2017-05-28 89 views
-3

我得到一个错误, IndexError:只有整数,切片(:),省略号(...),numpy.newaxis(无)和整数或布尔数组是有效的索引。 我正在制作声音识别应用程序。 我的代码是IndexError可以将int用作索引吗?

import numpy as np 
import pandas as pd 
import scipy as sp 
import pickle 
from scipy import fft 
from time import localtime, strftime 
import matplotlib.pyplot as plt 
from skimage.morphology import disk,remove_small_objects 
from skimage.filter import rank 
from skimage.util import img_as_ubyte 
import wave 

folder = 'mlsp_contest_dataset/' 


essential_folder = folder+'essential_data/' 
supplemental_folder = folder+'supplemental_data/' 
spectro_folder =folder+'my_spectro/' 
single_spectro_folder =folder+'my_spectro_single/' 
dp_folder = folder+'DP/' 

# Each audio file has a unique recording identifier ("rec_id"), ranging from 0 to 644. 
# The file rec_id2filename.txt indicates which wav file is associated with each rec_id. 
rec2f = pd.read_csv(essential_folder + 'rec_id2filename.txt', sep = ',') 

# There are 19 bird species in the dataset. species_list.txt gives each a number from 0 to 18. 
species = pd.read_csv(essential_folder + 'species_list.txt', sep = ',') 
num_species = 19 

# The dataset is split into training and test sets. 
# CVfolds_2.txt gives the fold for each rec_id. 0 is the training set, and 1 is the test set. 
cv = pd.read_csv(essential_folder + 'CVfolds_2.txt', sep = ',') 

# This is your main label training data. For each rec_id, a set of species is listed. The format is: 
# rec_id,[labels] 
raw = pd.read_csv(essential_folder + 'rec_labels_test_hidden.txt', sep = ';') 
label = np.zeros(len(raw)*num_species) 
label = label.reshape([len(raw),num_species]) 
for i in range(len(raw)): 
    line = raw.iloc[i] 
    labels = line[0].split(',') 
    labels.pop(0) # rec_id == i 
    for c in labels: 
     if(c != '?'): 
      print(label) 
      label[i,c] = 1 

我运行此代码, 我在这一点上label[i,c] = 1得到了错误。 我试图通过print(label) label看到label变量是像

warn(skimage_deprecation('The `skimage.filter` module has been renamed ' 
[[ 0. 0. 0. ..., 0. 0. 0.] 
[ 0. 0. 0. ..., 0. 0. 0.] 
[ 0. 0. 0. ..., 0. 0. 0.] 
..., 
[ 0. 0. 0. ..., 0. 0. 0.] 
[ 0. 0. 0. ..., 0. 0. 0.] 
[ 0. 0. 0. ..., 0. 0. 0.]] 

我认为,该错误意味着整数,切片(:),省略号(...),numpy.newaxis(无)和整数或布尔不能用作数组索引,但我把int放入数组索引很多时候,所以我不明白为什么会发生这个错误。 调试告诉我,

labels 

具有标签:: [ '?']。

c 

for c in labels[i]: 

有 '?',我真的不明白? type.I认为这个?导致错误,但我不知道如何解决这个问题。 我该如何解决这个问题?

+0

'在标签C:...','不过是labels'字符串列表。字符串是不是在设置“*整数,切片(:),省略号(...),numpy.newaxis(无)和整数或布尔*”。 (另请注意:'np.zeros((LEN(原料),num_species))'是简单的。) –

+0

@AndrasDeak非常感谢你!哪一部分是np。你告诉我的零((len(raw),num_species))?我怎样才能解决这个问题? – user21063

+0

我只注意到for循环之前的两行可以在一行中完成,而无需重新整形。至于你的问题:我不知道你想要做什么,但试图使用字符作为numpy数组索引肯定是行不通的。 –

回答

0

该错误消息是说,索引一个numpy的阵列

only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices 

label时是浮筒的2D阵列,以0:

label = np.zeros([len(raw),num_species]) 

电流值在循环:

for i in range(len(raw)):  # i=0,1,2,... 

你检查什么raw样的呢?来自pd.read_csv我想它是一个数据框; iloc[i]选择一排,但尚未拆分成多列?

line = raw.iloc[i] 
    labels = line[0].split(',') 
    labels.pop(0) # rec_id == i 

什么是labels like?我猜这是字符串的所有阵列

for c in labels: 
     if(c != '?'):   # evidently `c` is a string 
      print(label)  # prints the 2d array 
      label[i,c] = 1 

索引的二维数组应该是这样label[0,1]c可能是错误信息中的其他内容之一。但它不能是一个字符串。

Dataframes确实允许索引与琴弦 - 这是一个熊猫的特征。但是numpy数组必须有数字索引或者几个选择。它们没有用字符串索引(除了结构化数组的情况)。


In [209]: label = np.zeros((3,5)) 
In [210]: label 
Out[210]: 
array([[ 0., 0., 0., 0., 0.], 
     [ 0., 0., 0., 0., 0.], 
     [ 0., 0., 0., 0., 0.]]) 
In [211]: label[1,3] 
Out[211]: 0.0 
In [212]: label[1,3]=1  # index with integers OK 
In [213]: label[0,2]=1 
In [214]: label[0,'?'] =1 # index with a string - ERROR 
--------------------------------------------------------------------------- 
IndexError        Traceback (most recent call last) 
<ipython-input-214-3738f623c78e> in <module>() 
----> 1 label[0,'?'] =1 

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices 

In [215]: label[0,:] =2  # index with a slice 
In [216]: label 
Out[216]: 
array([[ 2., 2., 2., 2., 2.], 
     [ 0., 0., 0., 1., 0.], 
     [ 0., 0., 0., 0., 0.]])