2017-10-12 171 views
0

下面的代码使用的音频文件在tensorflow创建的特征的矩阵:Python的类型错误:“浮动”对象不能被解释为索引

import tensorflow as tf 

directory = "audio_dataset/*.wav" 

filenames = tf.train.match_filenames_once(directory) 

init = (tf.global_variables_initializer(), tf.local_variables_initializer()) 

count_num_files = tf.size(filenames) 
filename_queue = tf.train.string_input_producer(filenames) 
reader = tf.WholeFileReader() 
filename, file_contents = reader.read(filename_queue) 

with tf.Session() as sess: 
    sess.run(init) 
    num_files = sess.run(count_num_files) 

    coord = tf.train.Coordinator() 
    threads = tf.train.start_queue_runners(coord=coord) 

    for i in range(num_files): 
     audio_file = sess.run(filename) 
     print(audio_file) 

这是一种将音频从时域到频域的工具包:

from bregman.suite import * 


chromo = tf.placeholder(tf.float32) 
max_freqs = tf.argmax(chromo, 0) 


def get_next_chromogram(sess): 
    audio_file = sess.run(filename) 
    F = Chromagram(audio_file, nfft=16384, wfft=8192, nhop=2205) 
    return F.X 


def extract_feature_vector(sess, chromo_data): 
    num_features, num_samples = np.shape(chromo_data) 
    freq_vals = sess.run(max_freqs, feed_dict={chromo: chromo_data}) 
    hist, bins = np.histogram(freq_vals, bins=range(num_features + 1)) 
    return hist.astype(float)/num_samples 


def get_dataset(sess): 
    num_files = sess.run(count_num_files) 
    coord = tf.train.Coordinator() 
    threads = tf.train.start_queue_runners(coord=coord) 
    xs = [] 
    for _ in range(num_files): 
     chromo_data = get_next_chromogram(sess) 
     x = [extract_feature_vector(sess, chromo_data)] 
     x = np.matrix(x) 
     if len(xs) == 0: 
      xs = x 
     else: 
      xs = np.vstack((xs, x)) 
    return xs 

这个聚类围绕两个质心数据:

k = 2 
max_iterations = 100 

def initial_cluster_centroids(X, k): 
    return X[0:k, :] 

def assign_cluster(X, centroids): 
    expanded_vectors = tf.expand_dims(X, 0) 
    expanded_centroids = tf.expand_dims(centroids, 1) 
    distances = tf.reduce_sum(tf.square(tf.subtract(expanded_vectors, expanded_centroids)), 2) 
    mins = tf.argmin(distances, 0) 
    return mins 

def recompute_centroids(X, Y): 
    sums = tf.unsorted_segment_sum(X, Y, k) 
    counts = tf.unsorted_segment_sum(tf.ones_like(X), Y, k) 
    return sums/counts 

with tf.Session() as sess: 
    sess.run(init) 
    X = get_dataset(sess) 
    centroids = initial_cluster_centroids(X, k) 
    i, converged = 0, False 
    while not converged and i < max_iterations: 
     i += 1 
     Y = assign_cluster(X, centroids) 
     centroids = sess.run(recompute_centroids(X, Y)) 
    print(centroids) 

但是我得到以下回溯:

Traceback (most recent call last): 
    File "components.py", line 776, in <module> 
    X = get_dataset(sess) 
    File "ccomponents.py", line 745, in get_dataset 
    chromo_data = get_next_chromogram(sess) 
    File "coffee_components.py", line 728, in get_next_chromogram 
    F = Chromagram(audio_file, nfft=16384, wfft=8192, nhop=2205) 
    File "/Volumes/Dados/Documents/Education/Programming/Machine Learning/Manning/book/BregmanToolkit-master/bregman/features.py", line 143, in __init__ 
    Features.__init__(self, arg, feature_params) 
    File "/Volumes/Dados/Documents/Education/Programming/Machine Learning/Manning/book/BregmanToolkit-master/bregman/features_base.py", line 70, in __init__ 
    self.extract() 
    File "/Volumes/Dados/Documents/Education/Programming/Machine Learning/Manning/book/BregmanToolkit-master/bregman/features_base.py", line 213, in extract 
    self.extract_funs.get(f, self._extract_error)() 
    File "/Volumes/Dados/Documents/Education/Programming/Machine Learning/Manning/book/BregmanToolkit-master/bregman/features_base.py", line 711, in _chroma 
    if not self._cqft(): 
    File "/Volumes/Dados/Documents/Education/Programming/Machine Learning/Manning/book/BregmanToolkit-master/bregman/features_base.py", line 588, in _cqft 
    self._make_log_freq_map() 
    File "/Volumes/Dados/Documents/Education/Programming/Machine Learning/Manning/book/BregmanToolkit-master/bregman/features_base.py", line 353, in _make_log_freq_map 
    mxnorm = P.empty(self._cqtN) # Normalization coefficients   
TypeError: 'float' object cannot be interpreted as an index 

就我而言,rangeint,而不是一个float

有人可以请指出我的错误吗?

+0

'range'在哪里?它不在堆栈跟踪中。这似乎是抱怨'X = get_dataset(sess)'行。 – Antimony

+0

是的,'get_dataset(sess)'是一个函数(参见上面),使用('range()')进行迭代。通常这个错误是指你在范围内使用'float'这个事实,但我不确定这里。 – outkast

+0

也许你可以检查'get_next_chromogram()'中'audio_file'的值?这是唯一传递给'Chromagram()'的非整数。 – Antimony

回答

1

的问题是,你正在使用Python 3,但布雷格曼工具包是用Python编写2.错误来自this line

mxnorm = P.empty(self._cqtN) 

self._cqtNfloat。在Python 2中,pylab库接受彩车输入:

pylab.empty(5.0) 
__main__:1: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future 
array([ 0., 0., 0., 0., 0.]) 

然而,在Python 3,你做你得到同样的错误:

pylab.empty(5.0) 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
TypeError: 'float' object cannot be interpreted as an integer 

你应该能够只是为了解决这个错误编辑我在上面链接的文件中的行,并将其转换为int:

mxnorm = P.empty(int(self._cqtN)) 

然而,如果没有发现任何其他错误,我会感到惊讶,由于不兼容的版本。您可能想尝试使用Python 2或寻找Bregman Toolkit的替代方案。

+0

我不明白。我为此使用了“Python 2.X”conda环境。那应该不是问题。 – outkast

相关问题