2014-12-03 73 views
0

我试图将我从wav中检索到的数据划分为10ms段以进行动态时间规整。如何将wav文件转换为10ms数据

import wave 
    import contextlib 

    data = np.zeros((1, 7000)) 
    rate, wav_data = wavfile.read(file_path) 
    with contextlib.closing(wave.open(file_path, 'r')) as f: 
     frames = f.getnframes() 
     rate = f.getframerate() 
     duration = frames/float(rate) 

是否有任何现有的库,做到这一点

感谢

回答

0

如果你有兴趣在后处理中的数据,你可能会使用它作为numpy的数据。

>>> import wave 
>>> import numpy as np 
>>> f = wave.open('911.wav', 'r') 
>>> data = f.readframes(f.getnframes()) 
>>> data[:10] # just to show it is a string of bytes 
'"5AMj\x88\x97\xa6\xc0\xc9' 
>>> numeric_data = np.fromstring(data, dtype=np.uint8) 
>>> numeric_data 
array([ 34, 53, 65, ..., 128, 128, 128], dtype=uint8) 
>>> 10e-3*f.getframerate() # how many frames per 10ms? 
110.25 

这不是一个整数,所以,除非你要插你的数据,你需要垫您的数据用零来获得不错的110帧长的样品(这是10ms左右,在此帧率) 。

>>> numeric_data.shape, f.getnframes() # there are just as many samples in the numpy array as there were frames 
((186816,), 186816) 
>>> padding_length = 110 - numeric_data.shape[0]%110 
>>> padded = np.hstack((numeric_data, np.zeros(padding_length))) 
>>> segments = padded.reshape(-1, 110) 
>>> segments 
array([[ 34., 53., 65., ..., 216., 222., 228.], 
     [ 230., 227., 224., ..., 72., 61., 45.], 
     [ 34., 33., 32., ..., 147., 158., 176.], 
     ..., 
     [ 128., 128., 128., ..., 128., 128., 128.], 
     [ 127., 128., 128., ..., 128., 129., 129.], 
     [ 129., 129., 128., ..., 0., 0., 0.]]) 
>>> segments.shape 
(1699, 110) 

所以现在,segments阵列的每一行都是大约10ms长。