2017-01-09 59 views
8

我正在使用pyaudio将我的声音录制为wav文件。我使用下面的代码:为Google Speech API创建合适的WAV文件

def voice_recorder(): 
    FORMAT = pyaudio.paInt16 
    CHANNELS = 2 
    RATE = 22050 
    CHUNK = 1024 
    RECORD_SECONDS = 4 
    WAVE_OUTPUT_FILENAME = "first.wav" 

    audio = pyaudio.PyAudio() 

    # start Recording 
    stream = audio.open(format=FORMAT, channels=CHANNELS, 
        rate=RATE, input=True, 
        frames_per_buffer=CHUNK) 
    print "konusun..." 
    frames = [] 

    for i in range(0, int(RATE/CHUNK * RECORD_SECONDS)): 
     data = stream.read(CHUNK) 
     frames.append(data) 
    #print "finished recording" 


    # stop Recording 
    stream.stop_stream() 
    stream.close() 
    audio.terminate() 

    waveFile = wave.open(WAVE_OUTPUT_FILENAME, 'wb') 
    waveFile.setnchannels(CHANNELS) 
    waveFile.setsampwidth(audio.get_sample_size(FORMAT)) 
    waveFile.setframerate(RATE) 
    waveFile.writeframes(b''.join(frames)) 
    waveFile.close() 

我用下面的代码谷歌语音API该语音转换基本上在WAV文件文本:https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/speech/api-client/transcribe.py

当我尝试导入WAV文件,该文件由pyaudio产生谷歌的代码,我收到以下错误:

googleapiclient.errors.HttpError: <HttpError 400 when requesting https://speech.googleapis.com/v1beta1/speech:syncrecognize?alt=json returned "Invalid Configuration, Does not match Wav File Header. 
Wav Header Contents: 
Encoding: LINEAR16 
Channels: 2 
Sample Rate: 22050. 
Request Contents: 
Encoding: linear16 
Channels: 1 
Sample Rate: 22050."> 

我用下面的办法解决这个:我的WAV文件转换成与ffmpeg的MP3,再经过我转换MP3文件用sox再次挥手:

def wav_to_mp3(): 
    FNULL = open(os.devnull, 'w') 
    subprocess.call(['ffmpeg', '-i', 'first.wav', '-ac', '1', '-ab', '6400', '-ar', '16000', 'second.mp3', '-y'], stdout=FNULL, stderr=subprocess.STDOUT) 

def mp3_to_wav(): 
    subprocess.call(['sox', 'second.mp3', '-r', '16000', 'son.wav']) 

谷歌的API与这个WAV输出一起工作,但由于质量下降太多,它表现不佳。

那么如何在第一步中用pyaudio创建Google兼容的WAV文件呢?

回答

4

转换的WAV文件,FLAC文件,avconv并将其发送到谷歌语音API解决了这个问题

subprocess.call(['avconv', '-i', 'first.wav', '-y', '-ar', '48000', '-ac', '1', 'last.flac'])