2015-08-14 223 views
4

这里就是我与现在的工作:如何混合PCM音频源(Java)?

for (int i = 0, numSamples = soundBytes.length/2; i < numSamples; i += 2) 
{ 
    // Get the samples. 
    int sample1 = ((soundBytes[i] & 0xFF) << 8) | (soundBytes[i + 1] & 0xFF); // Automatically converts to unsigned int 0...65535         
    int sample2 = ((outputBytes[i] & 0xFF) << 8) | (outputBytes[i + 1] & 0xFF); // Automatically converts to unsigned int 0...65535 

    // Normalize for simplicity. 
    float normalizedSample1 = sample1/65535.0f; 
    float normalizedSample2 = sample2/65535.0f; 

    float normalizedMixedSample = 0.0f; 

    // Apply the algorithm. 
    if (normalizedSample1 < 0.5f && normalizedSample2 < 0.5f) 
     normalizedMixedSample = 2.0f * normalizedSample1 * normalizedSample2; 
    else 
     normalizedMixedSample = 2.0f * (normalizedSample1 + normalizedSample2) - (2.0f * normalizedSample1 * normalizedSample2) - 1.0f; 

    int mixedSample = (int)(normalizedMixedSample * 65535); 

    // Replace the sample in soundBytes array with this mixed sample. 
    soundBytes[i] = (byte)((mixedSample >> 8) & 0xFF); 
    soundBytes[i + 1] = (byte)(mixedSample & 0xFF); 
} 

从据我所知,这是该算法的精确表示此页面上的定义:http://www.vttoth.com/CMS/index.php/technical-notes/68

然而,仅仅混合声音沉默(全0)会产生很明显听起来不正确的声音,也许最好将其描述为更高调更响亮。

希望能帮助您确定我是否正确实施算法,或者如果我只是需要以不同的方式(不同的算法/方法)去解决它?

回答

3

在链接的文章的作者假定代表音频的整个流。更具体地X指流X所有样品的最大绝对值 - 其中X要么。所以他的算法是扫描两个流的整体来计算每个流的最大abs样本,然后对事物进行缩放,使理论上输出峰值为1.0。您需要对数据进行多次传递以实现此算法,并且如果您的数据正在流入,那么它将无法工作。

下面是我认为该算法工作的一个例子。它假定样本已经被转换为浮点数,以转换代码错误的问题。我将解释什么是错了,以后:

double[] samplesA = ConvertToDoubles(samples1); 
double[] samplesB = ConvertToDoubles(samples2); 
double A = ComputeMax(samplesA); 
double B = ComputeMax(samplesB); 

// Z always equals 1 which is an un-useful bit of information. 
double Z = A+B-A*B; 

// really need to find a value x such that xA+xB=1, which I think is: 
double x = 1/(Math.sqrt(A) * Math.sqrt(B)); 

// Now mix and scale the samples 
double[] samples = MixAndScale(samplesA, samplesB, x); 

混合和缩放:

double[] MixAndScale(double[] samplesA, double[] samplesB, double scalingFactor) 
{ 
    double[] result = new double[samplesA.length]; 
    for (int i = 0; i < samplesA.length; i++) 
     result[i] = scalingFactor * (samplesA[i] + samplesB[i]); 
} 

计算最大峰值:

double ComputeMaxPeak(double[] samples) 
{ 
    double max = 0; 
    for (int i = 0; i < samples.length; i++) 
    { 
     double x = Math.abs(samples[i]); 
     if (x > max) 
      max = x; 
    } 
    return max; 
} 

和转换。注意我是如何使用短的,以便正确维护符号位:

double[] ConvertToDouble(byte[] bytes) 
{ 
    double[] samples = new double[bytes.length/2]; 
    for (int i = 0; i < samples.length; i++) 
    { 
     short tmp = ((short)bytes[i*2])<<8 + ((short)(bytes[i*2+1]); 
     samples[i] = tmp/32767.0; 
    } 
    return samples; 
} 
+0

试过这段代码。经过少量编译和丢失括号错误后,当两个音频源混合时背景中仍然存在白噪声。是否还有其他遗漏? –

+0

经过了很长时间的处理这个问题,我决定不使用这种转换方式,而是'ByteBuffer.wrap(bytes).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer()。get(shorts);',在短裤上。然后返回字节ByteBuffer.wrap(bytes).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer()。put(shorts);'..这完美地工作。 –