如何混合PCM音频源（Java）？

这里就是我与现在的工作：如何混合PCM音频源（Java）？

for (int i = 0, numSamples = soundBytes.length/2; i < numSamples; i += 2) 
{ 
    // Get the samples. 
    int sample1 = ((soundBytes[i] & 0xFF) << 8) | (soundBytes[i + 1] & 0xFF); // Automatically converts to unsigned int 0...65535         
    int sample2 = ((outputBytes[i] & 0xFF) << 8) | (outputBytes[i + 1] & 0xFF); // Automatically converts to unsigned int 0...65535 

    // Normalize for simplicity. 
    float normalizedSample1 = sample1/65535.0f; 
    float normalizedSample2 = sample2/65535.0f; 

    float normalizedMixedSample = 0.0f; 

    // Apply the algorithm. 
    if (normalizedSample1 < 0.5f && normalizedSample2 < 0.5f) 
     normalizedMixedSample = 2.0f * normalizedSample1 * normalizedSample2; 
    else 
     normalizedMixedSample = 2.0f * (normalizedSample1 + normalizedSample2) - (2.0f * normalizedSample1 * normalizedSample2) - 1.0f; 

    int mixedSample = (int)(normalizedMixedSample * 65535); 

    // Replace the sample in soundBytes array with this mixed sample. 
    soundBytes[i] = (byte)((mixedSample >> 8) & 0xFF); 
    soundBytes[i + 1] = (byte)(mixedSample & 0xFF); 
}

从据我所知，这是该算法的精确表示此页面上的定义：http://www.vttoth.com/CMS/index.php/technical-notes/68

然而，仅仅混合声音沉默（全0）会产生很明显听起来不正确的声音，也许最好将其描述为更高调更响亮。

希望能帮助您确定我是否正确实施算法，或者如果我只是需要以不同的方式（不同的算法/方法）去解决它？

来源

2015-08-14 FTLRalph

在链接的文章的作者假定甲和乙代表音频的整个流。更具体地X指流X所有样品的最大绝对值 - 其中X要么甲或乙。所以他的算法是扫描两个流的整体来计算每个流的最大abs样本，然后对事物进行缩放，使理论上输出峰值为1.0。您需要对数据进行多次传递以实现此算法，并且如果您的数据正在流入，那么它将无法工作。

下面是我认为该算法工作的一个例子。它假定样本已经被转换为浮点数，以转换代码错误的问题。我将解释什么是错了，以后：

double[] samplesA = ConvertToDoubles(samples1); 
double[] samplesB = ConvertToDoubles(samples2); 
double A = ComputeMax(samplesA); 
double B = ComputeMax(samplesB); 

// Z always equals 1 which is an un-useful bit of information. 
double Z = A+B-A*B; 

// really need to find a value x such that xA+xB=1, which I think is: 
double x = 1/(Math.sqrt(A) * Math.sqrt(B)); 

// Now mix and scale the samples 
double[] samples = MixAndScale(samplesA, samplesB, x);

混合和缩放：

double[] MixAndScale(double[] samplesA, double[] samplesB, double scalingFactor) 
{ 
    double[] result = new double[samplesA.length]; 
    for (int i = 0; i < samplesA.length; i++) 
     result[i] = scalingFactor * (samplesA[i] + samplesB[i]); 
}

计算最大峰值：

double ComputeMaxPeak(double[] samples) 
{ 
    double max = 0; 
    for (int i = 0; i < samples.length; i++) 
    { 
     double x = Math.abs(samples[i]); 
     if (x > max) 
      max = x; 
    } 
    return max; 
}

和转换。注意我是如何使用短的，以便正确维护符号位：

double[] ConvertToDouble(byte[] bytes) 
{ 
    double[] samples = new double[bytes.length/2]; 
    for (int i = 0; i < samples.length; i++) 
    { 
     short tmp = ((short)bytes[i*2])<<8 + ((short)(bytes[i*2+1]); 
     samples[i] = tmp/32767.0; 
    } 
    return samples; 
}

来源

2015-08-15 09:50:51 jaket

试过这段代码。经过少量编译和丢失括号错误后，当两个音频源混合时背景中仍然存在白噪声。是否还有其他遗漏？ –

经过了很长时间的处理这个问题，我决定不使用这种转换方式，而是'ByteBuffer.wrap（bytes）.order（ByteOrder.LITTLE_ENDIAN）.asShortBuffer（）。get（shorts）;'，在短裤上。然后返回字节ByteBuffer.wrap（bytes）.order（ByteOrder.LITTLE_ENDIAN）.asShortBuffer（）。put（shorts）;'..这完美地工作。 –

如何混合PCM音频源（Java）？

回答

相关问题