2016-09-22 227 views
1

我正在研究两个相对较小的时间序列之间的交叉关联,但是试图完成我遇到了一个我无法调和自己的问题。首先,我了解plt.xcorrnp.correlate之间的依赖关系。但是,我无法调和零滞后的plt.xcorrnp.corrcoef之间的差异?0-lag和np.corrcoef的规范plt.xcorr之间的区别

a = np.array([ 7.35846410e+08, 8.96271634e+08, 6.16249222e+08, 
    8.00739868e+08, 1.06116376e+09, 9.05690167e+08, 
    6.31383600e+08]) 
b = np.array([ 1.95621617e+09, 2.06263134e+09, 2.27717015e+09, 
    2.27281916e+09, 2.71090116e+09, 2.84676385e+09, 
    3.19578883e+09]) 

np.corrcoef(a,b) 
# returns: 
array([[ 1.  , 0.02099573], 
     [ 0.02099573, 1.  ]]) 

plt.xcorr(a,b,normed=True, maxlags=1) 
# returns: 
array([-1, 0, 1]), 
array([ 0.90510941, 0.97024415, 0.79874158]) 

我期望这些返回相同的结果。我明显不明白plt.xcorr是如何规范的,有人可以请我直吗?

回答

0

我用http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xcorr

范:布尔,可选的,默认:真

如果真,则在第0滞后正常化通过自相关的数据。

在下面的代码中,plt_corr等于np_corr

plt_corr = plt.xcorr(a, b, normed=True, maxlags=6) 

c = np.correlate(a, a) # autocorrelation of a 
d = np.correlate(b, b) # autocorrelation of b 
np_corr = np.correlate(a/np.sqrt(c), b/np.sqrt(d), 'full') 
1

标准“皮尔森乘积矩相关系数”的计算是使用样本,通过平均值移位。 互相关系数不使用标准化样本。 除此之外,计算是相似的。但这些系数仍然有不同的公式和不同的含义。它们是相等仅当样品ab的平均值等于0(如果通过平均值移位不改变样本

import numpy as np 
import matplotlib.pyplot as plt 

a = np.array([7.35846410e+08, 8.96271634e+08, 6.16249222e+08, 
    8.00739868e+08, 1.06116376e+09, 9.05690167e+08, 6.31383600e+08]) 
b = np.array([1.95621617e+09, 2.06263134e+09, 2.27717015e+09, 
    2.27281916e+09, 2.71090116e+09, 2.84676385e+09, 3.19578883e+09]) 

y = np.corrcoef(a, b) 
z = plt.xcorr(a, b, normed=True, maxlags=1) 
print("Pearson product-moment correlation coefficient between `a` and `b`:", y[0][1]) 
print("Cross-correlation coefficient between `a` and `b` with 0-lag:", z[1][1], "\n") 


# Calculate manually: 

def pearson(a, b): 
    # Length. 
    n = len(a) 

    # Means. 
    ma = sum(a)/n 
    mb = sum(b)/n 

    # Shifted samples. 
    _ama = a - ma 
    _bmb = b - mb 

    # Standard deviations. 
    sa = np.sqrt(np.dot(_ama, _ama)/n) 
    sb = np.sqrt(np.dot(_bmb, _bmb)/n) 

    # Covariation. 
    cov = np.dot(_ama, _bmb)/n 

    # Final formula. 
    # Note: division by `n` in deviations and covariation cancel out each other in 
    #  final formula and could be ignored. 
    return cov/(sa * sb) 

def cross0lag(a, b): 
    return np.dot(a, b)/np.sqrt(np.dot(a, a) * np.dot(b, b)) 

pearson_coeff = pearson(a, b) 
cross_coeff = cross0lag(a, b) 

print("Manually calculated coefficients:") 
print(" Pearson =", pearson_coeff) 
print(" Cross =", cross_coeff, "\n") 


# Normalized samples: 
am0 = a - sum(a)/len(a) 
bm0 = b - sum(b)/len(b) 
pearson_coeff = pearson(am0, bm0) 
cross_coeff = cross0lag(am0, bm0) 
print("Coefficients for samples with means = 0:") 
print(" Pearson =", pearson_coeff) 
print(" Cross =", cross_coeff) 

输出:

Pearson product-moment correlation coefficient between `a` and `b`: 0.020995727082 
Cross-correlation coefficient between `a` and `b` with 0-lag: 0.970244146831 

Manually calculated coefficients: 
    Pearson = 0.020995727082 
    Cross = 0.970244146831 

Coefficients for samples with means = 0: 
    Pearson = 0.020995727082 
    Cross = 0.020995727082 
相关问题