皮尔逊相关系数代码实现
①直接用numpy的corrcoef方法
from math import sqrt
import numpy as np
x = [2,7,18,88,157, 90,177,570]
y = [3,5,15,90,180, 88,160,580]
print (np.corrcoef(x,y)) # ✔
输出结果—— (右上左下为R)
[[1. 0.99834875]
[0.99834875 1. ]]
②调用scipy库函数
from scipy.stats import pearsonr
v1 = [2,7,18,88,157, 90,177,570]
v2 = [3,5,15,90,180, 88,160,580]
pccs = pearsonr(v1, v2)
print(pccs)
输出结果—— (第一位为R)
(0.99834874864405, 1.1241947891650549e-08)
②自己手动实现
import math
import numpy as np
def pearson(v1, v2):
n = len(v1)
#simple sums
sum1 = sum(float(v1[i]) for i in range(n))
sum2 = sum(float(v2[i]) for i in range(n))
#sum up the squares
sum1_pow = sum([pow(v, 2.0) for v in v1])
sum2_pow = sum([pow(v, 2.0) for v in v2])
#sum up the products
p_sum = sum([v1[i] * v2[i] for i in range(n)])
#分子num,分母denominator
num = p_sum - (sum1*sum2/n)
den = math.sqrt((sum1_pow-pow(sum1, 2)/n)*(sum2_pow-pow(sum2, 2)/n))
if den == 0:
return 0.0
return num/den
vector1 = [2,7,18,88,157, 90,177,570]
vector2 = [3,5,15,90,180, 88,160,580]
print(pearson(vector2,vector1))
输出结果——
0.9983487486440501