皮尔逊相关系数代码实现
①直接用numpy的corrcoef方法
- from math import sqrt
- import numpy as np
-
- x = [2,7,18,88,157, 90,177,570]
- y = [3,5,15,90,180, 88,160,580]
-
- print (np.corrcoef(x,y)) # ✔
-
-
输出结果—— (右上左下为R)
[[1. 0.99834875]
[0.99834875 1. ]]
②调用scipy库函数
- from scipy.stats import pearsonr
-
- v1 = [2,7,18,88,157, 90,177,570]
- v2 = [3,5,15,90,180, 88,160,580]
-
- pccs = pearsonr(v1, v2)
- print(pccs)
-
-
输出结果—— (第一位为R)
(0.99834874864405, 1.1241947891650549e-08)
②自己手动实现
- import math
- import numpy as np
-
- def pearson(v1, v2):
- n = len(v1)
- #simple sums
- sum1 = sum(float(v1[i]) for i in range(n))
- sum2 = sum(float(v2[i]) for i in range(n))
- #sum up the squares
- sum1_pow = sum([pow(v, 2.0) for v in v1])
- sum2_pow = sum([pow(v, 2.0) for v in v2])
- #sum up the products
- p_sum = sum([v1[i] * v2[i] for i in range(n)])
- #分子num,分母denominator
- num = p_sum - (sum1*sum2/n)
- den = math.sqrt((sum1_pow-pow(sum1, 2)/n)*(sum2_pow-pow(sum2, 2)/n))
- if den == 0:
- return 0.0
- return num/den
-
- vector1 = [2,7,18,88,157, 90,177,570]
- vector2 = [3,5,15,90,180, 88,160,580]
-
- print(pearson(vector2,vector1))
-
-
输出结果——
0.9983487486440501