Sklearn.metrics类的学习笔记----Regression metrics

Sklearn.metrics类为sklearn包里的metric类,今天先学习关于Regression metrics 的一些方法。

1.Explained variance score

假设真实值为\(y\),预测值为\(\hat{y}\),则Explained variance score的计算公式为\(Explained variance score = 1-\dfrac {Var(y-\hat{y})} {Var(y)}\)
该Explained variance score的值越接近1越好。举例说明:

>>> from sklearn.metrics import explained_variance_score
>>> y_true = [3, -0.5, 2, 7]
>>> y_pred = [2.5, 0.0, 2, 8]
>>> explained_variance_score(y_true, y_pred)  
0.957...
>>> y_true = [[0.5, 1], [-1, 1], [7, -6]]
>>> y_pred = [[0, 2], [-1, 2], [8, -5]]
>>> explained_variance_score(y_true, y_pred, multioutput='raw_values')
... 
array([0.967..., 1.        ])
>>> explained_variance_score(y_true, y_pred, multioutput=[0.3, 0.7])
... 
0.990...

参数解释:
如果传参为ndarray,则应有一个multioutput参数,默认为uniform_average,结果为各个维度的Explained variance score均值,若参数改为raw_values,则分别计算Explained variance score,若参数为数组,表示按照该权重求出Explained variance score加权均值。

2.Mean absolute error

假设真实值为\(y\),预测值为\(\hat{y}\),则Mean absolute error值的计算公式为:\(MAE(y,\hat{y}) = \dfrac 1 m\displaystyle\sum_{i=1}^m|y_i-\hat{y}_i|\),举例说明:

>>> from sklearn.metrics import mean_absolute_error
>>> y_true = [3, -0.5, 2, 7]
>>> y_pred = [2.5, 0.0, 2, 8]
>>> mean_absolute_error(y_true, y_pred)
0.5
>>> y_true = [[0.5, 1], [-1, 1], [7, -6]]
>>> y_pred = [[0, 2], [-1, 2], [8, -5]]
>>> mean_absolute_error(y_true, y_pred)
0.75
>>> mean_absolute_error(y_true, y_pred, multioutput='raw_values')
array([0.5, 1. ])
>>> mean_absolute_error(y_true, y_pred, multioutput=[0.3, 0.7])
... 
0.85...

参数解释:同上

3.Mean squared error

假设真实值为\(y\),预测值为\(\hat{y}\),则Mean squared error的计算公式为:\(MSE(y,\hat{y}) = \dfrac 1 m\displaystyle\sum_{i=1}^m(y_i - \hat{y}_i)^2\),举例说明:

>>> from sklearn.metrics import mean_squared_error
>>> y_true = [3, -0.5, 2, 7]
>>> y_pred = [2.5, 0.0, 2, 8]
>>> mean_squared_error(y_true, y_pred)
0.375
>>> y_true = [[0.5, 1], [-1, 1], [7, -6]]
>>> y_pred = [[0, 2], [-1, 2], [8, -5]]
>>> mean_squared_error(y_true, y_pred)  
0.7083...

4.Mean squared logarithmic error

假设真实值为\(y\),预测值为\(\hat{y}\),则Mean squared logarithmic error的计算公式为:
\(MSLE(y,\hat{y}) = \dfrac 1 m \displaystyle\sum_{i=1}^m (ln(1+y_i) - ln(1+\hat{y}_i))^2\),该值使用于当预测值呈指数变化时,举例说明:

>>> from sklearn.metrics import mean_squared_log_error
>>> y_true = [3, 5, 2.5, 7]
>>> y_pred = [2.5, 5, 4, 8]
>>> mean_squared_log_error(y_true, y_pred)  
0.039...
>>> y_true = [[0.5, 1], [1, 2], [7, 6]]
>>> y_pred = [[0.5, 2], [1, 2.5], [8, 8]]
>>> mean_squared_log_error(y_true, y_pred)  
0.044...

5.Median absolute error

假设真实值为\(y\),预测值为\(\hat{y}\),则Median absolute error的计算公式为:
\(MedSE(y,\hat{y}) = median(|y_1- \hat{y}_1|,|y_2 - \hat{y}_2|,...|y_n - \hat{y}_n|)\),该方法不支持多输出(multioutput)。具体举例如下:

>>> from sklearn.metrics import median_absolute_error
>>> y_true = [3, -0.5, 2, 7]
>>> y_pred = [2.5, 0.0, 2, 8]
>>> median_absolute_error(y_true, y_pred)
0.5

6.R² score, the coefficient of determination 决定系数

假设真实值为\(y\),真实值的平均值为\(\bar{y}\),预测值为\(\hat{y}\),则R² score的计算公式为:
\(R^2 = 1-\dfrac {\displaystyle\sum_{i=1}^m(y_i - \hat{y}_i)^2} {\displaystyle\sum_{i=1}^m(y_i - \bar{y}_i)^2}\),举例如下:

>>> from sklearn.metrics import r2_score
>>> y_true = [3, -0.5, 2, 7]
>>> y_pred = [2.5, 0.0, 2, 8]
>>> r2_score(y_true, y_pred)  
0.948...
>>> y_true = [[0.5, 1], [-1, 1], [7, -6]]
>>> y_pred = [[0, 2], [-1, 2], [8, -5]]
>>> r2_score(y_true, y_pred, multioutput='variance_weighted')
... 
0.938...
>>> y_true = [[0.5, 1], [-1, 1], [7, -6]]
>>> y_pred = [[0, 2], [-1, 2], [8, -5]]
>>> r2_score(y_true, y_pred, multioutput='uniform_average')
... 
0.936...
>>> r2_score(y_true, y_pred, multioutput='raw_values')
... 
array([0.965..., 0.908...])
>>> r2_score(y_true, y_pred, multioutput=[0.3, 0.7])
... 
0.925...

posted on 2018-11-07 10:29  小马927  阅读(933)  评论(0)    收藏  举报

导航