Weighted Least Squares(WLS)

When and How to use Weighted Least Squares (WLS) Models

WLS主要用途:
解决异方差问题,即解决 随着x的 增加,对应x的y的方差也随之"增加" 这类问题

异方差数据:具有随输入而变化的可变性,通常随着输入的变化,方差也会跟着变化
例如:
随着年龄的增长,净资产趋于分散
随着公司规模的扩大,收入趋于分散
或者,随着婴儿身高的增加,体重趋于发散
随着x的 增加,对应x的y的方差也随之增加。

`
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

import and fit an OLS model, check coefficients

from sklearn.linear_model import LinearRegression

matplotlib inline

generate random data

np.random.seed(24)
x = np.random.uniform(-5,5,25)
ϵ = 2np.random.randn(25)
y = 2
x+ϵ

alternate error as a function of x

ϵ2 = ϵ(x+5)
y2 = 2
x+ϵ2
sns.regplot(x,y);
sns.regplot(x,y2);

model = LinearRegression()
model.fit(x.reshape(-1, 1), y)
print(model.intercept_, model.coef_)

add a strong outlier for high x

x_high = np.append(x,5)
y_high = np.append(y2,160)

add a strong outlier for low x

x_low = np.append(x,-4)
y_low = np.append(y2,160)

calculate weights for sets with low and high outlier

sample_weights_low = [1/(x+5) for x in x_low]
sample_weights_high = [1/(x+5) for x in x_high]

reshape for compatibility

X_low = x_low.reshape(-1, 1)
X_high = x_high.reshape(-1, 1)

model = LinearRegression()
model.fit(X_low, y_low)

fit WLS using sample_weights

WLS = LinearRegression()
WLS.fit(X_low, y_low, sample_weight=sample_weights_low)

sns.regplot(x_low,y_low);

print(model.intercept_, model.coef_)
print('WLS')
print(WLS.intercept_, WLS.coef_)

model = LinearRegression()
model.fit(X_high, y_high)
WLS = LinearRegression()

WLS.fit(X_high, y_high, sample_weight=sample_weights_high)
print(model.intercept_, model.coef_)
print('WLS')
print(WLS.intercept_, WLS.coef_)

model = LinearRegression()
model.fit(x.reshape(-1, 1), y)
WLS = LinearRegression()
sample_weights = [1/(i+5) for i in x]
WLS.fit(x.reshape(-1,1), y, sample_weight=sample_weights)
print(model.intercept_, model.coef_)
print('WLS')
print(WLS.intercept_, WLS.coef_)

plt.show()

`

posted @ 2020-12-16 16:45  blog_hfg  阅读(1325)  评论(0)    收藏  举报