python sklearn知识总结

简单的线性回归：Python预测某宝2021双十一销售额

import pandas as pd
import numpy as np

df = pd.read_excel("./历史双十一销售额.xlsx")

x = np.array(df.iloc[:, 0]).reshape(-1, 1)
y = np.array(df.iloc[:, 1])

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression

# 用管道的方式调用多项式回归算法
poly_reg = Pipeline([
('ploy', PolynomialFeatures(degree=2)),
('std_scaler', StandardScaler()),
('lin_reg', LinearRegression())
])

poly_reg.fit(x, y)

# 算法评分
poly_reg.score(x, y)

current_year = 2021

predict = poly_reg.predict([[current_year]])

df_new = df.append({"年份":"2021", "成交总额":predict[0]}, ignore_index=True)
df_new["成交总额"] = df_new["成交总额"].map(lambda x : round(x, 2))

df_new

import plotly.express as px

fig = px.line(df_new, x="年份", y="成交总额", text="成交总额")
fig.update_traces(textposition="top center")
fig.show()

评价回归模型

import matplotlib.pyplot as plt
predicted = model.predict(X_test)

plt.figure()
plt.plot(range(len(y_test)), predicted)
plt.plot(range(len(y_test)), y_test, 'r-.')
plt.show()

划分数据集:from sklearn_selection import train_test_split

data_train,data_test, target_train,target_test = train_test_split(data,target,test_size=,random_state=42)

转换器的三个方法:fit(起到生成规则的作用),transform,(起到应用规则的作用)fit_transform

使用转换器进行数据预处理和PCA降维

估计器两个方法:fit,predict

FMI评价法，评价聚类模型，组内相似性越大,组间差别越大,聚类效果越好

posted @ 2021-11-17 22:22 付明旭阅读(133) 评论(0) 收藏举报

刷新页面返回顶部

python sklearn知识总结

评价回归模型

公告