神经网络ANN
模型亮点
- 初始测试集上评分为0.29,调参后测试集上评分为0.98
- 数据集由sklearn自带
-----------------------------------------以下为模型具体实现-----------------------------------------
Step1.数据读取
- 使用load_iris命令,加载鸢尾花数据集
from sklearn.datasets import load_iris iris=load_iris() x=iris.data y=iris.target import pandas as pd df_x=pd.DataFrame(x) df_y=pd.DataFrame(y) df_x.columns=['sepal_length','sepal_width','petal_length','petal_width'] df_y.columns=['class']


Step2.划分训练集和测试集
why 划分训练集和测试集?
- 把所有样本当作训练集,做过的题都是旧题,都会~
- 把部分样本当作训练集,在新题上做测试,起到检测效果~
from sklearn.model_selection import train_test_split x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.3,random_state=1)
Step3.启动分类器
how 启动分类器?
- 设定初始化参数
- 拟合训练集数据
from sklearn.neural_network import MLPClassifier model=MLPClassifier(hidden_layer_sizes=(1,1),activation='relu',solver='adam',alpha=0.0001,learning_rate_init=0.001) #初始参数设定 model.fit(x_train,y_train)
Step4.模型评估
what 模型评估?
- 分类问题,model.score->accuracy_score
- 回归问题,model.score->r2_score
print("训练集上评分:",round(model.score(x_train,y_train),2)) print("测试集上评分:",round(model.score(x_test,y_test),2))
![]()
Step5.优化参数
why 优化参数?how 优化参数?
- 以提高测试集上评分
- 随机搜索,适合大量参数
from sklearn.model_selection import RandomizedSearchCV import numpy as np options=[] #定义空列表,以存放隐藏层尺寸的所有可能,元素为元组形式 lis=[1,1] #初始元素 for i in range(1,21): #元组元素有2个,每个元素范围为1~20 lis[0]=i for j in range(1,11): lis[1]=j options.append(tuple(lis)) params={'hidden_layer_sizes':options,'activation':['identity','logistic','tanh','relu'],'solver':['lbfgs','sgd','adam'], 'alpha':np.arange(0.0001,10.1,0.01),'learning_rate_init':np.arange(0.001,0.01,0.001)} model=RandomizedSearchCV(model,params,n_iter=100,cv=3) model.fit(x_train,y_train) print("最优参数:",model.best_params_) print("最优评分:",round(model.best_score_,2))
![]()
Step6.保存最优模型
why 保存模型?how 保存模型?
- for job-lib(工作自由)~
- dump转存模型,以pkl格式
- load加载模型
from sklearn.externals import joblib joblib.dump(model,'d:\ann.pkl') new_model=joblib.load('d:\ann.pkl') new_model.predict(x_test)
-END

浙公网安备 33010602011771号