kaggle教程--8--管道
管道(Pipelines)
管道的作用是将数据预处理与建模合为一体来操作,优化代码
绝大部分的scikit-learn 对象,要么是transformers(可调用transform命令) ,要么是 models(可调用predict命令)
你的管道必须从 transformers开始,从models结束
(目前为止,数据集放入管道前,需要把object对象提前做object编码)
例子:
用管道:
from sklearn.ensemble import RandomForestRegressor
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import Imputer
my_pipeline = make_pipeline(Imputer(), RandomForestRegressor())
my_pipeline.fit(train_X, train_y)
predictions = my_pipeline.predict(test_X)
不用管道:
my_imputer = Imputer()
my_model = RandomForestRegressor()
imputed_train_X = my_imputer.fit_transform(train_X)
imputed_test_X = my_imputer.transform(test_X)
my_model.fit(imputed_train_X, train_y)
predictions = my_model.predict(imputed_test_X)
posted on 2019-03-11 15:40 wangzhonghan 阅读(132) 评论(0) 收藏 举报
浙公网安备 33010602011771号