XGB基本使用
https://www.jianshu.com/p/e119f00bd93f
以下代码用到了 xgboost 包和 sklearn 包,这篇文章没有提供包的下载方式,可以自行搜索下载、安装方式。也不对参数进行解释。但是给出了各个参数含义的文档,给出的代码也没有进行寻参。
参数解释参考
# coding: utf-8
import xgboost as xgb
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
def train(mall_id, X, shop_ids, TEST, row_ids):
"""
mall_id: 商场 ID(m_23232)
X: 训练集向量
shop_ids: 商铺标签(s_223234)
TEST: 测试集向量
row_ids: 测试集行号
"""
# 处理真实标签为训练用标签,其中 shop_ids 为 []
lbl = preprocessing.LabelEncoder()
lbl.fit(shop_ids)
y = lbl.transform(shop_ids)
class_num = y.max() + 1 # 类别数
# 划分训练集和验证集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
xg_train = xgb.DMatrix(X_train, label=y_train)
xg_test = xgb.DMatrix(X_test, label=y_test)
watchlist = [(xg_train, 'train'), (xg_test, 'test')]
# 定义参数
params = {
'objective': 'multi:softmax',
'eta': 0.1,
'max_depth': 9,
'eval_metric': 'merror',
'seed': 0,
'missing': -999,
'class_num': class_num,
'silent': 1,
}
# 训练
bst = xgb.train(params, xg_train, 60, watchlist, early_stopping_rounds=15)
# 预测各个标签的概率
# pred_prob = bst.predict(xg_test).reshape(TEST.shape[0], class_num)
# 预测标签
pred = bst.predict(xg_test)
# 打印正确率
acc = (y_test == pred).mean()
print('accuracy', acc)
# 将标签转换为原标签
pred = [lbl.inverse_transform(int(x)) for x in pred]
作者:衣介书生
链接:https://www.jianshu.com/p/e119f00bd93f
来源:简书
简书著作权归作者所有,任何形式的转载都请联系作者获得授权并注明出处。

浙公网安备 33010602011771号