sklearn的交叉验证
数据集划分,要保持训练集和测试集各类数据的比例一样,需要用到stratify参数,这个参数接受array-like类型的数据。
K折交叉验证,需要保持每次训练集和验证集各类数据比例一样,则需要用StratifiedKFold。
from sklearn.model_selection import train_test_split, StratifiedKFold
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 42, stratify = y)
K = 2
kf = StratifiedKFold(n_splits = K)
for param in param_space:
for train_ind, val_ind in kf.split(X_train, y_train):
...
浙公网安备 33010602011771号