KNN算法的超参数
一:定义
超参数是在开始学习过程之前设置值的参数,而不是通过训练得到的参数数据。
二:常用超参数
k近邻算法的k,权重weight,明可夫斯基距离公式的p,这三个参数都在KNeighborsClassifier类的构造函数中。
三:共同代码
import numpy as np from sklearn.neighbors import KNeighborsClassifier from sklearn.model_selection import train_test_split from sklearn import datasets digits = datasets.load_digits() x = digits.data y = digits.target x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.2)
四:k的最优数值
best_score = 0.0
best_k = -1
for k in range(1,11):
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(x_train,y_train)
t = knn.score(x_test,y_test)
if t>best_score:
best_score = t
best_k = k
print(best_k)
print(best_score)
五:weight的最优数值
如果取值为uniform,例如:当我们取k等于3,结果预测到三个点距离最近的点为三个,sklearn就会选择一个进行返回预测结果,但是我们如果考虑距离也就是取值为distance,就会有一个权重的概念,一般为距离的倒数,例如该点到另外三个点的距离为1,3,4则权重为1,1/3,1/4,则返回1这个点作为预测结果。
best_score = 0.0
best_k = -1
best_method = ''
for method in ['uniform','distance']:
for k in range(1,11):
knn = KNeighborsClassifier(n_neighbors=k,weights=method)
knn.fit(x_train,y_train)
t = knn.score(x_test,y_test)
if t>best_score:
best_score = t
best_k = k
best_method = method
print(best_score)
print(best_k)
print(best_method)
六:p的最优数值
当需要p的参数时,weight必须为distance,不能为uniform
best_score = 0.0
best_k = -1
best_p = 1
for i in range(1,6):
for k in range(1,11):
knn = KNeighborsClassifier(n_neighbors=k,weights='distance',p=i)
knn.fit(x_train,y_train)
t = knn.score(x_test,y_test)
if t>best_score:
best_k = k
best_score = t
best_p = i
print(best_p)
print(best_score)
print(best_k)

浙公网安备 33010602011771号