scikit-learn学习笔记
参考资料:
python机器学习库scikit-learn简明教程之:随机森林
http://nbviewer.jupyter.org/github/donnemartin/data-science-ipython-notebooks/blob/master/kaggle/titanic.ipynb
scikit-learn sklearn 0.18 官方文档中文版
https://github.com/jakevdp/sklearn_pycon2015
官网:http://scikit-learn.org/stable/
Scikit-learn (sklearn) 优雅地学会机器学习 (莫烦 Python 教程)
python机器学习库scikit-learn简明教程之:AdaBoost算法
http://www.docin.com/p-1775095945.html
https://www.bilibili.com/video/av22530538/?p=6
https://github.com/Fdevmsy/Image_Classification_with_5_methods
https://github.com/huangchuchuan/SVM-HOG-images-classifier
https://blog.csdn.net/always2015/article/details/47100713
DBScan https://www.cnblogs.com/pinard/p/6208966.html
1.KNN的使用
carto@cartoPC:~$ python
Python 2.7.12 (default, Dec 4 2017, 14:50:18)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> from sklearn import datasets
>>> from sklearn.cross_validation import train_test_split
/usr/local/lib/python2.7/dist-packages/sklearn/cross_validation.py:41: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
"This module will be removed in 0.20.", DeprecationWarning)
>>> from sklearn.neighbors import KNeighborsClassifier
>>> iris=datasets.load_iris()
>>> iris_X=iris.data
>>> iris_y=iris.target
>>> print(iris_X[:2,:])
[[ 5.1 3.5 1.4 0.2]
[ 4.9 3. 1.4 0.2]]
>>> print(iris_y)
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2]
>>> X_train,X_test,y_train,y_test=train_test_split(iris_X,iris_y,test_size=0.3)
>>> print(y_train)
[2 1 0 0 0 2 0 0 1 1 2 2 1 1 2 2 2 0 1 0 2 2 1 1 1 1 1 0 1 1 0 2 1 0 0 2 2
0 0 2 1 0 0 2 1 2 1 2 1 1 1 2 1 2 0 2 0 1 1 2 1 0 1 2 2 0 2 2 1 0 1 1 2 2
1 0 1 1 2 0 0 1 0 1 0 2 0 1 1 0 2 1 2 0 2 0 2 0 2 1 0 2 0 2 2]
>>> knn=KNeighborsClassifier()
>>> knn.fit()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: fit() takes exactly 3 arguments (1 given)
>>> knn.fit(X_train,y_train)
KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
metric_params=None, n_jobs=1, n_neighbors=5, p=2,
weights='uniform')
>>> print(knn.predict(X_test))
[1 1 2 0 1 1 1 1 2 0 0 2 0 1 0 0 0 1 2 2 2 2 0 1 2 0 1 2 2 0 1 2 0 0 1 0 0
0 0 1 0 1 1 2 0]
>>> print(y_test)
[1 1 2 0 1 1 1 1 2 0 0 2 0 1 0 0 0 1 2 2 2 2 0 1 2 0 1 2 2 0 2 2 0 0 2 0 0
0 0 1 0 1 1 2 0]
>>>
2.SVC的使用
import pandas as pd
import numpy as np
from sklearn import datasets
from sklearn import svm
from sklearn.model_selection import train_test_split
def load_data():
iris=datasets.load_iris()
X_train,X_test,y_train,y_test=train_test_split(
iris.data,iris.target,test_size=0.10,random_state=0)
return X_train,X_test,y_train,y_test
def test_LinearSVC(X_train,X_test,y_train,y_test):
cls=svm.LinearSVC()
cls.fit(X_train,y_train)
print('Coefficients:%s, intercept %s'%(cls.coef_,cls.intercept_))
print('Score: %.2f' %cls.score(X_test,y_test))
if __name__=="__main__":
X_train,X_test,y_train,y_test=load_data()
test_LinearSVC(X_train,X_test,y_train,y_test)
调用
carto@cartoPC:~/python_ws$ python svmtest2.py Coefficients:[[ 0.18424504 0.45123335 -0.80794237 -0.45071267] [-0.13381099 -0.75235247 0.57223898 -1.11494325] [-0.7943601 -0.95801711 1.31465593 1.8169808 ]], intercept [ 0.10956304 1.86593164 -1.72576407] Score: 1.00
作者:太一吾鱼水
文章未经说明均属原创,学习笔记可能有大段的引用,一般会注明参考文献。
欢迎大家留言交流,转载请注明出处。
浙公网安备 33010602011771号