第四章 使用PolynomialFeatures来构建特征

例子:
https://www.kaggle.com/discdiver/category-encoders-examples
讲解:
https://towardsdatascience.com/smarter-ways-to-encode-categorical-data-for-machine-learning-part-1-of-3-6dca2f71b159

 

使用sklearn.preprocessing.PolynomialFeatures来进行特征的构造。

它是使用多项式的方法来进行的,如果有a,b两个特征,那么它的2次多项式为(1,a,b,a^2,ab, b^2)。

PolynomialFeatures有三个参数

degree:控制多项式的度

interaction_only: 默认为False,如果指定为True,那么就不会有特征自己和自己结合的项,上面的二次项中没有a^2和b^2。

include_bias:默认为True。如果为True的话,那么就会有上面的 1那一项。

 

 1 import pandas as pd
 2 from sklearn.neighbors import KNeighborsClassifier
 3 from sklearn.model_selection import GridSearchCV
 4 from sklearn.pipeline import Pipeline
 5 path = r"activity_recognizer\1.csv"
 6 #数据在https://archive.ics.uci.edu/ml/datasets/Activity+Recognition+from+Single+Chest-Mounted+Accelerometer
 7 df = pd.read_csv(path, header=None)
 8 df.columns = ['index', 'x', 'y', 'z', 'activity']
 9 
10 knn = KNeighborsClassifier()
11 knn_params = {'n_neighbors':[3, 4, 5, 6]}
12 X = df[['x', 'y', 'z']]
13 y = df['activity']
14 
15 
16 from sklearn.preprocessing import PolynomialFeatures
17 
18 poly = PolynomialFeatures(degree=2, include_bias=False, interaction_only=False)
19 X_ploly = poly.fit_transform(X)
20 X_ploly_df = pd.DataFrame(X_ploly, columns=poly.get_feature_names())
21 print(X_ploly_df.head())

 


结果:

 1 x0 x1 x2 x0^2 x0 x1 x0 x2 x1^2 \
 2 0 1502.0 2215.0 2153.0 2256004.0 3326930.0 3233806.0 4906225.0 
 3 1 1667.0 2072.0 2047.0 2778889.0 3454024.0 3412349.0 4293184.0 
 4 2 1611.0 1957.0 1906.0 2595321.0 3152727.0 3070566.0 3829849.0 
 5 3 1601.0 1939.0 1831.0 2563201.0 3104339.0 2931431.0 3759721.0 
 6 4 1643.0 1965.0 1879.0 2699449.0 3228495.0 3087197.0 3861225.0 
 7 
 8 x1 x2 x2^2 
 9 0 4768895.0 4635409.0 
10 1 4241384.0 4190209.0 
11 2 3730042.0 3632836.0 
12 3 3550309.0 3352561.0 
13 4 3692235.0 3530641.0

 

————————————————
版权声明:本文为CSDN博主「xtiange」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/tiange_xiao/article/details/79755793

posted @ 2021-05-12 16:54  锦绣良缘  阅读(193)  评论(0)    收藏  举报