随笔分类 -  sklearn

摘要:1 from sklearn.externals import joblib 2 import pandas as pd 3 import numpy 4 from sklearn.preprocessing import OneHotEncoder 5 #import link_and_train 6 #拼接测试集,测试集进行one-hot编码 7 onehot = OneHot... 阅读全文
posted @ 2018-04-25 11:31 我想休息 阅读(412) 评论(0) 推荐(0)
摘要:1 from sklearn.neighbors import KNeighborsClassifier 2 from sklearn.externals import joblib 3 onehot = OneHotEncoder() 4 for b in range(1,115): 5 addata = pd.read_csv("adFeature.csv") 6 ... 阅读全文
posted @ 2018-04-25 11:23 我想休息 阅读(331) 评论(0) 推荐(0)
摘要:1 import pandas as pd 2 user_feature_data = [] 3 flag = 1 4 print("landing") 5 with open("userFeature.data","r",encoding="utf-8") as f : 6 for i,line in enumerate(f) : 7 line = li... 阅读全文
posted @ 2018-04-23 17:46 我想休息 阅读(187) 评论(0) 推荐(0)
摘要:赛题:https://pan.baidu.com/s/1Re0k81XieiXFI6hkwgL8oA 分析: 1.用户是多个,广告也是多个,一个用户可能对多个广告产生行为,一个广告也可能被对多个用户点击,这显然是不好处理的.我们假设只有一个广告,那么他对于用户而言只有两种情况,被点击和不被点击,这就 阅读全文
posted @ 2018-04-23 17:40 我想休息 阅读(323) 评论(0) 推荐(1)
摘要:1 from sklearn.preprocessing import OneHotEncoder 2 import numpy 3 onehot = OneHotEncoder() 4 #建立一个映射,将多个特征共存的情况定义为单个数表示比如有特征{a,b},a用1,b用2,ab用3----(1) 5 import pandas 6 data = pandas.read_csv("... 阅读全文
posted @ 2018-04-23 17:00 我想休息 阅读(1666) 评论(0) 推荐(0)