3.2 DS汇总

# Pandas

## Spark dataframe vs Pandas dataframe

  https://blog.csdn.net/weixin_31866177/article/details/120754456

## pandas100个骚操作

  https://blog.csdn.net/yuxiaosmd/article/details/114647974

## pandas50道练习题

  https://www.jianshu.com/p/4250e35c2d45

  https://www.jianshu.com/p/b4338aa7bf55

  https://www.jianshu.com/p/3f3c0fb8fa4e

  https://www.jianshu.com/p/7ceafbe79ed4

  https://www.heywhale.com/mw/project/59e77a636d213335f38daec2

## read_csv读入csv文件报错 'utf-8' codec can't decode byte 0xbe in position 0

  https://blog.csdn.net/K_nightWang/article/details/79862206

## ValueError: Cannot mask with non-boolean array containing NA / NaN values

  https://blog.csdn.net/K_nightWang/article/details/79862206

## 将含有指定字符串的行筛选出来

  https://blog.csdn.net/qq_46240549/article/details/109553344

## 生成sql语句字符串

  https://blog.csdn.net/SeafyLiang/article/details/120262833

## Pandas 练习题

  参考https://blog.csdn.net/cd_sywe/article/details/103150386

  https://www.jianshu.com/p/4ab8720071dd

## pandas按行、按列遍历

  https://blog.csdn.net/weixin_43115411/article/details/126030711

## pandas行、列显示不完全

  https://blog.csdn.net/m0_46624667/article/details/109773960

  https://blog.csdn.net/weekdawn/article/details/81389865

  https://blog.csdn.net/wugou2014/article/details/129770484

## pandas中to_dict的用法详解

  参考https://www.jb51.net/article/141481.htm

## pandas报错 'Series' object has no attribute 'as_matrix'

  参考https://blog.csdn.net/weixin_44550865/article/details/105785876

## pandas报错:index must be monotonic increasing or decreasing

  参考https://blog.csdn.net/weixin_44149358/article/details/110874865

## pandas报错:'Series' object has no attribute 'order'

  参考https://blog.csdn.net/NextAction/article/details/85097904

## pandas报错:ModuleNotFoundError: No module named 'pandas.io.data'

  参考https://blog.csdn.net/qq_23347459/article/details/104966818

## pandas:Timestamp object has no attribute dt

  参考https://stackoverflow.com/questions/62803633/timestamp-object-has-no-attribute-dt/62815103

## pandas:unique()函数与nunique()函数区别

  参考https://blog.csdn.net/feizxiang3/article/details/93380525

## pandas:groupby详解

  参考https://zhuanlan.zhihu.com/p/101284491

## pandas:map、apply、applymap详解

  参考https://zhuanlan.zhihu.com/p/100064394

## pandas:数据类型转换

  参考https://blog.csdn.net/u010916338/article/details/82385618

## pandas:两列转换成字典的健和值

  参考https://blog.csdn.net/mao15827639402/article/details/107832903

## pandas:删除满足条件元素所在的行

  参考https://blog.csdn.net/u014636245/article/details/104202889

## pandas:判断值是否在列表中

  参考https://blog.csdn.net/qq_38115310/article/details/103290582

## pandas:findall()

  https://blog.csdn.net/claroja/article/details/64927541

## pandas中Series对象下的str方法汇总

  参考https://blog.csdn.net/weixin_43750377/article/details/107979607

## pandas中groupby的参数:as_index

  参考https://www.cnblogs.com/zhangzhixing/p/11074416.html

## pandas报错: xlrd.biffh.XLRDError: Excel xlsx file; not supported

  参考https://blog.csdn.net/weixin_44073728/article/details/111054157

## pandas报错: TypeError: read_excel() got an unexpected keyword argument `sheetname`

  参考https://www.cnblogs.com/mmjing/p/11935889.html

## pandas.get_dummies的用法

  参考https://blog.csdn.net/maymay_/article/details/80198468

## pandas报错:Getting TypeError: reduction operation 'argmax' not allowed for this dtype when trying to use idxmax()

  参考https://stackoverflow.com/questions/48719937

## pandas报错:ImportError: No module named 'pandas.tools'

  参考https://www.cnblogs.com/zhhy236400/p/11111036.html

## pandas函数:diff

  参考https://www.cnblogs.com/anovana/p/10429237.html

## dataframe和list转换

  https://blog.csdn.net/liujingwei8610/article/details/125438336

## dataframe的reset_index()

  https://blog.csdn.net/longge_number1/article/details/117203253

## dataframe的窗口函数rolling()

  https://blog.csdn.net/chinacmt/article/details/104757646

## pandas时间序列

  https://www.jianshu.com/p/93558d24509c

  https://www.jianshu.com/p/8d3d612afbb2

## rank函数

  https://blog.51cto.com/u_15671528/5358933

 

 

# Numpy

## numpy.random.randint用法

  参考https://blog.csdn.net/u011851421/article/details/83544853

## Print 数组无法完整输出解决方法

  参考https://blog.csdn.net/hustwayne/article/details/84393485

## numpy中stack(),hstack(),vstack()函数详解

  参考https://blog.csdn.net/csdn15698845876/article/details/73380803

## numpy.random.uniform介绍

  参考https://blog.csdn.net/u013920434/article/details/52507173

## 报错 AxisError: axis 0 is out of bounds for array of dimension 0

  参考https://www.cnblogs.com/WMT-Azura/p/13632440.html

## numpy,pandas计算均值、方差、标准差

  参考https://www.shangmayuan.com/a/88af74b8434f49b0ba255fe2.html

## numpy.reshape(-1,1)

  参考https://blog.csdn.net/qq_42804678/article/details/99062431

## numpy数组拼接方法

  参考https://blog.csdn.net/zyl1042635242/article/details/43162031

## numpy练习题

  https://www.machinelearningplus.com/python/101-numpy-exercises-python/

## numpy:squeeze()函数

  https://blog.csdn.net/weixin_44001371/article/details/125008596

## numpy中mat与array在相乘上的区别

 

# 统计

## 用python做z检验,t检验

  https://blog.csdn.net/robert_chen1988/article/details/103378351

  https://www.statsmodels.org/stable/generated/statsmodels.stats.weightstats.ztest.html#statsmodels-stats-weightstats-ztest

## 用python求统计功效

  https://www.statsmodels.org/dev/generated/statsmodels.stats.power.zt_ind_solve_power.html

  https://campus.datacamp.com/courses/practicing-statistics-interview-questions-in-python/statistical-experiments-and-significance-testing?ex=7

## 正态分布上求概率

  https://blog.csdn.net/a857553315/article/details/117554389

# Sklearn

## Sklearn的train_test_split用法

  参考https://blog.csdn.net/fxlou/article/details/79189106

## No module named 'sklearn.cross_validation'解决方法

  参考https://blog.csdn.net/rocling/article/details/89002209

## No module named 'sklearn.grid_search'

  参考https://blog.csdn.net/u012852847/article/details/84639213/

## DictVectorizer的使用

  参考https://blog.csdn.net/qq_36847641/article/details/78279309

## Sklearn数据预处理函数fit_transform()和transform()的区别

  参考https://blog.csdn.net/quiet_girl/article/details/72517053

## Sklearn数据预处理函数StandardScaler

  参考https://blog.csdn.net/wzyaiwl/article/details/90549391

## sklearn:文本特征提取方法CountVectorizer

  参考https://blog.csdn.net/weixin_38278334/article/details/82320307

## sklearn:predict()与predict_proba()用法区别

  参考https://www.cnblogs.com/mrtop/p/10309083.html

## sklearn报错:Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty

  参考https://blog.csdn.net/qq_22592457/article/details/103504796

## sklearn报错:TypeError: 'KFold' object is not iterable

  参考https://stackoverflow.com/questions/48641290

 

# 理论

## TF-IDF理论及Sklearn的TfidfVectorizer函数

  参考http://www.ruanyifeng.com/blog/2013/03/tf-idf.html

  https://blog.csdn.net/m0_37991005/article/details/105074754

## 文本向量化实战

  参考https://zhuanlan.zhihu.com/p/44917421

## jieba分词模块的基本用法

  参考https://www.cnblogs.com/jiayongji/p/7119065.html

## ROC曲线-阈值评价标准

  参考https://blog.csdn.net/abcjennifer/article/details/7359370

## 混淆矩阵

  参考https://blog.csdn.net/u011734144/article/details/80277225

## 相关性分析

  https://blog.csdn.net/chenlei456/article/details/123603257

## 关联规则——Apriori算法

  https://zhuanlan.zhihu.com/p/432733354

# 其他

## 打开.data文件的步骤

  参考https://blog.csdn.net/ziqingnian/article/details/108013340

## name 'json' is not defined

  参考https://blog.csdn.net/qq_38161040/article/details/91410095

## string indices must be integers 错误原因

  参考https://blog.csdn.net/weixin_43256057/article/details/83867876

## 解析时间戳并以毫秒为单位计算时间差

  参考https://oomake.com/question/1192202

## time与datetime模块如何转换

  参考http://www.jquerycn.cn/a_38061

## csv模块csv.writer().writerow()产生空行的问题

  参考https://blog.csdn.net/youzhouliu/article/details/53138661

## 生成csv文件时内容中包含逗号的处理方式

   参考https://blog.csdn.net/hjp1137/article/details/48656049

## cursor游标讲解

  参考https://blog.csdn.net/pdcfighting/article/details/104085622/

## 报错:TypeError: Object of type Decimal is not JSON serializable

  参考https://blog.csdn.net/ILovePythonhao/article/details/105347755

 

 

posted on 2022-04-07 22:25  Hiteration  阅读(113)  评论(0)    收藏  举报

导航