随笔分类 -  Pandas

摘要:fillna()会填充nan数据,返回填充后的结果。如果希望在原DataFrame中修改,则把inplace设置为True df =pd.DataFrame({ 'data1':[1,2,np.nan,3,np.nan,4,5], 'data2':[34,67,np.nan,np.nan,52,77 阅读全文
posted @ 2020-05-10 15:11 籽俊 阅读(157) 评论(0) 推荐(0)
摘要:# 去重 duplicated() s = pd.Series([1,1,1,1,2,2,2,3,4,5,5,5,5,5]) print(s[s.duplicated()==False]) out: 0 1 4 2 7 3 8 4 9 5 dtype: int64 # 移除重复值,默认inplace 阅读全文
posted @ 2020-05-06 22:02 籽俊 阅读(126) 评论(0) 推荐(0)
摘要:# 字符串常用方法(1) - lower,upper,len,startswith,endswith s = pd.Series(['A','b','bbhello','123',np.nan]) print(s.str.lower(),'→ lower小写\n') print(s.str.uppe 阅读全文
posted @ 2020-05-06 17:14 籽俊 阅读(222) 评论(0) 推荐(0)
摘要:# axis,skipna 基本参数 df =pd.DataFrame({ 'key1':[4,5,3,np.nan,2], 'key2':[1,2,np.nan,4,5], 'key3':[1,2,3,'j','k']}, index= ['a','b','c','d','e']) print(d 阅读全文
posted @ 2020-05-06 15:34 籽俊 阅读(220) 评论(0) 推荐(0)
摘要:# 直接生成时间索引,支持str, datetime.datetime rng=pd.DatetimeIndex(['12/1/2017','12/2/2017','12/3/2017','12/4/2017','12/5/2017']) print(rng,type(rng)) print(rng 阅读全文
posted @ 2020-05-06 00:34 籽俊 阅读(1420) 评论(0) 推荐(0)