pandas 学习总结

pandas  学习总结

 

作者:csj 更新时间:2018.04.02 shenzhen

email:59888745@qq.com

home: http://www.cnblogs.com/csj007523/p/8149929.html

 

1.import

2.export

3.create object

4.vewing,inspecting data

5.select data

6.data cleaning

7.filter,sort,groupby

8.join:merge,concat

 

import:

pd.read_csv('path')

pd.read_excel('path')

pd.read_table('path')

pd.read_sql(query,connstr)

read_html(url)

read_json(jsonstr)

pd.DataFrame(dict)

 

exporting:

df.to_csv(filename)

df.to_excel(filename)

df.to_json(filename)

df.to_sql(talbename,connstr)

 

create object:

pd.DataFrame(np.random.rand(20,4))

pd.Series(mylist)

df.index=pd.date_range('2018/01/01',periods=df.shape[0])

 

viewing/inspecting data:

df.head()

df.tail()

df.shape()

df.info()

df.describe()

df.apple()

df.columns

df.index s.value_counts()

 

select data:

df[col]

df[[col1,col2]]

df.col1

df.loc[col1/indexname]

df.iloc[0,:]

df.iloc[0,0]

 

data cleaning:

pd.isnull()

pd.notnull()

df.columns=['a','b','c','d']

df.dropna(how='any')

df.dropna(how='all')

df.dropna()

df.fillna(x)

df.fillna(s.mean())

s.astype(float)

s.replace(1,'one')

s.replace([1,3],['one','three'])

df.rename(columns=lambda x:x+1)

df.rename(columns={'oldcolname':'newcolumns'})

df.rename(index=lambda x:x+1)

df.set_index('colu1')

 

filter,sort ,groupby:

df[df[col]>10]

df[df[col] > 5 & df[col] <10]

df.sort_values(col1)

df.sort_values(col1,ascending=False)

df.sort_values([col1,col2],ascending=[False,True])

df.groupby([col1,col2])

df.groupby(col).agg(np.mean)

df.apply(np.mean)

df.apply(np.max,axis=1)  #across each row

df.pivot_table(index=col1,values=[col2,col3],aggfunc=mean)

join/combine:

pd.merge(lef,right,how='left/right/outer/inner/',on=['key1','key2'])  横向连接,用于将多个dataframe通过某个相同的键合并

为一个 pd.concat([df1,df2],axis=1)  可横向可纵向

Statistics:

df.describe() df.mean() df.corr() df.count() df.max() df.min() df.median() df.std()

posted @ 2018-04-02 18:04 大树2 阅读(...) 评论(...) 编辑 收藏