随笔分类 - python数据分析
摘要:pd.scatter_matrix(trans_data,diagonal='kde',color='k',alpha=0.3)报错,改为pd.plotting.scatter_matrix(trans_data,diagonal='kde',color='k',alpha=0.3)
阅读全文
摘要:values.hist(bins=100,alpha=0.3,color='g',normed=True)报错改为density=Truevalues.hist(bins=100,alpha=0.3,color='g',density=True)
阅读全文
摘要:fig, ax = plt.subplots() ax.plot(2, 3) plt.rcParams['font.sans-serif'] = ['SimHei'] # 正常显示中文 ax.set_title('中文标题') plt.show
阅读全文
摘要:import re # 描述一个或多个空白符的regex是\s+ text = "foo bar\t baz \tqux" regex = re.compile('\s+') print(regex.split(text)) # 等于 re.split('\s+',text) # ['foo', '
阅读全文
摘要:from pandas import DataFrame,Series import pandas as pd import numpy as np # 如果一个DataFrame的某一列中含有K个不同值,则可以派生出一个K列矩阵 df = DataFrame({'key':['b','b','a'
阅读全文
摘要:pd.read_table('movies.dat', sep='::') ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators (sep
阅读全文
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 3114: invalid continuation byte
摘要:pd.read_table('movies.dat', sep='::') 增加 encoding='ISO-8859-1' 可解决 pd.read_table('movies.dat', sep='::',encoding='ISO-8859-1')
阅读全文
摘要:from pandas import DataFrame,Series import pandas as pd import numpy as np # 使用numpy.random.permutation可实现对Series或DataFrame的列排列 df = DataFrame(np.aran
阅读全文
摘要:from pandas import DataFrame,Series import pandas as pd import numpy as np np.random.seed(12345) data = DataFrame(np.random.randn(1000,4)) # 找出某列中绝对值大
阅读全文
摘要:from pandas import DataFrame,Series import pandas as pd import numpy as np ages = [20, 22, 25, 27, 21, 23, 37, 31, 61, 45, 41, 32] bins = [18,25,35,60
阅读全文
摘要:Categorical.levels属性停止使用,现在为Categorical.categories
阅读全文
摘要:Categorical.labels属性停止使用,现在为Categorical.codes
阅读全文
摘要:from pandas import DataFrame,Series import pandas as pd import numpy as np data = DataFrame(np.arange(12).reshape((3,4)), index=["Aa","Bb","Cc"], colu
阅读全文
摘要:from pandas import Series import numpy as np data = Series([1,-999,2,-999,-1000,3]) print(data) ''' 0 1 1 -999 2 2 3 -999 4 -1000 5 3 dtype: int64 '''
阅读全文
摘要:from pandas import DataFrame,Series import pandas as pd import numpy as np data = DataFrame({'k1':['A']*3+['B']*4, 'k2':[1,1,2,3,3,4,4]}) print(data)
阅读全文
摘要:将 take_last=True 改为 keep='last'
阅读全文
摘要:from pandas import DataFrame,Series import numpy as np a = Series([np.nan,2.5,np.nan,3.5,4.5,np.nan], index=['f','e','d','c','b','a']) print(a) ''' f
阅读全文
摘要:from pandas import DataFrame,Series import pandas as pd import numpy as np arr = np.arange(12).reshape((3,4)) print(arr) ''' [[ 0 1 2 3] [ 4 5 6 7] [
阅读全文
摘要:from pandas import DataFrame left = DataFrame([[1,2],[3,4],[5,6]],index=['a','c','e'],columns=['item1','item2']) right = DataFrame([[7,8],[9,10],[11,1
阅读全文
摘要:from pandas import DataFrame import pandas as pd df1 = DataFrame({'key':['b','b','a','c','a','a','b'], 'data1':range(7)}) df2 = DataFrame({'key':['a',
阅读全文