Pandas的 loc iloc ix 区别

先看代码:

In [46]: import pandas as pd

In [47]: data = [[1,2,3],[4,5,6]]

In [48]: index = [0,1]

In [49]: columns=['a','b','c']

In [50]: df = pd.DataFrame(data=data, index=index, columns=columns)

In [51]: df
Out[51]: 
   a  b  c
0  1  2  3
1  4  5  6

1. loc——通过行标签索引行数据


In [52]: df.loc[1]
Out[52]: 
a    4
b    5
c    6
Name: 1, dtype: int64

1.2 loc['d']表示索引的是第’d’行(index 是字符)

In [53]: import pandas as pd    
    ...: data = [[1,2,3],[4,5,6]]    
    ...: index = ['d','e']    
    ...: columns=['a','b','c']    
    ...: df = pd.DataFrame(data=data, index=index, columns=columns)
    ...: 

In [54]: df
Out[54]: 
   a  b  c
d  1  2  3
e  4  5  6

In [55]: df.loc['d']
Out[55]: 
a    1
b    2
c    3
Name: d, dtype: int64

1.3 如果想索引列数据,像这样做会报错

In [56]: df.loc['a']
Traceback (most recent call last):

  File "<ipython-input-56-5dbae926782f>", line 1, in <module>
    df.loc['a']

  File "E:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 1328, in __getitem__
    return self._getitem_axis(key, axis=0)
    ...
KeyError: 'the label [a] is not in the [index]'

1.4 loc可以获取多行数据


In [57]: df.loc['d':]
Out[57]: 
   a  b  c
d  1  2  3
e  4  5  6

1.5 loc扩展——索引某行某列

In [58]: df.loc['d',['b','c']]
Out[58]: 
b    2
c    3
Name: d, dtype: int64

1.6 loc扩展——索引某列

In [59]: df.loc[:,['c']]
Out[59]: 
   c
d  3
e  6

当然获取某列数据最直接的方式是df.[列标签],但是当列标签未知时可以通过这种方式获取列数据。

需要注意的是,dataframe的索引[1:3]是包含1,2,3的,与平时的不同。

2. iloc——通过行号获取行数据

2.1 想要获取哪一行就输入该行数字

先看之前df数据:

In [54]: df
Out[54]: 
   a  b  c
d  1  2  3
e  4  5  6

现在调用iloc命令

In [60]: df.iloc[1]  #获取第1行
Out[60]: 
a    4
b    5
c    6
Name: e, dtype: int64

In [61]: df.iloc[0]  #获取第0行
Out[61]: 
a    1
b    2
c    3
Name: d, dtype: int64

2.2 通过行标签索引会报错

In [62]: df.iloc['a']
Traceback (most recent call last):

  File "<ipython-input-62-0c5fe4e92254>", line 1, in <module>
    df.iloc['a']

  File "E:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 1328, in __getitem__
    return self._getitem_axis(key, axis=0)
  ...

TypeError: cannot do positional indexing on <class 'pandas.core.indexes.base.Index'> with these indexers [a] of <class 'str'>

2.3 同样通过行号可以索引多行

In [63]: df.iloc[0:]   #获取0和其他行
Out[63]: 
   a  b  c
d  1  2  3
e  4  5  6

2.4 iloc索引列数据

In [64]: df.iloc[:,[0]]
Out[64]: 
   a
d  1
e  4

In [65]: df.iloc[:,[1]]
Out[65]: 
   b
d  2
e  5

3. ix——结合前两种的混合索引 (现在ix用法不推荐,这是Python2.x常用的)

3.1 通过行号索引

先看之前df数据:

In [54]: df
Out[54]: 
   a  b  c
d  1  2  3
e  4  5  6

现在看看.ix用法

In [66]: df.ix[1]
__main__:1: DeprecationWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing

See the documentation here: http://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate_ix
Out[66]: 
a    4
b    5
c    6
Name: e, dtype: int64

3.2 通过行标签索引

In [67]: df.ix['e']
Out[67]: 
a    4
b    5
c    6
Name: e, dtype: int64

参考来源:https://blog.csdn.net/roamer314/article/details/52179191

posted on 2018-04-15 22:49  星辰之衍  阅读(356)  评论(0编辑  收藏

导航