loc、iloc、ix比较

使用pandas创建一个对象

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: df = pd.DataFrame(np.random.randn(6,4),index=pd.date_range('20180101',periods=6),columns=list('ABCD'))

In [4]: df
Out[4]:
                   A         B         C         D
2018-01-01 -0.603510  0.269480  0.197354 -0.433003
2018-01-02  1.230502  0.474616  1.473517 -0.627363
2018-01-03 -0.402034  0.569097  0.675872 -0.317995
2018-01-04  0.220638  0.527543 -1.140620 -0.348089
2018-01-05 -2.494331  0.593269  0.596578  1.653347
2018-01-06 -2.766239 -0.919777  0.462890  0.156048

如果你想得到第三行的数据:

如果你沿袭之前python切片的习惯,想直接取,那么需要改变一下方式。

KeyError                                  Traceback (most recent call last)
D:\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   3062             try:
-> 3063                 return self._engine.get_loc(key)
   3064             except KeyError:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 2

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-5-b5f2749c85df> in <module>()
----> 1 df[2]

D:\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   2683             return self._getitem_multilevel(key)
   2684         else:
-> 2685             return self._getitem_column(key)
   2686
   2687     def _getitem_column(self, key):

D:\Anaconda3\lib\site-packages\pandas\core\frame.py in _getitem_column(self, key)
   2690         # get column
   2691         if self.columns.is_unique:
-> 2692             return self._get_item_cache(key)
   2693
   2694         # duplicate columns & possible reduce dimensionality

D:\Anaconda3\lib\site-packages\pandas\core\generic.py in _get_item_cache(self, item)
   2484         res = cache.get(item)
   2485         if res is None:
-> 2486             values = self._data.get(item)
   2487             res = self._box_item_values(item, values)
   2488             cache[item] = res

D:\Anaconda3\lib\site-packages\pandas\core\internals.py in get(self, item, fastpath)
   4113
   4114             if not isna(item):
-> 4115                 loc = self.items.get_loc(item)
   4116             else:
   4117                 indexer = np.arange(len(self.items))[isna(self.items)]

D:\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   3063                 return self._engine.get_loc(key)
   3064             except KeyError:
-> 3065                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   3066
   3067         indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 2
df[2]存在语法错误

正确的做法其实有好多种:

方法1:

In [6]: df[2:3]
Out[6]:
                   A         B         C         D
2018-01-03 -0.402034  0.569097  0.675872 -0.317995

方法2:

vIn [7]: df['20180103':'20180103']  #这里必须使用这种方式,不然会有语法错误
Out[7]:
                   A         B         C         D
2018-01-03 -0.402034  0.569097  0.675872 -0.317995

刚才使用类似python单个切片的方式貌似不行,所以就要说到今天的重点,loc、iloc、ix

(1).loc:按照标签进行取值

In [8]: df.loc['2018/01/03']
Out[8]:
A   -0.402034
B    0.569097
C    0.675872
D   -0.317995
Name: 2018-01-03 00:00:00, dtype: float64

 

(2).iloc:按照位置进行取值

In [9]: df.iloc[2]
Out[9]:
A   -0.402034
B    0.569097
C    0.675872
D   -0.317995
Name: 2018-01-03 00:00:00, dtype: float64

 

(3)ix:混合索引

In [10]: df.ix['2018/01/03']
Out[10]:
A   -0.402034
B    0.569097
C    0.675872
D   -0.317995
Name: 2018-01-03 00:00:00, dtype: float64

In [11]: df.ix[2]
Out[11]:
A   -0.402034
B    0.569097
C    0.675872
D   -0.317995
Name: 2018-01-03 00:00:00, dtype: float64

 

posted @ 2018-09-14 12:35  明王不动心  阅读(471)  评论(0编辑  收藏  举报