Pandas unique函数

 

Pandas中Series对象的唯一值

unique()函数用于获取Series对象的唯一值。

唯一性按出现顺序返回。基于哈希表的唯一,因此不排序

以NumPy数组形式返回唯一值。如果是扩展数组支持的Series,则返回仅具有唯一值的该类型的新ExtensionArray

 

The unique() function is used to get unique values of Series object.

Uniques are returned in order of appearance. Hash table-based unique, therefore does NOT sort.

Syntax:

Series.unique(self)

Pandas Series unique image

Returns: ndarray or ExtensionArray
The unique values returned as a NumPy array. See Notes.

Notes: Returns the unique values as a NumPy array. In case of an extension-array backed Series, a new ExtensionArray of that type with just the unique values is returned. This includes

    • Categorical
    • Period
    • Datetime with Timezone
    • Interval
    • Sparse
    • IntegerNA

 

Examples

In [1]:
import numpy as np
import pandas as pd
In [2]:
pd.Series([2, 4, 3, 3], name='P').unique()
Out[2]:
array([2, 4, 3], dtype=int64)
In [3]:
pd.Series([pd.Timestamp('2019-01-01') for _ in range(3)]).unique()
Out[3]:
array(['2019-01-01T00:00:00.000000000'], dtype='datetime64[ns]')
In [4]:
pd.Series([pd.Timestamp('2019-01-01', tz='US/Eastern')
           for _ in range(3)]).unique()
Out[4]:
<DatetimeArray>
['2019-01-01 00:00:00-05:00']
Length: 1, dtype: datetime64[ns, US/Eastern]
 

An unordered Categorical will return categories in the order of appearance.

In [5]:
pd.Series(pd.Categorical(list('qppqr'))).unique()
Out[5]:
[q, p, r]
Categories (3, object): [q, p, r]
 

An ordered Categorical preserves the category ordering.

In [6]:
pd.Series(pd.Categorical(list('qppqr'), categories=list('pqr'),
                         ordered=True)).unique()
Out[6]:
[q, p, r]
Categories (3, object): [p < q < r]
posted @ 2020-11-06 15:07  DaisyLinux  阅读(9370)  评论(0编辑  收藏  举报