python数据分析之pandas里的Series1 Series

1 Series

线性的数据结构, series是一个一维数组
Pandas 会默然用0到n-1来作为series的index, 但也可以自己指定index( 可以把index理解为dict里面的key )
1.1创造一个serise数据

    import pandas as pd
    import numpy as np
    •
    s = pd.Series([9, 'zheng', 'beijing', 128])
    •
    print(s)
[/code]

  * 打印 

```code
    0          9
    1      zheng
    2    beijing
    3        128
    dtype: object
[/code]

  * 访问其中某个数据 

```code
    print(s[1:2])
    •
    # 打印
    1    zheng
    dtype: object
[/code]

##  1.2 指定index

```code
    import pandas as pd
    import numpy as np
    •
    s = pd.Series([9, 'zheng', 'beijing', 128, 'usa', 990], index=[1,2,3,'e','f','g'])
    •
    print(s)
[/code]

  * 打印 

```code
    1          9
    2      zheng
    3    beijing
    e        128
    f        usa
    g        990
    dtype: object
[/code]

  * 根据索引找出值 

```code
    print(s['f'])    # usa
[/code]

##  1.3 用dictionary构造一个series

```code
    import pandas as pd
    import numpy as np
    •
    s = {"ton": 20, "mary": 18, "jack": 19, "car": None}
    •
    sa = pd.Series(s, name="age")
    •
    print(sa)
[/code]

  * 打印 

```code
    car      NaN
    jack    19.0
    mary    18.0
    ton     20.0
    Name: age, dtype: float64
[/code]

  * 检测类型 

```code
    print(type(sa))    # <class 'pandas.core.series.Series'>
[/code]

##  1.4 用numpy ndarray构造一个Series

  * 生成一个随机数 
[code]     import pandas as pd

    import numpy as np
    •
    num_abc = pd.Series(np.random.randn(5), index=list('abcde'))
    num = pd.Series(np.random.randn(5))
    •
    print(num)
    print(num_abc)
    •
    # 打印
    0   -0.102860
    1   -1.138242
    2    1.408063
    3   -0.893559
    4    1.378845
    dtype: float64
    a   -0.658398
    b    1.568236
    c    0.535451
    d    0.103117
    e   -1.556231
    dtype: float64
[/code]

##  1.5 选择数据

```code
    import pandas as pd
    import numpy as np
    •
    s = pd.Series([9, 'zheng', 'beijing', 128, 'usa', 990], index=[1,2,3,'e','f','g'])
    •
    print(s[1:3])  # 选择第1到3个, 包左不包右 zheng beijing
    print(s[[1,3]])  # 选择第1个和第3个, zheng 128
    print(s[:-1]) # 选择第1个到倒数第1个, 9 zheng beijing 128 usa
[/code]

##  1.6 操作数据

```code
    import pandas as pd
    import numpy as np
    •
    s = pd.Series([9, 'zheng', 'beijing', 128, 'usa', 990], index=[1,2,3,'e','f','g'])
    •
    sum = s[1:3] + s[1:3]
    sum1 = s[1:4] + s[1:4]
    sum2 = s[1:3] + s[1:4]
    sum3 = s[:3] + s[1:]
    •
    print(sum)
    print(sum1)
    print(sum2)
    print(sum3)
[/code]

  * 打印 

```code
    2        zhengzheng
    3    beijingbeijing
    dtype: object
    2        zhengzheng
    3    beijingbeijing
    e               256
    dtype: object
    2        zhengzheng
    3    beijingbeijing
    e               NaN
    dtype: object
    1               NaN
    2        zhengzheng
    3    beijingbeijing
    e               NaN
    f               NaN
    g               NaN
    dtype: object
[/code]

##  1.7 查找

  * 是否存在 

```code
    USA in s # true
[/code]

  * 范围查找 

```code
    import pandas as pd
    import numpy as np
    
    s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
    
    sa = pd.Series(s, name="age")
    
    print(sa[sa>19])
[/code]

![](https://img-
blog.csdn.net/20180311131749213?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQva2luZ292/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)  

  * 中位数 

```code
    import pandas as pd
    import numpy as np
    
    s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
    
    sa = pd.Series(s, name="age")
    
    print(sa.median())  # 20
[/code]

  * 判断是否大于中位数 

```code
    import pandas as pd
    import numpy as np
    
    s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
    
    sa = pd.Series(s, name="age")
    
    print(sa>sa.median())
[/code]

![](https://img-
blog.csdn.net/20180311132042901?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQva2luZ292/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)  

  * 找出大于中位数的数 

```code
    import pandas as pd
    import numpy as np
    
    s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
    
    sa = pd.Series(s, name="age")
    
    print(sa[sa > sa.median()])
[/code]

![](https://img-
blog.csdn.net/20180311132206419?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQva2luZ292/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)  

  * 中位数 

```code
    import pandas as pd
    import numpy as np
    
    s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
    
    sa = pd.Series(s, name="age")
    
    more_than_midian = sa>sa.median()
    
    print(more_than_midian)
    
    print('---------------------')
    
    print(sa[more_than_midian])
[/code]

![](https://img-
blog.csdn.net/20180311132520743?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQva2luZ292/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)  

  

##  1.8 Series赋值

```code
    import pandas as pd
    import numpy as np
    
    s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
    
    sa = pd.Series(s, name="age")
    
    print(s)
    
    print('----------------')
    
    sa['ton'] = 99
    
    print(sa)
[/code]

![](https://img-
blog.csdn.net/20180311132813516?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQva2luZ292/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)  

  

##  1.9 满足条件的统一赋值

```code
    import pandas as pd
    import numpy as np
    
    s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
    
    sa = pd.Series(s, name="age")
    
    print(s) # 打印原字典
    
    print('---------------------')   # 分割线
    
    sa[sa>19] = 88 # 将所有大于19的同一改为88
    
    print(sa) # 打印更改之后的数据
    
    print('---------------------')   # 分割线
    
    print(sa / 2) # 将所有数据除以2
posted on 2021-07-07 16:19 BabyGo000 阅读(229) 评论(0) 收藏举报