python数据分析之pandas里的Series1 Series
线性的数据结构, series是一个一维数组
Pandas 会默然用0到n-1来作为series的index, 但也可以自己指定index( 可以把index理解为dict里面的key )
1.1创造一个serise数据
import pandas as pd
import numpy as np
•
s = pd.Series([9, 'zheng', 'beijing', 128])
•
print(s)
[/code]
* 打印
```code
0 9
1 zheng
2 beijing
3 128
dtype: object
[/code]
* 访问其中某个数据
```code
print(s[1:2])
•
# 打印
1 zheng
dtype: object
[/code]
## 1.2 指定index
```code
import pandas as pd
import numpy as np
•
s = pd.Series([9, 'zheng', 'beijing', 128, 'usa', 990], index=[1,2,3,'e','f','g'])
•
print(s)
[/code]
* 打印
```code
1 9
2 zheng
3 beijing
e 128
f usa
g 990
dtype: object
[/code]
* 根据索引找出值
```code
print(s['f']) # usa
[/code]
## 1.3 用dictionary构造一个series
```code
import pandas as pd
import numpy as np
•
s = {"ton": 20, "mary": 18, "jack": 19, "car": None}
•
sa = pd.Series(s, name="age")
•
print(sa)
[/code]
* 打印
```code
car NaN
jack 19.0
mary 18.0
ton 20.0
Name: age, dtype: float64
[/code]
* 检测类型
```code
print(type(sa)) # <class 'pandas.core.series.Series'>
[/code]
## 1.4 用numpy ndarray构造一个Series
* 生成一个随机数
[code] import pandas as pd
import numpy as np
•
num_abc = pd.Series(np.random.randn(5), index=list('abcde'))
num = pd.Series(np.random.randn(5))
•
print(num)
print(num_abc)
•
# 打印
0 -0.102860
1 -1.138242
2 1.408063
3 -0.893559
4 1.378845
dtype: float64
a -0.658398
b 1.568236
c 0.535451
d 0.103117
e -1.556231
dtype: float64
[/code]
## 1.5 选择数据
```code
import pandas as pd
import numpy as np
•
s = pd.Series([9, 'zheng', 'beijing', 128, 'usa', 990], index=[1,2,3,'e','f','g'])
•
print(s[1:3]) # 选择第1到3个, 包左不包右 zheng beijing
print(s[[1,3]]) # 选择第1个和第3个, zheng 128
print(s[:-1]) # 选择第1个到倒数第1个, 9 zheng beijing 128 usa
[/code]
## 1.6 操作数据
```code
import pandas as pd
import numpy as np
•
s = pd.Series([9, 'zheng', 'beijing', 128, 'usa', 990], index=[1,2,3,'e','f','g'])
•
sum = s[1:3] + s[1:3]
sum1 = s[1:4] + s[1:4]
sum2 = s[1:3] + s[1:4]
sum3 = s[:3] + s[1:]
•
print(sum)
print(sum1)
print(sum2)
print(sum3)
[/code]
* 打印
```code
2 zhengzheng
3 beijingbeijing
dtype: object
2 zhengzheng
3 beijingbeijing
e 256
dtype: object
2 zhengzheng
3 beijingbeijing
e NaN
dtype: object
1 NaN
2 zhengzheng
3 beijingbeijing
e NaN
f NaN
g NaN
dtype: object
[/code]
## 1.7 查找
* 是否存在
```code
USA in s # true
[/code]
* 范围查找
```code
import pandas as pd
import numpy as np
s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
sa = pd.Series(s, name="age")
print(sa[sa>19])
[/code]

* 中位数
```code
import pandas as pd
import numpy as np
s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}