pandas 重命名 列名 指定列 修改 删除指定列 增减列 列转行
pandas explode() 爆炸(列拆分为多个行) | pandas 教程 - 盖若 https://gairuo.com/p/pandas-explode
pandas.DataFrame.explode — pandas 2.2.3 documentation https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.explode.html
pandas.DataFrame.explode
- DataFrame.explode(column, ignore_index=False)[source]
-
Transform each element of a list-like to a row, replicating index values.
- Parameters:
- columnIndexLabel
-
Column(s) to explode. For multiple columns, specify a non-empty list with each element be str or tuple, and all specified columns their list-like data on same row of the frame must have matching length.
Added in version 1.3.0: Multi-column explode
- ignore_indexbool, default False
-
If True, the resulting index will be labeled 0, 1, …, n - 1.
- Returns:
- DataFrame
-
Exploded lists to rows of the subset columns; index will be duplicated for these rows.
- Raises:
- ValueError
-
-
If columns of the frame are not unique.
-
If specified columns to explode is empty list.
-
If specified columns to explode have not matching count of elements rowwise in the frame.
-
See also
DataFrame.unstack
-
Pivot a level of the (necessarily hierarchical) index labels.
DataFrame.melt
-
Unpivot a DataFrame from wide format to long format.
Series.explode
-
Explode a DataFrame from list-like columns to long format.
Notes
This routine will explode list-likes including lists, tuples, sets, Series, and np.ndarray. The result dtype of the subset rows will be object. Scalars will be returned unchanged, and empty list-likes will result in a np.nan for that row. In addition, the ordering of rows in the output will be non-deterministic when exploding sets.
Reference the user guide for more examples.
Examples
>>> df = pd.DataFrame({'A': [[0, 1, 2], 'foo', [], [3, 4]], ... 'B': 1, ... 'C': [['a', 'b', 'c'], np.nan, [], ['d', 'e']]}) >>> df A B C 0 [0, 1, 2] 1 [a, b, c] 1 foo 1 NaN 2 [] 1 [] 3 [3, 4] 1 [d, e]
Single-column explode.
>>> df.explode('A') A B C 0 0 1 [a, b, c] 0 1 1 [a, b, c] 0 2 1 [a, b, c] 1 foo 1 NaN 2 NaN 1 [] 3 3 1 [d, e] 3 4 1 [d, e]
Multi-column explode.
>>> df.explode(list('AC')) A B C 0 0 1 a 0 1 1 b 0 2 1 c 1 foo 1 NaN 2 NaN 1 NaN 3 3 1 d 3 4 1 e
import pandas as pd
# 创建一个简单的DataFrame
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [4, 5, 6]
})
# 添加新列
df['C'] = [7, 8, 9]
print(df)
@staticmethod
def modify_df_mask(df: pd.DataFrame, zone_index_name_map: Dict, ):
"""
对pd.DataFrame列值修改
"""
"""
FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[320, 439)' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
Cast a pandas object to a specified dtype ``dtype``.
HOW TO
df[column_name].astype('str').mask(df[column_name] == order_index, order_index_val, inplace=True)
"""
rename = {}
for column_name, order_index_lst in zone_index_name_map.items():
tmp_column_name = f'tmp_{column_name}'
rename[tmp_column_name] = column_name
new_column_name_val = []
for i in range(len(df[column_name])):
new_column_name_val.append(order_index_lst[df[column_name][i]])
df[tmp_column_name] = new_column_name_val
df.drop([column_name], axis=1, inplace=True)
CommonQuery.modify_df_rename(df, rename)
@staticmethod
def modify_df_rename(df: pd.DataFrame, name_to_show_dict: Dict, ):
"""
对pd.DataFrame列名重命名
"""
if not df.empty:
if name_to_show_dict:
df.rename(columns=name_to_show_dict, inplace=True)
DataFrame.eval(self,expr,inplace = False,** kwargs ) [源代码]
评估描述DataFrame列上的操作的字符串。
仅在列上操作,而不在特定的行或元素上操作。这使 eval可以运行任意代码,如果将用户输入传递给此函数,可能会使您容易受到代码注入的攻击。
参数: |
expr : 要评估的表达式字符串。 inplace : 如果表达式包含一个赋值,则是否就地执行操作并更改现有的DataFrame。否则,将返回一个新的 0.18.0版的新功能。 kwargs : |
返回值: |
ndarray,标量或pandas对象 评估结果。 |
Notes
有关更多详细信息,请参见的API文档eval()
。有关详细示例,请参见使用eval增强性能。
例子
>>> df = pd.DataFrame({'A': range(1, 6), 'B': range(10, 0, -2)})
>>> df
A B
0 1 10
1 2 8
2 3 6
3 4 4
4 5 2
>>> df.eval('A + B')
0 11
1 10
2 9
3 8
4 7
dtype: int64
允许分配,尽管默认情况下不修改原始DataFrame
>>> df.eval('C = A + B')
A B C
0 1 10 11
1 2 8 10
2 3 6 9
3 4 4 8
4 5 2 7
>>> df
A B
0 1 10
1 2 8
2 3 6
3 4 4
4 5 2
使用inplace=True修改原来的数据帧
>>> df.eval('C = A + B', inplace=True)
>>> df
A B C
0 1 10 11
1 2 8 10
2 3 6 9
3 4 4 8
4 5 2 7
Pandas的掩蔽函数是为了用一个条件替换任何行或列的值。现在我们使用这个屏蔽条件,将性别栏中所有的 “女性 “改为0。
语法: df[‘column_name’].mask( df[‘column_name’] == ‘some_value’, value , inplace=True )