pandas.concat
https://pandas.pydata.org/pandas-docs/version/0.21/generated/pandas.concat.html
pandas.concat(objs, axis=0, join='outer', join_axes=None, ignore_index=False, keys=None, levels=None, names=None, verify_integrity=False, copy=True)[source]
Concatenate pandas objects along a particular axis with optional set logic along the other axes.
Can also add a layer of hierarchical indexing on the concatenation axis, which may be useful if the labels are the same (or overlapping) on the passed axis number.
| Parameters: |
objs : a sequence or mapping of Series, DataFrame, or Panel objects
axis : {0/’index’, 1/’columns’}, default 0
join : {‘inner’, ‘outer’}, default ‘outer’
join_axes : list of Index objects
ignore_index : boolean, default False
keys : sequence, default None
levels : list of sequences, default None
names : list, default None
verify_integrity : boolean, default False
copy : boolean, default True
|
|---|---|
| Returns: |
concatenated : object, type of objs
|
See also
Series.append, DataFrame.append, DataFrame.join, DataFrame.merge
Notes
The keys, levels, and names arguments are all optional.
A walkthrough of how this method fits in with other tools for combining pandas objects can be found here.
Examples
Combine two Series.
>>> s1 = pd.Series(['a', 'b'])
>>> s2 = pd.Series(['c', 'd'])
>>> pd.concat([s1, s2])
0 a
1 b
0 c
1 d
dtype: object
Clear the existing index and reset it in the result by setting the ignore_index option to True.
>>> pd.concat([s1, s2], ignore_index=True)
0 a
1 b
2 c
3 d
dtype: object
Add a hierarchical index at the outermost level of the data with the keys option.
>>> pd.concat([s1, s2], keys=['s1', 's2',])
s1 0 a
1 b
s2 0 c
1 d
dtype: object
Label the index keys you create with the names option.
>>> pd.concat([s1, s2], keys=['s1', 's2'],
... names=['Series name', 'Row ID'])
Series name Row ID
s1 0 a
1 b
s2 0 c
1 d
dtype: object
Combine two DataFrame objects with identical columns.
>>> df1 = pd.DataFrame([['a', 1], ['b', 2]],
... columns=['letter', 'number'])
>>> df1
letter number
0 a 1
1 b 2
>>> df2 = pd.DataFrame([['c', 3], ['d', 4]],
... columns=['letter', 'number'])
>>> df2
letter number
0 c 3
1 d 4
>>> pd.concat([df1, df2])
letter number
0 a 1
1 b 2
0 c 3
1 d 4
Combine DataFrame objects with overlapping columns and return everything. Columns outside the intersection will be filled with NaN values.
>>> df3 = pd.DataFrame([['c', 3, 'cat'], ['d', 4, 'dog']],
... columns=['letter', 'number', 'animal'])
>>> df3
letter number animal
0 c 3 cat
1 d 4 dog
>>> pd.concat([df1, df3])
animal letter number
0 NaN a 1
1 NaN b 2
0 cat c 3
1 dog d 4
Combine DataFrame objects with overlapping columns and return only those that are shared by passing inner to the join keyword argument.
>>> pd.concat([df1, df3], join="inner")
letter number
0 a 1
1 b 2
0 c 3
1 d 4
Combine DataFrame objects horizontally along the x axis by passing in axis=1.
>>> df4 = pd.DataFrame([['bird', 'polly'], ['monkey', 'george']],
... columns=['animal', 'name'])
>>> pd.concat([df1, df4], axis=1)
letter number animal name
0 a 1 bird polly
1 b 2 monkey george
Prevent the result from including duplicate index values with the verify_integrity option.
>>> df5 = pd.DataFrame([1], index=['a'])
>>> df5
0
a 1
>>> df6 = pd.DataFrame([2], index=['a'])
>>> df6
0
a 2
>>> pd.concat([df5, df6], verify_integrity=True)
Traceback (most recent call last):
...
ValueError: Indexes have overlapping values: ['a']
浙公网安备 33010602011771号