Intro
hive中有explode方法,进行单行转多行的操作。pandas也有类似的功能,versionadded:: 0.25.0。直接看个case。
Signature: df1.explode(column: Union[str, Tuple], ignore_index: bool = False) -> 'DataFrame'
Docstring:
Transform each element of a list-like to a row, replicating index values.
.. versionadded:: 0.25.0
Parameters
----------
column : str or tuple
Column to explode.
ignore_index : bool, default False
If True, the resulting index will be labeled 0, 1, …, n - 1.
.. versionadded:: 1.1.0
Returns
-------
DataFrame
Exploded lists to rows of the subset columns;
index will be duplicated for these rows.
Raises
------
ValueError :
if columns of the frame are not unique.
See Also
--------
DataFrame.unstack : Pivot a level of the (necessarily hierarchical)
index labels.
DataFrame.melt : Unpivot a DataFrame from wide format to long format.
Series.explode : Explode a DataFrame from list-like columns to long format.
Notes
-----
This routine will explode list-likes including lists, tuples,
Series, and np.ndarray. The result dtype of the subset rows will
be object. Scalars will be returned unchanged. Empty list-likes will
result in a np.nan for that row.
Case
import pandas as pd
pd.__version__
'1.1.5'
id = ['a','b','c']
id2 = [4,6,[2,3,8]]
id3 = [1,1,1]
df = pd.DataFrame({'id':id,'id2':id2,'id3':id3})
df
id | id2 | id3 | |
---|---|---|---|
0 | a | 4 | 1 |
1 | b | 6 | 1 |
2 | c | [2, 3, 8] | 1 |
df.explode('id2')
id | id2 | id3 | |
---|---|---|---|
0 | a | 4 | 1 |
1 | b | 6 | 1 |
2 | c | 2 | 1 |
2 | c | 3 | 1 |
2 | c | 8 | 1 |
2021-03-25 于南京市江宁区九龙湖