①列转行方法
- stack函数:pandas.DataFrame.stack(self, level=-1, dropna=True)
通过?pandas.DataFrame.stack命令查看帮助文档
- Signature: pandas.DataFrame.stack(self, level=-1, dropna=True)
- Docstring:
- Pivot a level of the (possibly hierarchical) column labels, returning a
- DataFrame (or Series in the case of an object with a single level of
- column labels) having a hierarchical index with a new inner-most level
- of row labels.
- The level involved will automatically get sorted.
- In [16]: import pandas as pd
- ...: import numpy as np
- ...: df = pd.DataFrame(np.arange(6).reshape(2,3),index=['AA','BB'],columns=
- ...: ['three','two','one'])
- ...: df
- ...:
- Out[16]:
- three two one
- AA 0 1 2
- BB 3 4 5
- In [17]: df.stack()
- Out[17]:
- AA three 0
- two 1
- one 2
- BB three 3
- two 4
- one 5
- dtype: int32
- In [18]: df.stack(level=0)
- Out[18]:
- AA three 0
- two 1
- one 2
- BB three 3
- two 4
- one 5
- dtype: int32
- In [19]: df.stack(level=-1)
- Out[19]:
- AA three 0
- two 1
- one 2
- BB three 3
- two 4
- one 5
- dtype: int32
- In [31]: import pandas as pd
- ...: import numpy as np
- ...: df = pd.DataFrame(np.arange(8).reshape(2,4),index=['AA','BB'],columns=
- ...: [['two','two','one','one'],['A','B','C','D']])
- ...: df
- ...:
- Out[31]:
- two one
- A B C D
- AA 0 1 2 3
- BB 4 5 6 7
- In [32]: df.stack()
- Out[32]:
- one two
- AA A NaN 0.0
- B NaN 1.0
- C 2.0 NaN
- D 3.0 NaN
- BB A NaN 4.0
- B NaN 5.0
- C 6.0 NaN
- D 7.0 NaN
- In [33]: df.stack(level=0)
- Out[33]:
- A B C D
- AA one NaN NaN 2.0 3.0
- two 0.0 1.0 NaN NaN
- BB one NaN NaN 6.0 7.0
- two 4.0 5.0 NaN NaN
- In [34]: df.stack(level=1)
- Out[34]:
- one two
- AA A NaN 0.0
- B NaN 1.0
- C 2.0 NaN
- D 3.0 NaN
- BB A NaN 4.0
- B NaN 5.0
- C 6.0 NaN
- D 7.0 NaN
- In [35]: df.stack(level=-1)
- Out[35]:
- one two
- AA A NaN 0.0
- B NaN 1.0
- C 2.0 NaN
- D 3.0 NaN
- BB A NaN 4.0
- B NaN 5.0
- C 6.0 NaN
- D 7.0 NaN
- In [36]: df.stack(level=[0,1])
- Out[36]:
- AA one C 2.0
- D 3.0
- two A 0.0
- B 1.0
- BB one C 6.0
- D 7.0
- two A 4.0
- B 5.0
- dtype: float64
- unstack函数:pandas.DataFrame.unstack(self, level=-1, fill_value=None)
通过?pandas.DataFrame.unstack命令查看帮助文档
- Signature: pandas.DataFrame.unstack(self, level=-1, fill_value=None)
- Docstring:
- Pivot a level of the (necessarily hierarchical) index labels, returning
- a DataFrame having a new level of column labels whose inner-most level
- consists of the pivoted index labels. If the index is not a MultiIndex,
- the output will be a Series (the analogue of stack when the columns are
- not a MultiIndex).
- The level involved will automatically get sorted.
- In [20]: df
- Out[20]:
- three two one
- AA 0 1 2
- BB 3 4 5
- In [21]: df.unstack()
- Out[21]:
- three AA 0
- BB 3
- two AA 1
- BB 4
- one AA 2
- BB 5
- dtype: int32
- In [22]: df.unstack(0)
- Out[22]:
- three AA 0
- BB 3
- two AA 1
- BB 4
- one AA 2
- BB 5
- dtype: int32
- In [23]: df.unstack(-1)
- Out[23]:
- three AA 0
- BB 3
- two AA 1
- BB 4
- one AA 2
- BB 5
- dtype: int32
- In [37]: df
- Out[37]:
- two one
- A B C D
- AA 0 1 2 3
- BB 4 5 6 7
- In [38]: df.unstack()
- Out[38]:
- two A AA 0
- BB 4
- B AA 1
- BB 5
- one C AA 2
- BB 6
- D AA 3
- BB 7
- dtype: int32
- In [39]: df.unstack(0)
- Out[39]:
- two A AA 0
- BB 4
- B AA 1
- BB 5
- one C AA 2
- BB 6
- D AA 3
- BB 7
- dtype: int32
- In [40]: df.unstack(1)
- Out[40]:
- two A AA 0
- BB 4
- B AA 1
- BB 5
- one C AA 2
- BB 6
- D AA 3
- BB 7
- dtype: int32
- In [41]: df.unstack(-1)
- Out[41]:
- two A AA 0
- BB 4
- B AA 1
- BB 5
- one C AA 2
- BB 6
- D AA 3
- BB 7
- dtype: int32
- In [42]: df.unstack(level=[0,1])
- IndexError: Too many levels: Index has only 1 level, not 2
- In [44]: df
- Out[44]:
- two one
- A B C D
- AA 0 1 2 3
- BB 4 5 6 7
- In [45]: df.unstack(level=5)
- Out[45]:
- two A AA 0
- BB 4
- B AA 1
- BB 5
- one C AA 2
- BB 6
- D AA 3
- BB 7
- dtype: int32
- melt函数:pandas.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None)
通过?pandas.melt查看帮助文档
- Signature: pandas.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None)
- Docstring:
- "Unpivots" a DataFrame from wide format to long format, optionally leaving
- identifier variables set.
- This function is useful to massage a DataFrame into a format where one
- or more columns are identifier variables (`id_vars`), while all other
- columns, considered measured variables (`value_vars`), are "unpivoted" to
- the row axis, leaving just two non-identifier columns, 'variable' and
- 'value'.
- In [46]: df = pd.DataFrame(np.arange(8).reshape(2,4),index=['AA','BB'],columns=
- ...: ['A','B','C','D'])
- ...: df
- ...:
- Out[46]:
- A B C D
- AA 0 1 2 3
- BB 4 5 6 7
- In [47]: pd.melt(df,id_vars=['A','C'],value_vars=['B','D'],var_name='B|D',value
- ...: _name='(B|D)_value')
- Out[47]:
- A C B|D (B|D)_value
- 0 0 2 B 1
- 1 4 6 B 5
- 2 0 2 D 3
- 3 4 6 D 7
- In [48]: pd.melt(df,id_vars=['A'],value_vars=['B','D'],var_name='B|D',value_nam
- ...: e='(B|D)_value')
- Out[48]:
- A B|D (B|D)_value
- 0 0 B 1
- 1 4 B 5
- 2 0 D 3
- 3 4 D 7
- In [49]: pd.melt(df,id_vars=['A'],value_vars=['B'],var_name='B',value_name='B_v
- ...: alue')
- Out[49]:
- A B B_value
- 0 0 B 1
- 1 4 B 5
- In [50]: df1 = pd.DataFrame(np.arange(8).reshape(2,4),columns=[list('ABCD'),lis
- ...: t('EFGH')])
- ...: df1
- ...:
- Out[50]:
- A B C D
- E F G H
- 0 0 1 2 3
- 1 4 5 6 7
- In [51]: pd.melt(df1,col_level=0,id_vars=['A'],value_vars=['D'])
- Out[51]:
- A variable value
- 0 0 D 3
- 1 4 D 7
②行转列方法
- unstack函数:pandas.DataFrame.unstack(self, level=-1, fill_value=None)
- In [26]: df2=df.stack()
- ...: df2
- ...:
- Out[26]:
- AA three 0
- two 1
- one 2
- BB three 3
- two 4
- one 5
- dtype: int32
- In [27]: df2.unstack()
- Out[27]:
- three two one
- AA 0 1 2
- BB 3 4 5
- In [28]: df2.unstack(0)
- Out[28]:
- AA BB
- three 0 3
- two 1 4
- one 2 5
- In [29]: df2.unstack(1)
- Out[29]:
- three two one
- AA 0 1 2
- BB 3 4 5
- In [30]: df2.unstack(-1)
- Out[30]:
- three two one
- AA 0 1 2
- BB 3 4 5