melt函数
Pandas melt() function is used to change the DataFrame format from wide to long. It’s used to create a specific format of the DataFrame object where one or more columns work as identifiers. All the remaining columns are treated as values and unpivoted to the row axis and only two columns – variable and value.
熊猫的melt()函数用于将DataFrame的格式从宽更改为长。 它用于创建DataFrame对象的特定格式,其中一个或多个列用作标识符。 所有剩余的列被视为值和非透视于行轴和只有两列- 变量和值 。
1.熊猫melt()示例 (1. Pandas melt() Example)
The use of melt() function is more clear when looked through an example.
通过示例查看时,melt()函数的用法更加清晰。
import pandas as pd
d1 = {"Name": ["Pankaj", "Lisa", "David"], "ID": [1, 2, 3], "Role": ["CEO", "Editor", "Author"]}
df = pd.DataFrame(d1)
print(df)
df_melted = pd.melt(df, id_vars=["ID"], value_vars=["Name", "Role"])
print(df_melted)
Output:
输出:
Name ID Role
0 Pankaj 1 CEO
1 Lisa 2 Editor
2 David 3 Author
ID variable value
0 1 Name Pankaj
1 2 Name Lisa
2 3 Name David
3 1 Role CEO
4 2 Role Editor
5 3 Role Author
We can pass the ‘var_name’ and ‘value_name’ parameters to change the column names of ‘variable’ and ‘value’.
我们可以传递“ var_name”和“ value_name”参数来更改“变量”和“值”的列名。
df_melted = pd.melt(df, id_vars=["ID"], value_vars=["Name", "Role"], var_name="Attribute", value_name="Value")
2.多个列作为id_vars (2. Multiple Columns as id_vars)
Let’s see what happens when we pass multiple columns as the id_vars parameter.
让我们看看将多个列作为id_vars参数传递时会发生什么。
df_melted = pd.melt(df, id_vars=["ID", "Name"], value_vars=["Role"])
print(df_melted)
Output:
输出:
ID Name variable value
0 1 Pankaj Role CEO
1 2 Lisa Role Editor
2 3 David Role Author
3.在melt()函数中跳过列 (3. Skipping Columns in melt() Function)
It’s not required to use all the rows from the source DataFrame. Let’s skip the “ID” column in the next example.
不需要使用源DataFrame中的所有行。 让我们跳过下一个示例中的“ ID”列。
df_melted = pd.melt(df, id_vars=["Name"], value_vars=["Role"])
print(df_melted)
Output:
输出:
Name variable value
0 Pankaj Role CEO
1 Lisa Role Editor
2 David Role Author
4.使用pivot()函数解散DataFrame (4. Unmelting DataFrame using pivot() function)
We can use pivot() function to unmelt a DataFrame object and get the original dataframe. The pivot() function ‘index’ parameter value should be same as the ‘id_vars’ value. The ‘columns’ value should be passed as the name of the ‘variable’ column.
我们可以使用ivot()函数取消融化DataFrame对象并获取原始数据帧。 ivot()函数的“索引”参数值应与“ id_vars”值相同。 “列”值应作为“变量”列的名称传递。
import pandas as pd
d1 = {"Name": ["Pankaj", "Lisa", "David"], "ID": [1, 2, 3], "Role": ["CEO", "Editor", "Author"]}
df = pd.DataFrame(d1)
# print(df)
df_melted = pd.melt(df, id_vars=["ID"], value_vars=["Name", "Role"], var_name="Attribute", value_name="Value")
print(df_melted)
# unmelting using pivot()
df_unmelted = df_melted.pivot(index='ID', columns='Attribute')
print(df_unmelted)
Output:
输出:
ID Attribute Value
0 1 Name Pankaj
1 2 Name Lisa
2 3 Name David
3 1 Role CEO
4 2 Role Editor
5 3 Role Author
Value
Attribute Name Role
ID
1 Pankaj CEO
2 Lisa Editor
3 David Author
The unmelted DataFrame values are the same as the original DataFrame. But, the columns and index need some minor changes to make it exactly like the original data frame.
未融化的DataFrame值与原始DataFrame相同。 但是,列和索引需要进行一些细微的更改以使其完全类似于原始数据框。
df_unmelted = df_unmelted['Value'].reset_index()
df_unmelted.columns.name = None
print(df_unmelted)
Output:
输出:
ID Name Role
0 1 Pankaj CEO
1 2 Lisa Editor
2 3 David Author
Reference: pandas melt() API Doc
参考 : pandasmelt()API文档
翻译自: https://www.journaldev.com/33398/pandas-melt-unmelt-pivot-function
melt函数