from sklearn.preprocessing import MultiLabelBinarizer
mlb = MultiLabelBinarizer()
df1 = pd.DataFrame(mlb.fit_transform(df['genres']),columns=mlb.classes_, index=df.index)
df = df.join(df1)
print (df)
genres Action Adventure Comedy Drama Family \
0 [Drama] 0 0 0 1 0
1 [Music, Drama, Romance] 0 0 0 1 0
2 [Action, Adventure, Comedy] 1 1 1 0 0
3 [Thriller, Romance, Drama] 0 0 0 1 0
4 [Adventure, Family] 0 1 0 0 1
Music Romance Thriller
0 0 0 0
1 1 1 0
2 0 0 0
3 0 1 1
4 0 0 0
如果需要按列表筛选流派添加
reindex
:
genres = ['Action', 'Adventure', 'Comedy', 'Drama']
df1 = pd.DataFrame(mlb.fit_transform(df['genres']),columns=mlb.classes_, index=df.index)
df = df.join(df1.reindex(columns=g