本意想找出产业链,但数据量太大,这方法不适用
dataframe两列转为字典key和value
import pandas as pd
import numpy as np
df = pd.DataFrame({"a":[1,1,1,2,2,2,2,3], "b": ["q","q","q","q","q","q","q","w"], "c":[0,0,0,0,0,0,0,0], "d": [1,1,1,1,1,1,1,1]})
print(df)
aa= df[["a", "b"]].set_index("a").to_dict()["b"]
print(aa)
结果:
a b c d
0 1 q 0 1
1 1 q 0 1
2 1 q 0 1
3 2 q 0 1
4 2 q 0 1
5 2 q 0 1
6 2 q 0 1
7 3 w 0 1
{1: 'q', 2: 'q', 3: 'w'}
Process finished with exit code 0
找出所有父节点
dict = {"南京企业爱普":"南京企业爱普22","南京企业爱普22":"南京企业爱普33","南京企业爱普":"河南胡辣汤","南京企业爱普1":"武汉热干面","北京烤鸭":"西安肉夹馍","深圳前海":"南京企业爱普"}
list = []
def get_father( word):
if word not in dict.keys():
return []
else:
c = dict[word]
a = [dict[word]]
b = get_father(c)
return a + b
print(get_father("深圳前海"))
结果:
['南京企业爱普', '河南胡辣汤']
Process finished with exit code 0