In the following DataFrame, I have three columns:
Code | Category | Count
X A 89734
X A 239487
Y B 298787
Z B 87980
W C 098454
I need to add a column, that if a category has more than one unique code (like B in the example above), it gets a flag denoting it as a test.
So the output I am looking for is this:
Code | Category | Count | Test_Flag
X A 89734
X A 239487
Y B 298787 T
Z B 87980 T
W C 098454
解决方案
You could also opt for transform with numpy.where for filling the values.
df['Test_flag'] = np.where(df.groupby('Category').Code.transform('nunique') > 1, 'T', '')
>>> df
Category Code Count Test_flag
0 A X 89734
1 A X 239487
2 B Y 298787 T
3 B Z 87980 T
4 C W 98454