实现步骤:
1、采用drop_duplicates对数据去两次重,一次将重复数据全部去除(keep=False)记为data1,另一次将重复数据保留一个(keep='first)记为data2;
2、求data1和data2的差集即可:data2.append(data1).drop_duplicates(keep=False)
data1 = df.drop_duplicates(keep=False) # 将重复数据全部去除
data2 = df.drop_duplicates(keep='first') # 将重复数据只保留一个
cll = data2.append(data1).drop_duplicates(keep=False) # 此时原来的重复数据不算重复,原来不重复的数据变成重复数据去除掉了
print(cll)