在pandas中,给定一个DataFrame D:
+-----+--------+--------+--------+
| | 1 | 2 | 3 |
+-----+--------+--------+--------+
| 0 | apple | banana | banana |
| 1 | orange | orange | orange |
| 2 | banana | apple | orange |
| 3 | NaN | NaN | NaN |
| 4 | apple | apple | apple |
+-----+--------+--------+--------+
当有三列或更多列时,如何返回其所有列中具有相同内容的行,以便它返回:
+-----+--------+--------+--------+
| | 1 | 2 | 3 |
+-----+--------+--------+--------+
| 1 | orange | orange | orange |
| 4 | apple | apple | apple |
+-----+--------+--------+--------+
请注意,当所有值都是NaN时,它会跳过行.
如果这只是两列,我通常做D [D [1] == D [2]],但我不知道如何对超过2列的DataFrame进行推广.
解决方法:
类似于Andy Hayden的回答,检查min是否等于max(然后行元素都是重复的):
df[df.apply(lambda x: min(x) == max(x), 1)]
标签:python,pandas,dataframe
来源: https://codeday.me/bug/20190926/1819260.html