问题:
How can I remove duplicate rows of a 2 dimensional numpy
array?
data = np.array([[1,8,3,3,4],
[1,8,9,9,4],
[1,8,3,3,4]])
The answer should be as follows:
ans = array([[1,8,3,3,4],
[1,8,9,9,4]])
If there are two rows that are the same, then I would like to remove one "duplicate" row.
答案:
33down voteaccepted
You can use numpy unique
. Since you want the unique rows, we need to put them into tuples:
import numpy as np
data = np.array([[1,8,3,3,4],
[1,8,9,9,4],
[1,8,3,3,4]])
just applying np.unique
to the data
array will result in this:
>>> uniques
array([1, 3, 4, 8, 9])
prints out the unique elements in the list. So putting them into tuples results in:
new_array = [tuple(row) for row in data]
uniques = np.unique(new_array)
which prints:
>>> uniques
array([[1, 8, 3, 3, 4],
[1, 8, 9, 9, 4]])