I have a DataFrame and I want to get some non-null elements as a list.
Specifically, given df:
df = pd.DataFrame({"a":["A",None,"B"],"b":[None,"C","D"],"c":["E","F",None]})
a b c
0 A None E
1 None C F
2 B D None
and the interesting columns list ["a","c"], I want to extract the list of non-None element of the specified columns, i.e.,
["A","B","E","F"]
I guess I can do
[value for colname in interesting_columns
for value in df.loc[df[colname].notnull(),colname]]
but I was wondering if there is some non-iterative magic trick.
解决方案
You can stack it to long format and retrieve the data with .values accessor. By default, stack() drops missing values automatically:
df[['a', 'c']].T.stack().values
# array(['A', 'B', 'E', 'F'], dtype=object)
Or if you want a list:
df[['a', 'c']].T.stack().tolist()
# ['A', 'B', 'E', 'F']
T is needed to get the values in the order you requested.