Dataframe既有行索引也有列索引,可以被看做由Series组成的字典。
df = pd.DataFrame(np.random.randint(100,size =12).reshape(3,4),
index = ['one','two','three'],
columns = ['a','b','c','d'])
print(df)
============================
a b c d
one 35 35 17 50
two 53 4 51 23
three 82 12 51 97
# 按照列名选择列,只选择一列输出Series,选择多列输出Dataframe
data1 = df['a']
data2 = df[['a','c']]
print(data1,type(data1))
print(data2,type(data2))
============================
one 35
two 53
three 82
Name: a, dtype: int32 <class 'pandas.core.series.Series'>
a c
one 35 17
two 53 51
three 82 51 <class 'pandas.core.frame.DataFrame'>
# 按照index选择行,只选择一行输出Series,选择多行输出Dataframe
data3 = df.loc['one']
data4 = df.loc[['one','two']]
print(data3,type(data3))
print(data4,type(data4))
============================
a 35
b 35
c 17
d 50
Name: one, dtype: int32 <class 'pandas.core.series.Series'>
a b c d
one 35 35 17 50
two 53 4 51 23 <class 'pandas.core.frame.DataFrame'>
df[ ]的用法
df[ ]默认选择列,[ ]中写列名(所以一般数据colunms都会单独制定,不会用默认数字列名,以免和index冲突)。单选列结果为Series,多选列结果为Dataframe。选取列名不能超出源数据列名,不然报错
data1 = df['a']
data2 = df[['b','c']]
print(data1)
print