之前是按照书本的节奏来,总感觉效率非常低。这两天在慕课上找了比较好的课程,在跟着老师学习。在这里继续进行记录,一来方便自己在琐碎时间回顾,另外也可以作为自己学习之路的见证吧。
import pandas as pd
data = {"ID":["000001","000002","000003","000004","000005","000006","000007"],
"name":["liming","zhaoyichun","zhangfuping","baili","niuyude","yaohua","linan"],
"gender":["男","女","男","女","男","女","男"],
"age":["16","20","18","18","18","17","18"],
"height":["1.88","1.78","1.81","1.86","1.74","1.75","1.76"]
}
# 特定数据的选择
Frame = pd.DataFrame(data, index=[1, 2, 3, 4, 5, 6, 7])
print(Frame['ID'][0:1]) # 特定选择ID列中的第一行,其中0代表第一行,不是索引号,这里不包括右界;
# 或者使用print(Frame.loc[[1,3]]),通过索引号选择特定的1,3行;
# 或者使用print(Frame.loc[:,['gender']])来选择gender列的全部数据,冒号代表选择全部行;
# 或者使用print(Frame.loc[5:6,['gender','age'])来选择第5行和第6行在gender和age列中的数据;
# 或者使用print(Frame.iloc[1,3])通过序号选择第二行第四列的数据;
# 或者使用print(Frame.iloc[1:2,2:4])通过序号选择第一行第二列的数据,不包括右界;
# 或者使用print(Frame.iloc[[3,1],[0,2]),选择第四和第二行以及第一和第三列的数据;
# 或者使用print(Frame[(Frame['age']>17)&(Frame['height']>1.7)])进行条件筛选,如果是布尔型数据,就要使用"==";