1、统计某个值出现次数
df["category"].value_counts()
Loading necessary files...
10 20125
1 2686
0 1211
3 720
4 228
2 144
Name: category, dtype: int64
2、匹配两个文件中相同数据
buildFeatPath = r"./Dataset/train_nj_polyMark(newDensity)_0205.csv"
poi_onehot_path = r"./result/onehot_{}m.csv".format(radius)
df_poi = pd.read_csv(poi_onehot_path, sep=";", usecols=["build_id"])
# print(df_poi.shape)
df_build = pd.read_csv(buildFeatPath, sep=";")
# 匹配两个pandas数据的交集——inner
# http://bluewhale.cc/2016-08-15/python-merge.html
df_inner = pd.merge(df_poi, df_build, how="inner")
参考与:http://bluewhale.cc/2016-08-15/python-merge.html
3、pandas读取数据排序
# ascending:是否升序排序,默认为true,降序则为false
# "build_id"采用升序排列, "cate_two"采用降序排列
df = df_poi.sort_values(by=["build_id", "cate_two"], ascending=[True, False])