多个csv文件匹配筛选优化小技巧,提速50倍
任务描述:处理users(百万级),shops(万级)这2个csv文件,根据shops里的userid找到相应user,计算匹配的shop和user经纬度差值,即(shop.lon-user.lon,shop.lat-user.lat)。考虑将其转化为dataframe,方便处理,即得到df_shops,df_users。
优化前思路:
...
for index in df_shops.index:
lon1=df_shops.iloc[index,2]
lat1=df_shops.iloc[index,3]
for index2 in df_users.index:
if(df_shops.iloc