Pandas学习笔记(6)Renaming and Combining

1.理论部分

1.重命名函数rename

reviews.rename(columns={'points': 'score'})

在这里插入图片描述

reviews.rename(index={0: 'firstEntry', 1: 'secondEntry'})

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-ewu45k4R-1639650707751)(C:\Users\admin\AppData\Roaming\Typora\typora-user-images\image-20211216181505070.png)]

reviews.rename_axis("wines", axis='rows').rename_axis("fields", axis='columns')

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Z2750S9y-1639650707751)(C:\Users\admin\AppData\Roaming\Typora\typora-user-images\image-20211216181538582.png)]

2.合并函数有concat(),join()和merge(),其中merge的功能可被join代替

concat函数将相同的数据类型数据合并

canadian_youtube = pd.read_csv("../input/youtube-new/CAvideos.csv")
british_youtube = pd.read_csv("../input/youtube-new/GBvideos.csv")

pd.concat([canadian_youtube, british_youtube])

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-N6Bo00mw-1639650707752)(C:\Users\admin\AppData\Roaming\Typora\typora-user-images\image-20211216181713141.png)]

join将部分相同的数据类型的数据合并

left = canadian_youtube.set_index(['title', 'trending_date'])
right = british_youtube.set_index(['title', 'trending_date'])

left.join(right, lsuffix='_CAN', rsuffix='_UK')

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-hpw2IHEu-1639650707753)(C:\Users\admin\AppData\Roaming\Typora\typora-user-images\image-20211216181819250.png)]

2.实践部分

1.region_1 and region_2 are pretty uninformative names for locale columns in the dataset. Create a copy of reviews with these columns renamed to region and locale, respectively.

renamed = reviews.rename(columns={'region_1':'region','region_2':'locale'})

2.Set the index name in the dataset to wines.

reindexed = reviews.rename_axis("wines",axis = 'index')

3.合并两个数据文件

gaming_products = pd.read_csv("../input/things-on-reddit/top-things/top-things/reddits/g/gaming.csv")
gaming_products['subreddit'] = "r/gaming"
movie_products = pd.read_csv("../input/things-on-reddit/top-things/top-things/reddits/m/movies.csv")
movie_products['subreddit'] = "r/movies"combined_products = pd.concat([gaming_products, movie_products])

Create a DataFrame of products mentioned on either subreddit.

combined_products = pd.concat([gaming_products, movie_products])

4.合并

powerlifting_meets = pd.read_csv("../input/powerlifting-database/meets.csv")
powerlifting_competitors = pd.read_csv("../input/powerlifting-database/openpowerlifting.csv")

Both tables include references to a MeetID, a unique key for each meet (competition) included in the database. Using this, generate a dataset combining the two tables into one.

powerlifting_combined = powerlifting_meets.set_index("MeetID").join(powerlifting_competitors.set_index("MeetID"))
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值