2020.08.23 Datewhale组队学习 数据分析03 数据重构01


本章内容介绍数据重构,主要根据对于数据的理解进行有利于我们的数据重新整理。

import numpy as np
import pandas as pd

第二章 数据重构

数据的合并

任务二:使用concat方法:将数据train-left-up.csv和train-right-up.csv横向分别合并为两张表,并在上下合并为一张表,并保存这张表为result_up
text_left_up = pd.read_csv("data02/train-left-up.csv")
text_left_down = pd.read_csv("data02/train-left-down.csv")
text_right_up = pd.read_csv("data02/train-right-up.csv")
text_right_down = pd.read_csv("data02/train-right-down.csv")
text_up = pd.concat([text_left_up,text_right_up],axis = 1)
text_up.head()
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
01.00.03.0Braund, Mr. Owen Harrismale22.01.00.0A/5 211717.2500NaNS
12.01.01.0Cumings, Mrs. John Bradley (Florence Briggs Th...female38.01.00.0PC 1759971.2833C85C
23.01.03.0Heikkinen, Miss. Lainafemale26.00.00.0STON/O2. 31012827.9250NaNS
34.01.01.0Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01.00.011380353.1000C123S
45.00.03.0Allen, Mr. William Henrymale35.00.00.03734508.0500NaNS
text_down = pd.concat([text_left_down,text_right_down],axis = 1)
text_down.head()
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
044002Kvillner, Mr. Johan Henrik Johannessonmale31.000C.A. 1872310.500NaNS
144112Hart, Mrs. Benjamin (Esther Ada Bloomfield)female45.011F.C.C. 1352926.250NaNS
244203Hampe, Mr. Leonmale20.0003457699.500NaNS
344303Petterson, Mr. Johan Emilmale25.0103470767.775NaNS
444412Reynaldo, Ms. Encarnacionfemale28.00023043413.000NaNS
text = pd.concat([text_up,text_down])
text.head()
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
01.00.03.0Braund, Mr. Owen Harrismale22.01.00.0A/5 211717.2500NaNS
12.01.01.0Cumings, Mrs. John Bradley (Florence Briggs Th...female38.01.00.0PC 1759971.2833C85C
23.01.03.0Heikkinen, Miss. Lainafemale26.00.00.0STON/O2. 31012827.9250NaNS
34.01.01.0Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01.00.011380353.1000C123S
45.00.03.0Allen, Mr. William Henrymale35.00.00.03734508.0500NaNS
text.to_csv('result.csv')
任务四:使用DataFrame自带的方法join方法和append:完成任务二和任务三的任务
text_up = text_left_up.join(text_right_up)
text_down = text_left_down.join(text_right_down)
text = text_up.append(text_down)
text.head()
# text.to_csv('result.csv')
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
0103Braund, Mr. Owen Harrismale22.01.00.0A/5 211717.2500NaNS
1211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.01.00.0PC 1759971.2833C85C
2313Heikkinen, Miss. Lainafemale26.00.00.0STON/O2. 31012827.9250NaNS
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01.00.011380353.1000C123S
4503Allen, Mr. William Henrymale35.00.00.03734508.0500NaNS
任务五:使用Panads的merge方法和DataFrame的append方法:完成任务二和任务三的任务
text_up = pd.merge(text_left_up,text_right_up,left_index=True,right_index=True)
text_down = pd.merge(text_left_down,text_right_down,left_index=True,right_index=True)
text = text_up.append(text_down)
text.head()
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
0103Braund, Mr. Owen Harrismale22.01.00.0A/5 211717.2500NaNS
1211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.01.00.0PC 1759971.2833C85C
2313Heikkinen, Miss. Lainafemale26.00.00.0STON/O2. 31012827.9250NaNS
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01.00.011380353.1000C123S
4503Allen, Mr. William Henrymale35.00.00.03734508.0500NaNS

换一种角度看数据

stack函数

text2 = text.stack()
text2.to_csv('result2.csv')
df2 = pd.read_csv('result2.csv')
df2.head()
Unnamed: 0Unnamed: 10
00PassengerId1
10Survived0
20Pclass3
30NameBraund, Mr. Owen Harris
40Sexmale
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值