**复习:这门课程得主要目的是通过真实的数据,以实战的方式了解数据分析的流程和熟悉数据分析python的基本操作。知道了课程的目的之后,我们接下来我们要正式的开始数据分析的实战教学,完成kaggle上泰坦尼克的任务,实战数据分析全流程。
这里有两份资料:
教材《Python for Data Analysis》和 baidu.com &
google.com(善用搜索引擎)
1 第一章:数据载入及初步观察
1.1 载入数据
数据集下载 https://www.kaggle.com/c/titanic/overview
1.1.1 任务一:导入numpy和pandas
#写入代码
import numpy as ny
import pandas as pd
【提示】如果加载失败,学会如何在你的python环境下安装numpy和pandas这两个库
1.1.2 任务二:载入数据
(1) 使用相对路径载入数据
(2) 使用绝对路径载入数据
#写入代码
pd.read_csv("train.csv")
PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 0 | 3 | Braund, Mr. Owen Harris | male | 22.0 | 1 | 0 | A/5 21171 | 7.2500 | NaN | S |
1 | 2 | 1 | 1 | Cumings, Mrs. John Bradley (Florence Briggs Th... | female | 38.0 | 1 | 0 | PC 17599 | 71.2833 | C85 | C |
2 | 3 | 1 | 3 | Heikkinen, Miss. Laina | female | 26.0 | 0 | 0 | STON/O2. 3101282 | 7.9250 | NaN | S |
3 | 4 | 1 | 1 | Futrelle, Mrs. Jacques Heath (Lily May Peel) | female | 35.0 | 1 | 0 | 113803 | 53.1000 | C123 | S |
4 | 5 | 0 | 3 | Allen, Mr. William Henry | male | 35.0 | 0 | 0 | 373450 | 8.0500 | NaN | S |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
886 | 887 | 0 | 2 | Montvila, Rev. Juozas | male | 27.0 | 0 | 0 | 211536 | 13.0000 | NaN | S |
887 | 888 | 1 | 1 | Graham, Miss. Margaret Edith | female | 19.0 | 0 | 0 | 112053 | 30.0000 | B42 | S |
888 | 889 | 0 | 3 | Johnston, Miss. Catherine Helen "Carrie" | female | NaN | 1 | 2 | W./C. 6607 | 23.4500 | NaN | S |
889 | 890 | 1 | 1 | Behr, Mr. Karl Howell | male | 26.0 | 0 | 0 | 111369 | 30.0000 | C148 | C |
890 | 891 | 0 | 3 | Dooley, Mr. Patrick | male | 32.0 | 0 | 0 | 370376 | 7.7500 | NaN | Q |
891 rows × 12 columns
#写入代码
pd.read_csv(r"E:\Documents\otherFiles\hands-on-data-analysis-master\第一单元项目集合\train.csv")
#绝对路径是将文件的存储完整地址
PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 0 | 3 | Braund, Mr. Owen Harris | male | 22.0 | 1 | 0 | A/5 21171 | 7.2500 | NaN | S |
1 | 2 | 1 | 1 | Cumings, Mrs. John Bradley (Florence Briggs Th... | female | 38.0 | 1 | 0 | PC 17599 | 71.2833 | C85 | C |
2 | 3 | 1 | 3 | Heikkinen, Miss. Laina | female | 26.0 | 0 | 0 | STON/O2. 3101282 | 7.9250 | NaN | S |
3 | 4 | 1 | 1 | Futrelle, Mrs. Jacques Heath (Lily May Peel) | female | 35.0 | 1 | 0 | 113803 | 53.1000 | C123 | S |
4 | 5 | 0 | 3 | Allen, Mr. William Henry | male | 35.0 | 0 | 0 | 373450 | 8.0500 | NaN | S |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
886 | 887 | 0 | 2 | Montvila, Rev. Juozas | male | 27.0 | 0 | 0 | 211536 | 13.0000 | NaN | S |
887 | 888 | 1 | 1 | Graham, Miss. Margaret Edith | female | 19.0 | 0 | 0 | 112053 | 30.0000 | B42 | S |
888 | 889 | 0 | 3 | Johnston, Miss. Catherine Helen "Carrie" | female | NaN | 1 | 2 | W./C. 6607 | 23.4500 | NaN | S |
889 | 890 | 1 | 1 | Behr, Mr. Karl Howell | male | 26.0 | 0 | 0 | 111369 | 30.0000 | C148 | C |
890 | 891 | 0 | 3 | Dooley, Mr. Patrick | male | 32.0 | 0 | 0 | 370376 | 7.7500 | NaN | Q |
891 rows × 12 columns
pd.read_table("train.csv")
PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked | |
---|---|
0 | 1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/... |
1 | 2,1,1,"Cumings, Mrs. John Bradley (Florence Br... |
2 | 3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,S... |
3 | 4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May ... |
4 | 5,0,3,"Allen, Mr. William Henry",male,35,0,0,3... |
... | ... |
886 | 887,0,2,"Montvila, Rev. Juozas",male,27,0,0,21... |
887 | 888,1,1,"Graham, Miss. Margaret Edith",female,... |
888 | 889,0,3,"Johnston, Miss. Catherine Helen ""Car... |