Python for Data Analysis_2nd_Task 5 之 Pandas 进阶

Python for Data Analysis_2nd_Task 5 之 Pandas 进阶

十道经典练习,使用Pandas,一起玩转数据分析

  • 开始了解你的数据:探索 Chipotle 快餐数据
  • 数据过滤排序:探索 2012 欧洲杯数据
  • 数据分组:探索酒类消费数据
  • Apply 函数:探索 1960-2014 美国犯罪数据
  • 合并:探索虚拟姓名数据
  • 统计:探索风速数据
  • 可视化:探索泰坦尼克灾难数据
  • 创建数据框:探索 Pokemon 数据
  • 时间序列:探索 Apple 公司股价数据
  • 删除数据:探索 Iris 纸鸢花数据

开始了解你的数据:探索 Chipotle 快餐数据

C

探索 Chipotle 快餐数据

查看对应的数据集文件路径
ls ../input/pandas_exercise/exercise_data/

Apple_stock.csv  drinks.csv          second_cars_info.csv          wechart.csv
cars.csv         Euro2012_stats.csv  train.csv                     wind.data
chipotle.tsv     iris.csv            US_Crime_Rates_1960_2014.csv
Step 1 导入必要的库
import pandas as np
Step 2 导入数据集
path1 = "../input/pandas_exercise/exercise_data/chipotle.tsv" 
Step 3 将数据集存入 chipo 的 DataFrame 类型
chipo = pd.read_csv(path1, sep = '\t')
Step 4 查看前 10 行内容
chipo.head(10)

 	order_id 	quantity 	item_name 	choice_description 	item_price
0 	1 	1 	Chips and Fresh Tomato Salsa 	NaN 	$2.39
1 	1 	1 	Izze 	[Clementine] 	$3.39
2 	1 	1 	Nantucket Nectar 	[Apple] 	$3.39
3 	1 	1 	Chips and Tomatillo-Green Chili Salsa 	NaN 	$2.39
4 	2 	2 	Chicken Bowl 	[Tomatillo-Red Chili Salsa (Hot), [Black Beans... 	$16.98
5 	3 	1 	Chicken Bowl 	[Fresh Tomato Salsa (Mild), [Rice, Cheese, Sou... 	$10.98
6 	3 	1 	Side of Chips 	NaN 	$1.69
7 	4 	1 	Steak Burrito 	[Tomatillo Red Chili Salsa, [Fajita Vegetables... 	$11.75
8 	4 	1 	Steak Soft Tacos 	[Tomatillo Green Chili Salsa, [Pinto Beans, Ch... 	$9.25
9 	5 	1 	Steak Burrito 	[Fresh Tomato Salsa, [Rice, Black Beans, Pinto... 	$9.25
Step 5 产看数据集中的列数
# columns
chipo.shape[1]

5

# rows
chipo.shape[0]

4622
Step 6 打印全部列名
chipo.columns

Index(['order_id', 'quantity', 'item_name', 'choice_description',
       'item_price'],
      dtype='object')

chipo.info # [4622 rows x 5 columns]

<bound method DataFrame.info of       order_id  quantity                              item_name  \
0            1         1           Chips and Fresh Tomato Salsa   
1            1         1                                   Izze   
2            1         1                       Nantucket Nectar   
3            1         1  Chips and Tomatillo-Green Chili Salsa   
4            2<
这本书主要是用 pandas 连接 SciPy 和 NumPy,用pandas做数据处理是Pycon2012上一个很热门的话题。另一个功能强大的东西是Sage,它将很多开源的软件集成到统一的 Python 接口。, Python for Data Analysis is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you’ll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language., Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It’s ideal for analysts new to Python and for Python programmers new to scientific computing., Use the IPython interactive shell as your primary development environment, Learn basic and advanced NumPy (Numerical Python) features, Get started with data analysis tools in the pandas library, Use high-performance tools to load, clean, transform, merge, and reshape data, Create scatter plots and static or interactive visualizations with matplotlib, Apply the pandas groupby facility to slice, dice, and summarize datasets, Measure data by points in time, whether it’s specific instances, fixed periods, or intervals, Learn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值