python进阶之学习笔记_Python进阶学习笔记(1)- Pandas入门

Intermediate Python -- DataCamp

进阶数据操作(Dictionary&DataFrame)

List

数据单独列出来就是List

pop = [30.55, 2.77, 39.21]

countries = ["afghanistan", "albania", "algeria"]

ind_alb = countries.index("albania")

ind_alb

1

pop[ind_alb]

2.77

list使用起来不是很方便。

Dictionary

两组list的数据对应起来的话就变成Dictionary了

world = {"afghanistan":30.55, "albania":2.77, "algeria":39.21}

world["albania"]

2.77

新增或者删除数值

world["sealand"] = 0.000028

world

{'afghanistan': 30.55, 'albania': 2.81, 'algeria': 39.21, 'sealand': 2.8e-05}

del(world["sealand"]) world

{'afghanistan': 30.55, 'albania': 2.81, 'algeria': 39.21}

多层字典

说白了就是字典里面套字典

# Dictionary of dictionaries

europe = { 'spain': { 'capital':'madrid', 'population':46.77 },

'france': { 'capital':'paris', 'population':66.03 },

'germany': { 'capital':'berlin', 'population':80.62 },

'norway': { 'capital':'oslo', 'population':5.084 } }

# Print out the capital of France

europe["france"]["capital"]

# Create sub-dictionary data

data={"capital":"rome","population":59.83}

# Add data to europe under key 'italy'

europe["italy"]=data

# Print europe

print(europe)

{'france': {'population': 66.03, 'capital': 'paris'}, 'italy': {'population': 59.83, 'capital': 'rome'}, 'germany': {'population': 80.62, 'capital': 'berlin'}, 'norway': {'population': 5.084, 'capital': 'oslo'}, 'spain': {'population': 46.77, 'capital': 'madrid'}}

DataFrame

数据分析里用的最多的其实还是DataFrame(数据框),操作数据框的话会用到pandas工具包 。

把字典转换成数据框

关键语法

import xx as x

pd.DataFrame()

# Pre-defined lists

names = ['United States', 'Australia', 'Japan', 'India', 'Russia', 'Morocco', 'Egypt']

dr = [True, False, False, False, True, True, True]

cpc = [809, 731, 588, 18, 200, 70, 45]

# Import pandas as pd

import pandas as pd

# Create dictionary my_dict with three key:value pairs: my_dict

my_dict={"country":names,"drives_right":dr,"cars_per_cap":cpc}

# Build a DataFrame cars from my_dict: cars

cars=pd.DataFrame(my_dict)

# Print cars

print(cars)

cars_per_cap country drives_right

0 809 United States True

1 731 Australia False

2 588 Japan False

3 18 India False

4 200 Russia True

5 70 Morocco True

6 45 Egypt True

设置行名

cars.index=xxxxx

import pandas as pd

# Build cars DataFrame

names = ['United States', 'Australia', 'Japan', 'India', 'Russia', 'Morocco', 'Egypt']

dr = [True, False, False, False, True, True, True]

cpc = [809, 731, 588, 18, 200, 70, 45]

cars_dict = { 'country':names, 'drives_right':dr, 'cars_per_cap':cpc }

cars = pd.DataFrame(cars_dict,)

print(cars)

# Definition of row_labels

row_labels = ['US', 'AUS', 'JPN', 'IN', 'RU', 'MOR', 'EG']

# Specify row labels of cars

cars.index=row_labels

# Print cars again

print(cars)

cars_per_cap country drives_right

0 809 United States True

1 731 Australia False

2 588 Japan False

3 18 India False

4 200 Russia True

5 70 Morocco True

6 45 Egypt True

cars_per_cap country drives_right

US 809 United States True

AUS 731 Australia False

JPN 588 Japan False

IN 18 India False

RU 200 Russia True

MOR 70 Morocco True

EG 45 Egypt True

读取csv文件

cars= pd.read_csv("cars.csv",index_col = 0)

Pandas的简单操作

之后会花篇幅详细学习Pandas

选择列

[]和[[ ]]的区别,带列名和不带列名

print(cars["country"])

US United States

AUS Australia

JPN Japan

IN India

RU Russia

MOR Morocco

EG Egypt

Name: country, dtype: object

print(cars[["country"]])

country

US United States

AUS Australia

JPN Japan

IN India

RU Russia

MOR Morocco

EG Egypt

print(cars[["country","drives_right"]])

country drives_right

US United States True

AUS Australia False

JPN Japan False

IN India False

RU Russia True

MOR Morocco True

EG Egypt True

选择行

可以选取指定行,这个和R很相似

# Import cars data

import pandas as pd

cars = pd.read_csv('cars.csv', index_col = 0)

# Print out first 3 observations

print(cars[0:3])

# Print out fourth, fifth and sixth observation

print(cars[3:6])

cars_per_cap country drives_right

US 809 United States True

AUS 731 Australia False

JPN 588 Japan False

cars_per_cap country drives_right

IN 18 India False

RU 200 Russia True

MOR 70 Morocco True

print(cars[0:3])

cars_per_cap country drives_right

US 809 United States True

AUS 731 Australia False

JPN 588 Japan False

output:

cars_per_cap country drives_right

US 809 United States True

AUS 731 Australia False

JPN 588 Japan False

cars_per_cap country drives_right

IN 18 India False

RU 200 Russia True

MOR 70 Morocco True

loc and iloc

# Import cars data

import pandas as pd

cars = pd.read_csv('cars.csv', index_col = 0)

# Print out drives_right value of Morocco

print(cars.loc[["MOR","drives_right"]])

# Print sub-DataFrame

print(cars.loc[["RU","MOR"],["country","drives_right"]])

output:

cars_per_cap country drives_right

MOR 70.0 Morocco True

drives_right NaN NaN NaN

country drives_right

RU Russia True

MOR Morocco True

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值