python进阶之学习笔记_Python进阶学习笔记(1)- Pandas入门

最新推荐文章于 2022-11-28 20:53:41 发布

weixin_39834205

最新推荐文章于 2022-11-28 20:53:41 发布

阅读量110

点赞数

文章标签： python进阶之学习笔记

本文链接：https://blog.csdn.net/weixin_39834205/article/details/111483436

版权

Intermediate Python -- DataCamp

进阶数据操作(Dictionary&DataFrame)

List

数据单独列出来就是List

pop = [30.55, 2.77, 39.21]

countries = ["afghanistan", "albania", "algeria"]

ind_alb = countries.index("albania")

ind_alb

pop[ind_alb]

2.77

list使用起来不是很方便。

Dictionary

两组list的数据对应起来的话就变成Dictionary了

world = {"afghanistan":30.55, "albania":2.77, "algeria":39.21}

world["albania"]

2.77

新增或者删除数值

world["sealand"] = 0.000028

world

{'afghanistan': 30.55, 'albania': 2.81, 'algeria': 39.21, 'sealand': 2.8e-05}

del(world["sealand"]) world

{'afghanistan': 30.55, 'albania': 2.81, 'algeria': 39.21}

多层字典

说白了就是字典里面套字典

# Dictionary of dictionaries

europe = { 'spain': { 'capital':'madrid', 'population':46.77 },

'france': { 'capital':'paris', 'population':66.03 },

'germany': { 'capital':'berlin', 'population':80.62 },

'norway': { 'capital':'oslo', 'population':5.084 } }

# Print out the capital of France

europe["france"]["capital"]

# Create sub-dictionary data

data={"capital":"rome","population":59.83}

# Add data to europe under key 'italy'

europe["italy"]=data

# Print europe

print(europe)

{'france': {'population': 66.03, 'capital': 'paris'}, 'italy': {'population': 59.83, 'capital': 'rome'}, 'germany': {'population': 80.62, 'capital': 'berlin'}, 'norway': {'population': 5.084, 'capital': 'oslo'}, 'spain': {'population': 46.77, 'capital': 'madrid'}}

DataFrame

数据分析里用的最多的其实还是DataFrame(数据框)，操作数据框的话会用到pandas工具包。

把字典转换成数据框

关键语法

import xx as x

pd.DataFrame()

# Pre-defined lists

names = ['United States', 'Australia', 'Japan', 'India', 'Russia', 'Morocco', 'Egypt']

dr = [True, False, False, False, True, True, True]

cpc = [809, 731, 588, 18, 200, 70, 45]

# Import pandas as pd

import pandas as pd

# Create dictionary my_dict with three key:value pairs: my_dict

my_dict={"country":names,"drives_right":dr,"cars_per_cap":cpc}

# Build a DataFrame cars from my_dict: cars

cars=pd.DataFrame(my_dict)

# Print cars

print(cars)

cars_per_cap country drives_right

0 809 United States True

1 731 Australia False

2 588 Japan False

3 18 India False

4 200 Russia True

5 70 Morocco True

6 45 Egypt True

设置行名

cars.index=xxxxx

import pandas as pd

# Build cars DataFrame

names = ['United States', 'Australia', 'Japan', 'India', 'Russia', 'Morocco', 'Egypt']

dr = [True, False, False, False, True, True, True]

cpc = [809, 731, 588, 18, 200, 70, 45]

cars_dict = { 'country':names, 'drives_right':dr, 'cars_per_cap':cpc }

cars = pd.DataFrame(cars_dict,)

print(cars)

# Definition of row_labels

row_labels = ['US', 'AUS', 'JPN', 'IN', 'RU', 'MOR', 'EG']

# Specify row labels of cars

cars.index=row_labels

# Print cars again

print(cars)

cars_per_cap country drives_right

0 809 United States True

1 731 Australia False

2 588 Japan False

3 18 India False

4 200 Russia True

5 70 Morocco True

6 45 Egypt True

cars_per_cap country drives_right

US 809 United States True

AUS 731 Australia False

JPN 588 Japan False

IN 18 India False

RU 200 Russia True

MOR 70 Morocco True

EG 45 Egypt True

读取csv文件

cars= pd.read_csv("cars.csv",index_col = 0)

Pandas的简单操作

之后会花篇幅详细学习Pandas

选择列

[]和[[ ]]的区别，带列名和不带列名

print(cars["country"])

US United States

AUS Australia

JPN Japan

IN India

RU Russia

MOR Morocco

EG Egypt

Name: country, dtype: object

print(cars[["country"]])

country

US United States

AUS Australia

JPN Japan

IN India

RU Russia

MOR Morocco

EG Egypt

print(cars[["country","drives_right"]])

country drives_right

US United States True

AUS Australia False

JPN Japan False

IN India False

RU Russia True

MOR Morocco True

EG Egypt True

选择行

可以选取指定行，这个和R很相似

# Import cars data

import pandas as pd

cars = pd.read_csv('cars.csv', index_col = 0)

# Print out first 3 observations

print(cars[0:3])

# Print out fourth, fifth and sixth observation

print(cars[3:6])

cars_per_cap country drives_right

US 809 United States True

AUS 731 Australia False

JPN 588 Japan False

cars_per_cap country drives_right

IN 18 India False

RU 200 Russia True

MOR 70 Morocco True

print(cars[0:3])

cars_per_cap country drives_right

US 809 United States True

AUS 731 Australia False

JPN 588 Japan False

output:

cars_per_cap country drives_right

US 809 United States True

AUS 731 Australia False

JPN 588 Japan False

cars_per_cap country drives_right

IN 18 India False

RU 200 Russia True

MOR 70 Morocco True

loc and iloc

# Import cars data

import pandas as pd

cars = pd.read_csv('cars.csv', index_col = 0)

# Print out drives_right value of Morocco

print(cars.loc[["MOR","drives_right"]])

# Print sub-DataFrame

print(cars.loc[["RU","MOR"],["country","drives_right"]])

output:

cars_per_cap country drives_right

MOR 70.0 Morocco True

drives_right NaN NaN NaN

country drives_right

RU Russia True

MOR Morocco True

weixin_39834205

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python进阶之学习笔记_Python进阶学习笔记(1)- Pandas入门

Intermediate Python -- DataCamp进阶数据操作(Dictionary&DataFrame)List数据单独列出来就是Listpop = [30.55, 2.77, 39.21]countries = ["afghanistan", "albania", "algeria"]ind_alb = countries.index("albania")ind_alb1p...
复制链接

扫一扫

python进阶之学习笔记_Python进阶学习笔记(1)- Pandas入门

“相关推荐”对你有帮助么？