Pandas学习打卡日记#2

fangyan0819

于 2020-04-23 19:50:14 发布

阅读量162

点赞数

分类专栏： pandas学习打卡

本文链接：https://blog.csdn.net/fangyan0819/article/details/105715742

版权

pandas学习打卡专栏收录该内容

4 篇文章 0 订阅

订阅专栏

import numpy as np
import pandas as pd
df = pd.read_csv('F:/data/UFO.csv')
df.head()

	datetime	shape	duration (seconds)	latitude	longitude
0	10/10/1949 20:30	cylinder	2700.0	29.883056	-97.941111
1	10/10/1949 21:00	light	7200.0	29.384210	-98.581082
2	10/10/1955 17:00	circle	20.0	53.200000	-2.916667
3	10/10/1956 21:00	circle	20.0	28.978333	-96.645833
4	10/10/1960 20:00	light	900.0	21.418056	-157.803611

df.rename(columns = {'duration (seconds)':'duration'} , inplace =  True)
df.head()

	datetime	shape	duration	latitude	longitude
0	10/10/1949 20:30	cylinder	2700.0	29.883056	-97.941111
1	10/10/1949 21:00	light	7200.0	29.384210	-98.581082
2	10/10/1955 17:00	circle	20.0	53.200000	-2.916667
3	10/10/1956 21:00	circle	20.0	28.978333	-96.645833
4	10/10/1960 20:00	light	900.0	21.418056	-157.803611

df.loc[lambda x : x['duration'] > 60]['shape'].value_counts().index[0]

'light'

interval1 = pd.interval_range(start = -180 , end = 180 , freq = 30)
interval2 = pd.interval_range(start = -90 , end = 90 , freq = 18)

IntervalIndex([(-90, -72], (-72, -54], (-54, -36], (-36, -18], (-18, 0], (0, 18], (18, 36], (36, 54], (54, 72], (72, 90]],
              closed='right',
              dtype='interval[int64]')

cuts1 = pd.cut(df['longitude'],bins = interval1)
cuts2 = pd.cut(df['latitude'],bins = interval2)
df['longitude'] = cuts1
df['latitude'] = cuts2
df.head()

	datetime	shape	duration	latitude	longitude
0	10/10/1949 20:30	cylinder	2700.0	(18, 36]	(-120, -90]
1	10/10/1949 21:00	light	7200.0	(18, 36]	(-120, -90]
2	10/10/1955 17:00	circle	20.0	(36, 54]	(-30, 0]
3	10/10/1956 21:00	circle	20.0	(18, 36]	(-120, -90]
4	10/10/1960 20:00	light	900.0	(18, 36]	(-180, -150]

df.set_index([cuts1,cuts2]).index.value_counts().head().head()

((-90, -60], (36, 54])      27891
((-120, -90], (18, 36])     14280
((-120, -90], (36, 54])     11960
((-90, -60], (18, 36])       9923
((-150, -120], (36, 54])     9658
dtype: int64

#作业二

df2 = pd.read_csv('F:/data/Pokemon.csv')
df2.head()

	#	Name	Type 1	Type 2	Total	HP	Attack	Defense	Sp. Atk	Sp. Def	Speed	Generation	Legendary
0	1	Bulbasaur	Grass	Poison	318	45	49	49	65	65	45	1	False
1	2	Ivysaur	Grass	Poison	405	60	62	63	80	80	60	1	False
2	3	Venusaur	Grass	Poison	525	80	82	83	100	100	80	1	False
3	3	VenusaurMega Venusaur	Grass	Poison	625	80	100	123	122	120	80	1	False
4	4	Charmander	Fire	NaN	309	39	52	43	60	50	65	1	False

df2['Type 2'].count()/df2.shape[0]

0.5175

df2.query('Total >= 580')['Legendary'].value_counts(normalize = True)

True     0.575221
False    0.424779
Name: Legendary, dtype: float64

df_temp = df2.loc[lambda x : x['Type 1']=='Fighting']
df_temp.sort_values(by='Attack',ascending=False).head(3)

	#	Name	Type 1	Type 2	Total	HP	Attack	Defense	Sp. Atk	Sp. Def	Speed	Generation	Legendary
498	448	LucarioMega Lucario	Fighting	Steel	625	70	145	88	140	70	112	4	False
594	534	Conkeldurr	Fighting	NaN	505	105	140	95	55	65	45	5	False
74	68	Machamp	Fighting	NaN	505	90	130	80	65	85	55	1	False

df2['range'] = df2.iloc[:,5:11].max(axis=1)-df2.iloc[:,5:11].min(axis=1)
df_temp = df2[['Type 1','range']].set_index('Type 1')
mrg=0
result = ''
for i in df_temp.index.unique():
    temp = df_temp.loc[i,:].mean()
    if temp.values[0] > mrg:
        mrg = temp.values[0]
        result = i
result

'Steel'

df2.query('Legendary == True')['Type 1'].value_counts().index[0]

'Psychic'

df_temp = df2.query('Legendary == True')[['Type 1','Total']].set_index('Type 1')
mval = 0
result = ''
for i in df_temp.index.unique():
    temp = float(df_temp.loc[i,:].mean())
    if temp > mval:
        mval = temp
        result = i
result

'Normal'

fangyan0819

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Pandas学习打卡日记#2

import numpy as npimport pandas as pddf = pd.read_csv('F:/data/UFO.csv')df.head() datetime shape duration (seconds) latitude longitude ...
复制链接

扫一扫