爬虫_pandas

本文介绍如何利用pandas库对爬虫抓取的数据进行处理,包括读取CSV文件site.csv和123.csv,并通过toDataFrame.py脚本转换为DataFrame格式。
摘要由CSDN通过智能技术生成
123.py
import pandas as pd

df=pd.read_csv('./123.csv')

打印某一列;判断某一列是否有空值
print(df['NUM_BEDROOMS'])
print(df['NUM_BEDROOMS'].isnull())
dropna()中写inplace=True修改源数据
df2=df.dropna()
指定的列的某一行有空值的话就删除那一行数据
df3=df.dropna(subset=['ST_NUM']) 
替换脏数据为666 
df4=df.fillna('666') 
替换某一列的所有空值为12345 
df['PID'].fillna(12345,inplace=True) 
print(df) 
平均数填充空值,中位数填充空值,众数填充空值 
avg=df['ST_NUM'].mean() 
zhongwei=df['ST_NUM'].median() 
zhong=df['ST_NUM'].mode 
df.fillna(zhong,inplace=True) 
print(df)

 

123.csv

PID,ST_NUM,ST_NAME,OWN_OCCUPIED,NUM_BEDROOMS,NUM_BATH,SQ_FT
100001000,104,PUTNAM,Y,3,1,1000
100002000,197,LEXINGTON,N,3,1.5,--
100003000,,LEXINGTON,N,n/a,1,850
100004000,201,BERKELEY,12,1,NaN,700
,203,BERKELEY,Y,3,2,1600
100006000,207,BERKELEY,Y,NA,1,800
100007000,NA,WASHINGTON,,2,HURLEY,950
100008000,213,TREMONT,Y,1,1,
100009000,215,TREMONT,Y,na,2,1800

 

data.py
import pandas as pd

data={
        "Date":['2020/12/01','2020/12/02','20201226'],
        "duration":[50,40,45]
}
df=pd.DataFrame(data,index=['day1','day2','day3'])
df['Date']=pd.to_datetime(df['Date'])
可以写df也可以写df.to_string()
print(df.to_string())

 

json2.py
import pandas as pd

url='https://static.runoob.com/download/sites.json'
df = pd.read_json(url)
print(df)

 

import pandas as pd

df=pd.read_csv('./nba.csv',encoding='GBK')nba.csv在桌面上则写绝对路径,现在该文件在本项目下
print(df)
df.to_csv('./nba2.csv',encoding='utf-8') 生成nba2.csv,添加了序号
定义三个列表
nme=["Google","Runoob","Taobao","Wiki"]
st["www.google.com","www.runoob.com","www.taobao.com","www.wikipedia.org"]
ag=[90,40,80,98]
将列表转变为字典
dict={'name':nme,'site':st,'age':ag}
df = pd.DataFrame(dict)
保存 dataframe 到site.csv
df.to_csv('site.csv')

 

site.csv

,name,site,age
0,Google,www.google.com,90
1,Runoob,www.runoob.com,40
2,Taobao,www.taobao.com,80
3,Wiki,www.wikipedia.org,98

 

nba2.py
import pandas as pd

读取csv文件到Dataframe
df=pd.read_csv('./nba2.csv')
打印前10行
print(df.head(10))
打印后10行
print(df.tail(10))
返回基本信息 显示所有列名以及列名的数据类型等等信息
print(df.info)

nba2.csv
,Name,Team,Number,Position,Age,Height,Weight,College,Salary
0,Avery Bradley,Boston Celtics,0,PG,25,6月2日,180,Texas,7730337.0
1,Jae Crowder,Boston Celtics,99,SF,25,6月6日,235,Marquette,6796117.0
2,John Holland,Boston Celtics,30,SG,27,6月5日,205,Boston University,
3,R.J. Hunter,Boston Celtics,28,SG,22,6月5日,185,Georgia State,1148640.0
4,Jonas Jerebko,Boston Celtics,8,PF,29,6月10日,231,,5000000.0
5,Amir Johnson,Boston Celtics,90,PF,29,6月9日,240,,12000000.0
6,Jordan Mickey,Boston Celtics,55,PF,21,6月8日,235,LSU,1170960.0
7,Kelly Olynyk,Boston Celtics,41,C,25,7-0,238,Gonzaga,2165160.0
8,Terry Rozier,Boston Celtics,12,PG,22,6月2日,190,Louisville,1824360.0
9,Marcus Smart,Boston Celtics,36,PG,22,6月4日,220,Oklahoma State,3431040.0
10,Jared Sullinger,Boston Celtics,7,C,24,6月9日,260,Ohio State,2569260.0
11,Isaiah Thomas,Boston Celtics,4,PG,27,5月9日,185,Washington,6912869.0
12,Evan Turner,Boston Celtics,11,SG,27,6月7日,220,Ohio State,3425510.0
13,James Young,Boston Celtics,13,SG,20,6月6日,215,Kentucky,1749840.0
14,Tyler Zeller,Boston Celtics,44,C,26,7-0,253,North Carolina,2616975.0
15,Bojan Bogdanovic,Brooklyn Nets,44,SG,27,6月8日,216,,3425510.0
16,Markel Brown,Brooklyn Nets,22,SG,24,6月3日,190,Oklahoma State,845059.0
17,Wayne Ellington,Brooklyn Nets,21,SG,28,6月4日,200,North Carolina,1500000.0
18,Rondae Hollis-Jefferson,Brooklyn Nets,24,SG,21,6月7日,220,Arizona,1335480.0
19,Jarrett Jack,Brooklyn Nets,2,PG,32,6月3日,200,Georgia Tech,6300000.0
20,Sergey Karasev,Brooklyn Nets,10,SG,22,6月7日,208,,1599840.0
21,Sean Kilpatrick,Brooklyn Nets,6,SG,26,6月4日,219,Cincinnati,134215.0
22,Shane Larkin,Brooklyn Nets,0,PG,23,5月11日,175,Miami (FL),1500000.0
23,Brook Lopez,Brooklyn Nets,11,C,28,7-0,275,Stanford,19689000.0
24,Chris McCullough,Brooklyn Nets,1,PF,21,6月11日,200,Syracuse,1140240.0
25,Willie Reed,Brooklyn Nets,33,PF,26,6月10日,220,Saint Louis,947276.0
26,Thomas Robinson,Brooklyn Nets,41,PF,25,6月10日,237,Kansas,981348.0
27,Henry Sims,Brooklyn Nets,14,C,26,6月10日,248,Georgetown,947276.0
28,Donald Sloan,Brooklyn Nets,15,PG,28,6月3日,205,Texas A&M,947276.0
29,Thaddeus Young,Brooklyn Nets,30,PF,27,6月8日,221,Georgia Tech,11235955.0
30,Arron Afflalo,New York Knicks,4,SG,30,6月5日,210,UCLA,8000000.0
31,Lou Amundson,New York Knicks,17,PF,33,6月9日,220,UNLV,1635476.0
32,Thanasis Antetokounmpo,New York Knicks,43,SF,23,6月7日,205,,30888.0
33,Carmelo Anthony,New York Knicks,7,SF,32,6月8日,240,Syracuse,22875000.0
34,Jose Calderon,New York Knicks,3,PG,34,6月3日,200,,7402812.0
35,Cleanthony Early,New York Knicks,11,SF,25,6月8日,210,Wichita State,845059.0
36,Langston Galloway,New York Knicks,2,SG,24,6月2日,200,Saint Joseph's,845059.0
37,Jerian Grant,New York Knicks,13,PG,23,6月4日,195,Notre Dame,1572360.0
38,Robin Lopez,New York Knicks,8,C,28,7-0,255,Stanford,12650000.0
39,Kyle O'Quinn,New York Knicks,9,PF,26,6月10日,250,Norfolk State,3750000.0
40,Kristaps Porzingis,New York Knicks,6,PF,20,7月3日,240,,4131720.0
41,Kevin Seraphin,New York Knicks,1,C,26,6月10日,278,,2814000.0
42,Lance Thomas,New York Knicks,42,SF,28,6月8日,235,Duke,1636842.0
43,Sasha Vujacic,New York Knicks,18,SG,32,6月7日,195,,947276.0
44,Derrick Williams,New York Knicks,23,PF,25,6月8日,240,Arizona,4000000.0
45,Tony Wroten,New York Knicks,5,SG,23,6月6日,205,Washington,167406.0
46,Elton Brand,Philadelphia 76ers,42,PF,37,6月9日,254,Duke,
47,Isaiah Canaan,Philadelphia 76ers,0,PG,25,6-0,201,Murray State,947276.0
48,Robert Covington,Philadelphia 76ers,33,SF,25,6月9日,215,Tennessee State,1000000.0
49,Joel Embiid,Philadelphia 76ers,21,C,22,7-0,250,Kansas,4626960.0
50,Jerami Grant,Philadelphia 76ers,39,SF,22,6月8日,210,Syracuse,845059.0
51,Richaun Holmes,Philadelphia 76ers,22,PF,22,6月10日,245,Bowling Green,1074169.0
52,Carl Landry,Philadelphia 76ers,7,PF,32,6月9日,248,Purdue,6500000.0
53,Kendall Marshall,Philadelphia 76ers,5,PG,24,6月4日,200,North Carolina,2144772.0
54,T.J. McConnell,Philadelphia 76ers,12,PG,24,6月2日,200,Arizona,525093.0
55,Nerlens Noel,Philadelphia 76ers,4,PF,22,6月11日,228,Kentucky,3457800.0
56,Jahlil Okafor,Philadelphia 76ers,8,C,20,6月11日,275,Duke,4582680.0
57,Ish Smith,Philadelphia 76ers,1,PG,27,6-0,175,Wake Forest,947276.0
58,Nik Stauskas,Philadelphia 76ers,11,SG,22,6月6日,205,Michigan,2869440.0
59,Hollis Thompson,Philadelphia 76ers,31,SG,25,6月8日,206,Georgetown,947276.0
60,Christian Wood,Philadelphia 76ers,35,PF,20,6月11日,220,UNLV,525093.0
61,Bismack Biyombo,Toronto Raptors,8,C,23,6月9日,245,,2814000.0
62,Bruno Caboclo,Toronto Raptors,20,SF,20,6月9日,205,,1524000.0
63,DeMarre Carroll,Toronto Raptors,5,SF,29,6月8日,212,Missouri,13600000.0
64,DeMar DeRozan,Toronto Raptors,10,SG,26,6月7日,220,USC,10050000.0
65,James Johnson,Toronto Raptors,3,PF,29,6月9日,250,Wake Forest,2500000.0
66,Cory Joseph,Toronto Raptors,6,PG,24,6月3日,190,Texas,7000000.0
67,Kyle Lowry,Toronto Raptors,7,PG,30,6-0,205,Villanova,12000000.0
68,Lucas Nogueira,Toronto Raptors,92,C,23,7-0,220,,1842000.0
69,Patrick Patterson,Toronto Raptors,54,PF,27,6月9日,235,Kentucky,6268675.0
70,Norman Powell,Toronto Raptors,24,SG,23,6月4日,215,UCLA,650000.0
71,Terrence Ross,Toron
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值