python实现一个简单的项目_用Python做一个简单的机票价格爬取(二)

该博客介绍了一个使用Python实现的简单机票价格爬虫,通过CTrip API获取航班信息,包括出发、到达时间、航班号等,并进行了数据格式化处理。在爬取大量数据后,发现携程存在请求限制,导致部分无效数据。文章提醒读者,爬虫可能涉及非法行为,应遵守内容提供商的使用规定。
摘要由CSDN通过智能技术生成

0x01. 回顾

上一篇,简单写了下抓信息的单线路玩法,经过简单update代码之后,就可以爬出一下的结果:

为了做到format数据的效果,在format内增加如下代码,且重新简化了查询需要的参数:

def formatFlightInfo(d):

#Parse flight info

for i in d:

#base info is contained in luggageVisaKey

# outFlight=d['flightSegments'][0]

# inFlight=d['flightSegments'][1]

for priceList in i['priceList']:

priceTags=''

if priceList.get('priceTags',None):

for pT in priceList['priceTags']:

priceTags+=str(pT['label'])+'|'

detail=''

for flight in json.loads(priceList['luggageVisaKey'])['criteria']['FlightInfoList']:

detail+=flight['MarketingFlightNo']+","

detail+=flight['DepartureCityCode']+","

detail+=flight['TakeOffDateTime']+","

detail+=flight['ArrivalCityCode']+","

detail+=flight['ArrivalDateTime']+","

detail+=str(flight['IsTransit'])+","

print '{},{},{},{},{},{},{},{},{},{},{},{},{},{},{}'.format(

str('"')+i['itineraryId']+str('"'),

i['flightSegments'][0]['flightList'][0]['departureDateTime'],

i['flightSegments'][0]['flightList'][0]['departureCityName']+" - "+i['flightSegments'][0]['flightList'][0]['departureAirportName']+" - "+i['flightSegments'][0]['flightList'][0].get('departureTerminal','null'),

i['flightSegments'][0]['flightList'][len(i['flightSegments'][0]['flightL

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值