用python爬虫记录_python爬虫记录

爬虫是比较常用的程序,用python实现起来非常简单,有几个相关的库,这里就记录一下python常用的爬虫代码,备忘。

1 requestxs

import requests

url ='http://onevanillachecker.com/'

rep = requests.get(url)

rep.encoding = 'utf-8'

print(rep.text)

一些参数的记录

import requests

url ='http://onevanillachecker.com/'

header={

'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',

'Accept-Encoding': 'gzip, deflate, sdch',

'Accept-Language': 'zh-CN,zh;q=0.8',

'Connection': 'keep-alive',

'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X x.y; rv:42.0) Gecko/20100101 Firefox/42.0'

}

timeout = random.choice(range(80, 180))

rep = requests.get(url,headers = header,timeout = timeout)

rep.encoding = 'utf-8'

print(rep.text)

2 urllib2

import urllib2

req = urllib2.Request('http://onevanillachecker.com/')

response = urllib2.urlopen(req)

html = response.read()

3 beautifulsoup

beautifulsoup是用来解析页面的库,使用起来非常方便

相关文档https://www.crummy.com/software/BeautifulSoup/bs4/doc.zh/

下面简单记一些常用的东西,备忘。

配置安装

pip install beautifulsoup4

简单使用

from bs4 import BeautifulSoup

import urllib2

req = urllib2.Request('http://onevanillachecker.com/')

response = urllib2.urlopen(req)

html = response.read()

# beautifulsoup

soup = BeautifulSoup(html)

print(soup.title)

#

One Vanilla Gift Card Balance Check -Official Website

print(soup.title.name)

# title

print(soup.title.string)

# One Vanilla Gift Card Balance Check -Official Website

print(soup.title.parent.name)

# head

print(soup.p)

#

Life happens every day. And OneVanilla
helps make it simpler. Shop, dine, fill 'er up
and more - all with one prepaid card.

# print(soup.p['class'])

print(soup.a)

# Vanilla Gift Card

print(soup.find_all('a'))

# Vanilla Gift Card, Check Vanilla 3 Balance

# Vanilla Gift Cards, Where to Buy # Sign In, About Vanilla Gift Card

# Using Your Vanilla Gift Card, Try Vanilla Gift

# ......

print(soup.find(alt="2"))

# 2

print(soup.get_text())

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值