Python网络编程基础(2)——Web Services

最新推荐文章于 2023-05-15 19:55:14 发布

runningtortoise

最新推荐文章于 2023-05-15 19:55:14 发布

阅读量705

点赞数

分类专栏：读书笔记 pyryday 文章标签： python web 编程网络 import debian

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/runningtortoise/article/details/4324884

版权

pyryday 同时被 2 个专栏收录

46 篇文章 0 订阅

订阅专栏

8 篇文章 0 订阅

订阅专栏

第6章 Web客户端访问

1. 获取web页面

读取一个页面：

import urllib2

req = urllib2.Request('http://www.python.org')

page = urllib2.urlopen(req)

for line in page:

sys.stdout.write(line)

如果Request的网址没带协议的话会报错。

可以使用info()方法获得网页的headers：

import urllib2

req = urllib2.Request('http://www.python.org/')

page = urllib2.urlopen(req)

info = page.info()

print info

执行结果：

>>>

Date: Fri, 12 Jun 2009 13:07:11 GMT

Server: Apache/2.2.9 (Debian) DAV/2 SVN/1.5.1 mod_ssl/2.2.9 OpenSSL/0.9.8g mod_wsgi/2.3 Python/2.5.2

Last-Modified: Fri, 12 Jun 2009 10:01:57 GMT

ETag: "105800d-43bd-46c23cc794f40"

Accept-Ranges: bytes

Content-Length: 17341

Connection: close

Content-Type: text/html

2. 认证

dump_info_auth.py展示了如何使用urllib2打开需要验证的页面。

3. 提交表单数据

GET方法，可以手工构造url。也可以使用urllib的urlencode方法：

import urllib2, urllib

url = 'http://www.wunderground.com/cgi-bin/findweather/getForecast'

url = url + '?' + urllib.urlencode([('query','shenyang')])

#print url

req = urllib2.Request(url)

page = urllib2.urlopen(req)

info = page.info()

print info

POST方法，与GET方法不同，不能手工构造查询字符串。而需要将数据作为参数传递给urlopen()方法。

import urllib2, urllib

url = 'http://www.wunderground.com/cgi-bin/findweather/getForecast'

data = urllib.urlencode([('query','shenyang')])

req = urllib2.Request(url)

page = urllib2.urlopen(req,data)

info = page.info()

print info

4. 处理错误

error_all.py在连接的过程中捕获异常，并且检查文档的长度和Content-Length是否一致。

第7章解析HTML和XHTML

使用Python自带的HTMLParser模块。下面的程序就可以获得一个文档的title以及标签个数。

# -*- coding: cp936 -*-

from HTMLParser import HTMLParser

import urllib2

#解析网页的title

class TitleParser(HTMLParser):

def __init__(self):

#title的数据

self.title = ''

self.readingtitle = 0

self.count = 0

HTMLParser.__init__(self)

def handle_starttag(self, tag, attrs):

self.count += 1

if tag == 'title':

self.readingtitle = 1

def handle_data(self, data):

if self.readingtitle:

self.title = data

def handle_endtag(self, tag):

if tag == 'title':

self.readingtitle = 0

runningtortoise

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Python网络编程基础(2)——Web Services

第6章 Web客户端访问1. 获取web页面读取一个页面：import urllib2req = urllib2.Request(http://www.python.org)page = urllib2.urlopen(req)for line in page: sys.stdout.write(line)如果Request的网址没带协议的话会报错。
复制链接

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。