Python爬虫第一步之获取网页源代码

最新推荐文章于 2023-04-21 22:18:00 发布

红金龙-时光

最新推荐文章于 2023-04-21 22:18:00 发布

阅读量1.1w

点赞数 1

分类专栏： Python 文章标签： Python

本文链接：https://blog.csdn.net/hongjinlongno1/article/details/51648687

版权

本文介绍了Python2.7环境下使用Urllib库获取网页源代码的基础教程。内容包括解决Python文件中文编码问题，以及在Pycharm和notepad++中进行Python爬虫的初步实践。

摘要由CSDN通过智能技术生成

软件使用：Python2.7 +Pycharm，稍后使用Python3.5+notepad++试试

#coding: utf-8
import urllib

def getHtml(url):
    page = urllib.urlopen(url)
    html = page.read()
    return html

html = getHtml("http://blog.sina.com.cn/")

#coding = utf-8
import urllib

page =urllib.urlopen("http://blog.sina.com.cn/")
print page.read()

最低0.47元/天解锁文章

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

红金龙-时光

关注关注

1
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
Python爬虫第一步之获取网页源代码

“’ python #coding=utf-8 import urllibdef getHtml(url): page = urllib.urlopen(url) html = page.read() return htmlhtml = getHtml(“http://blog.sina.com.cn/“) “’# coding=utf-8PY文件当中是
复制链接

扫一扫