电影天堂获取电影

最新推荐文章于 2019-05-23 08:39:08 发布

J__333

最新推荐文章于 2019-05-23 08:39:08 发布

阅读量1.3w

点赞数

分类专栏：快乐编程 Python 技术互动程序人生文章标签： python 爬虫电影

本文链接：https://blog.csdn.net/J__333/article/details/81988584

版权

程序人生同时被 3 个专栏收录

14 篇文章 0 订阅

订阅专栏

快乐编程

10 篇文章 0 订阅

订阅专栏

技术互动

10 篇文章 0 订阅

订阅专栏

from urllib import request
import re
import pymysql
db = pymysql.connect(host='127.0.0.1', user='root', password='123456', port=3306, database='xueqiu')
cursor = db.cursor()
for i in range(3):
    url = 'http://www.ygdy8.com/html/gndy/dyzz/list_23_'+str(i)+'.html'
    headers = {
    'Referer':'http://www.ygdy8.com/',
    'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.26 Safari/537.36 Core/1.63.5702.400 QQBrowser/10.2.1893.400'
    }
    req = request.Request(url,headers=headers)
    response = request.urlopen(req)
    html = response.read().decode('gbk','ignore')
    art = r'<a href="(.*?)" class="ulink">'
    i = re.findall(art,html)
    for n in i:
        url = 'http://www.ygdy8.com'+str(n)
        req = request.Request(url,headers=headers)
        response = request.urlopen(req)
        html1 = response.read().decode('gbk','ignore')
        asr = r'<title>(.*?)</title>'
        title = re.search(asr,html1).group(1)
        aer = r'<a href="(.*?)"><strong><font'
        cili = re.search(aer,html1).group(1)
        acr = r'bgcolor="#fdfddf"><a href="(.*?)">ftp'
        xunlei = re.search(acr,html1).group(1)
        sql = "insert into dianying(title,cili,xunlei) values('" + title + "','" + cili + "','" + xunlei + "')"
        cursor.execute(sql)
        db.commit()
cursor.close()
db.close()