python多线程学习(续)

b.当要访问的网页比较多,比如1000个时,我们不能同时启动1000个线程,这样可能机器的性能抗不住,我们可以设置一个线程池,只启动40个线程,等40个线程执行完了,再启动其他的线程。
比如,有10个线程,我设置的线程池数目是4.

#!/usr/bin/python env
# -*- coding:utf-8 -*-
import threading
import urllib2
import  time

def surf_net(url):
    start_time = time.time()
    print 'surf start', start_time
    req = urllib2.Request(url)
    time.sleep(2)
    try:
        urllib2.urlopen(req)
    except urllib2.URLError as e:
        print e.reason
    end_time = time.time()
    print url, urllib2.urlopen(req).code, end_time - start_time

url_list = ['https://www.taobao.com', 'https://www.baidu.com', 'https://www.jd.com',
            'http://mail.163.com', 'http://www.csdn.net', 'http://www.weibo.com',
            'http://www.youku.com','http://www.dianping.com/', 'https://mp.weixin.qq.com', 'https://xiumi.us/#/']
#####################one by one step start###################
one_by_one_start = time.time()
for each_url in url_list:
    # print each_url
    surf_net(each_url)
one_by_one_end = time.time()
print 'one by one run time is:', one_by_one_end - one_by_one_start
#####################one by one step end###################

threads = []
start_time = time.time()
for index in range(len(url_list)):
    one_thread = threading.Thread(target=surf_net, args=(url_list[index],))
    threads.append(one_thread)

thread_num = 4 #set threading pool, you have put 4 threads in it
while 1:
    count = min(thread_num, len(threads))
    print 'count', count   ###4,4,2

    res = []
    for index in range(count):
        x = threads.pop()
        res.append(x)
    for thread_index in res:
        thread_index.start()

    for j in res:
        j.join()

    if not threads:
        break

end_time = time.time()

print 'start time to end time =', end_time - start_time

由于选择的10个网站返回数据都很快,为了对单线程和多线程进行对比,访问每个网页时我们都让程序睡2秒。

运行结果如下:

surf start 1501241536.07
https://www.taobao.com 200 5.18160605431
surf start 1501241541.41
https://www.baidu.com 200 2.17890906334
surf start 1501241543.74
https://www.jd.com 200 2.21625900269
surf start 1501241546.14
http://mail.163.com 200 2.08968806267
surf start 1501241548.3
http://www.csdn.net 200 2.08174395561
surf start 1501241550.45
http://www.weibo.com 200 2.21271705627
surf start 1501241552.81
http://www.youku.com 200 2.1059448719
surf start 1501241555.0
http://www.dianping.com/ 200 12.4947040081
surf start 1501241577.89
https://mp.weixin.qq.com 200 2.1225938797
surf start 1501241580.13
https://xiumi.us/#/ 200 2.07485389709
one by one run time is: 46.2059190273
count 4
surf start 1501241582.28
surf start surf start 1501241582.281501241582.28

surf start 1501241582.28
http://www.youku.com https://mp.weixin.qq.com 200 7.10403490067
https://xiumi.us/#/ 200 7.15073108673
200 7.22556805611
http://www.dianping.com/ 200 12.6222820282
count 4
surf start 1501241605.45
surf startsurf start 1501241605.45 1501241605.45

surf start 1501241605.45
http://mail.163.com http://www.csdn.net http://www.weibo.com 200 2.07268214226
200 2.08485388756
https://www.jd.com 200 2.08593082428
200 2.19985890388
count 2
surf start 1501241607.84
 surf start 1501241607.84
https://www.baidu.com 200 2.16068816185
https://www.taobao.com 200 7.46481204033
start time to end time = 38.1950109005

单个线程执行是46秒,但是若是多线程执行是38秒。若是将线程池改大一些,若设置线程池为5个,则运行结果为:

surf start 1501241800.73
https://www.taobao.com 200 5.46020698547
surf start 1501241806.33
https://www.baidu.com 200 2.166918993
surf start 1501241808.65
https://www.jd.com 200 3.22009301186
surf start 1501241812.06
http://mail.163.com 200 2.10153102875
surf start 1501241814.22
http://www.csdn.net 200 2.10018992424
surf start 1501241816.39
http://www.weibo.com 200 7.5566380024
surf start 1501241826.47
http://www.youku.com 200 2.12182092667
surf start 1501241828.67
http://www.dianping.com/ 200 12.511051178
surf start 1501241851.53
https://mp.weixin.qq.com 200 2.22085404396
surf start 1501241853.87
https://xiumi.us/#/ 200 2.12279200554
one by one run time is: 55.3288900852
count 5
surf start 1501241856.06
surf start 1501241856.06surf start 1501241856.06

surf start 1501241856.06
 surf start 1501241856.06
https://xiumi.us/#/ http://www.youku.com https://mp.weixin.qq.com 200 7.07921099663
200 7.08860993385
200 7.13851284981
http://www.weibo.com http://www.dianping.com/ 200 7.34806513786
200 12.405025959
count 5
surf start surf start 1501241878.871501241878.87

surf start 1501241878.87
surf start 1501241878.87
surf start 1501241878.87
http://www.csdn.net http://mail.163.com https://www.baidu.com https://www.jd.com 200 2.14060592651
200 2.10476207733
200 2.20087099075
200 2.21591711044
https://www.taobao.com 200 5.41400504112
start time to end time = 28.6687788963

单个线程运行时,时间是58秒,线程池为5个线程并发时,运行时间是28.7秒。

若是线程池设置为10,即thread_num = 10时

/System/Library/Frameworks/Python.framework/Versions/2.7/bin/python2.7 /Users/nfzhlkzn/Documents/StudyCode/StudyThreading/demo5.py
surf start 1501242133.5
https://www.taobao.com 200 2.2023191452
surf start 1501242135.87
https://www.baidu.com 200 2.17822694778
surf start 1501242138.2
https://www.jd.com 200 2.20729494095
surf start 1501242140.59
http://mail.163.com 200 2.11021089554
surf start 1501242142.75
http://www.csdn.net 200 2.09399604797
surf start 1501242144.93
http://www.weibo.com 200 2.21243000031
surf start 1501242147.31
http://www.youku.com 200 2.10412812233
surf start 1501242149.49
http://www.dianping.com/ 200 12.4396719933
surf start 1501242172.45
https://mp.weixin.qq.com 200 2.11350488663
surf start 1501242174.66
https://xiumi.us/#/ 200 2.11214208603
one by one run time is: 44.2924759388
count 10
surf start 1501242177.79
surf start 1501242177.8
surf start 1501242177.8
surf start 1501242177.8
surf start 1501242177.8
 surf start 1501242177.8
surf start 1501242177.8
surf start 1501242177.8
surf start 1501242177.8
surf start 1501242177.8
http://www.youku.com https://mp.weixin.qq.com https://xiumi.us/#/ http://mail.163.com http://www.csdn.net https://www.baidu.com 200 2.43743491173
200 7.09744596481
200 7.10696387291
200 2.10279512405
http://www.weibo.com 200 2.14740610123
https://www.taobao.com https://www.jd.com 200 10.1884939671
200 7.18714308739
200 10.1404249668
200 10.2309341431
http://www.dianping.com/ 200 15.3515269756
start time to end time = 25.8884620667

Process finished with exit code 0

单个线程运行时,需要44.3秒,10个并发线程时,运行时间为25.9秒

分析:由于网络不稳定,每次运行时,单个线程访问10个网站的总运行时间都是不固定的,但是在每次运行中,都可以看到,多进程比单进程运行时间少很多。而且不是线程池开的越大,运行时间减少就会显著哦。会有个最优的线程池设置。这个比较复杂,不在本次的讨论范围内。

结论:对于IO密集型的场景,pyhton的多线程可以提高运行效率。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值