multidownloadXkcd 多线程抓图

 

Python爬虫视频教程零基础小白到scrapy爬虫高手-轻松入门

https://item.taobao.com/item.htm?spm=a1z38n.10677092.0.0.482434a6EmUbbW&id=564564604865

 两组代码都报错,进行拆分尝试

 

#! python3
# multidownloadXkcd.py - Downloads XKCD comics using multiple threads.

import requests, os, bs4, threading
os.makedirs('xkcd', exist_ok=True) # store comics in ./xkcd

def downloadXkcd(startComic, endComic):
    for urlNumber in range(startComic, endComic):
        # Download the page.
        print('Downloading page http://xkcd.com/%s...' % (urlNumber))
        res = requests.get('http://xkcd.com/%s' % (urlNumber))
        res.raise_for_status()

        soup = bs4.BeautifulSoup(res.text)

        # Find the URL of the comic image.
        comicElem = soup.select('#comic img')
        if comicElem == []:
            print('Could not find comic image.')
        else:
            comicUrl = comicElem[0].get('src')
            # Download the image.
            print('Downloading image %s...' % (comicUrl))
            res = requests.get(comicUrl)
            res.raise_for_status()

            # Save the image to ./xkcd
            imageFile = open(os.path.join('xkcd', os.path.basename(comicUrl)), 'wb')
            for chunk in res.iter_content(100000):
                imageFile.write(chunk)
            imageFile.close()

# Create and start the Thread objects.
downloadThreads = [] # a list of all the Thread objects
for i in range(0, 1400, 100): # loops 14 times, creates 14 threads
    downloadThread = threading.Thread(target=downloadXkcd, args=(i, i + 99))
    downloadThreads.append(downloadThread)
    downloadThread.start()

# Wait for all threads to end.
for downloadThread in downloadThreads:
    downloadThread.join()
print('Done.')

  

我写的新代码,都有问题,进行拆分尝试

#! python3
# multidownloadXkcd.py - Downloads XKCD comics using multiple threads.

import requests, os, bs4, threading
os.makedirs('xkcd', exist_ok=True) # store comics in ./xkcd

def download_single(urlNumber):
    print('Downloading page http://xkcd.com/%s...' % (urlNumber))
    res = requests.get('http://xkcd.com/%s' % (urlNumber))
    res.raise_for_status()

    soup = bs4.BeautifulSoup(res.text,"lxml")

    # Find the URL of the comic image.
    comicElem = soup.select('#comic img')
    if comicElem == []:
        print('Could not find comic image.')
    else:
        comicUrl = comicElem[0].get('src')
        # Download the image.
        print('Downloading image %s...' % (comicUrl))
        res = requests.get(comicUrl)
        res.raise_for_status()

        # Save the image to ./xkcd
        imageFile = open(os.path.join('xkcd', os.path.basename(comicUrl)), 'wb')
        for chunk in res.iter_content(100000):
            imageFile.write(chunk)
        imageFile.close()
    

def downloadXkcd(startComic, endComic):
    for urlNumber in range(startComic, endComic):
        # Download the page.
        try:
            download_single(urlNumber)
        except:
            continue

# Create and start the Thread objects.
downloadThreads = [] # a list of all the Thread objects
for i in range(0, 1400, 100): # loops 14 times, creates 14 threads
    #i=0,100,200,300
    #args=(0,99),(100,199),(200,299).....
    downloadThread = threading.Thread(target=downloadXkcd, args=(i, i + 99))
    downloadThreads.append(downloadThread) #添加到线程列表
    downloadThread.start()
   
# Wait for all threads to end.???
for downloadThread in downloadThreads:
    #print("downloadThread:",downloadThread)
    downloadThread.join()
print('Done.')

  

 

 

 

 

For example, calling downloadXkcd(140, 280) would loop over the downloading code to download the comics at http://xkcd.com/140http://xkcd.com/141,http://xkcd.com/142, and so on, up to http://xkcd.com/279. Each thread that you create will call downloadXkcd() and pass a different range of comics to download.

Add the following code to your multidownloadXkcd.py program:

First we make an empy list downloadThreads; the list will help us keep track of the many Thread objects we’ll create. Then we start our for loop. Each time through the loop, we create a Thread object with threading.Thread(), append theThread object to the list, and call start() to start running downloadXkcd() in the new thread. Since the for loop sets the i variable from 0 to 1400 at steps of 100,i will be set to 0 on the first iteration, 100 on the second iteration, 200 on the third, and so on. Since we pass args=(i, i + 99) to threading.Thread(), the two arguments passed to downloadXkcd() will be 0 and 99 on the first iteration, 100and 199 on the second iteration, 200 and 299 on the third, and so on.

As the Thread object’s start() method is called and the new thread begins to run the code inside downloadXkcd(), the main thread will continue to the next iteration of the for loop and create the next thread.

 

Step 3: Wait for All Threads to End

The main thread moves on as normal while the other threads we create download comics. But say there’s some code you don’t want to run in the main thread until all the threads have completed. Calling a Thread object’s join() method will block until that thread has finished. By using a for loop to iterate over all theThread objects in the downloadThreads list, the main thread can call the join()method on each of the other threads. Add the following to the bottom of your program:

 

多线程join函数

http://www.jb51.net/article/54628.htm

两个线程开始并发执行,然后执行线程1的join(2),等线程1执行2s后就不管它了,执行线程2的join(2),等线程2执行2s后也不管它了(在此过程中线程1执行结束,打印线程1的结束信息),开始执行主进程,打印「end join」。4s之后线程2执行结束。

总结一下:

1.join方法的作用是阻塞主进程(挡住,无法执行join以后的语句),专注执行多线程。

2.多线程多join的情况下,依次执行各线程的join方法,前头一个结束了才能执行后面一个。

3.无参数,则等待到该线程结束,才开始执行下一个线程的join。

4.设置参数后,则等待该线程这么长时间就不管它了(而该线程并没有结束)。不管的意思就是可以执行后面的主进程了。

最后附上参数为2时的程序执行流程表,自己画的orz,这样看起来更好理解。

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
.NET CHCNetSDK是一个用于视频监控系统开发的软件开发工具包。它提供了一系列的功能和类库,可以帮助开发人员用于实现视频抓图的功能。 使用.NET CHCNetSDK进行抓图的步骤如下: 1. 首先,需要在应用程序中安装.NET CHCNetSDK,并引入相应的命名空间。 2. 连接到网络摄像头或视频监控设备。可以使用CHCNetSDK类库提供的方法来建立与设备的连接,并获取设备ID。 3. 设置抓图参数。可以通过调用CHCNetSDK类库提供的方法,设置抓图的分辨率、格式、保存路径等参数。 4. 执行抓图操作。可以使用CHCNetSDK类库提供的方法,发送抓图指令给设备,并接收设备返回的图像数据。 5. 保存抓图。将接收到的图像数据保存为图片文件,可以使用.NET相关类库提供的方法,比如System.Drawing.Bitmap类的Save方法。 6. 断开与设备的连接。完成抓图操作后,需要调用CHCNetSDK类库提供的方法,断开与设备的连接。 在使用.NET CHCNetSDK进行抓图时,需要注意以下几点: 1. 在进行抓图操作之前,需要确保设备已经连接并处于正常工作状态。 2. 在设置抓图参数时,需要根据实际情况进行调整,确保抓图结果符合要求。 3. 对于不同型号的网络摄像头或视频监控设备,可能需要使用不同的抓图接口和参数。 4. 在保存抓图时,需要指定合适的保存路径,并确保对该路径有写入权限。 总之,使用.NET CHCNetSDK进行抓图是一个比较简单的操作,只需要遵循上述步骤,即可完成抓图操作,并保存抓图结果。通过这种方式,开发人员可以实现各种应用场景下的视频监控系统功能。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值