线程

最新推荐文章于 2022-12-24 21:37:24 发布

计科1401崔希艺

最新推荐文章于 2022-12-24 21:37:24 发布

阅读量323

点赞数

分类专栏： python

本文链接：https://blog.csdn.net/cui2839255227/article/details/53192103

版权

python 专栏收录该内容

11 篇文章 0 订阅

订阅专栏

这里我首先介绍一下多线程于多进程:

背景

1、GIL是什么？

GIL的全称是Global Interpreter Lock(全局解释器锁)，来源是python设计之初的考虑，为了数据安全所做的决定。

2、每个CPU在同一时间只能执行一个线程（在单核CPU下的多线程其实都只是并发，不是并行，并发和并行从宏观上来讲都是同时处理多路请求的概念。但并发和并行又有区别，并行是指两个或者多个事件在同一时刻发生；而并发是指两个或多个事件在同一时间间隔内发生。）

在Python多线程下，每个线程的执行方式：

获取GIL
执行代码直到sleep或者是python虚拟机将其挂起。
释放GIL

可见，某个线程想要执行，必须先拿到GIL，我们可以把GIL看作是“通行证”，并且在一个python进程中，GIL只有一个。拿不到通行证的线程，就不允许进入CPU执行。

在Python2.x里，GIL的释放逻辑是当前线程遇见IO操作或者ticks计数达到100（ticks可以看作是Python自身的一个计数器，专门做用于GIL，每次释放后归零，这个计数可以通过 sys.setcheckinterval 来调整），进行释放。

而每次释放GIL锁，线程进行锁竞争、切换线程，会消耗资源。并且由于GIL锁存在，python里一个进程永远只能同时执行一个线程(拿到GIL的线程才能执行)，这就是为什么在多核CPU上，python的多线程效率并不高。

那么是不是python的多线程就完全没用了呢？

在这里我们进行分类讨论：

1、CPU密集型代码(各种循环处理、计数等等)，在这种情况下，由于计算工作多，ticks计数很快就会达到阈值，然后触发GIL的释放与再竞争（多个线程来回切换当然是需要消耗资源的），所以python下的多线程对CPU密集型代码并不友好。

2、IO密集型代码(文件处理、网络爬虫等)，多线程能够有效提升效率(单线程下有IO操作会进行IO等待，造成不必要的时间浪费，而开启多线程能在线程A等待时，自动切换到线程B，可以不浪费CPU的资源，从而能提升程序执行效率)。所以python的多线程对IO密集型代码比较友好。

而在python3.x中，GIL不使用ticks计数，改为使用计时器（执行时间达到阈值后，当前线程释放GIL），这样对CPU密集型程序更加友好，但依然没有解决GIL导致的同一时间只能执行一个线程的问题，所以效率依然不尽如人意。

多核性能

多核多线程比单核多线程更差，原因是单核下多线程，每次释放GIL，唤醒的那个线程都能获取到GIL锁，所以能够无缝执行，但多核下，CPU0释放GIL后，其他CPU上的线程都会进行竞争，但GIL可能会马上又被CPU0拿到，导致其他几个CPU上被唤醒后的线程会醒着等待到切换时间后又进入待调度状态，这样会造成线程颠簸(thrashing)，导致效率更低

多进程为什么不会这样？

每个进程有各自独立的GIL，互不干扰，这样就可以真正意义上的并行执行，所以在python中，多进程的执行效率优于多线程(仅仅针对多核CPU而言)。

所以在这里说结论：多核下，想做并行提升效率，比较通用的方法是使用多进程，能够有效提高执行效率。

python thread基础知识:
1）什么叫做线程:
线程是一个程序执行的最小单元,一个标准的线程由线程ID，当前指令指针(PC),寄存器集合和堆栈组成。线

程是进程的一个实体，线程自己不用有系统资源，只拥有在运行中必不可少的资源。但他可以与同属于一个

进程的其他线程共享全部的资源。一个线程可以创建和撤销另一个线程，同一个进程里的全部线程可以并发

执行。

2）python使用线程的两种方式:
(1)函数式: thread.start_new_thread(function,args[,kwargs]) function :线程函数，args：传递给线程

函数的参数，他必须是个元组，kwargs：可选参数.
import thread
import time

# 为线程定义一个函数
def print_time( threadName, delay):
   count = 0
   while count < 5:
      time.sleep(delay)
      count += 1
      print "%s: %s" % ( threadName, time.ctime(time.time()) )

# 创建两个线程
try:
   thread.start_new_thread( print_time, ("Thread-1", 2, ) )
   thread.start_new_thread( print_time, ("Thread-2", 4, ) )
except:
   print "Error: unable to start thread"
线程的结束一般依靠线程函数的自然结束；也可以在线程函数中调用thread.exit()，他抛出SystemExit

exception，达到退出线程的目的。
(2)线程模块:
thread 模块提供的其他方法：
threading.currentThread(): 返回当前的线程变量。
threading.enumerate(): 返回一个包含正在运行的线程的list。正在运行指线程启动后、结束前，不包括启

动前和终止后的线程。
threading.activeCount(): 返回正在运行的线程数量，与len(threading.enumerate())有相同的结果。

线程模块同样提供了Thread类来处理线程，Thread类提供了以下方法:
run(): 用以表示线程活动的方法。
start():启动线程活动。

join([time]): 等待至线程中止。这阻塞调用线程直至线程的join() 方法被调用中止-正常退出或者抛出未

处理的异常-或者是可选的超时发生。(假如说thread1中调用Thread2.join(),这样知道Thread2执行完才能执

行Thread1.

isAlive(): 返回线程是否活动的。
getName(): 返回线程名。
setName(): 设置线程名。

import threading,time
exitFlag = 0
#封装了一个线程类
class myThread(threading.Thread):#继承了threading.Thread类
def __init__(self,threadID,threadName,count):
  threading.Thread.__init__(self)
  self.threadID=threadID
  self.threadName=threadName
  self.count=count
def run(self):
  print "Starting " + self.threadName
  print_time(self.threadName,self.count,4)
  print "Exiting " +self.threadName
def print_time(threadname,count,delaytime):
while count:
  count-=1
  if exitFlag:
   thread.exit()
  time.sleep(delaytime)
  print "%s : %s" %(threadname,time.ctime(time.time()))

# create threading
thread1= myThread(1,"thread-1",5)
thread2=myThread(2,"thread-2",5)

# start thrreading

thread1.start()
thread2.start()

print "Exit main thread"

3)线程的同步:
如果多个线程同时对某个数据进行修改，这样就可能出现错误。在这种情况下，我们就可以进行线程同步。
使用Thread对象的Lock和Rlock可以实现简单的线程同步，这两个对象都有acquire()和release()方法.对那

些只允许一个县城操作的数据，可以将其操作放在acquire()和release()方法之间.

import threading,time;

# thread lock
threadlock= threading.Lock()

threadlist=[]
exitFlag =0
class myThread(threading.Thread):
def __init__(self,threadID,threadName,count):
  threading.Thread.__init__(self)
  self.threadID=threadID
  self.threadName=threadName
  self.count=count
def run(self):
  print "Starting " + self.threadName
   # 获得锁，成功获得锁定后返回True
         # 可选的timeout参数不填时将一直阻塞直到获得锁定
         # 否则超时后将返回False
  threadlock.acquire()
  print_time(self.threadName,self.count,4)
  threadlock.release()#释放锁
  print "Exiting " +self.threadName
def print_time(threadname,count,delaytime):
while count:
  count-=1
  if exitFlag:
   thread.exit()
  time.sleep(delaytime)
  print threading.currentThread()
  #print threading.enumerate()
  #print "%s : %s" %(threadname,time.ctime(time.time()))
#create thread
thread1=myThread(1,"thread-1",4)
thread2=myThread(2,"thread-2",5)

# start thread

thread1.start()
thread2.start()

# append thread in threadlist

threadlist.append(thread1)
threadlist.append(thread2)

for thread in threadlist://这里要注意，我也没有弄明白，为什么去了这里之后，结果变得不一样了。
print threading.currentThread()
thread.join()
print "Exit main thread!"

4）线程的优先级的队列:
Python的Queue模块中提供了同步的、线程安全的队列类，包括FIFO（先入先出)队列Queue，LIFO（后入先出

）队列LifoQueue，和优先级队列PriorityQueue。这些队列都实现了锁原语，能够在多线程中直接使用。可

以使用队列来实现线程间的同步。
Queue模块中的常用方法:
Queue.qsize() 返回队列的大小
Queue.empty() 如果队列为空，返回True,反之False
Queue.full() 如果队列满了，返回True,反之False
Queue.full 与 maxsize 大小对应
Queue.get([block[, timeout]])获取队列，timeout等待时间
Queue.get_nowait() 相当Queue.get(False)
Queue.put(item) 写入队列，timeout等待时间
Queue.put_nowait(item) 相当Queue.put(item, False)
Queue.task_done() 在完成一项工作之后，Queue.task_done()函数向任务已经完成的队列发送一个信号
Queue.join() 实际上意味着等到队列为空，再执行别的操作
import Queue
import threading
import time

exitFlag = 0

class myThread (threading.Thread):
    def __init__(self, threadID, name, q):
        threading.Thread.__init__(self)
        self.threadID = threadID
        self.name = name
        self.q = q
    def run(self):
        print "Starting " + self.name
        process_data(self.name, self.q)
        print "Exiting " + self.name

def process_data(threadName, q):
    while not exitFlag:
        queueLock.acquire()
        if not workQueue.empty():
            data = q.get()
            queueLock.release()
            print "%s processing %s" % (threadName, data)
        else:
            queueLock.release()
        time.sleep(1)

threadList = ["Thread-1", "Thread-2", "Thread-3"]
nameList = ["One", "Two", "Three", "Four", "Five"]
queueLock = threading.Lock()
workQueue = Queue.Queue(10)
threads = []
threadID = 1

# 创建新线程
for tName in threadList:
    thread = myThread(threadID, tName, workQueue)
    thread.start()
    threads.append(thread)
    threadID += 1

# 填充队列
queueLock.acquire()
for word in nameList:
workQueue.put(word)
queueLock.release()

# 等待队列清空
while not workQueue.empty():
pass

# 通知线程是时候退出
exitFlag = 1

# 等待所有线程完成
for t in threads:
t.join()
print "Exiting Main Thread"