python并行编程笔记（一）：定义、Lock、RLock

最新推荐文章于 2024-07-20 00:51:35 发布

qq_35658177

最新推荐文章于 2024-07-20 00:51:35 发布

阅读量140

点赞数

分类专栏：并行算法文章标签： python并行编程

本文链接：https://blog.csdn.net/qq_35658177/article/details/102675937

版权

并行算法专栏收录该内容

1 篇文章 0 订阅

订阅专栏

基于线程的并行

软件应用中使用最广泛的并行编程范例是多线程。通常一个应用有一个进程，分成多个独立的线程，并行运行、互相配合，执行不同类型的任务。
线程是独立的处理流程，可以和系统的其他线程并行或并发地执行。多线程可以利用共享内存空间共享数据和资源。线程和进程的具体实现取决于你要运行的操作系统，但是总体来讲，我们可以说线程是包含在进程中的，同一个进程的多个不同的线程可以共享相同的资源，而进程之间不会共享资源。

每一个线程基本上包含3个元素：程序计数器，寄存器和栈。与同一进程的其他线程共享的资源基本上包括数据和系统资源。每一个线程也有自己的运行状态，可以和其他线程同步，线程的状态大体上可以分为ready,running,blocked。线程的典型应用是应用软件的并行化，目的是为了充分利用现代的多核处理器，使每个核心可以运行单个线程。相比于进程，使用线程的优势主要是性能，相比之下，在进程之间切换上下文要比在统一进程的多线程之间切换上下文要重的多。

定义一个线程
使用线程最简单的方式是用一个目标函数实例化一个Thread然后调用start()方法启动它。Python的threading模块提供了Thread()方法在不同的线程中运行函数或处理过程等。

class threading.Thread(group=None,  target=None, name=None, args=(), kwargs={})

- group：特性预留

- target：当线程启动的时候要执行的函数

- name：线程的名字，默认为Thread-N

- args：传递给target的参数，要试用tuple类型

- kwargs：同上，试用字段类型dict

import threading
def function(i):
    print('I am thread-%s'%i)
    return
threads = []
for i in range(5):
    t = threading.Thread(target=function,args=(i,))
    threads.append(t)
    t.start()
    t.join()  # t.join：主线程会调用t线程，然后等待t线程完成再执行for循环开启下一个t线程。可理解为阻塞主线程。

输出：

I am thread-0
I am thread-1
I am thread-2
I am thread-3
I am thread-4

确认一个线程
使用参数来确认或命名线程是没有必要的，每一个Thread实例创建的时候都有一个带默认值的名字，并且可以修改。在服务端通常一个服务进程都有多个线程服务，负责不同的操作，这时候命名线程是很实用的。

import threading
import time
def first_function():
    print('%s is starting'%threading.currentThread().getName())
    time.sleep(2)
    print('%s is exiting'%threading.currentThread().getName())
def second_function():
    print('%s is starting'%threading.currentThread().getName())
    time.sleep(2)
    print('%s is existing'%threading.currentThread().getName())
def third_function():
    print('%s is starting'%threading.currentThread().getName())
    time.sleep(2)
    print('%s is exicting'%threading.currentThread().getName())
if __name__ == '__main__':
    t1 = threading.Thread(name='first_function',target = first_function)
    t2 = threading.Thread(name='second_function',target = second_function)
    t3 = threading.Thread(name='third_function',target = third_function)
    
    t1.start()  
    t2.start()
    t3.start()
    
    t1.join()
    t2.join()
    t3.join()

输出：

first_function is starting
second_function is starting
third_function is starting
first_function is exiting
second_function is existing
third_function is exicting

比较t.join()放置位置不同的区别：

import threading
import time
def first_function():
    print('%s is starting'%threading.currentThread().getName())
    time.sleep(2)
    print('%s is exiting'%threading.currentThread().getName())
def second_function():
    print('%s is starting'%threading.currentThread().getName())
    time.sleep(2)
    print('%s is existing'%threading.currentThread().getName())
def third_function():
    print('%s is starting'%threading.currentThread().getName())
    time.sleep(2)
    print('%s is exicting'%threading.currentThread().getName())
if __name__ == '__main__':
    t1 = threading.Thread(name='first_function',target = first_function)
    t2 = threading.Thread(name='second_function',target = second_function)
    t3 = threading.Thread(name='third_function',target = third_function)
    
    t1.start()  
    t1.join()
    t2.start()
    t2.join()
    t3.start()
    t3.join()

输出：

first_function is starting
first_function is exiting
second_function is starting
second_function is existing
third_function is starting
third_function is exicting

Thread中，join（）方法的作用是调用线程等待该线程完成后，才能继续用下运行。第一个例子中，t1 t2 t3是并行开始的，第二个例子中，t1->t2->t3顺序执行
if name == ‘main’:一个python的文件有两种使用的方法，第一是直接作为脚本执行，第二是import到其他的python脚本中被调用（模块重用）执行。因此if name == ‘main’: 的作用就是控制这两种情况执行代码的过程，在if name == ‘main’: 下的代码只有在第一种情况下（即文件作为脚本直接执行）才会被执行，而import到其他脚本中是不会被执行的。

线程同步之Lock

当两个或以上对共享内存操作的并发线程中，如果有一个改变数据，又没有同步机制的条件下，就会产生竞争条件，可能会导致执行无效代码、bug等异常行为。

 竞争条件最简单的解决方法是使用锁。锁的操作非常简单，当一个线程需要访问部分共享内存时，它必须先获得锁才能访问。此线程对这部分共享资源使用完成之后，释放锁，然后其他线程才可再次获得锁并访问这部分资源。

 然而，在实际使用中，这个方法经常导致死锁现象。当不同线程要求得到同一个锁时，死锁就会发生，此时程序不会继续执行，因为他们互相拿着对方需要的锁。

在这里插入图片描述造成死锁的原因：线程A在使用资源2，线程B在使用资源1，如果在没有释放锁时，线程A又需要资源1，线程B又需要资源2，但是两个资源的锁都是被占用的，而且在对方的锁释放之前都处于等待且不释放锁的状态，此时就会造成死锁。
使用锁来解决同步问题是一个可行的方式，但是也存在潜在的问题。

使用lock进行线程同步

import threading

shared_resource_with_lock = 0
shared_resource_with_no_lock = 0
count = 100000
shared_resource_lock = threading.Lock() #创建锁;只是定义一个锁,并不是给资源加锁,你可以定义多个锁,像下两行代码,当你需要占用这个资源时，任何一个锁都可以锁这个资源 

#has lock
def increment_with_lock():
    global shared_resource_with_lock  #global ：python中定义函数时，若想在函数内部对函数外的变量进行操作，就需要在函数内部声明其为global
    for i in range(count):
        shared_resource_lock.acquire()  #使用锁的时候就调用acquire()方法，以此告诉其他线程，我正在占用该资源，你们要等会
        shared_resource_with_lock += 1 
        shared_resource_lock.release() #待使用资源后需要释放资源的时候就调用release()方法，告诉其他线程，我已经完成使用该资源了，其他人可以过来使用了。
def decrement_with_lock():
    global shared_resource_with_lock
    for i in range(count):
        shared_resource_lock.acquire()
        shared_resource_with_lock -= 1
        shared_resource_lock.release() 

# has no lock
def increment_without_lock():
    global shared_resource_with_no_lock
    for i in range(count):
        shared_resource_with_no_lock += 1
def decrement_without_lock():
    global shared_resource_with_no_lock
    for i in range(count):
        shared_resource_with_no_lock -= 1

if __name__ == '__main__':
    t1 = threading.Thread(target=increment_with_lock)
    t2 = threading.Thread(target=decrement_with_lock)
    t3 = threading.Thread(target=increment_without_lock)
    t4 = threading.Thread(target=decrement_without_lock)
    t1.start()
    t2.start()
    t3.start()
    t4.start()
    t1.join()
    t2.join()
    t3.join()
    t4.join()
    print('the value of shared with lock is %s'%shared_resource_with_lock)
    print('the value of shared with no lock is %s'%shared_resource_with_no_lock)

输出：

the value of shared with lock is 0
the value of shared with no lock is 0
#有时
the value of shared with lock is 0
the value of shared with no lock is 45693

在有锁的情况下，我们会得到正确的结果，而没有锁的情况下，往往会出现错误的结果。

锁状态：

锁有两种状态：locked(被某一线程拿到）和unlocked(可用状态)
操作锁的方式：acquire()和release()

global 例子

x = 1
def func():
    x = 2
func()
print('x',x)
y = 1
def func1():
    global y
    y = 2
func1()
print('y',y)

输出：

x 1
y 2

RLock

如果想让只有拿到锁的线程才能释放该锁，那么应该使用RLock()对象。当需要在类外面保证线程安全，又要在类内使用同样方法的时候RLock()就很使用。
RLock叫做Reentrant Lock，就是可以重复进入的锁，也叫递归锁。这种锁对比Lock有三个特点：1、谁拿到锁，谁释放；2、同一线程可以多次拿到该锁；3、acquire多少次就必须release多少次，只有最后一次release才能改变RLock的状态为unlocked。

import threading
import time

class Box(object):
    lock = threading.RLock()
    
    def __init__(self):  #下有解释
        self.total_items = 0
    def execute(self,n):
        Box.lock.acquire()
        self.total_items += n
        Box.lock.release()
    def add(self):
        Box.lock.acquire()
        self.execute(1)    
        Box.lock.release()
    def remove(self):
        Box.lock.acquire()
        self.execute(-1)
        Box.lock.release()
def adder(box,items):
    while items > 0 :
        print('adding 1 items in the box')
        box.add()    #调用box类中的add()函数
        time.sleep(1)
        items -= 1
def remover(box,items):
    while items > 0 :
        print('removing 1 items in the box')
        box.remove()
        time.sleep(1)
        items -= 1
        
if __name__ == '__main__':
    items = 5
    print('putting %s items in the box'%items)
    box = Box()
    t1 = threading.Thread(target=adder,args=(box,items))
    t2 = threading.Thread(target=remover,args=(box,items))
    
    t1.start()
    t2.start()
    
    t1.join()
    t2.join()
    print("%s items still remain in the box " % box.total_items)

输出：

putting 5 items in the box
adding 1 items in the box
removing 1 items in the box
adding 1 items in the box
removing 1 items in the box
adding 1 items in the box
removing 1 items in the box
adding 1 items in the box
removing 1 items in the box
adding 1 items in the box
removing 1 items in the box
0 items still remain in the box

def init(self,name,gender)：定义类的时候，若是添加init方法，那么在创建类的实例的时候，实例会自动调用这个方法，一般用来对实例的属性进行初使化()。

class testClass:
    def __init__(self,name,gender):
        self.Name = name
        self.Gender = gender
        print('hello')
testman = testClass('neo','male')
print(testman)
print(testman.Name)
print(testman.Gender)

输出：

hello
<__main__.testClass object at 0x00000212D077B518>
neo
male

def init(self,name,gender)的三个参数：
self: 一会创建类的实例的时候这个被创建的实例本身（例中的testman）；你也可以写成其他的东西
比如写成me也是可以的，这样的话下面的self.Name就要写成me.Name。
self.Name=name ：通常会写成self.name=name，这里为了区分前后两个是不同的东西，把前面那个大写了，等号左边的那个Name（或name）是实例的属性，后面那个是方法init的参数，两个是不同的）
self.Gender=gender ：通常会写成self.gender=gender
print(‘hello’) ：这个print(‘hello’)是为了说明在创建类的实例的时候，init方法就立马被调用了。
testman = testClass('neo,‘male’) ：这里创建了类testClass的一个实例

Lock与RLock

互斥锁Lock
- 缺陷：添加锁之后会影响程序性能；可能引起‘死锁’（迭代死锁、互相调用死锁）
可重入锁RLock

RLock内部维护着一个Lock和一个counter变量，counter记录了acquire的次数，从而使得资源可以被多次require。直到一个线程所有的acquire都被release，其他的线程才能获得资源。用法和threading.Lock类相同。
调用require()和release()次数必须相同

qq_35658177

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python并行编程笔记（一）：定义、Lock、RLock

基于线程的并行软件应用中使用最广泛的并行编程范例是多线程。通常一个应用有一个进程，分成多个独立的线程，并行运行、互相配合，执行不同类型的任务。线程是独立的处理流程，可以和系统的其他线程并行或并发地执行。多线程可以利用共享内存空间共享数据和资源。线程和进程的具体实现取决于你要运行的操作系统，但是总体来讲，我们可以说线程是包含在进程中的，同一个进程的多个不同的线程可以共享相同的资源，而进程...
复制链接

扫一扫

专栏目录