第28章 锁

这篇博客通过x86汇编代码分析了不同锁机制,如flag.s的单内存标志锁、test-and-set.s的自旋锁、Peterson算法及ticket lock的实现。讨论了中断频率如何影响程序行为,以及如何通过yield.s引入让步提高效率。同时,对比了test-and-test-and-set.s的优化,强调了互斥和死锁避免的重要性。
摘要由CSDN通过智能技术生成

Homework(Simulation):

        This program, x86.py, allows you to see how different thread interleavings either cause or avoid race conditions. See the README for details on how the program works and answer the questions below

Questions:

1. Examine flag.s. This code “implements” locking with a single memory flag. Can you understand the assembly?

flag.s是对图28.1程序的汇编实现

python3 x86.py -p flag.s -c

2. When you run with the defaults, does flag.s work? Use the -M and -R flags to trace variables and registers (and turn on -c to see their values). Can you predict what value will end up in flag? 

count最终为2,程序按预期工作

python3 x86.py -p flag.s -M flag,count -R ax,bx -c

3. Change the value of the register %bx with the -a flag (e.g., -a bx=2,bx=2 if you are running just two threads). What does the code do? How does it change your answer for the question above?

count最终为4,代码先执行Thread0,待Thread0结束后,执行Thread1

python3 x86.py -p flag.s -a bx=2,bx=2 -M flag,count -R ax,bx -c

 4. Set bx to a high value for each thread, and then use the -i flag to generate different interrupt frequencies; what values lead to a bad outcomes? Which lead to good outcomes?

只有当中断频率为11的倍数时,程序才能按预期工作,因为有11条指令循环(1000-1010)

第12条结束指令(1011 halt)只执行一次,因此不考虑在内

当中断频率不是11的倍数时,都有可能发生图28.2的问题,即中断发生在指令1001和指令1003之间,导致锁的失效

python3 x86.py -p flag.s -a bx=50,bx=50 -i 11 -M flag,count -R ax,bx -c

5. Now let’s look at the program test-and-set.s. First, try to understand the code, which uses the xchg instruction to build a simple locking primitive. How is the lock acquire written? How about lock release?

获取锁:

mov  $1, %ax        
xchg %ax, mutex     # atomic swap of 1 and mutex
test $0, %ax        # if we get 0 back: lock is free!
jne  .acquire       # if not, try again

释放锁:

mov  $0, mutex

6. Now run the code, changing the value of the interrupt interval (-i) again, and making sure to loop for a number of times. Does the code always work as expected? Does it sometimes lead to an inefficient use of the CPU? How could you quantify that?

程序会按预期执行,但是会导致CPU使用率不高,因为会有自旋发生

可以用自旋指令数/总指令数来衡量

令bx=5

当i=11,自旋指令数/总指令数=0,说明没有自旋发生

当i=4,自旋指令数/总指令数=44/156=28.2%,说明有28.2%的指令是在自旋,浪费CPU资源

python3 x86.py -p test-and-set.s -a bx=5,bx=5 -i 4 -M mutex,count -R ax,bx -c

7. Use the -P flag to generate specific tests of the locking code. For example, run a schedule that grabs the lock in the first thread, but then tries to acquire it in the second. Does the right thing happen? What else should you test?

程序可以保证正确执行

python3 x86.py -p test-and-set.s -a bx=5,bx=5 -i 4 -M mutex,count -R ax,bx -P 000111 -c

8. Now let’s look at the code in peterson.s, which implements Peterson’s algorithm (mentioned in a sidebar in the text). Study the code and see if you can make sense of it.

Peterson's algorithm的汇编实现

这3条指令可以防止出现图28.2的情况

mov turn, %ax
test %cx, %ax           # compare 'turn' and '1 - self'
je .spin1               # if turn==1-self, go back and start spin again

9. Now run the code with different values of -i. What kinds of different behavior do you see? Make sure to set the thread IDs appropriately (using -a bx=0,bx=1 for example) as the code assumes it.

python3 x86.py -p peterson.s -a bx=0,bx=1 -i 50 -M flag,turn,count -R ax,bx,cx -c

10. Can you control the scheduling (with the -P flag) to “prove” that the code works? What are the different cases you should show hold? Think about mutual exclusion and deadlock avoidance.

先运行6条Thread0的指令,再运行6条Thread1的指令,这时turn = 0,而上面提到的3条指令起了效果,将运行Thread0而Thread1自旋等待

python3 x86.py -p peterson.s -a bx=0,bx=1 -M flag,turn,count -R ax,bx,cx -P 000000111111 

11. Now study the code for the ticket lock in ticket.s. Does it match the code in the chapter? Then run with the following flags: -a bx=1000,bx=1000 (causing each thread to loop through the critical section 1000 times). Watch what happens; do the threads spend much time spin-waiting for the lock?

是图28.7的汇编实现

python3 x86.py -p ticket.s -a bx=1000,bx=1000 -i 50 -M ticket,turn,count -R ax,bx,cx -c

12. How does the code behave as you add more threads?

Fetch-And-Add在Test-And-Set的基础上,保证了公平性,可以让每一个线程都有相等的机会运行,不会饿死

但仍然会有自旋等待发生

python3 x86.py -p ticket.s -t 4 -a bx=5 -i 2 -M ticket,turn,count -R ax,bx,cx -c

13. Now examine yield.s, in which a yield instruction enables one thread to yield control of the CPU (realistically, this would be an OS primitive, but for the simplicity, we assume an instruction does the task). Find a scenario where test-and-set.s wastes cycles spinning, but yield.s does not.

在test-and-set.s的基础上改变跳转条件并增加了3条指令

mov  $1, %ax        
xchg %ax, mutex     # atomic swap of 1 and mutex
test $0, %ax        # if we get 0 back: lock is free!
je .acquire_done    
yield               # if not, yield and try again
j .acquire
.acquire_done

python3 x86.py -p yield.s -i 5 -a bx=5,bx=5 -M mutex,count -R ax,bx -c

14. Finally, examine test-and-test-and-set.s. What does this lock do? What kind of savings does it introduce as compared to test-and-set.s?

相比test-and-set.s多出来了这2条指令

mov  mutex, %ax
test $0, %ax
jne .acquire

而这3条指令正是flag.s中用于判断是否有锁的指令

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值