[Erlang 0035] Erlang SMP

最新推荐文章于 2016-12-13 22:16:17 发布

坚强2002

最新推荐文章于 2016-12-13 22:16:17 发布

阅读量1.7k

点赞数

文章标签： erlang system asynchronous optimization parallel each

本文链接：https://blog.csdn.net/ligaorenvip/article/details/7923412

版权

Erlang SMP

Erlang SMP (Symmetrical Multi Processor)在97-98年左右开始,项目按照先跑起来再优化的开发策略("First make it work, then measure, then optimize".),在2006年R11B发布了第一个稳定版本;

强制SMP编译 ./configure --enable-smp-support
关闭SMP模拟器 ./configure --disable-smp-support
启动erl时开启或关闭SMP -smp enable -smp disable

原理

没有SMP支持的Erlang VM只有一个Scheduler运行于主线程;Scheduler从运行队列中取出需要处理的Erlang进程和IO-job;由于只有一个Scheduler没有必要对数据加锁;见下图:

有SMP支持的Erlang VM 可以有1~1024个Scheduler,每一个Scheduler都会运行于一个独立的操作系统线程;操作系统决定它是否要在不同的核上运行.由于多个Scheduler就要对数据加锁,一个Erlang进程可能前后被多个Scheduler调度;

Alright, so it was decided that lightweight processes with asynchronous message passing were the approach to take for Erlang. How to make this work? Well, first of all, the operating system can't be trusted to handle the processes. Operating systems have many different ways to handle processes, and their performance varies a lot. Most if not all of them are too slow or too heavy for what is needed by standard Erlang applications. By doing this in the VM, the Erlang implementers keep control of optimization and reliability. Nowadays, Erlang's processes take about 300 words of memory each and can be created in a matter of microseconds—not something doable on major operating systems these days.

To handle all these potential processes your programs could create, the VM starts one thread per core which acts as a scheduler. Each of these schedulers has a run queue, or a list of Erlang processes on which to spend a slice of time. When one of the schedulers has too many tasks in its run queue, some are migrated to another one. This is to say each Erlang VM takes care of doing all the load-balancing and the programmer doesn't need to worry about it. There are some other optimizations that are done, such as limiting the rate at which messages can be sent on overloaded processes in order to regulate and distribute the load.

link:http://learnyousomeerlang.com/the-hitchhikers-guide-to-concurrency

性能

下面这个数据比较老了(2008年1月),但是还可以做参考:

Measurements from a real telecom product showed a 1.7 speed improvement between a single and a dual core system.

The SMP VM with only one scheduler is slightly slower (10%) than the non SMP VM.
This is because the SMP VM need to use locks for all shared datastructures. But as long as there are no lock-conflicts the overhead caused by locking is not that high (it is the lock conflicts that takes time).
This explains why it in some cases can be more efficient to run several SMP VM's with one scheduler each instead on one SMP VM with several schedulers. Of course the running of several VM's require that the application can run in many parallel tasks which has no or very little communication with each other.

应用

从OTP R12B开始只要操作系统告知当前是多CPU(多核),SMP就会自动开启并设定Scheduler个数和CPU或核的数量一致;我们启动一下看看:

Erlang R15B (erts-5.9) [source] [64-bit] [smp:8:8] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.9  (abort with ^G)
1>

检查一下服务器CPU信息

# grep "model name" /proc/cpuinfo | cut -f2 -d:
Intel(R) Xeon(R) CPU           E5506 @ 2.13GHz
Intel(R) Xeon(R) CPU           E5506 @ 2.13GHz
Intel(R) Xeon(R) CPU           E5506 @ 2.13GHz
Intel(R) Xeon(R) CPU           E5506 @ 2.13GHz
Intel(R) Xeon(R) CPU           E5506 @ 2.13GHz
Intel(R) Xeon(R) CPU           E5506 @ 2.13GHz
Intel(R) Xeon(R) CPU           E5506 @ 2.13GHz
Intel(R) Xeon(R) CPU           E5506 @ 2.13GHz

可以看到8核的机器启动的时候自动开启了smp:8:8,这两个数字是什么意思?

+S Schedulers:SchedulerOnline

Sets the amount of scheduler threads to create and scheduler threads to set online when SMP support has been enabled. Valid range for both values are 1-1024. If the Erlang runtime system is able to determine the amount of logical processors configured and logical processors available, Schedulers will default to logical processors configured, and SchedulersOnline will default to logical processors available; otherwise, the default values will be 1. Schedulers may be omitted if :SchedulerOnline is not and vice versa. The amount of schedulers online can be changed at run time via erlang:system_flag(schedulers_online, SchedulersOnline).

Note: the results are similar whether symmetric multiprocessing is enabled or not. To prove it, you can just test it out by starting the Erlang VM with $ erl -smp disable.

To see if your Erlang VM runs with or without SMP support in the first place, start a new VM without any options and look for the first line output. If you can spot the text [smp:2:2] [rq:2], it means you're running with SMP enabled, and that you have 2 run queues (rq, or schedulers) running on two cores. If you only see [rq:1], it means you're running with SMP disabled.

If you wanted to know, [smp:2:2] means there are two cores available, with two schedulers. [rq:2] means there are two run queues active. In earlier versions of Erlang, you could have multiple schedulers, but with only one shared run queue. Since R13B, there is one run queue per scheduler by default; this allows for better parallelism.

注意:

如果设定的数量超出CPU数或者核数并不能得到什么好处
有的操作系统可以使用tasket之类的命令绑定CPU,Erlang VM只会检测可用的CPU数,因为绑定这事可能在任何时刻发生;SchedulersOnline 就是实际可用的CPU数或核数;
运行时是可以调整该参数的

如果要关闭SMP 可以使用下面的启动参数:

-smp [enable|auto|disable]
-smp enable and -smp starts the Erlang runtime system with SMP support enabled. This may fail if no runtime system with SMP support is available. -smp auto starts the Erlang runtime system with SMP support enabled if it is available and more than one logical processor are detected. -smp disable starts a runtime system without SMP support. By default -smp auto will be used unless a conflicting parameter has been passed, then -smp disable will be used. Currently only the -hybrid parameter conflicts with -smp auto.

Parallelism is not the answer to every problem. In some cases, going parallel will even slow down your application. This can happen whenever your program is 100% sequential, but still uses multiple processes.

One of the best examples of this is the ring benchmark. A ring benchmark is a test where many thousands of processes will pass a piece of data to one after the other in a circular manner. Think of it as a game of telephone if you want. In this benchmark, only one process at a time does something useful, but the Erlang VM still spends time distributing the load accross cores and giving every process its share of time.

This plays against many common hardware optimizations and makes the VM spend time doing useless stuff. This often makes purely sequential applications run much slower on many cores than on a single one. In this case, disabling symmetric multiprocessing ($ erl -smp disable) might be a good idea.