进程间通信（IPC）模型的选择

最新推荐文章于 2023-03-16 10:16:21 发布

Frozen2022

最新推荐文章于 2023-03-16 10:16:21 发布

阅读量1.3k

点赞数

分类专栏： c++ c

c++ 同时被 2 个专栏收录

58 篇文章 2 订阅

订阅专栏

40 篇文章 0 订阅

订阅专栏

When selecting your IPC you should consider causes for performance differences including transfer buffer sizes, data transfer mechanisms, memory allocation schemes, locking mechanism implementations, and even code complexity.

cross-platform

In terms of speed, the best cross-platform IPC mechanism will be pipes. That assumes, however, that you want cross-platform IPC on the same machine. If you want to be able to talk to processes on remote machines, you'll want to look at using sockets instead. Luckily, if you're talking about TCP at least, sockets and pipes behave pretty much the same behavior. While the APIs for setting them up and connecting them are different, they both just act like streams of data.

The difficult part, however, is not the communication channel, but the messages you pass over it. You really want to look at something that will perform verification and parsing for you. I recommend looking at Google's Protocol Buffers. You basically create a spec file that describes the object you want to pass between processes, and there is a compiler that generates code in a number of different languages for reading and writing objects that match the spec. It's much easier (and less bug prone) than trying to come up with a messaging protocol and parser yourself.

For C++, check out Boost IPC.
You can probably create or find some bindings for the scripting languages as well.

Otherwise if it's really important to be able to interface with scripting languages your best bet is simply to use files, pipes or sockets or even a higher level abstraction like HTTP.

Non-cross platform

unix/linux

For LARGE messages, there is no doubt that shared memory is a very good technique, and very useful in many ways.

However, if the messages are small, there are drawbacks of having to come up with your own message-passing protocol and method of informing the other process that there is a message.

Pipes and named pipes are much easier to use in this case - they behave pretty much like a file, you just write data at the sending side, and read the data at the receiving side. If the sender writes something, the receiver side automatically wakes up. If the pipe is full, the sending side gets blocked. If there is no more data from the sender, the receiving side is automatically blocked. Which means that this can be implemented in fairly few lines of code with a pretty good guarantee that it will work at all times, every time.

Shared memory on the other hand relies on some other mechanism to inform the other thread that "you have a packet of data to process". Yes, it's very fast if you have LARGE packets of data to copy - but I would be surprised if there is a huge difference to a pipe, really. Main benefit would be that the other side doesn't have to copy the data out of the shared memory - but it also relies on there being enough memory to hold all "in flight" messages, or the sender having the ability to hold back things.

I'm not saying "don't use shared memory", I'm just saying that there is no such thing as "one solution that solves all problems 'best'".

To clarify: I would start by implementing a simple method using a pipe or named pipe [depending on which suits the purposes], and measure the performance of that. If a significant time is spent actually copying the data, then I would consider using other methods.

Of course, another consideration should be "are we ever going to use two separate machines [or two virtual machines on the same system] to solve this problem. In which case, a network solution is a better choice - even if it's not THE fastest, I've run a local TCP stack on my machines at work for benchmark purposes and got some 20-30Gbit/s (2-3GB/s) with sustained traffic. A raw memcpy within the same process gets around 50-100GBit/s (5-10GB/s) (unless the block size is REALLY tiny and fits in the L1 cache). I haven't measured a standard pipe, but I expect that's somewhere roughly in the middle of those two numbers. [This is numbers that are about right for a number of different medium-sized fairly modern PC's - obviously, on a ARM, MIPS or other embedded style controller, expect a lower number for all of these methods]。

If performance really becomes a problem you can use shared memory - but it's a lot more complicated than the other methods - you'll need a signalling mechanism to signal that data is ready (semaphore etc) as well as locks to prevent concurrent access to structures while they're being modified.

The upside is that you can transfer a lot of data without having to copy it in memory, which will definitely improve performance in some cases.

Perhaps there are usable libraries which provide higher level primitives via shared memory.

Shared memory is generally obtained by mmaping the same file using MAP_SHARED (which can be on a tmpfs if you don't want it persisted); a lot of apps also use System V shared memory (IMHO for stupid historical reasons; it's a much less nice interface to the same thing)

Shared memory is the fastest interprocess communication mechanism. The operating system maps a memory segment in the address space of several processes, so that several processes can read and write in that memory segment without calling operating system functions. However, we need some kind of synchronization between processes that read and write shared memory.

a paper on 【Performance Analysis of Various Mechanisms for Inter-process Communication】 that indicates Unix domain sockets for IPC may provide the best performance.conflicting results 【Comparing Some IPC Methods on Unix】 which indicate pipes may be better.

When sending small amounts of data, I prefer named pipes (FIFOs) for their simplicity. This requires a pair of named pipes for bi-directional communication. Unix domain sockets take a bit more overhead to setup (socket creation, initialization and connection), but are more flexible and may offer better performance (higher throughput).

You may need to run some benchmarks for your specific application/environment to determine what will work best for you. From the description provided, it sounds like Unix domain sockets may be the best fit.

Windows

The full list of IPC methods supported by Windows is available on the MSDN.

DDE , Shared memory solve some problems of a minimum magnitude.

Pipes are used for all medium level IPC requirements.

Sockets are used only when u have some thing really fancy that u can't do with pipes.

Sockets can have problems of their own ... sync/ async, Protocol problems ... etc ...

Using a DCOM approach is not bad neither, it will do a lot of things for u but u need a lot of learning for it.

still, if you just have two applications that want to share a memory block, you should create a named memory-mapped file (backed by the paging file) with CreateFileMapping/MapViewOfFile, that should be the most straightforward and fastest method. The details of file mapping are described on its page on MSDN.

The relevant Boost IPC classes can act as a thin wrapper around shared memory, AFAIK it only encapsulates the calls to the relevant system-specific APIs, but in the end you get the usual pointer to the shared memory block, so operation should be as fast as using the native APIs.

Because of this I advise you to use Boost.Interprocess, since it's portable, C++-friendly (it provides RAII semantics) and does not give you any performance penalty after the shared memory block has been created (it can provide additional functionalities on shared memory, but they are all opt-in - if you just want shared memory you get just it).

NOTE:ReadProcessMemory shouldn't even be listed as an IPC method; yes, it can be used as such, but it exists mainly for debugging purposes (if you check its reference, it's under the category "Debugging functions"), and it's surely slower than "real" shared memory because it copies the memory of a process into the specified buffer, while real shared memory doesn't have this overhead.

opinions on pipes and shared memory

Essentially, pipes - whether named or anonymous - are used like message passing. Someone sends a piece of information to the recipient and the recipient can receive it. Shared memory is more like publishing data - someone puts data in shared memory and the readers (potentially many) must use synchronization e.g. via semaphores to learn about the fact that there is new data and must know how to read the memory region to find the information.

With pipes the synchronization is simple and built into the pipe mechanism itself - your reads and writes will freeze and unfreeze the app when something interesting happens. With shared memory, it is easier to work asynchronously and check for new data only once in a while - but at the cost of much more complex code. Plus you can get many-to-many communication but it requires more work again. Also, due to the above, debugging of pipe-based communication is easier than debugging shared memory.

A minor difference is that fifos are visible directly in the filesystem while shared memory regions need special tools like ipcs for their management in case you e.g. create a shared memory segment but your app dies and doesn't clean up after itself (same goes for semaphores and many other synchronization mechanisms which you might need to use together with shared memory).

Shared memory also gives you more control over bufferring and resource use - within limits allowed by the OS it's you who decides how much memory to allocate and how to use it. With pipes, the OS controls things automatically, so once again you loose some flexibility but are relieved of much work.

Summary of most important points: pipes for one-to-one communication, less coding and letting the OS handle things, shared memory for many-to-many, more manual control over things but at the cost of more work and harder debugging.