java构建内存池队列,线程安全内存池

最新推荐文章于 2021-02-26 23:56:01 发布

weixin_39843431

最新推荐文章于 2021-02-26 23:56:01 发布

阅读量72

点赞数

文章标签： java构建内存池队列

My application currently is highly performance critical and is requests 3-5 million objects per frame. Initially, to get the ball rolling, I new'd everything and got the application to work and test my algorithms. The application is multi-threaded.

Once I was happy with the performance, I started to create a memory manager for my objects. The obvious reason is memory fragmentation and wastage. The application could not continue for more than a few frames before crashing due to memory fragmentation. I have checked for memory leaks and know the application is leak free.

So I started creating a simple memory manager using TBB's concurrent_queue. The queue stores a maximum set of elements the application is allowed to use. The class requiring new elements pops elements from the queue. The try_pop method is, according to Intel's documentation, lock-free. This worked quite well as far as memory consumption goes (although there is still memory fragmentation, but not nearly as much as before). The problem I am facing now is that the application's performance has slowed down approximately 4 times according to my own simple profiler (I do not have access to commercial profilers or know of any that will work on a real-time application... any recommendation would be appreciated).

My question is, is there a thread-safe memory pool that is scalable. A must-have feature of the pool is fast recycling of elements and making them available. If there is none, any tips/tricks performance wise?

EDIT: I thought I would explain the problem a bit more. I could easily initialize n number of arrays where n is the number of threads and start using the objects from the arrays per thread. This will work perfectly for some cases. In my case, I am recycling the elements as well (potentially every frame) and they could be recycled at any point in the array; i.e. it may be from elementArray[0] or elementArray[10] or elementArray[1000] part of the array. Now I will have a fragmented array of elements consisting of elements that are ready to be used and elements that are in-use :(

解决方案

As said in comments, don't get a thread-safe memory allocator, allocate memory per-thread.

As you implied in your update, you need to manage free/in-use effectively. That is a pretty straightforward problem, given a constant type and no concurrency.

For example (off the top of my head, untested):

template

class ThreadStorage

{

std::vector m_objs;

std::vector m_avail;

public:

explicit ThreadStorage(size_t count) : m_objs(count, T()) {

m_avail.reserve(count);

for (size_t i = 0; i < count; ++i) m_avail.push_back(i);

}

T* alloc() {

T* retval = &m_objs[0] + m_avail.back();

m_avail.pop_back();

return retval;

}

void free(T* p) {

*p = T(); // Assuming this is enough destruction.

m_avail.push_back(p - &m_objs[0]);

}

};

Then, for each thread, have a ThreadStorage instance, and call alloc() and free() as required.

You can add smart pointers to manage calling free() for you, and you can optimise constructor/destructor calling if that's expensive.

You can also look at boost::pool.

Update:

The new requirement for keeping track of things that have been used so that they can be processed in a second pass seems a bit unclear to me. I think you mean that when the primary processing is finished on an object, you need to not release it, but keep a reference to it for second stage processing. Some objects you will just be released back to the pool and not used for second stage processing.

I assume you want to do this in the same thread.

As a first pass, you could add a method like this to ThreadStorage, and call it when you want to do processing on all unreleased instances of T. No extra book keeping required.

void do_processing(boost::function const& f) {

std::sort(m_avail.begin(), m_avail.end());

size_t o = 0;

for (size_t i = 0; i != m_avail.size(); ++i) {

if (o < m_avail[i]) {

do {

f(&m_objs[o]);

} while (++o < m_avail[i]);

++o;

} else of (o == m_avail[i])

++o;

}

for (; o < m_objs.size(); ++o) f(&m_objs[o]);

}

Assumes no other thread is using the ThreadStorage instance, which is reasonable because it is thread-local by design. Again, off the top of my head, untested.

weixin_39843431

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
java构建内存池队列,线程安全内存池

My application currently is highly performance critical and is requests 3-5 million objects per frame. Initially, to get the ball rolling, I new'd everything and got the application to work and test m...
复制链接

扫一扫