matlab dlnode,在matlab中進行單元陣列的預分配。

This is more a question to understand a behavior rather than a specific problem.

這是一個理解行為而不是特定問題的問題。

Mathworks states that numerical are stored continuous which makes preallocation important. This is not the case for cell arrays.

Mathworks聲明,數值存儲為連續,這使得預分配非常重要。這不是單元陣列的情況。

Are they something similar than vector or array of pointers in C++?

它們是否與c++中的向量或指針數組相似?

This would mean that prealocation is not so important since a pointer is half the size of a double (according to whos - but there surely is overhead somewhere to store the datatype of the mxArray).

這意味着,由於指針的大小是double的一半(根據whos的大小),所以prealocation並不是那么重要,但是一定會有一些開銷來存儲mxArray的數據類型。

Running this code:

運行這段代碼:

clear all

n = 1e6;

tic

A = [];

for i=1:n

A(end + 1) = 1;

end

fprintf('Numerical without preallocation %f s\n',toc)

clear A

tic

A = zeros(1,n);

for i=1:n

A(i) = 1;

end

fprintf('Numerical with preallocation %f s\n',toc)

clear A

tic

A = cell(0);

for i=1:n

A{end + 1} = 1;

end

fprintf('Cell without preallocation %f s\n',toc)

tic

A = cell(1,n);

for i=1:n

A{i} = 1;

end

fprintf('Cell with preallocation %f s\n',toc)

returns: Numerical without preallocation 0.429240 s Numerical with preallocation 0.025236 s Cell without preallocation 4.960297 s Cell with preallocation 0.554257 s

返回:沒有預先分配的0.429240 s的數值,預先分配的0.025236 s單元,預分配為0.554257 s。

There is no surprise for the numerical values. But the did surprise me since only the container of the pointers and not the data itself would need reallocation. Which should (since the pointer is smaller than a double) lead to difference of <.2s. where does this overhead come from>

對於數值來說,這並不奇怪。但這確實讓我吃驚,因為只有指針的容器而不是數據本身需要重新分配。(因為指針比雙線小)導致了

A related question would be, if I would like to make a data container for heterogeneous data in Matlab (preallocation is not possible since the final size is not known in the beginning). I think handle classes are not good since the also have huge overhead.

一個相關的問題是,如果我想在Matlab中為異構數據制作一個數據容器(由於在開始時不知道最終的大小,所以不可能預先分配)。我認為處理類不太好,因為它也有巨大的開銷。

already looking forward to learn something

已經期待着學習一些東西。

magu_

magu_

Edit: I tried out the linked list proposed by Eitan T but I think the overhead from matlab is still rather big. I tried something with an double array as data (rand(200000,1)).

編輯:我嘗試了Eitan T提出的鏈接列表,但是我認為matlab的開銷仍然很大。我嘗試使用雙數組作為數據(rand(200000,1))。

I made a little plot to illustrate:

c4373ece53816bb204f4e93f7b5e167b.png

我做了一個小故事來說明

code for the graph: (I used the dlnode class from the matlab hompage as stated in the answering post)

圖的代碼:(我使用了在回答文章中所述的matlab hompage上的dlnode類)

D = rand(200000,1);

D =蘭德(200000 1);

s = linspace(10,20000,50);

nC = zeros(50,1);

nL = zeros(50,1);

for i = 1:50

a = cell(0);

tic

for ii = 1:s(i)

a{end + 1} = D;

end

nC(i) = toc;

a = list([]);

tic

for ii = 1:s(i)

a.insertAfter(list(D));

end

nL(i) = toc;

end

figure

plot(s,nC,'r',s,nL,'g')

xlabel('#iter')

ylabel('time (s)')

legend({'cell' 'list'})

Don't get me wrong I love the idea of linked list, since there are rather flexible, but I think the overhead might be to big.

不要誤會我的意思,我喜歡鏈表的想法,因為它很靈活,但是我認為它的開銷可能很大。

1 个解决方案

#1

9

Are cell arrays something similar to a vector or an array of pointers in C++?

單元陣列類似於向量或c++中的指針數組嗎?

Cell arrays allow storing data of different types and sizes indeed, but each cell also adds a constant overhead of 112 bytes (see this other answer of mine). This is far more than an 8-byte double, and this is non-negligible, especially when dealing with large cell arrays as in your example.

實際上,單元數組允許存儲不同類型和大小的數據,但是每個單元格還增加了112字節的常量開銷(請參閱我的另一個答案)。這遠遠超過一個8字節的double,這是不可忽略的,特別是在處理大型單元數組時。

It is reasonable to assume that a cell array is implemented as a continuous array of pointers, each pointing to the actual content of the cell.

假設一個單元數組被實現為一個連續的指針數組,每個指針指向單元格的實際內容,這是合理的。

This means that you can modify the content of each cell individually without actually resizing the cell array container itself. However, this also means that adding new cells to the cell array requires dynamic storage allocation and this is why preallocating memory for a cell array improves performance.

這意味着您可以單獨修改每個單元格的內容,而不必實際調整單元陣列容器本身。然而,這也意味着向單元陣列中添加新單元需要動態存儲分配,這就是為什么為單元陣列預分配內存可以提高性能。

A related question would be, if I would like to make a data container for heterogeneous data in Matlab (preallocation is not possible since the final size is not known in the beginning)

一個相關的問題是,如果我想在Matlab中為異構數據制作一個數據容器(由於在開始時不知道最終的大小,所以不可能預先分配)

Not knowing the final size may indeed be a problem, but you could always preallocate a cell array with the maximum supported size necessary (if there is one), and remove the empty cells in the end. I also suggest that you look into implementing linked lists in MATLAB.

不知道最終的大小可能確實是一個問題,但是您總是可以預先分配一個具有最大支持大小的單元格(如果有的話),並在最后刪除空單元格。我還建議您在MATLAB中查看實現鏈接列表。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值