Memcpy 函数受指针对齐的速度影响
Memcpy 函数原型
- Defined in header <cstring>
- void* memcpy( void* dest, const void* src, std::size_t count );
- Functions: Copies count bytes from the object pointed to by src to the object pointed to by dest. Both objects are reinterpreted as arrays of unsigned char.
Parameters
dest - pointer to the memory location to copy to
src - pointer to the memory location to copy from
count - number of bytes to copy
Return value
dest
测试条件
IAR 8.30 , Simulater ,Corex-m3, 优化-O0
测试代码及结果
在“start" 和“end”分别打断点,观察断点“end”处 CCSTEP 值,并记录到Cycle speed中。
#include "stdint.h"
/*
memcpy function speed test:
------------------------------------------------------------------------
|Test List | Dest pointer | Src pointer | Cycle speed |
--------------------------------------------------------------------------
| 0 | 4 byte aligned | 4 byte aligned | 2203
--------------------------------------------------------------------------
| 1 | 4 byte aligned | 4 byte aligned +1 | 2765
--------------------------------------------------------------------------
| 2 | 4 byte aligned +1 | 4 byte aligned | 2783
---------------------------------------------------------------------------
| 3 | 4 byte aligned +1 | 4 byte aligned +1 | 2229
--------------------------------------------------------------------------
| 4 | 4 byte aligned +2 | 4 byte non-aligned | 2773
--------------------------------------------------------------------------
| 5 | 4 byte aligned +3 | 4 byte non-aligned | 2763
--------------------------------------------------------------------------
*/
uint32_t buf[5000];
void * p_s;
void * p_d;
#define TEST_LIST 5
void main( void )
{
#if TEST_LIST == 0
p_s = (void *)buf;
p_d = (void *)((uint8_t *)(buf+2000));
#elif TEST_LIST == 1
p_s = (void *)((uint8_t *)(buf) +1 );
p_d = (void *)((uint8_t *)(buf+2000));
#elif TEST_LIST == 2
p_s = (void *)buf;
p_d = (void *)((uint8_t *)(buf+2000) +1 );
#elif TEST_LIST == 3
p_s = (void *)((uint8_t *)(buf) +1 );
p_d = (void *)((uint8_t *)(buf+2000) +1 );
#elif TEST_LIST == 4
p_s = (void *)((uint8_t *)(buf) +1 );
p_d = (void *)((uint8_t *)(buf+2000) +2 );
#elif TEST_LIST == 5
p_s = (void *)((uint8_t *)(buf) +1 );
p_d = (void *)((uint8_t *)(buf+2000) +3 );
#endif
printf("start\r\n");
memcpy(p_s,p_d,1000);
printf("end\r\n");
}
结论
- 目标及源都是4byte对齐的时候,速度最快。
- 如果目标及源对于4byte对齐都有相同的偏移,那么速度是第二,应该是memcpy先做单字节拷贝,然后再整体4字节拷贝。
- 其他情况速度都比1慢,时间增加25%