在测试内存(AEP,6*256GB interleaved dax)性能的时候,发现通过8B循环写的带宽大概是4GB/s,然后无意间用了一下memcpy,发现带宽达到了10GB/s,就顺便研究了一下memcpy函数,做个记录如下:
glibc的memcpy函数实现如下
void *
memcpy (void *dstpp, const void *srcpp, size_t len)
{
unsigned long int dstp = (long int) dstpp;
unsigned long int srcp = (long int) srcpp;
/* Copy from the beginning to the end. */
/* If there not too few bytes to copy, use word copy. */
if (len >= OP_T_THRES)
{
/* Copy just a few bytes to make DSTP aligned. */
len -= (-dstp) % OPSIZ;
BYTE_COPY_FWD (dstp, srcp, (-dstp) % OPSIZ);
/* Copy whole pages from SRCP to DSTP by virtual address manipulation,
as much as possible. */
PAGE_COPY_FWD_MAYBE (dstp, srcp, len, len);
/* Copy from SRCP to DSTP taking advantage of the known alignment of
DSTP. Number of bytes remaining is put in the third argument,
i.e. in LEN. This number may va