pwnable之memcpy

问题描述

Are you tired of hacking?, take some rest here.
Just help me out with my small experiment regarding memcpy performance. 
after that, flag is yours.

http://pwnable.kr/bin/memcpy.c

memcpy.c

// compiled with : gcc -o memcpy memcpy.c -m32 -lm
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <signal.h>
#include <unistd.h>
#include <sys/mman.h>
#include <math.h>

unsigned long long rdtsc(){
        asm("rdtsc");
}

char* slow_memcpy(char* dest, const char* src, size_t len){
    int i;
    for (i=0; i<len; i++) {
        dest[i] = src[i];
    }
    return dest;
}

char* fast_memcpy(char* dest, const char* src, size_t len){
    size_t i;
    // 64-byte block fast copy
    if(len >= 64){
        i = len / 64;
        len &= (64-1);
        while(i-- > 0){
            __asm__ __volatile__ (
            "movdqa (%0), %%xmm0\n"
            "movdqa 16(%0), %%xmm1\n"
            "movdqa 32(%0), %%xmm2\n"
            "movdqa 48(%0), %%xmm3\n"
            "movntps %%xmm0, (%1)\n"
            "movntps %%xmm1, 16(%1)\n"
            "movntps %%xmm2, 32(%1)\n"
            "movntps %%xmm3, 48(%1)\n"
            ::"r"(src),"r"(dest):"memory");
            dest += 64;
            src += 64;
        }
    }

    // byte-to-byte slow copy
    if(len) slow_memcpy(dest, src, len);
    return dest;
}

int main(void){

    setvbuf(stdout, 0, _IONBF, 0);
    setvbuf(stdin, 0, _IOLBF, 0);

    printf("Hey, I have a boring assignment for CS class.. :(\n");
    printf("The assignment is simple.\n");

    printf("-----------------------------------------------------\n");
    printf("- What is the best implementation of memcpy?        -\n");
    printf("- 1. implement your own slow/fast version of memcpy -\n");
    printf("- 2. compare them with various size of data         -\n");
    printf("- 3. conclude your experiment and submit report     -\n");
    printf("-----------------------------------------------------\n");

    printf("This time, just help me out with my experiment and get flag\n");
    printf("No fancy hacking, I promise :D\n");

    unsigned long long t1, t2;
    int e;
    char* src;
    char* dest;
    unsigned int low, high;
    unsigned int size;
    // allocate memory
    char* cache1 = mmap(0, 0x4000, 7, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
    char* cache2 = mmap(0, 0x4000, 7, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
    src = mmap(0, 0x2000, 7, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);

    size_t sizes[10];
    int i=0;

    // setup experiment parameters
    for(e=4; e<14; e++){    // 2^13 = 8K
        low = pow(2,e-1);
        high = pow(2,e);
        printf("specify the memcpy amount between %d ~ %d : ", low, high);
        scanf("%d", &size);
        if( size < low || size > high ){
            printf("don't mess with the experiment.\n");
            exit(0);
        }
        sizes[i++] = size;
    }

    sleep(1);
    printf("ok, lets run the experiment with your configuration\n");
    sleep(1);

    // run experiment
    for(i=0; i<10; i++){
        size = sizes[i];
        printf("experiment %d : memcpy with buffer size %d\n", i+1, size);
        dest = malloc( size );

        memcpy(cache1, cache2, 0x4000);     // to eliminate cache effect
        t1 = rdtsc();
        slow_memcpy(dest, src, size);       // byte-to-byte memcpy
        t2 = rdtsc();
        printf("ellapsed CPU cycles for slow_memcpy : %llu\n", t2-t1);

        memcpy(cache1, cache2, 0x4000);     // to eliminate cache effect
        t1 = rdtsc();
        fast_memcpy(dest, src, size);       // block-to-block memcpy
        t2 = rdtsc();
        printf("ellapsed CPU cycles for fast_memcpy : %llu\n", t2-t1);
        printf("\n");
    }

    printf("thanks for helping my experiment!\n");
    printf("flag : ----- erased in this source code -----\n");
    return 0;
}

简单分析
slow_memcpy 是逐字节复制,fast_memcpy利用的是xmm寄存器无cache复制。不足64字节调用slow_memcpy

编译链接运行
这里在每次为dest申请空间后面加了一句,printf("dest addr :%p\n",dest);

$ gcc -o memcpy memcpy.c -m32 -lm
$ ./memcpy 
Hey, I have a boring assignment for CS class.. :(
The assignment is simple.
-----------------------------------------------------
- What is the best implementation of memcpy?        -
- 1. implement your own slow/fast version of memcpy -
- 2. compare them with various size of data         -
- 3. conclude your experiment and submit report     -
-----------------------------------------------------
This time, just help me out with my experiment and get flag
No fancy hacking, I promise :D
specify the memcpy amount between 8 ~ 16 : 8
specify the memcpy amount between 16 ~ 32 : 16
specify the memcpy amount between 32 ~ 64 : 32
specify the memcpy amount between 64 ~ 128 : 64
specify the memcpy amount between 128 ~ 256 : 128
specify the memcpy amount between 256 ~ 512 : 256
specify the memcpy amount between 512 ~ 1024 : 512
specify the memcpy amount between 1024 ~ 2048 : 1024
specify the memcpy amount between 2048 ~ 4096 : 2048
specify the memcpy amount between 4096 ~ 8192 : 4096
ok, lets run the experiment with your configuration
experiment 1 : memcpy with buffer size 8
ellapsed CPU cycles for slow_memcpy : 4620
dest addr :0x57f46410
ellapsed CPU cycles for fast_memcpy : 21792

experiment 2 : memcpy with buffer size 16
ellapsed CPU cycles for slow_memcpy : 828
dest addr :0x57f46420
ellapsed CPU cycles for fast_memcpy : 23100

experiment 3 : memcpy with buffer size 32
ellapsed CPU cycles for slow_memcpy : 768
dest addr :0x57f46438
ellapsed CPU cycles for fast_memcpy : 12456

experiment 4 : memcpy with buffer size 64
ellapsed CPU cycles for slow_memcpy : 1932
dest addr :0x57f46460
ellapsed CPU cycles for fast_memcpy : 14880

experiment 5 : memcpy with buffer size 128
ellapsed CPU cycles for slow_memcpy : 3192
dest addr :0x57f464a8
段错误

调试

$ gdb memcpy -q
Reading symbols from memcpy...(no debugging symbols found)...done.
gdb-peda$ set disassembly-flavor intel
gdb-peda$ r
Starting program: /home/pwd/Desktop/pwdmylife/pwnable/memcpy/memcpy 
Hey, I have a boring assignment for CS class.. :(
The assignment is simple.
-----------------------------------------------------
- What is the best implementation of memcpy?        -
- 1. implement your own slow/fast version of memcpy -
- 2. compare them with various size of data         -
- 3. conclude your experiment and submit report     -
-----------------------------------------------------
This time, just help me out with my experiment and get flag
No fancy hacking, I promise :D
specify the memcpy amount between 8 ~ 16 : 8
specify the memcpy amount between 16 ~ 32 : 16
specify the memcpy amount between 32 ~ 64 : 32
specify the memcpy amount between 64 ~ 128 : 64
specify the memcpy amount between 128 ~ 256 : 128
specify the memcpy amount between 256 ~ 512 : 256
specify the memcpy amount between 512 ~ 1024 : 512
specify the memcpy amount between 1024 ~ 2048 : 1024
specify the memcpy amount between 2048 ~ 4096 : 2048
specify the memcpy amount between 4096 ~ 8192 : 4096
ok, lets run the experiment with your configuration
experiment 1 : memcpy with buffer size 8
ellapsed CPU cycles for slow_memcpy : 5376
dest addr :0x56559410
ellapsed CPU cycles for fast_memcpy : 50632

experiment 2 : memcpy with buffer size 16
ellapsed CPU cycles for slow_memcpy : 544
dest addr :0x56559420
ellapsed CPU cycles for fast_memcpy : 20176

experiment 3 : memcpy with buffer size 32
ellapsed CPU cycles for slow_memcpy : 672
dest addr :0x56559438
ellapsed CPU cycles for fast_memcpy : 14136

experiment 4 : memcpy with buffer size 64
ellapsed CPU cycles for slow_memcpy : 1184
dest addr :0x56559460
ellapsed CPU cycles for fast_memcpy : 13944

experiment 5 : memcpy with buffer size 128
ellapsed CPU cycles for slow_memcpy : 2040
dest addr :0x565594a8


Program received signal SIGSEGV, Segmentation fault.

[----------------------------------registers-----------------------------------]
EAX: 0xf7fc8000 --> 0x0 
EBX: 0x56558000 --> 0x2ee8 
ECX: 0xffff9790 ("dest addr :0x565594a8\nr slow_memcpy : 2040\n\n2 : ")
EDX: 0x565594a8 --> 0x0 
ESI: 0x1 
EDI: 0xf7f55000 --> 0x1b2db0 
EBP: 0xffffbca8 --> 0xffffbd38 --> 0x0 
ESP: 0xffffbc98 --> 0xffffbcb4 --> 0xf7fc8000 --> 0x0 
EIP: 0x5655588f (<fast_memcpy+62>:  movntps XMMWORD PTR [edx],xmm0)
EFLAGS: 0x10202 (carry parity adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x56555880 <fast_memcpy+47>: movdqa xmm1,XMMWORD PTR [eax+0x10]
   0x56555885 <fast_memcpy+52>: movdqa xmm2,XMMWORD PTR [eax+0x20]
   0x5655588a <fast_memcpy+57>: movdqa xmm3,XMMWORD PTR [eax+0x30]
=> 0x5655588f <fast_memcpy+62>: movntps XMMWORD PTR [edx],xmm0
   0x56555892 <fast_memcpy+65>: movntps XMMWORD PTR [edx+0x10],xmm1
   0x56555896 <fast_memcpy+69>: movntps XMMWORD PTR [edx+0x20],xmm2
   0x5655589a <fast_memcpy+73>: movntps XMMWORD PTR [edx+0x30],xmm3
   0x5655589e <fast_memcpy+77>: add    DWORD PTR [ebp+0x8],0x40
[------------------------------------stack-------------------------------------]
0000| 0xffffbc98 --> 0xffffbcb4 --> 0xf7fc8000 --> 0x0 
0004| 0xffffbc9c --> 0xf7e13a25 (<__GI___libc_malloc+197>:  test   eax,eax)
0008| 0xffffbca0 --> 0x56558000 --> 0x2ee8 
0012| 0xffffbca4 --> 0x1 
0016| 0xffffbca8 --> 0xffffbd38 --> 0x0 
0020| 0xffffbcac ("O\\UV\250\224UV")
0024| 0xffffbcb0 --> 0x565594a8 --> 0x0 
0028| 0xffffbcb4 --> 0xf7fc8000 --> 0x0 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x5655588f in fast_memcpy ()

分析
网上找到有关SEE指令movntps的资料

movntps m128,XMM    m128 <== XMM 直接把XMM中的值送入m128,不经过cache,必须对齐16字节.

这里edx存放了dset的首地址,16字节对齐则要求该地址最后4位均为0,而且dest的空间是malloc申请的堆。
32位,堆的结构

|   4bytes  (pre_size)          |4bytes (size+ 3 bits flag|A|M|P)|
|       data                    |                                |

solve
申请一定大小的空间,保证dest的地址的后4位为0

#!/usr/bin/env python
#coding:utf-8
#made by pwd
from pwn import *
import sys
import math

for i in xrange(4,15):
    size = math.pow(2,i)
    print "###########",i
    for j in xrange(int(math.pow(2,i))):
        tmp = size + j
        tmp = 8 * ((tmp + 4) / 8 + 1)
        tmp += 8
        if tmp % 16 == 0:
            print size + j
            break
### 回答1: 在指针练习:memcpy之二中,我们继续学习和练习memcpy函数的用法和应用。 memcpy函数是C语言中非常常用的一个内存拷贝函数,它可以将一个内存区域的数据拷贝到另一个内存区域。memcpy函数的原型如下: void *memcpy(void *dest, const void *src, size_t n); 其中,dest是目标内存区域的指针,src是源内存区域的指针,n是要拷贝的字节数。这个函数的返回值是目标内存区域的指针。 我们可以通过memcpy函数来实现一些常见的操作,比如复制数组、结构体等。下面是一个例子: ```c #include <stdio.h> #include <string.h> struct Student { int age; char name[20]; }; int main() { struct Student stu1 = {20, "Tom"}; struct Student stu2; memcpy(&stu2, &stu1, sizeof(struct Student)); printf("stu2: age=%d, name=%s\n", stu2.age, stu2.name); return 0; } ``` 在这个例子中,我们定义了一个结构体Student,其中包含一个整型的age和一个字符数组的name。然后我们创建了两个结构体变量stu1和stu2,将stu1中的数据拷贝到stu2中。使用memcpy函数时,我们需要传入源内存区域和目标内存区域的指针,以及要拷贝的字节数,这里使用了sizeof操作符来获取结构体的大小。 在最后打印出来的结果中,我们可以看到stu2成功地复制了stu1中的数据。 通过这个例子,我们可以看到memcpy函数在实际工程中的应用。它可以方便地进行内存拷贝操作,帮助我们更好地管理和处理数据。 ### 回答2: 在C语言中,指针是一种非常重要的数据类型,可以用来间接访问内存中的数据。指针的使用需要注意内存的管理,避免出现悬空指针、内存泄漏等问题。 memcpy是C语言中常用的内存拷贝函数,用于将指定内存块中的数据复制到另一个内存块中。其函数原型为:void *memcpy(void *dest, const void *src, size_t n),其中dest为目标内存块的起始地址,src为源内存块的起始地址,n为要拷贝的字节数。 为了练习memcpy函数的使用,我们可以编写一个程序,实现自定义的拷贝函数。 首先,我们需要定义一个自定义的拷贝函数,如下所示: void my_memcpy(void *dest, const void *src, size_t n) { char *d = dest; // 将目标指针转为字符指针 const char *s = src; // 将源指针转为字符指针 while (n--) { *d++ = *s++; // 逐个字节复制 } } 在主函数中,我们可以声明两个数组,分别作为源和目标内存块,然后调用自定义的拷贝函数进行拷贝。示例代码如下: #include <stdio.h> #include <string.h> void my_memcpy(void *dest, const void *src, size_t n) { char *d = dest; const char *s = src; while (n--) { *d++ = *s++; } } int main() { char src[20] = "Hello, world!"; char dest[20] = {}; my_memcpy(dest, src, strlen(src) + 1); printf("源内存块的内容为:%s\n", src); printf("目标内存块的内容为:%s\n", dest); return 0; } 以上代码实现了自定义的拷贝函数my_memcpy,并在主函数中进行了测试。运行程序后,会输出源内存块和目标内存块的内容,验证了拷贝函数的正确性。 通过练习使用memcpy函数,我们可以更好地理解指针的使用和内存拷贝的机制。同时,也可以进一步加深对内存管理的认识和理解。 ### 回答3: 在指针练习中,我们继续探讨`memcpy`函数的用法。`memcpy`函数用于在内存中复制一段指定长度的数据,并将其传递给另一个指针。这对于需要在不同变量之间复制数据非常有用。 `memcpy`函数的用法如下: ``` void* memcpy(void* destination, const void* source, size_t num); ``` 其中,`destination`是目标指针,`source`是源指针,`num`表示要复制的字节数。该函数会将源指针指向的内存数据复制到目标指针指向的内存中。 下面是一个示例程序: ```c #include <stdio.h> #include <string.h> int main() { char source[] = "Hello, World!"; char destination[20]; // 使用memcpy函数将source中的数据复制到destination中 memcpy(destination, source, strlen(source) + 1); // 打印目标字符串 printf("目标字符串:%s\n", destination); return 0; } ``` 在该示例中,我们声明了一个源字符串`source`和一个目标字符串`destination`。然后,我们使用`memcpy`函数将源字符串中的数据复制到目标字符串中。需要注意的是,我们必须指定要复制的字节数,这里我们使用了`strlen(source) + 1`来确保复制整个字符串,并且包含末尾的`\0`字符。 最后,我们打印目标字符串,验证复制操作是否成功。 通过这个练习,我们可以更深入地理解指针和内存操作的概念,以及如何使用`memcpy`函数进行数据复制。这对于处理复杂的数据结构和进行内存操作的程序来说非常重要。
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值