Android 伪加密和解决思路

Android 伪加密和解决思路

我们都知道Android的apk文件就是一个zip格式的文件。由于工作需要,经常要解压apk文件拿到里面的资源,可是最近很多apk通过各种解压软件解压的时候都会失败,但是却能够安装和使用aapt2工具查看包的内容。本来通过python的zip可以批量解压,现在都要安装怕不是要了老命,于是就研究一下Android 11源码中的zip解压库,看看有什么特殊的地方。

zip格式

https://pkware.cachefly.net/webdocs/APPNOTE/APPNOTE-6.2.0.txt 这里是官方文档,想要最详细的格式可以看这里。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-A1DbU7xs-1660629380444)(https://p9-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/2bd2f52fc48f4cd4a0d754f45455cf27~tplv-k3u1fbpfcp-watermark.image?)]
粗略来看zip可以分为这三个部分,第一部分保存文件数据,第二部分是核心目录保存的是第一部分中的文件的信息,最后是结束标志,他的作用首先是标志zip文件的结束,第二是存储了核心目录的信息,所以解析zip文件反而是从后往前来解析的。

end of central directory record(ECOD)
  I.  End of central directory record:

        end of central dir signature    4 bytes  (0x06054b50) //首先就是4个字节的标志位0x06054b50,用于找到EOCD
        number of this disk             2 bytes//当前的硬盘编号
        number of the disk with the
        start of the central directory  2 bytes//核心目录开始的硬盘编号
        total number of entries in the
        central directory on this disk  2 bytes//当前磁盘中保存的核心目录entry总数
        total number of entries in
        the central directory           2 bytes//核心目录entry总数
        size of the central directory   4 bytes//核心目录大小
        offset of start of central
        directory with respect to
        the starting disk number        4 bytes//核心目录开始位置相对于磁盘编号的偏移
        .ZIP file comment length        2 bytes//注释长度
        .ZIP file comment       (variable size)//注释内容的内容

解压zip的第一步操作就是在EOCD中找到核心目录开始的位置和大小。

central directory
 Central directory structure:

      [file header 1]
      .
      .
      . 
      [file header n]
      [digital signature] 

      File header:

        central file header signature   4 bytes  (0x02014b50)//魔数
        version made by                 2 bytes//压缩用的版本
        version needed to extract       2 bytes//解压需要的最低版本
        general purpose bit flag        2 bytes//通用位标记,如果最低位是1就是加密为0就是未加密
        compression method              2 bytes//压缩方法
        last mod file time              2 bytes//文件最后修改时间
        last mod file date              2 bytes//文件最后修改日期
        crc-32                          4 bytes//CRC-32算法
        compressed size                 4 bytes//压缩后大小
        uncompressed size               4 bytes//未压缩的大小
        file name length                2 bytes//文件名长度
        extra field length              2 bytes//扩展域长度
        file comment length             2 bytes//文件注释长度
        disk number start               2 bytes//文件开始位置的磁盘编号
        internal file attributes        2 bytes//内部文件属性
        external file attributes        4 bytes//外部文件属性
        relative offset of local header 4 bytes//本地文件header的相对位移。

        file name (variable size)。      //目录文件名
        extra field (variable size)     //扩展域
        file comment (variable size)   //文件注释内容 

      Digital signature:

        header signature                4 bytes  (0x05054b50)
        size of data                    2 bytes
        signature data (variable size)

核心目录由一个个file header组成,每一个file header描述了一个文件,可以拿到文件名。文件数据的位置和大小,接下来就可以去数据部分拿到文件解压了,其中general purpose bit flag & 0x01拿到最低位的值表示是否加密,将其改为1就可以实现最简单的伪加密,因为实际在打包时并没有加密设置密码只是修改了标识位,在android安装的时候不会去读这个标识位,而很多zip库和zip解压软件是会根据这个标识位来判断是否需要输入密码,从而实现了反解压的能力。

[local file header 1]
    [file data 1]
    [data descriptor 1]
    . 
    .
    .
    [local file header n]
    [file data n]
    [data descriptor n]
    
A.  Local file header:

        local file header signature     4 bytes  (0x04034b50) //标识位
        version needed to extract       2 bytes //能解压的最低版本
        general purpose bit flag        2 bytes //general purpose bit flag
        compression method              2 bytes //加密方法
        last mod file time              2 bytes //文件最后修改时间
        last mod file date              2 bytes //文件最后修改日期
        crc-32                          4 bytes //CRC32校验码
        compressed size                 4 bytes //压缩后大小
        uncompressed size               4 bytes //未压缩的大小
        file name length                2 bytes //文件名长度
        extra field length              2 bytes //扩展域长度

        file name (variable size)//文件名
        extra field (variable size)//扩展区

  B.  File data

      Immediately following the local header for a file
      is the compressed or stored data for the file. 
      The series of [local file header][file data][data
      descriptor] repeats for each file in the .ZIP archive. 

  C.  Data descriptor: //一般不会有

        crc-32                          4 bytes
        compressed size                 4 bytes
        uncompressed size               4 bytes

可以发现Local file header内容和核心目录中是几乎一样的,接在Local file header后面就是文件数据了,根据数据长度和加密方式就可以解压了。

Android 解压流程

在frameworks中可以通过frameworks/base/libs/androidfw/ZipUtils.cpp来解压文件。但是仔细看代码会发现这个类只是对ziparchive库的函数的封装,最终调用都进入了ziparchive中。这个库的源码路径是system/core/libziparchive/

system/core/libziparchive/zip_archive.cc

int32_t OpenArchive(const char* fileName, ZipArchiveHandle* handle) {
  const int fd = open(fileName, O_RDONLY | O_BINARY, 0);
  ZipArchive* archive = new ZipArchive(fd, true);
  *handle = archive;

  if (fd < 0) {
    ALOGW("Unable to open '%s': %s", fileName, strerror(errno));
    return kIoError;
  }

  return OpenArchiveInternal(archive, fileName);
}
  1. 首先通过路径打开文件拿到fd
  2. 生成ZipArchive对象
  3. 调用OpenArchiveInternal解析文件
static int32_t OpenArchiveInternal(ZipArchive* archive, const char* debug_file_name) {
  int32_t result = -1;
  if ((result = MapCentralDirectory(debug_file_name, archive)) != 0) { //解析ECOD拿到核心目录的位置和其他信息
    return result;
  }

  if ((result = ParseZipArchive(archive))) {//解析zip文件
    return result;
  }

  return 0;
}

到这里激动人心的核心目录已经出来了,下面就看看是怎么通过MapCentralDirectory拿到核心目录

/*
 * Find the zip Central Directory and memory-map it.
 *
 * On success, returns 0 after populating fields from the EOCD area:
 *   directory_offset
 *   directory_ptr
 *   num_entries
 */
static int32_t MapCentralDirectory(const char* debug_file_name, ZipArchive* archive) {
    
//删除部分异常处理代码
  /*
   * Perform the traditional EOCD snipe hunt.
   *
   * We're searching for the End of Central Directory magic number,
   * which appears at the start of the EOCD block.  It's followed by
   * 18 bytes of EOCD stuff and up to 64KB of archive comment.  We
   * need to read the last part of the file into a buffer, dig through
   * it to find the magic number, parse some values out, and use those
   * to determine the extent of the CD.
   *
   * We start by pulling in the last part of the file.
   */
  off64_t read_amount = kMaxEOCDSearch;
  if (file_length < read_amount) {
    read_amount = file_length;
  }

  std::vector<uint8_t> scan_buffer(read_amount);
  int32_t result =
      MapCentralDirectory0(debug_file_name, archive, file_length, read_amount, scan_buffer.data());
  return result;
}

里面只是做了一些异常处理,最终用的MapCentralDirectory0函数来解析。异常处理中出现了很熟悉EocdRecord,这个结构体就是用来描述EOCD的。

static int32_t MapCentralDirectory0(const char* debug_file_name, ZipArchive* archive,
                                    off64_t file_length, off64_t read_amount, uint8_t* scan_buffer) {
  const off64_t search_start = file_length - read_amount;

  if (!archive->mapped_zip.ReadAtOffset(scan_buffer, read_amount, search_start)) {
    ALOGE("Zip: read %" PRId64 " from offset %" PRId64 " failed", static_cast<int64_t>(read_amount),
          static_cast<int64_t>(search_start));
    return kIoError;
  }

  /*
   * Scan backward for the EOCD magic.  In an archive without a trailing
   * comment, we'll find it on the first try.  (We may want to consider
   * doing an initial minimal read; if we don't find it, retry with a
   * second read as above.)
   */
   //循环查找ECOD
  int i = read_amount - sizeof(EocdRecord);
  for (; i >= 0; i--) {
    if (scan_buffer[i] == 0x50) {
      uint32_t* sig_addr = reinterpret_cast<uint32_t*>(&scan_buffer[i]);
      if (get_unaligned<uint32_t>(sig_addr) == EocdRecord::kSignature) {// kSignature = 0x06054b50;通过标志位找到EOCD
        ALOGV("+++ Found EOCD at buf+%d", i);
        break;
      }
    }
  }
  
  if (i < 0) {
    ALOGD("Zip: EOCD not found, %s is not zip", debug_file_name);
    return kInvalidFile;
  }

  const off64_t eocd_offset = search_start + i;
  const EocdRecord* eocd = reinterpret_cast<const EocdRecord*>(scan_buffer + i);//生成EocdRecord对象,这个对象的作用就是根据zip的EOCD结构解析数据
  /*
   * Verify that there's no trailing space at the end of the central directory
   * and its comment.
   */
  const off64_t calculated_length = eocd_offset + sizeof(EocdRecord) + eocd->comment_length;
  if (calculated_length != file_length) {
    ALOGW("Zip: %" PRId64 " extraneous bytes at the end of the central directory",
          static_cast<int64_t>(file_length - calculated_length));
    return kInvalidFile;
  }

  /*
   * Grab the CD offset and size, and the number of entries in the
   * archive and verify that they look reasonable.
   */
  if (static_cast<off64_t>(eocd->cd_start_offset) + eocd->cd_size > eocd_offset) {
    ALOGW("Zip: bad offsets (dir %" PRIu32 ", size %" PRIu32 ", eocd %" PRId64 ")",
          eocd->cd_start_offset, eocd->cd_size, static_cast<int64_t>(eocd_offset));
#if defined(__ANDROID__)
    if (eocd->cd_start_offset + eocd->cd_size <= eocd_offset) {
      android_errorWriteLog(0x534e4554, "31251826");
    }
#endif
    return kInvalidOffset;
  }
  if (eocd->num_records == 0) {
    ALOGW("Zip: empty archive?");
    return kEmptyArchive;
  }

    //到这里各种异常判断结束,EOCD合法并可以拿到核心目录中File header的数量
  ALOGV("+++ num_entries=%" PRIu32 " dir_size=%" PRIu32 " dir_offset=%" PRIu32, eocd->num_records,
        eocd->cd_size, eocd->cd_start_offset);

  /*
   * It all looks good.  Create a mapping for the CD, and set the fields
   * in archive.
   */
    //InitializeCentralDirectory创建相关变量保存起来
  if (!archive->InitializeCentralDirectory(debug_file_name,
                                           static_cast<off64_t>(eocd->cd_start_offset),
                                           static_cast<size_t>(eocd->cd_size))) {
    ALOGE("Zip: failed to intialize central directory.\n");
    return kMmapFailed;
  }

  archive->num_entries = eocd->num_records;
  archive->directory_offset = eocd->cd_start_offset;

  return 0;
}
  1. 在文件 file_length - read_amount的地方开始找EOCD,read_amount是EOCD可能的最大长度,就是从文件最后read_amount这么长的区域中找到ECOD
  2. 各种异常处理之后,确定找到的ECOD合法,这里也是很多伪加密处理的地方,Android是直接从read_amount的区域查找,但是很多库和解压软件是默认没有注释和额外的数据
  3. InitializeCentralDirectory解析核心目录创建相关变量保存起来

回到OpenArchiveInternal调用MapCentralDirectory拿到相关信息之后就是调用ParseZipArchive解析了。

//函数比较长删掉了一部分异常处理的代码
static int32_t ParseZipArchive(ZipArchive* archive) {
  const uint8_t* const cd_ptr = archive->central_directory.GetBasePtr();
  const size_t cd_length = archive->central_directory.GetMapLength();
  const uint16_t num_entries = archive->num_entries;

  /*
   * Create hash table.  We have a minimum 75% load factor, possibly as
   * low as 50% after we round off to a power of 2.  There must be at
   * least one unused entry to avoid an infinite loop during creation.
   */
  archive->hash_table_size = RoundUpPower2(1 + (num_entries * 4) / 3); //创建hashtable
  archive->hash_table =
      reinterpret_cast<ZipStringOffset*>(calloc(archive->hash_table_size, sizeof(ZipStringOffset)));
  /*
   * Walk through the central directory, adding entries to the hash
   * table and verifying values.
   */
  const uint8_t* const cd_end = cd_ptr + cd_length;
  const uint8_t* ptr = cd_ptr;
  for (uint16_t i = 0; i < num_entries; i++) { //循环获取每一个CentralDirectoryRecord
    if (ptr > cd_end - sizeof(CentralDirectoryRecord)) {
      ALOGW("Zip: ran off the end (item #%" PRIu16 ", %zu bytes of central directory)", i,
            cd_length);
#if defined(__ANDROID__)
      android_errorWriteLog(0x534e4554, "36392138");
#endif
      return kInvalidFile;
    }

    const CentralDirectoryRecord* cdr = reinterpret_cast<const CentralDirectoryRecord*>(ptr);
    if (cdr->record_signature != CentralDirectoryRecord::kSignature) { //kSignature = 0x02014b50;每次都会判断一下标志位
      ALOGW("Zip: missed a central dir sig (at %" PRIu16 ")", i);
      return kInvalidFile;
    }

    const off64_t local_header_offset = cdr->local_file_header_offset;

    const uint16_t file_name_length = cdr->file_name_length;
    const uint16_t extra_length = cdr->extra_field_length;
    const uint16_t comment_length = cdr->comment_length;
    const uint8_t* file_name = ptr + sizeof(CentralDirectoryRecord);
    // Add the CDE filename to the hash table.
    std::string_view entry_name{reinterpret_cast<const char*>(file_name), file_name_length};//根据filename创建entry_name
      
    const int add_result = AddToHash(archive->hash_table, archive->hash_table_size, entry_name,
                                     archive->central_directory.GetBasePtr());//加入hashtable,key是entry_name,fvalue是当前CentralDirectoryRecord的地址
    ptr += sizeof(CentralDirectoryRecord) + file_name_length + extra_length + comment_length;
  }

  ALOGV("+++ zip good scan %" PRIu16 " entries", num_entries);

  return 0;
}
  1. 创建一个hashtable对象
  2. 通过EOCD中拿到的起始地址和数量循环解析每一个CentralDirectoryRecord
  3. 将解析出来的CentralDirectoryRecord全部存入hashtable中

到这里CentralDirectoryRecord的hashtable也创建好了,接下来要解压就是从hashtable中获取CentralDirectoryRecord,根据CentralDirectoryRecord找到对应数据的地址和长度截取数据就好了。

总结

zip解压的流程就到这里结束,android中解压还是通过标准的流程。找到ECOD解析CentralDirectory->根据CentralDirectory创建CentralDirectoryRecord的hashtable->最终通过CentralDirectoryRecord中的文件地址和长度压缩方式,拿到数据解压。后续如果再遇到修改了其他地方导致解压失败应该也很容易解决了。

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值