最近测试一个项目时高通8.1平台,每个版本做出来差分包有800M,因为版本中有个APK有700M,所以相当于这个APK没有走差分的流程,一开始有两个方面的猜想,第一种是做了限制,操过了一定大小的APK没有执行差分,第二种引入分割理论,将APK分割成小份进行差分
一、大致流程
二、问题来源
三、问题优化
四、问题探究
一、大致流程
这里不详细说,我们只看我们的单个问题,所以先不做具体分析
payload_generator/generate_delta_main.cc
GenerateUpdatePayloadFile(payload_config,FLAGS_out_file, FLAGS_private_key,&metadata_size)) {}
payload_generator/delta_diff_generator.cc
strategy->GenerateOperations(config,old_part,new_part,&blob_file,&aops)){}
payload_generator/ab_generator.cc
GenerateOperations( const PayloadGenerationConfig& config, const PartitionConfig& old_part, const PartitionConfig& new_part,BlobFileWriter* blob_file,vector<AnnotatedOperation>* aops){}
payload_generator/delta_diff_utils.cc
DeltaReadPartition(vector<AnnotatedOperation>* aops,const PartitionConfig& old_part,
const PartitionConfig& new_part,ssize_t hard_chunk_blocks, size_t soft_chunk_blocks, const PayloadVersion& version, BlobFileWriter* blob_file) {}
DeltaReadFile(aops, old_part.path,new_part.path,old_unvisited, new_unvisited, "<non-file-data>",
soft_chunk_blocks, version, blob_file)){}
ReadExtentsToDiff(old_part,new_part,old_extents_chunk,new_extents_chunk,version,&data,&operation)){}
这就是大致的方法调用顺序,目前大致的方法知道了,我们就先看这个问题如何进行排查
二、问题来源
首先呢,我把8.1的中间包放到了9.0的环境里面,发现做出的差分包时400M,而且apk还走差分了,那就不用说啥了,很明显payload_generator9.0处理的有优化的方法
对比了9.0 跟 8.0的做包log发现
1payload_generator/delta_diff_utils.cc 2 3bool DeltaReadPartition(vector<AnnotatedOperation>* aops, 4 const PartitionConfig& old_part, 5 const PartitionConfig& new_part, 6 ssize_t hard_chunk_blocks, 7 size_t soft_chunk_blocks, 8 const PayloadVersion& version, 9 BlobFileWriter* blob_file) { 10 ExtentRanges old_visited_blocks; 11 ExtentRanges new_visited_blocks; 12 //处理move 和zero的情况 13 TEST_AND_RETURN_FALSE(DeltaMovedAndZeroBlocks( 14 aops, 15 old_part.path, 16 new_part.path, 17 old_part.size / kBlockSize, 18 new_part.size / kBlockSize, 19 soft_chunk_blocks, 20 version, 21 blob_file, 22 &old_visited_blocks, 23 &new_visited_blocks)); 24 25 map<string, vector<Extent>> old_files_map; 26 if (old_part.fs_interface) { 27 vector<FilesystemInterface::File> old_files; 28 old_part.fs_interface->GetFiles(&old_files); 29 for (const FilesystemInterface::File& file : old_files) 30 old_files_map[file.name] = file.extents; 31 } 32 33 TEST_AND_RETURN_FALSE(new_part.fs_interface); 34 vector<FilesystemInterface::File> new_files; 35 new_part.fs_interface->GetFiles(&new_files); 36 37 vector<FileDeltaProcessor> file_delta_processors; 38 39 // The processing is very straightforward here, we generate operations for 40 // every file (and pseudo-file such as the metadata) in the new filesystem 41 // based on the file with the same name in the old filesystem, if any. 42 // Files with overlapping data blocks (like hardlinks or filesystems with tail 43 // packing or compression where the blocks store more than one file) are only 44 // generated once in the new image, but are also used only once from the old 45 // image due to some simplifications (see below). 46 for (const FilesystemInterface::File& new_file : new_files) { 47 // Ignore the files in the new filesystem without blocks. Symlinks with 48 // data blocks (for example, symlinks bigger than 60 bytes in ext2) are 49 // handled as normal files. We also ignore blocks that were already 50 // processed by a previous file. 51 vector<Extent> new_file_extents = FilterExtentRanges( 52 new_file.extents, new_visited_blocks); 53 new_visited_blocks.AddExtents(new_file_extents); 54 55 if (new_file_extents.empty()) 56 continue; 57 58 LOG(INFO) << "Encoding file " << new_file.name << " (" 59 << BlocksInExtents(new_file_extents) << " blocks)"; 60 61 // We can't visit each dst image inode more than once, as that would 62 // duplicate work. Here, we avoid visiting each source image inode 63 // more than once. Technically, we could have multiple operations 64 // that read the same blocks from the source image for diffing, but 65 // we choose not to avoid complexity. Eventually we will move away 66 // from using a graph/cycle detection/etc to generate diffs, and at that 67 // time, it will be easy (non-complex) to have many operations read 68 // from the same source blocks. At that time, this code can die. -adlr 69 vector<Extent> old_file_extents = FilterExtentRanges( 70 old_files_map[new_file.name], old_visited_blocks); 71 old_visited_blocks.AddExtents(old_file_extents); 72 //处理差分的流程 73 TEST_AND_RETURN_FALSE(DeltaReadFile(aops, 74 old_part.path, 75 new_part.path, 76 old_file_extents, 77 new_file_extents, 78 new_file.name, // operation name 79 hard_chunk_blocks, 80 version, 81 blob_file)); 82 } 83 // Process all the blocks not included in any file. We provided all the unused 84 // blocks in the old partition as available data. 85 vector<Extent> new_unvisited = { 86 ExtentForRange(0, new_part.size / kBlockSize)}; 87 new_unvisited = FilterExtentRanges(new_unvisited, new_visited_blocks); 88 if (new_unvisited.empty()) 89 return true; 90 91 vector<Extent> old_unvisited; 92 if (old_part.fs_interface) { 93 old_unvisited.push_back(ExtentForRange(0, old_part.size / kBlockSize)); 94 old_unvisited = FilterExtentRanges(old_unvisited, old_visited_blocks); 95 } 96 97 LOG(INFO) << "Scanning " << BlocksInExtents(new_unvisited) 98 << " unwritten blocks using chunk size of " 99 << soft_chunk_blocks << " blocks."; 100 // We use the soft_chunk_blocks limit for the <non-file-data> as we don't 101 // really know the structure of this data and we should not expect it to have 102 // redundancy between partitions. 103 // 处理没有文件的数据的部分 104 TEST_AND_RETURN_FALSE(DeltaReadFile(aops, 105 old_part.path, 106 new_part.path, 107 old_unvisited, 108 new_unvisited, 109 "<non-file-data>", // operation name 110 soft_chunk_blocks, 111 version, 112 blob_file)); 113 114 return true; 115}
这里面一共调用了三次DeltaReadFile处理不同的情况,很明显中间的差分部分必须会出现打印
LOG(INFO) << "Encoding file " << new_file.name << " ("
<< BlocksInExtents(new_file_extents) << " blocks)";
实际发现8.1的环境并没有这个打印,也就等于没进入for循环,很明显 new_part.fs_interface这个参数起到了至关重要的部分,我们只需要确认哪里赋值,为什么不对就可以了(实际做了很多无用功,最初一直以为是什么参数不对)
三、问题优化
确认fs_interface赋值
1payload_generator/payload_generation_config.cc 2 3bool PartitionConfig::OpenFilesystem() { 4 if (path.empty()) 5 return true; 6 fs_interface.reset(); 7 //lbb test问题的关键就出在这个判断,fs_interface这里时给到正确的值,但是没return,也就意味着 8 //走到方法最后去了 9 if (utils::IsExtFilesystem(path)) { 10 fs_interface = Ext2Filesystem::CreateFromFile(path); 11 if (fs_interface) { 12 //加入return 并给定blocksize为4096 13 TEST_AND_RETURN_FALSE(fs_interface->GetBlockSize() == kBlockSize); 14 return true; 15 } 16 } 17 18 if (!mapfile_path.empty()) { 19 fs_interface = MapfileFilesystem::CreateFromFile(path, mapfile_path); 20 if (fs_interface) { 21 TEST_AND_RETURN_FALSE(fs_interface->GetBlockSize() == kBlockSize); 22 return true; 23 } 24 } 25 26 // Fall back to a RAW filesystem. 27 TEST_AND_RETURN_FALSE(size % kBlockSize == 0); 28 fs_interface = RawFilesystem::Create( 29 "<" + name + "-partition>", kBlockSize, size / kBlockSize); 30 return true; 31}
加入该代码后重新编译delta_generator问题解决,这样8.1也可以走差分
四、问题探究
在发现这个问题之前其实做了很多无用功,这里面有几个重要的参数
1、payload_generator/generate_delta_main.cc
DEFINE_int32(chunk_size,200 * 1024 * 1024,
"Payload chunk size (-1 for whole files)");
2、payload_generator/delta_diff_utils.cc
// The maximum destination size allowed for bsdiff. In general, bsdiff should
// work for arbitrary big files, but the payload generation and payload
// application requires a significant amount of RAM. We put a hard-limit of
// 200 MiB that should not affect any released board, but will limit the
// Chrome binary in ASan builders.
const uint64_t kMaxBsdiffDestinationSize = 100 * 1024 * 1024; // bytes
// The maximum destination size allowed for imgdiff. In general, imgdiff should
// work for arbitrary big files, but the payload application is quite memory
// intensive, so we limit these operations to 50 MiB.
// lbb test
const uint64_t kMaxImgdiffDestinationSize = 100 * 1024 * 1024; // bytes
这两个参数好像毫不相干,却是他们控制着文件是否会走差分
1payload_generator/delta_diff_utils.cc 2 3bool DeltaReadFile(vector<AnnotatedOperation>* aops, 4 const string& old_part, 5 const string& new_part, 6 const vector<Extent>& old_extents, 7 const vector<Extent>& new_extents, 8 const string& name, 9 ssize_t chunk_blocks, 10 const PayloadVersion& version, 11 BlobFileWriter* blob_file) { 12 brillo::Blob data; 13 InstallOperation operation; 14 15 uint64_t total_blocks = BlocksInExtents(new_extents); 16 if (chunk_blocks == -1) 17 chunk_blocks = total_blocks; 18 19 for (uint64_t block_offset = 0; block_offset < total_blocks; 20 block_offset += chunk_blocks) { 21 // Split the old/new file in the same chunks. Note that this could drop 22 // some information from the old file used for the new chunk. If the old 23 // file is smaller (or even empty when there's no old file) the chunk will 24 // also be empty. 25 // 将文件按照chunk_size对文件进行分割,非常重要的流程 26 vector<Extent> old_extents_chunk = ExtentsSublist( 27 old_extents, block_offset, chunk_blocks); 28 vector<Extent> new_extents_chunk = ExtentsSublist( 29 new_extents, block_offset, chunk_blocks); 30 NormalizeExtents(&old_extents_chunk); 31 NormalizeExtents(&new_extents_chunk); 32 33 //lbb test 34 LOG(INFO) << "file name"<< name.c_str() << "."; 35 LOG(INFO) << "old_extents_DeltaReadFile"<< utils::BlocksInExtents(old_extents_chunk) * kBlockSize << " bytes"; 36 37 TEST_AND_RETURN_FALSE(ReadExtentsToDiff(old_part, 38 new_part, 39 old_extents_chunk, 40 new_extents_chunk, 41 version, 42 &data, 43 &operation)); 44 45 ...... 46 ...... 47 return true; 48}
1payload_generator/delta_diff_utils.cc 2 3bool ReadExtentsToDiff(const string& old_part, 4 const string& new_part, 5 const vector<Extent>& old_extents, 6 const vector<Extent>& new_extents, 7 const PayloadVersion& version, 8 brillo::Blob* out_data, 9 InstallOperation* out_op) { 10 InstallOperation operation; 11 12 // We read blocks from old_extents and write blocks to new_extents. 13 uint64_t blocks_to_read = BlocksInExtents(old_extents); 14 uint64_t blocks_to_write = BlocksInExtents(new_extents); 15 16 // Disable bsdiff and imgdiff when the data is too big. 17 //文件操过了kMaxBsdiffDestinationSize 200M,跳过执行bsdiff差分,采用整包 18 bool bsdiff_allowed = 19 version.OperationAllowed(InstallOperation::SOURCE_BSDIFF) || 20 version.OperationAllowed(InstallOperation::BSDIFF); 21 if (bsdiff_allowed && 22 blocks_to_read * kBlockSize > kMaxBsdiffDestinationSize) { 23 LOG(INFO) << "bsdiff blacklisted, data too big: " 24 << blocks_to_read * kBlockSize << " bytes"; 25 bsdiff_allowed = false; 26 } 27 //文件操过kMaxImgdiffDestinationSize 50M,跳过执行imagediff差分,采用整包 28 bool imgdiff_allowed = version.OperationAllowed(InstallOperation::IMGDIFF); 29 if (imgdiff_allowed && 30 blocks_to_read * kBlockSize > kMaxImgdiffDestinationSize) { 31 LOG(INFO) << "imgdiff blacklisted, data too big: " 32 << blocks_to_read * kBlockSize << " bytes"; 33 imgdiff_allowed = false; 34 } 35 ...... 36 ...... 37} 38
这两个值的逻辑是这样的
1、首先进来的文件都会走一个分割,而分割的大小按chunk_size的大小200M
2、根据chunk_size 判断是不是操过了kMaxBsdiffDestinationSize ,如果超过了,那就不走差分了,考虑到机器内存和性能
3、目前来说,chunk_size 与 kMaxBsdiffDestinationSize相同,所以你多大的APK都会差分,只不过分割的份数不一样而已,看起来kMaxBsdiffDestinationSize判断有点儿多余
如果我们想更改流程,超过chunk_size大小的文件,就不走差分,那好办,更改kMaxBsdiffDestinationSize小鱼chunk_size即可
这次的分析,暂时不做流程上详细的讲解,我们只看这个问题的解决方法,后续流程讲解delta_generator的时候再详细说明