转载 2017年09月06日 16:48:45

A few months ago Apple announced a ‘new feature,’ called ‘Bitcode.’ In this article, I will try to answer the questions like what is Bitcode, what problems it aims to solve, what issues it introduces and so on.

What is Bitcode?

To answer this question let’s look at what compilers do for us. Here is a brief overview of compilation process:

  • Lexer: takes source code as an input and translates it into a stream of tokens;
  • Parser: takes stream of tokens as an input and translates it into an AST;
  • Semantic Analysis: takes an AST as an input, checks if a program is correct (method called with correct amount of parameters, method called on object actually exists and non-private, etc.), fills in ‘missing types’ (e.g.: let x = yx has type of y) and passes AST to the next phase;
  • Code Generation: takes an AST as an input and emits some high-level IR (intermediate representation);
  • Optimization: takes IR, makes optimizations and emits IR which is potentially faster and/or smaller;
  • AsmPrinter: another code generation phase, it takes IR and emits assembly for particular CPU;
  • Assembler: takes assembly and converts it into an object code (stream of 0s and 1s);
  • Linker: usually programs refer to already compiled routines from other programs (e.g.: printf) to avoid recompilation of the same code over and over. Until this phase these links do not have correct addresses, they are just placeholders. Linker’s job is to resolve those placeholders so that they point to the correct addresses of their corresponding routines.

You can find more details here: The Compiler.

In the modern world these phases are split into two parts: compiler frontend (lexerparsersemantic analysiscode generation) and compiler backend (optimizationasm printerassemblerlinker). This separation makes much sense for both language designers and hardware manufacturers. If you want to create a new programming language you ‘just’ need to implement a frontend, and you get all available optimizations and support of different CPUs for free. On the other hand, if you created a new chip, you ‘just’ need to extend the backend and you get all the available languages (frontends) support for your CPU.

Below you can see a picture that illustrates compilation process using Clang and LLVM:

This picture clearly demonstrates how communication between frontend and backend is done using IR, LLVM has it is own format, that can be encoded using LLVM bitstream file format - Bitcode.

Just to recall it explicitly - Bitcode is a bitstream representation of LLVM IR.

What problems Apple’s Bitcode aims to solve?

Again, we need to dive a bit deeper and look at how an OS runs programs. This description is not precise and is given just to illustrate the process. For more details I can recommend reading this article: How OS X Executes Applications.

OS X and iOS can run on different CPUs (i386x86_64armarm64, etc.), if you want to run a program on any OS X/iOS setup, then the program should contain object code for each platform. Here is how a binary might look like:

When you run a program, OS reads the ‘Table Of Contents’ and looks for a slice corresponding to the OS CPU. For instance, if you run operating system on x86_64, then OS will load object code for x86_64 into a memory and run the program.

What’s happening with other slices? Nothing, they just waste your disk space.

This is the problem Apple wants to solve: currently, all the apps on the AppStore contain object code for arm and arm64 CPUs. Moreover, third-party proprietary libraries or frameworks contain object code for i386x86_64arm and arm64, so you can use them to test the app on a device or simulator. (Can you imagine how many copies of Google Analytics for i386 you have in your pocket?)

UPD: I do not know why, but I was sure that final executable contains these slices as well (i386x86_64, etc.), but it seems they are stripped during the build phase.

Apple did not give us that many details about how the Bitcode and App Thinning works, so let me assume how it may look:

When you submit an app (including Bitcode) Apple’s ‘BlackBox’ recompiles it for each supported platform and drops any ‘useless’ object code, so AppStore has a copy of the app for each CPU. When an end user wants to install the app - she installs the only version for the particular processor, without any unused stuff.

Bitcode might save up to 50% of disk space per program.

UPD: Of course, I do not take in count resources, it is just about binary itself. For instance, an app I am working on currently has size ~40 megabytes (including assets, xibs. fonts), a size of a binary itself is ~16 megabytes. I checked sizes of each slice: ~7MB for armv7 and 9MB for arm64, if we crop just one of them, it will decrease the size of the app by ~20%.

What problems do Bitcode introduce?

The idea of Bitcode and recompiling for each platform looks really great, and it is a huge improvement, though it has downsides as well: the biggest one is security.

To get the benefits of Bitcode, you should submit your app including Bitcode (surprisingly). If you use some proprietary third-party library, then it also should contain Bitcode, hence as a maintainer of a proprietary library, you should distribute the library with Bitcode.

To recall: Bitcode is just another form of LLVM IR.


Let’s write some code to see LLVM IR in action.

// main.c
extern int printf(const char *fmt, ...);

int main() {
  printf("Hello World\n");
  return 0;

Run the following:

clang -S -emit-llvm main.c

And you’ll have main.ll containing IR:

@.str = private unnamed_addr constant [13 x i8] c"Hello World\0A\00", align 1

; Function Attrs: nounwind ssp uwtable
define i32 @main() #0 {
  %1 = alloca i32, align 4
  store i32 0, i32* %1
  %2 = call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([13 x i8]* @.str, i32 0, i32 0))
  ret i32 0

declare i32 @printf(i8*, ...) #1

What can we see here? It is a bit more verbose than original C code, but it is still much more readable than assembler. Malefactors will be much happier to work with this representation, than with disassembled version of a binary (and they do not even have to pay for tools such Hopper or IDA).

How could malefactor get the IR?

iOS and OS X executables have their own format - Mach-O (read Parsing Mach-O files for more details). Mach-O file contains several segments such as Read-Only Data, Code, Symbol Table, etc. One of those sections contain xar archive with Bitcode:

It is really easy to retrieve it automatically, here I wrote a simple C program that does just that: bitcode_retriever. The workflow is pretty straightforward. Let’s assume that some_binary is a Mach-O file that contains object code for two CPUs (arm and x86_64), and each object code is built using two source files:

$ bitcode_retriever some_binary
$ xar -xvf arm.xar
$ llvm-dis 1 # outputs 1.ll
$ llvm-dis 2 # outputs 2.ll

Bitcode does not store any information about original filenames but uses numbers instead (123, etc.). Also, probably you do not have llvm-disinstalled/built on your machine, but you can easily obtain it, see this article for more details: Getting Started with Clang/LLVM on OS X.

Another potential issue (can’t confirm it) - Bitcode thingie works only for iOS 9, so if you submit your app to the AppStore and it includes Bitcode, then malefactor can get the whole IR from your app using iOS 78 and jailbroken device.

I know only one way to secure the IR - obfuscation. This task is not trivial itself, and it requires even much more efforts if you want to introduce this phase into your Xcode-Driven development flow.


  • Bitcode is a bitstream file format for LLVM IR
  • one of its goals is to decrease a size of an app by eliminating unused object code
  • malefactor can obtain your app or library, retrieve the IR from it and steal your ‘secret algorithm.’


内容来源: 今天在一个iOS培训网站上看到一篇关于第三方库不包含bitcode就会报错的文章,感觉剖析得很详细,分享出来,希望可以对iOS初入门者有所帮助。下面我们就一起来看看吧。 用Xc...
  • jijiji000111
  • jijiji000111
  • 2015年09月17日 15:21
  • 4132


Bitcode刚出来的时候查过相关的文档,知道只要是iOS程序就把Enable Bitcode选择NO,因为那个时候第三方SDK都没有支持Bitcode包,也就没有细看,今天我们来看看什么是Bitco...
  • PianZhideNanRen
  • PianZhideNanRen
  • 2016年05月03日 14:56
  • 4345


目录[-] 前言历史回顾编译器现状苹果的要求实践出真知此文之初衷 前言 做iOS开发的朋友们都知道,目前最新的Xcode7,新建项目默认就打开了bitcode设置.而且大部分开发者都被这个突如其...
  • ios_leungYL
  • ios_leungYL
  • 2016年03月12日 15:00
  • 1339


前言 昨天去美团吹水,聊到 BitCode 话题,朋友说他们发现美团没有开启 Bitcode,下载下来的包还是二进制文件只有一个架构的状态。我感到非常不可思议,晚上就去越狱了我的 iPad mi...
  • Gatsbyi
  • Gatsbyi
  • 2016年03月19日 12:57
  • 603


前言 做iOS开发的朋友们都知道,目前最新的Xcode7,新建项目默认就打开了bitcode设置.而且大部分开发者都被这个突如其来的bitcode功能给坑过导致项目编译失败,而这些因为bitco...
  • u011363981
  • u011363981
  • 2017年05月10日 11:52
  • 411

iOS 打包上线 bitcode问题

iOS 打包上线 All object files and libraries for bitcode must be generated from…报错错误现象: 一、App的开发对bitcode...
  • u012198553
  • u012198553
  • 2017年01月12日 19:12
  • 2595

Bitcode真机测试要修改为 NO

You must rebuild it with bitcode enabled(Xcode setting ENABLE_BITCODE), obtain an updated library fr...
  • ios_xumin
  • ios_xumin
  • 2016年06月17日 15:32
  • 748

Enable Bitcode 一种中间代码

今天试着用Xcode 7 beta 3在真机(iOS 8.3)上运行一下我们的工程,结果发现工程编译不过。看了下问题,报的是以下错误: 1 ld: ...
  • fanghan0_0
  • fanghan0_0
  • 2016年05月27日 17:06
  • 483


Bitcode是什么,如何配置?           今天在一个麦子学院上看到一篇关于第三方库不包含bitcode就会报错的文章,感觉剖析得很详细,分享出来,希望可以...
  • Lu_Ca
  • Lu_Ca
  • 2016年02月01日 10:42
  • 556


  • skylin19840101
  • skylin19840101
  • 2017年05月17日 17:26
  • 1176