MCJIT 设计与实现

本文是对LLVM 7.0.1文档《MCJIT Design and Implementation》的选择性意译,并在关键处附上相应源码。

引言

本文档描述MCJIT执行引擎与RuntimeDyld组件的内部过程。这是一份层次比较高的概述,主要展示代码生成与动态链接的流程以及过程中对象之间的交互。

引擎创建

多数情况下,我们使用EngineBuilder来创建MCJIT执行引擎的实例。EngineBuilder的构造函数接受一个llvm::Module对象作为参数。

// lib/ExecutionEngine/ExecutionEngine.cpp

ExecutionEngine::ExecutionEngine(std::unique_ptr<Module> M)
    : DL(M->getDataLayout()), LazyFunctionCreator(nullptr) {
  Init(std::move(M));
}

此外,可以设置MCJIT引擎所需的各种选项,包括是否使用MCJIT(引擎类型)。

// tools/lli/lli.cpp

int main(int argc, char **argv, char * const *envp) {
  ...

  builder.setMArch(MArch);
  builder.setMCPU(getCPUStr());
  builder.setMAttrs(getFeatureList());
  if (RelocModel.getNumOccurrences())
    builder.setRelocationModel(RelocModel);
  if (CMModel.getNumOccurrences())
    builder.setCodeModel(CMModel);
  builder.setErrorStr(&ErrorMsg);
  builder.setEngineKind(ForceInterpreter
                        ? EngineKind::Interpreter
                        : EngineKind::JIT);
  builder.setUseOrcMCJITReplacement(UseJITKind == JITKind::OrcMCJITReplacement);

  ...
}

值得注意的是EngineBuilder::setMCJITMemoryManager函数,如果此时没有显式地创建一个内存管理器,初始化MCJIT引擎时就会自动创建默认的内存管理器SectionMemoryManager

设置好选项后,EngineBuilder::create开始创建MCJIT引擎实例,如果没有传入TargetMachine参数,将根据目标triple以及创建EngineBuilder时的模块自动创建一个合适的TargetMachine

// include/llvm/ExecutionEngine/ExecutionEngine.h

class EngineBuilder {
  ...

  ExecutionEngine *create() {
    return create(selectTarget());
  }

  ExecutionEngine *create(TargetMachine *TM);

  ...
}

EngineBuilder::create调用MCJIT::createJIT函数(实际上是指向MCJIT::createJIT的函数指针ExecutionEngine::MCJITCtor),将模块、内存管理器、TM对象的指针传给它,此后就由MCJIT对象来管理它们。

// lib/ExecutionEngine/ExecutionEngine.cpp

ExecutionEngine *EngineBuilder::create(TargetMachine *TM) {
  ...

    ExecutionEngine *EE = nullptr;
    if (ExecutionEngine::OrcMCJITReplacementCtor && UseOrcMCJITReplacement) {
      EE = ExecutionEngine::OrcMCJITReplacementCtor(ErrorStr, std::move(MemMgr),
                                                    std::move(Resolver),
                                                    std::move(TheTM));
      EE->addModule(std::move(M));
    } else if (ExecutionEngine::MCJITCtor)
      EE = ExecutionEngine::MCJITCtor(std::move(M), ErrorStr, std::move(MemMgr),
                                      std::move(Resolver), std::move(TheTM));

  ...
}

MCJIT有个成员变量RuntimeDyld Dyld,它负责MCJIT和RuntimeDyldImpl对象之间的通信,RuntimeDyldImpl对象在对象加载时创建。

MCJIT在创建时从EngineBuilder手上接过了Module指针,但并不会立马生成模块代码,代码生成推迟到调用MCJIT::finalizeObjectMCJIT::getPointerToFunction时进行(这两个函数lli.cpp将先后调用)。

代码生成

进入代码生成后,MCJIT首先尝试从ObjectCache*类型的成员变量ObjCache中获取对象镜像,如果获取不到,调用MCJIT::emitObject

// lib/ExecutionEngine/MCJIT/MCJIT.cpp

void MCJIT::generateCodeForModule(Module *M) {
  ···

  std::unique_ptr<MemoryBuffer> ObjectToLoad;
  // Try to load the pre-compiled object from cache if possible
  if (ObjCache)
    ObjectToLoad = ObjCache->getObject(M);

  assert(M->getDataLayout() == getDataLayout() && "DataLayout Mismatch");

  // If the cache did not contain a suitable object, compile the object
  if (!ObjectToLoad) {
    ObjectToLoad = emitObject(M);
    assert(ObjectToLoad && "Compilation did not produce an object.");
  }

  ...
}

MCJIT::emitObject分别创建legacy::PassManager实例和ObjectBufferStream(raw_svector_ostream)实例,并在调用PassManager::run之前传入TargetMachine::addPassesToEmitMC

// lib/ExecutionEngine/MCJIT/MCJIT.cpp

std::unique_ptr<MemoryBuffer> MCJIT::emitObject(Module *M) {
  ...

  legacy::PassManager PM;

  // The RuntimeDyld will take ownership of this shortly
  SmallVector<char, 4096> ObjBufferSV;
  raw_svector_ostream ObjStream(ObjBufferSV);

  // Turn the machine code intermediate representation into bytes in memory
  // that may be executed.
  if (TM->addPassesToEmitMC(PM, Ctx, ObjStream, !getVerifyModules()))
    report_fatal_error("Target does not support MC emission!");

  // Initialize passes.
  PM.run(*M);

  ...
}

PassManager借助成员变量PassManagerImpl *PM完成具体工作,PassManager::run的本质就是PassManagerImpl::run

// lib/IR/LegacyPassManager.cpp

/// run - Execute all of the passes scheduled for execution.  Keep track of
/// whether any of the passes modifies the module, and if so, return true.
bool PassManager::run(Module &M) {
  return PM->run(M);
}

PassManagerImpl::run在ObjectBufferStream对象中生成完整的、可重定位的二进制对象镜像(是ELF还是MachO取决于平台)。如果启用了对象缓存,此时还会将镜像传给ObjectCache。

至此,ObjectBufferStream中包含了原始的对象镜像,在运行之前,该镜像中的代码段和数据段必须加载到合适内存空间,必须进行重定位,以及内存准许和代码缓存作废(如果需要的话)。

对象加载

现在,从ObjectCache中获取的也好,直接生成的也罢,总之有了对象镜像,将其传给RuntimeDyld进行加载。

// lib/ExecutionEngine/MCJIT/MCJIT.cpp

void MCJIT::generateCodeForModule(Module *M) {
  ...

  // Load the object into the dynamic linker.
  // MCJIT now owns the ObjectImage pointer (via its LoadedObjects list).
  Expected<std::unique_ptr<object::ObjectFile>> LoadedObject =
    object::ObjectFile::createObjectFile(ObjectToLoad->getMemBufferRef());
  if (!LoadedObject) {
    std::string Buf;
    raw_string_ostream OS(Buf);
    logAllUnhandledErrors(LoadedObject.takeError(), OS, "");
    OS.flush();
    report_fatal_error(Buf);
  }
  std::unique_ptr<RuntimeDyld::LoadedObjectInfo> L =
    Dyld.loadObject(*LoadedObject.get());

  ...
}

RuntimeDyld根据对象的文件类型创建RuntimeDyldELFRuntimeDyldMachO(两者都是RuntimeDyldImpl的子类)实例,然后调用RuntimeDyldImpl::loadObject完成具体的加载动作。

// lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp

std::unique_ptr<RuntimeDyld::LoadedObjectInfo>
RuntimeDyld::loadObject(const ObjectFile &Obj) {
  if (!Dyld) {
    if (Obj.isELF())
      Dyld =
          createRuntimeDyldELF(static_cast<Triple::ArchType>(Obj.getArch()),
                               MemMgr, Resolver, ProcessAllSections, Checker);
    else if (Obj.isMachO())
      Dyld = createRuntimeDyldMachO(
               static_cast<Triple::ArchType>(Obj.getArch()), MemMgr, Resolver,
               ProcessAllSections, Checker);
    else if (Obj.isCOFF())
      Dyld = createRuntimeDyldCOFF(
               static_cast<Triple::ArchType>(Obj.getArch()), MemMgr, Resolver,
               ProcessAllSections, Checker);
    else
      report_fatal_error("Incompatible object format!");
  }

  if (!Dyld->isCompatibleFile(Obj))
    report_fatal_error("Incompatible object format!");

  auto LoadedObjInfo = Dyld->loadObject(Obj);
  MemMgr.notifyObjectLoaded(*this, Obj);
  return LoadedObjInfo;
}
图中的ObjectImage在源码中没有找到

这里使用的是Linux平台,因此观察RuntimeDyldELF::loadObject,它调用了RuntimeDyldImpl::loadObject

// lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp

std::unique_ptr<RuntimeDyld::LoadedObjectInfo>
RuntimeDyldELF::loadObject(const object::ObjectFile &O) {
  if (auto ObjSectionToIDOrErr = loadObjectImpl(O))
    return llvm::make_unique<LoadedELFObjectInfo>(*this, *ObjSectionToIDOrErr);
  else {
    HasError = true;
    raw_string_ostream ErrStream(ErrorStr);
    logAllUnhandledErrors(ObjSectionToIDOrErr.takeError(), ErrStream, "");
    return nullptr;
  }
}

RuntimeDyldImpl::loadObject遍历object::ObjectFile对象中的符号,存入JITSymbolResover::LookupSet Symbols结构中,逐一解析,每个函数和数据的相应段都加载到内存,随后调用RuntimeDyldImpl::emitCommonSymbols为COMMON符号构建段。

随后,RuntimeDyldImpl::loadObject遍历object::ObjectFile对象中的段,使用RuntimeDyldELF::processRelocationRef完成每一段中各项重定位的处理。

图中CreateObjectImage在源码中没有找到

至此,代码和数据段已在内存就绪,重定位信息已备好,但还没有进行重定位,尚不能运行。

地址重映射

如果需要给外部程序使用,代码生成后,调用finalizeObject前,使用MCJIT::mapSectionAddress进行各段地址的重映射,映射到外部进程的地址空间。这一步需在段内存被拷到新地址之前完成。

MCJIT::mapSectionAddress调用RuntimeDyldImpl::mapSectionAddress完成具体工作。

最后的准备工作(重定位)

MCJIT::finalizeObject使用RuntimeDyld::resolveRelocations完成当前对象的外部符号重定位。

// lib/ExecutionEngine/MCJIT/MCJIT.cpp

void MCJIT::finalizeObject() {
  MutexGuard locked(lock);

  // Generate code for module is going to move objects out of the 'added' list,
  // so we need to copy that out before using it:
  SmallVector<Module*, 16> ModsToAdd;
  for (auto M : OwnedModules.added())
    ModsToAdd.push_back(M);

  for (auto M : ModsToAdd)
    generateCodeForModule(M);

  finalizeLoadedModules();
}

void MCJIT::finalizeLoadedModules() {
  MutexGuard locked(lock);

  // Resolve any outstanding relocations.
  Dyld.resolveRelocations();

  OwnedModules.markAllLoadedModulesAsFinalized();

  // Register EH frame data for any module we own which has been loaded
  Dyld.registerEHFrames();

  // Set page permissions.
  MemMgr->finalizeMemory();
}
// lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp

// Resolve the relocations for all symbols we currently know about.
void RuntimeDyldImpl::resolveRelocations() {
  MutexGuard locked(lock);

  // Print out the sections prior to relocation.
  LLVM_DEBUG(for (int i = 0, e = Sections.size(); i != e; ++i)
                 dumpSectionMemory(Sections[i], "before relocations"););

  // First, resolve relocations associated with external symbols.
  if (auto Err = resolveExternalSymbols()) {
    HasError = true;
    ErrorStr = toString(std::move(Err));
  }

  // Iterate over all outstanding relocations
  for (auto it = Relocations.begin(), e = Relocations.end(); it != e; ++it) {
    // The Section here (Sections[i]) refers to the section in which the
    // symbol for the relocation is located.  The SectionID in the relocation
    // entry provides the section to which the relocation will be applied.
    int Idx = it->first;
    uint64_t Addr = Sections[Idx].getLoadAddress();
    LLVM_DEBUG(dbgs() << "Resolving relocations Section #" << Idx << "\t"
                      << format("%p", (uintptr_t)Addr) << "\n");
    resolveRelocationList(it->second, Addr);
  }
  Relocations.clear();

  // Print out sections after relocation.
  LLVM_DEBUG(for (int i = 0, e = Sections.size(); i != e; ++i)
                 dumpSectionMemory(Sections[i], "after relocations"););
}

外部符号解析(resolveExternalSymbols)由RTDyldMemoryManager::getPointerToNamedFunction(暂时没找到联系)完成:

// lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp

Error RuntimeDyldImpl::resolveExternalSymbols() {
  ...
  while (!ExternalSymbolRelocations.empty()) {
  ...
    if (Name.size() == 0) {
      ...
      resolveRelocationList(Relocs, 0);
    } else {
      ...
        resolveRelocationList(Relocs, Addr);
      ...
    }
    ...
  }
  ...
}

RuntimeDyld随后遍历重定向列表(resolveRelocationList),逐一调用RuntimeDyldELF::resolveRelocation实现针对不同平台的重定向。

// lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp

void RuntimeDyldImpl::resolveRelocationList(const RelocationList &Relocs,
                                            uint64_t Value) {
  for (unsigned i = 0, e = Relocs.size(); i != e; ++i) {
    const RelocationEntry &RE = Relocs[i];
    // Ignore relocations for sections that were not loaded
    if (Sections[RE.SectionID].getAddress() == nullptr)
      continue;
    resolveRelocation(RE, Value);
  }
}
// lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp

void RuntimeDyldELF::resolveRelocation(const SectionEntry &Section,
                                       uint64_t Offset, uint64_t Value,
                                       uint32_t Type, int64_t Addend,
                                       uint64_t SymOffset, SID SectionID) {
  switch (Arch) {
  case Triple::x86_64:
    resolveX86_64Relocation(Section, Offset, Value, Type, Addend, SymOffset);
    break;
  case Triple::x86:
    resolveX86Relocation(Section, Offset, (uint32_t)(Value & 0xffffffffL), Type,
                         (uint32_t)(Addend & 0xffffffffL));
    break;
  case Triple::aarch64:
  case Triple::aarch64_be:
    resolveAArch64Relocation(Section, Offset, Value, Type, Addend);
    break;
  case Triple::arm: // Fall through.
  case Triple::armeb:
  case Triple::thumb:
  case Triple::thumbeb:
    resolveARMRelocation(Section, Offset, (uint32_t)(Value & 0xffffffffL), Type,
                         (uint32_t)(Addend & 0xffffffffL));
    break;
  case Triple::ppc:
    resolvePPC32Relocation(Section, Offset, Value, Type, Addend);
    break;
  case Triple::ppc64: // Fall through.
  case Triple::ppc64le:
    resolvePPC64Relocation(Section, Offset, Value, Type, Addend);
    break;
  case Triple::systemz:
    resolveSystemZRelocation(Section, Offset, Value, Type, Addend);
    break;
  case Triple::bpfel:
  case Triple::bpfeb:
    resolveBPFRelocation(Section, Offset, Value, Type, Addend);
    break;
  default:
    llvm_unreachable("Unsupported CPU type!");
  }
}

重定位完成后,将段数据交给RTDyldMemoryManager::registerEHFrames,由此内存管理器可以调用函数。

最终,使用MemMgr->finalizeMemory()SectionMemoryManager::finalizeMemory)获得内存页的使用许可。


2020年10月22、23日

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值