IR分层
Module–》Function–》BasicBlock–》Instruction
Tips:
1、所有的instruction都是在Function之内,以%开头。metadata不属于instruction,struct的variable name只存在于debug model下,需从metadata中提取。
2、getelementptr是constantExpr的subclass,constantExpr是constant的subclass,由于constant都是instruction的参数,对其进行操作时,必先进行get,然后if (llvm::isa<llvm::ConstantExpr>(ptr))
判断它是否为ConstantExpr类型
3、一般来说,llvm::Value* ptr = instr->getOperand(0);得到一个value(ps万姐说IR一切皆value),不断判断,对其向下转换具体类型if (llvm::isa<llvm::ConstantExpr>(ptr)) llvm::ConstantExpr* constExpr = llvm::dyn_cast<llvm::ConstantExpr>(ptr);
,不可以随意转为value的随意子类。为什么要向下转换呢?是因为不同参数类型的成员函数不一样。当当前参数类型为ConstantExpr,那么可以直接将其转为value,或者不变调用其父类成员函数。
4、对于如何用LLVM编写自己的功能函数,(工作只用到IR中函数),要熟悉IR中的不同文件中可能包含自己能调用的函数,例如对instruction操作,就先进入instruction.h或者instructions.h进行查找,找不到再看父类或者子类。当找到自己需要调用的函数时,例如我需要创建一个GlobalVariable时,需要知道GlobalVariable是在module下的GlobalList中的,但是module.h没有对应create函数,在IR目录的GlobalVariable.h中可new GlobalVariable,(PS在IRbuilder.h中含有大量create函数,可以借鉴),
/// GlobalVariable ctor - If a parent module is specified, the global is /// automatically inserted into the end of the specified modules global list. GlobalVariable(Type *Ty, bool isConstant, LinkageTypes Linkage, Constant *Initializer = nullptr, const Twine &Name = "", ThreadLocalMode = NotThreadLocal, unsigned AddressSpace = 0, bool isExternallyInitialized = false); /// GlobalVariable ctor - This creates a global and inserts it before the /// specified other global. GlobalVariable(Module &M, Type *Ty, bool isConstant, LinkageTypes Linkage, Constant *Initializer, const Twine &Name = "", GlobalVariable *InsertBefore = nullptr, ThreadLocalMode = NotThreadLocal, unsigned AddressSpace = 0, bool isExternallyInitialized = false);
仔细阅读其所需要的变量信息和使用注释,一般为了更好的理解,还需要到GlobalVariable.cpp中查找完整的函数内容。
这是我自己创建example of GlobalVariable:GlobalVariable * gvar_new = new GlobalVariable(*module_, vir_op->getType(), false, GlobalValue::InternalLinkage, init, "i", gv);
的
示例code
以下code是我吃进去一个bc文件,对其instruction遍历,提取相关信息的简单操作。目标将读取的GlobalVariable存进set中。
std::set<llvm::GlobalVariable *> global_vars;
// in runOnModule. We simple iterate function list and dispatch functions to handlers
for (Module::iterator f = module_->begin(), FE = module_->end(); f != FE; ++f) {
Function &F = *f;
for (Function::iterator b = F.begin(); b != F.end(); ++b) {
llvm::BasicBlock &bb = *b;
for (llvm::BasicBlock::iterator i = bb.begin(); i != bb.end(); ++i) {
llvm::Instruction* instr = &*i;
outs() << instr->getName() << " " << instr->getOpcode() << "\n";
int operand_num = instr->getNumOperands();
outs() << "total " << operand_num << " numbers" << "\n";
for (int i = 0; i < operand_num; ++i) {
outs() << i + 1 << " number is " << *instr->getOperand(i) << "\n";
}
outs() << "IR :" << *instr << "\n\n";
// Aim to change expression into constExpr to get address of struct element
llvm::Value* ptr = instr->getOperand(0);
if (llvm::isa<llvm::ConstantExpr>(ptr)) {
llvm::ConstantExpr* constExpr = llvm::dyn_cast<llvm::ConstantExpr>(ptr);
int Expr_num = constExpr->getNumOperands();
outs() << "Expr_num :" << Expr_num << "\n";
llvm::Value* op0 = constExpr->getOperand(0);
llvm::Value* op1 = constExpr->getOperand(1);
llvm::Value* op2 = constExpr->getOperand(2);
outs() << "op0 of const expr :" << *op0 << "\n";
outs() << "op1 of const expr :" << *op1 << "\n";
outs() << "op2 of const expr :" << *op2 << "\n";
if (llvm::isa<llvm::GlobalVariable>(op0)) {
llvm::GlobalVariable* global_variable = llvm::dyn_cast<llvm::GlobalVariable>(op0);
global_vars.insert(global_variable);
//llvm::Constant* global_var_init = global_variable->getInitializer();
}
}
}
}
}
getgGlobalVar(global_vars);
}