Symcc源码分析

前言

网上基本没有Symcc源码的介绍,阅读起来还挺难的,所以这里就做个记录
由于涉及到cmake语法,LLVM的东西,最近刚学也是一点点搜集的,可能很多东西描述的不准确,

编译器一般分为三部分:前端(Frontend)-- 优化器(Optimizer)-- 后端(Backend)
LLVM也是类似。LLVM编译一个源文件的过程:预处理 -> 词法分析 -> Token -> 语法分析 -> AST -> 代码生成 -> LLVM IR -> 优化 -> 生成汇编代码 -> Link -> 目标文件
参考
使用LLVM编译器时候我们所说的插桩就是对IR文件进行自己的一些插桩操作,IRB的api
然后如何对IR进行操作,LLVM就提供了Pass,顾名思义就是一遍一遍遍历IR,,所以要重写里面的一些方法,做自己想做的事情。
LLVM IR实际上有三种表示:

  1. .ll 格式:人类可以阅读的文本。
  2. .bc 格式:适合机器存储的二进制文件。
  3. 内存表示
    首先,.ll格式和.bc格式是如何生成并相互转换的呢?
.c -> .ll:clang -emit-llvm -S a.c -o a.ll
.c -> .bc: clang -emit-llvm -c a.c -o a.bc
.ll -> .bc: llvm-as a.ll -o a.bc
.bc -> .ll: llvm-dis a.bc -o a.ll
.bc -> .s: llc a.bc -o a.s

.ll文件就相当于汇编,.bc文件就相当于机器码。

LLVM学习参考

目录

cmakelist.txt

首先就是先看最外层第一个cmakelist.txt文件
在这里插入图片描述
外部项目位于runtime下面,然后咱们等会去runtime下面去看

在这里插入图片描述
生成的这个so库,,自己也可以指定静态的就是.a,,跟名字后面那个MOUDLE有关系,有三个可以选择static,shared,moudle(记住这个so库,,就是我们要的插桩的pass了
在这里插入图片描述
test下面的cmakelist,添加符号运行和符号化的依赖库
在这里插入图片描述

然后我们看下一个cmakelist.txt:位于runtime下面的
在这里插入图片描述
然后我们看qsym_backend的cmakelist.txt(使用的qsym后端)最主要的是就是codegen代码生成,通过前面就可以知道可以转换对应的目标机器代码
在这里插入图片描述
在这里插入图片描述

expr_builder__gen.cpp:
在这里插入图片描述
expr__gen.cpp
在这里插入图片描述

生成符号代码还有符号建立的,,这个是对应这symbolize那些插桩插入的符号的实现(后面分析代码再说)
在这里插入图片描述
我们看到增加链接库,symcc就只相当于写了个运行时库,封装那些指令符号函数,但是处理以及符号生成之类的都是QSYM里面处理的

然后我们编译时候符号化的那个库位于compile文件下看symcc.in文件
在这里插入图片描述
看到已经把clang封装为symcc了pass是自己修改的,,接下来就是看插桩代码

插桩代码

libSymbolier.so

位于compile下面
main.cpp注册pass
pass:重写函数
runtime。cpp:运行时库(需要插桩的符号函数)
symbolier:符号化的一些具体操作(是对InstVisitor的封装)

相当于是在编译时候就插入符号的调用,不改变程序的流程,然后在后端处理的时候调用这些调用详细的可以查看论文

pass.cpp

以下已经写上了注释,但是可能这样看很抽象,等会放个图片

char SymbolizePass::ID = 0;//具体值是多少无所谓,LLVM是根据ID地址来标识的

bool SymbolizePass::doInitialization(Module &M) {
  DEBUG(errs() << "Symbolizer module init\n");

  // Redirect calls to external functions to the corresponding wrappers and
  // rename internal functions.
  for (auto &function : M.functions()) {
    auto name = function.getName();
    if (isInterceptedFunction(function))
      function.setName(name + "_symbolized");//拦截容易出错的函数,malloc,strcpy主要的
  }

  // Insert a constructor that initializes the runtime and any globals.
  Function *ctor; //kSymCtorName = __sym_ctor
  std::tie(ctor, std::ignore) = createSanitizerCtorAndInitFunctions(
      M, kSymCtorName, "_sym_initialize", {}, {});//ignore占位符,返回一个ctor
  appendToGlobalCtors(M, ctor, 0);//GV初始化,,在函数执行之前插入一个ctor的函数调用,,,也就是在这初始化runtime里面的函数插桩
  //Function *Ctor = createSanitizerCtor(M, CtorName);
  return true;
}

bool SymbolizePass::runOnFunction(Function &F) {
  auto functionName = F.getName();
  if (functionName == kSymCtorName)
    return false;

  DEBUG(errs() << "Symbolizing function ");
  DEBUG(errs().write_escaped(functionName) << '\n');

  SmallVector<Instruction *, 0> allInstructions;
  allInstructions.reserve(F.getInstructionCount());
  for (auto &I : instructions(F))
    allInstructions.push_back(&I);

  Symbolizer symbolizer(*F.getParent());//构造函数
  symbolizer.symbolizeFunctionArguments(F);//符号化函数参数

  for (auto &basicBlock : F)
    symbolizer.insertBasicBlockNotification(basicBlock);//插入基本块通知

  for (auto *instPtr : allInstructions)
    symbolizer.visit(instPtr);//最后执行visit,记录位置

  symbolizer.finalizePHINodes();//指令处理完之后完成当前节点PHI
  symbolizer.shortCircuitExpressionUses();//简化调用表达式,因为PHI完成了导致其符号表达式不可用,所以要去掉

  // DEBUG(errs() << F << '\n');
  assert(!verifyFunction(F, &errs()) &&
         "SymbolizePass produced invalid bitcode");

  return true;
}

pass.cpp主要是调用了runtime.h跟symbolize.h
其中runtime是为了初始化函数
symbolize是为了看指令进行符号化

runtime.cpp

只贴一点代码


template <typename... ArgsTy>
SymFnT import(llvm::Module &M, llvm::StringRef name, llvm::Type *ret,
              ArgsTy... args) {
#if LLVM_VERSION_MAJOR >= 9 && LLVM_VERSION_MAJOR < 11
  return M.getOrInsertFunction(name, ret, args...).getCallee();
#else
  return M.getOrInsertFunction(name, ret, args...);//参数是Module,TargetLibraryInfo,LibFunction,FunctionType,args
                                                   //在模块符号表中查找指定的函数。 插入函数
#endif
}

} // namespace

Runtime::Runtime(Module &M) {//堆runtime.h里面的函数进行赋值,,然后具体添加在哪是createCall
  IRBuilder<> IRB(M.getContext());
  auto *intPtrType = M.getDataLayout().getIntPtrType(M.getContext());
  auto *ptrT = IRB.getInt8PtrTy();
  auto *int8T = IRB.getInt8Ty();
  auto *voidT = IRB.getVoidTy();

  buildInteger = import(M, "_sym_build_integer", ptrT, IRB.getInt64Ty(), int8T);
  buildInteger128 = import(M, "_sym_build_integer128", ptrT, IRB.getInt64Ty(),
                           IRB.getInt64Ty());
  buildFloatToFloat =
      import(M, "_sym_build_float_to_float", ptrT, ptrT, IRB.getInt1Ty());
  buildBitsToFloat =
      import(M, "_sym_build_bits_to_float", ptrT, ptrT, IRB.getInt1Ty());
  buildFloatToBits = import(M, "_sym_build_float_to_bits", ptrT, ptrT);
  buildFloatToSignedInt =
      import(M, "_sym_build_float_to_signed_integer", ptrT, ptrT, int8T);
  buildFloatToUnsignedInt =
      import(M, "_sym_build_float_to_unsigned_integer", ptrT, ptrT, int8T);
  buildFloatAbs = import(M, "_sym_build_fp_abs", ptrT, ptrT);
  buildBoolAnd = import(M, "_sym_build_bool_and", ptrT, ptrT, ptrT);
  buildBoolOr = import(M, "_sym_build_bool_or", ptrT, ptrT, ptrT);
  buildBoolXor = import(M, "_sym_build_bool_xor", ptrT, ptrT, ptrT);
  buildBoolToBits = import(M, "_sym_build_bool_to_bits", ptrT, ptrT, int8T);
  pushPathConstraint = import(M, "_sym_push_path_constraint", voidT, ptrT,
                              IRB.getInt1Ty(), intPtrType);

  setParameterExpression =
      import(M, "_sym_set_parameter_expression", voidT, int8T, ptrT);
  getParameterExpression =
      import(M, "_sym_get_parameter_expression", ptrT, int8T);
  setReturnExpression = import(M, "_sym_set_return_expression", voidT, ptrT);
  getReturnExpression = import(M, "_sym_get_return_expression", ptrT);

这些都是符号话的函数名字

bool isInterceptedFunction(const Function &f) {
  static const StringSet<> kInterceptedFunctions = {
      "malloc",   "calloc",  "mmap",    "mmap64", "open",   "read",    "lseek",
      "lseek64",  "fopen",   "fopen64", "fread",  "fseek",  "fseeko",  "rewind",
      "fseeko64", "getc",    "ungetc",  "memcpy", "memset", "strncpy", "strchr",
      "memcmp",   "memmove", "ntohl",   "fgets",  "fgetc", "getchar"};

  return (kInterceptedFunctions.count(f.getName()) > 0);
}

这个是对这些函数拦截,加上后缀_ _symbolized

看个图片这下对上面来文件都有了更直观的了解了
在这里插入图片描述

symboize.h

建议直接LLVM手册,InstVistor
然后附上我的代码解释,旁边的英文解释就挺好
首先是.h文件

//指令查看,通过BFS来遍历IR 
class Symbolizer : public llvm::InstVisitor<Symbolizer> {//指令访问的基类 InstVisitor 当想要对不同类型的指令执行不同的操作的时候可以使用VisitXXX
public:
  explicit Symbolizer(llvm::Module &M)//会初始化以下
      : runtime(M), dataLayout(M.getDataLayout()),//获取模块目标平台的数据布局。
        ptrBits(M.getDataLayout().getPointerSizeInBits()),获取布局指针大小,以位为单位
        intPtrType(M.getDataLayout().getIntPtrType(M.getContext())) {}  //getIntPtrType返回一个整数类型,其大小至少与给定地址空间中的指针一样大。

  /// Insert code to obtain the symbolic expressions for the function arguments.
  void symbolizeFunctionArguments(llvm::Function &F);//插入代码以获取函数参数的符号表达式。获取参数的
  //符号化函数参数
  /// Insert a call to the run-time library to notify it of the basic block
  /// entry.插入对运行时库的调用以通知它基本块条目。
  void insertBasicBlockNotification(llvm::BasicBlock &B);//插入基本块通知
  void finalizePHINodes();//PHI是一个指令,用于根据当前块的前导选择一个值,因为LLVM IR是SSA形式的,所以在控制流图中一个变量可能来自两个不同的基本块,所以需要使用PHI节点标识属于哪个基本块,并且PHI标识必须位于基本块的最前面
void shortCircuitExpressionUses();//简化调用表达式

  void handleIntrinsicCall(llvm::CallBase &I);//处理一些原生的函数调用
  void handleInlineAssembly(llvm::CallInst &I);//处理内联汇编
  void handleFunctionCall(llvm::CallBase &I, llvm::Instruction *returnPoint);
/// A symbolic input.
  struct Input {
    llvm::Value *concreteValue;//具体值
    unsigned operandIndex;//操作数索引
    llvm::Instruction *user;//所有引用Values的LLVM节点
    //Instruction类本身跟踪的主要数据是操作码(指令类型)和嵌入的父BasicBlock,同时还是Value,User的子类
    llvm::Value *getSymbolicOperand() const {
      return user->getOperand(operandIndex);//操作数列表
    }

    void replaceOperand(llvm::Value *newOperand) {
      user->setOperand(operandIndex, newOperand);//替换操作数
    }
  };

  /// A symbolic computation with its inputs.
  struct SymbolicComputation {//符号计算结构体
    llvm::Instruction *firstInstruction = nullptr, *lastInstruction = nullptr;
    llvm::SmallVector<Input, kExpectedSymbolicArgumentsPerComputation> inputs;
    //kExpectedSymbolicArgumentsPerComputation = 2
    SymbolicComputation() = default;

    SymbolicComputation(llvm::Instruction *first, llvm::Instruction *last,
                        llvm::ArrayRef<Input> in)
        : firstInstruction(first), lastInstruction(last),
          inputs(in.begin(), in.end()) {}//初始化第一个和最后一个指令,以及input

    /// Append another symbolic computation to this one.
    ///
    /// The computation that is to be appended must occur after the one that
    /// this method is called on.要附加的计算必须发生在调用此方法的计算之后,,先合并
    void merge(const SymbolicComputation &other) {
      if (&other == this)
        return;
      //第一个指令是空的,最后一个指令不应该也需要重新赋值吗?????
      if (firstInstruction == nullptr)
        firstInstruction = other.firstInstruction;
      lastInstruction = other.lastInstruction;

      for (const auto &input : other.inputs)
        inputs.push_back(input);
    }
    //重载 << 运算符
    friend llvm::raw_ostream &
    operator<<(llvm::raw_ostream &out,
               const Symbolizer::SymbolicComputation &computation) {
      out << "\nComputation starting at " << *computation.firstInstruction
          << "\n...ending at " << *computation.lastInstruction
          << "\n...with inputs:\n";
      for (const auto &input : computation.inputs) {
        out << '\t' << *input.concreteValue << '\n';
      }
      return out;
    }
  };
  //创建一个具体值的表达式
  /// Create an expression that represents the concrete value.
  llvm::CallInst *createValueExpression(llvm::Value *V, llvm::IRBuilder<> &IRB);
  //得到一个已经创建的值的符号表达式
  /// Get the (already created) symbolic expression for a value.
  llvm::Value *getSymbolicExpression(llvm::Value *V) {//获取已创建的符号表达式
    auto exprIt = symbolicExpressions.find(V);//symbolicExpressions是一个map在字节码中表示:[ 0, %entry ]
    return (exprIt != symbolicExpressions.end()) ? exprIt->second : nullptr;
  }
  //常量用null表示
  llvm::Value *getSymbolicExpressionOrNull(llvm::Value *V) {
    auto *expr = getSymbolicExpression(V);
    if (expr == nullptr)
      return llvm::ConstantPointerNull::get(//指向 null 的常量指针值。,没有初始化是不能使用的,相当于定义一个常量
          llvm::IntegerType::getInt8PtrTy(V->getContext()));
    return expr;
  }

  bool isLittleEndian(llvm::Type *type) {//小端序
    return (!type->isAggregateType() && dataLayout.isLittleEndian());
  }
  //强制运行时调用
  /// Like buildRuntimeCall, but the call is always generated.
  SymbolicComputation
  forceBuildRuntimeCall(llvm::IRBuilder<> &IRB, SymFnT function,
                        llvm::ArrayRef<std::pair<llvm::Value *, bool>> args);


  std::optional<SymbolicComputation>//使用一个额外的布尔值来标识值是否存在
  buildRuntimeCall(llvm::IRBuilder<> &IRB, SymFnT function,
                   llvm::ArrayRef<std::pair<llvm::Value *, bool>> args) {
    if (std::all_of(args.begin(), args.end(),
                    [this](std::pair<llvm::Value *, bool> arg) {
                      return (getSymbolicExpression(arg.first) == nullptr);
                    })) {
      return {};//每个参数是一对value与bool,布尔值用来指定value是否是符号
    }//在运行时库中创建对指定函数的调用。如果已知的所有参数符号都是具体的,就不会发出调用指令并且返回NULL

    return forceBuildRuntimeCall(IRB, function, args);
  }
    //将所有参数视为符号的便利重载。  降低overhead
  /// Convenience overload that treats all arguments as symbolic.
  std::optional<SymbolicComputation>
  buildRuntimeCall(llvm::IRBuilder<> &IRB, SymFnT function,
                   llvm::ArrayRef<llvm::Value *> symbolicArgs) {
    std::vector<std::pair<llvm::Value *, bool>> args;
    for (const auto &arg : symbolicArgs) {
      args.emplace_back(arg, true);//插入TRUE,,代表是符号
    }

    return buildRuntimeCall(IRB, function, args);
  }

  /// Register the result of the computation as the symbolic expression
  /// corresponding to the concrete value and record the computation for
  /// short-circuiting.将计算结果标识为一个对应的符号表达式
  void registerSymbolicComputation(const SymbolicComputation &computation,
                                   llvm::Value *concrete = nullptr) {
    if (concrete != nullptr)
      symbolicExpressions[concrete] = computation.lastInstruction;//symbolicExpressions是一个value,value的map
    expressionUses.push_back(computation);//符号计算的一个容器
  }
  ///为啥不合并为一个函数,,参数都是一样的?????
  /// Convenience overload for chaining with buildRuntimeCall.
  void registerSymbolicComputation(
      const std::optional<SymbolicComputation> &computation,
      llvm::Value *concrete = nullptr) {
    if (computation)
      registerSymbolicComputation(*computation, concrete);
  }
  //生成使求解器尝试 V 的替代值的代码。
  /// Generate code that makes the solver try an alternative value for V.
  void tryAlternative(llvm::IRBuilder<> &IRB, llvm::Value *V);

  /// Helper to use a pointer to a host object as integer (truncating!).
  ///
  /// Note that the conversion will truncate the most significant bits of the
  /// pointer if the host uses larger addresses than the target. Therefore, use
  /// this function only when such loss is acceptable (e.g., when generating
  /// site identifiers to be passed to the backend, where collisions of the
  /// least significant bits are reasonably unlikely).
  ///
  /// Why not do a lossless conversion and make the backend accept 64-bit
  /// integers?
  ///
  /// 1. Performance: 32-bit architectures will process 32-bit values faster
  /// than 64-bit values.
  ///
  /// 2. Pragmatism: Changing the backend to accept and process 64-bit values
  /// would require modifying code that we don't control (in the case of Qsym).
  llvm::ConstantInt *getTargetPreferredInt(void *pointer) {
    return llvm::ConstantInt::get(intPtrType,
                                  reinterpret_cast<uint64_t>(pointer));
  }
  //计算(可能是嵌套的)聚合中成员的偏移量。
  /// Compute the offset of a member in a (possibly nested) aggregate.
  uint64_t aggregateMemberOffset(llvm::Type *aggregateType,
                                 llvm::ArrayRef<unsigned> indices) const;

libSymRuntime.so

位于runtime下
可以理解是后端运行时候自己要调用的方法,具体的处理求解是使用SMT在qsym下面的solver.cpp文件中
shadow.h 影子内存
LibcWrappers.cpp:libc重封装
config,cpp 配置文件
runtimecommon.h:符号函数定义与实现,跟代码生成里面相对应,ExprBuilder
garbagecollection.h:垃圾回收,用来回收已经完成的内存
里面用到Expr,是qsym后端的代码,但是有一个类我一直找不到定义,搜也不知道是不是原生的类
ExprRef,然后它里面的成员是啥我不知道,就分析不下去了在这里插入图片描述

shadow.h


#ifndef SHADOW_H
#define SHADOW_H

#include <algorithm>
#include <cassert>
#include <cstring>
#include <iterator>
#include <map>

#include <Runtime.h>//要用SymExpr,_sym_build_integer

#include <z3.h>

//
// This file is dedicated to the management of shadow memory.
//
// We manage shadows at page granularity. Since the shadow for each page is
// malloc'ed and thus at an unpredictable location in memory, we need special
// handling for memory allocations that cross page boundaries. This header
// provides iterators over shadow memory that automatically handle jumps between
// memory pages (and thus shadow regions). They should work with the C++
// standard library.
//
// We represent shadowed memory as a sequence of 8-bit expressions. The
// iterators therefore expose the shadow in the form of byte expressions.
// 影子内存用来处理特殊的跨页边界的内存分配,,,像ASAN的就是使用redZone来处理的,设置不可写,并将shadow一位表示内存的8位

constexpr uintptr_t kPageSize = 4096;

/// Compute the corresponding page address.
constexpr uintptr_t pageStart(uintptr_t addr) {//对应页面地址,,一页是4096字节也就是4KB
  return (addr & ~(kPageSize - 1));
}

/// Compute the corresponding offset into the page.
constexpr uintptr_t pageOffset(uintptr_t addr) {//在该页面的偏移量
  return (addr & (kPageSize - 1));
}

/// A mapping from page addresses to the corresponding shadow regions. Each
/// shadow is large enough to hold one expression per byte on the shadowed page.
extern std::map<uintptr_t, SymExpr *> g_shadow_pages;//页面地址映射一个符号表达式

/// An iterator that walks over the shadow bytes corresponding to a memory
/// region. If there is no shadow for any given memory address, it just returns
/// null.
class ReadShadowIterator//对影子内存进行遍历的迭代器,如果该内存地址没有影子内存就返回null
    : public std::iterator<std::bidirectional_iterator_tag, SymExpr> {
public:
  explicit ReadShadowIterator(uintptr_t address)
      : std::iterator<std::bidirectional_iterator_tag, SymExpr>(),
        address_(address), shadow_(getShadow(address)) {}//构造函数,,初始化地址跟影子内存地址
  //重载运算符
  ReadShadowIterator &operator++() {
    auto previousAddress = address_++;
    if (shadow_ != nullptr)//如果当前地址有影子内存
      shadow_++;//那么前一个地址的影子内存地址也直接++
    if (pageStart(address_) != pageStart(previousAddress))//为了避免出现跨页了重新判断一下
      shadow_ = getShadow(address_);//重新计算影子内存,,,仍然肯恩为空
    return *this;
  }

  ReadShadowIterator &operator--() {//同++
    auto previousAddress = address_--;
    if (shadow_ != nullptr)
      shadow_--;
    if (pageStart(address_) != pageStart(previousAddress))
      shadow_ = getShadow(address_);
    return *this;
  }

  SymExpr operator*() {
    assert((shadow_ == nullptr || *shadow_ == nullptr ||
            _sym_bits_helper(*shadow_) == 8) &&
           "Shadow memory always represents bytes");//影子内存只占8位?
    return shadow_ != nullptr ? *shadow_ : nullptr;
  }

  bool operator==(const ReadShadowIterator &other) const {
    return (address_ == other.address_);//对象比较比的是地址
  }

  bool operator!=(const ReadShadowIterator &other) const {
    return !(*this == other);//比的是对象地址
  }

protected:
  static SymExpr *getShadow(uintptr_t address) {
    if (auto shadowPageIt = g_shadow_pages.find(pageStart(address));//通过寻找开始页面地址,来定位地址出现位置
        shadowPageIt != g_shadow_pages.end())//!=end()说明查找成功
      return shadowPageIt->second + pageOffset(address);//返回SymExpr地址+偏移作为影子内存

    return nullptr;//没有返回空
  }

  uintptr_t address_;//地址成员变量
  SymExpr *shadow_;//符号表达式成员变量,,用来表达影子内存的位置==符号地址+偏移
};

/// Like ReadShadowIterator, but return an expression for the concrete memory
/// value if a region does not have a shadow.
class NonNullReadShadowIterator : public ReadShadowIterator {//在迭代器上二次开发,,主要是看影子内存的
public:
  explicit NonNullReadShadowIterator(uintptr_t address)
      : ReadShadowIterator(address) {}//初始化

  SymExpr operator*() {
    if (auto *symbolicResult = ReadShadowIterator::operator*())//影子内存
      return symbolicResult;                                  //返回影子内存的表达式

    return _sym_build_integer(*reinterpret_cast<const uint8_t *>(address_), 8);//防止数据丢失,进行类型转换
  }
};

/// An iterator that walks over the shadow corresponding to a memory region and
/// exposes it for modification. If there is no shadow yet, it creates a new
/// one.这个是如果影子内存不存在,,就创建
class WriteShadowIterator : public ReadShadowIterator {//这个是如果影子内存不存在,,就创建,,,
public:
  WriteShadowIterator(uintptr_t address) : ReadShadowIterator(address) {//初始化
    shadow_ = getOrCreateShadow(address);//如果没有影子内存就创建
  }
  //重构运算操作
  WriteShadowIterator &operator++() {
    auto previousAddress = address_++;
    shadow_++;
    if (pageStart(address_) != pageStart(previousAddress))
      shadow_ = getOrCreateShadow(address_);
    return *this;
  }

  WriteShadowIterator &operator--() {
    auto previousAddress = address_--;
    shadow_--;
    if (pageStart(address_) != pageStart(previousAddress))
      shadow_ = getOrCreateShadow(address_);
    return *this;
  }

  SymExpr &operator*() { return *shadow_; }

protected:
  static SymExpr *getOrCreateShadow(uintptr_t address) {//如果改地址有影子内存就直接使用,,没有就创建
    if (auto *shadow = getShadow(address))
      return shadow;

    auto *newShadow =
        static_cast<SymExpr *>(malloc(kPageSize * sizeof(SymExpr)));
    memset(newShadow, 0, kPageSize * sizeof(SymExpr));
    g_shadow_pages[pageStart(address)] = newShadow;
    return newShadow + pageOffset(address);
  }
};

/// A view on shadow memory that exposes read-only functionality.
struct ReadOnlyShadow {//查看影子内存地址的起始
  template <typename T>//模板
  ReadOnlyShadow(T *addr, size_t len)
      : address_(reinterpret_cast<uintptr_t>(addr)), length_(len) {}//强制类型转换,初始化地址跟长度

  ReadShadowIterator begin() const { return ReadShadowIterator(address_); }//开始,获得一个地址跟一个影子内存地址
  ReadShadowIterator end() const {
    return ReadShadowIterator(address_ + length_);//获得最后的地址
  }

  NonNullReadShadowIterator begin_non_null() const {
    return NonNullReadShadowIterator(address_);//初始化影子开始地址
  }

  NonNullReadShadowIterator end_non_null() const {
    return NonNullReadShadowIterator(address_ + length_);//初始化影子结束地址
  }

  uintptr_t address_;
  size_t length_;
};

/// A view on shadow memory that allows modifications.
template <typename T> struct ReadWriteShadow {
  ReadWriteShadow(T *addr, size_t len)
      : address_(reinterpret_cast<uintptr_t>(addr)), length_(len) {}//初始化地址,成都

  WriteShadowIterator begin() { return WriteShadowIterator(address_); }//该地址是否存在影子内存,,没有创建
  WriteShadowIterator end() { return WriteShadowIterator(address_ + length_); }//该地址是否存在影子内存,,没有创建

  uintptr_t address_;
  size_t length_;
};

/// Check whether the indicated memory range is concrete, i.e., there is no
/// symbolic byte in the entire region.
template <typename T> bool isConcrete(T *addr, size_t nbytes) {//是具体的,即检查在该地址区域上是否没有符号字节
  // Fast path for allocations within one page.
  auto byteBuf = reinterpret_cast<uintptr_t>(addr);
  if (pageStart(byteBuf) == pageStart(byteBuf + nbytes) &&
      !g_shadow_pages.count(pageStart(byteBuf)))//符号表达式map为空,说明没有符号
    return true;

  ReadOnlyShadow shadow(addr, nbytes);//也就是从开始到结束的地址,如果符号都是空,那么返回TRUE,否则就不全是具体的
  return std::all_of(shadow.begin(), shadow.end(),
                     [](SymExpr expr) { return (expr == nullptr); });//bool all_of ( InputIterator first, InputIterator last, Predicate p );对于p中的每个项目都返回TRUE就返回TRUE
}

#endif

LibcWrappers.cpp

然后下面就是对那些拦截的函数,对他们进行重新命名_symbolized和插桩那对应

#define SYM(x) x##_symbolized //对函数重命名

然后是这模板函数,在solver里面看的更清楚,会对约束进行求解根据感不感兴趣之类的(qsym里面)

template <typename V, typename F>
void tryAlternative(V value, SymExpr valueExpr, F caller) {//尝试新的值代替一个给定的值
  if (valueExpr) {
    _sym_push_path_constraint(//符号路径约束
        _sym_build_equal(valueExpr,                                  //_sym_build_equal是封装的,其实调用的是createEqual
                         _sym_build_integer(value, sizeof(value) * 8)),//第一个参数 SymExpr类型,表示的是一个符号表达式,这里是一个相等的约束
        true, reinterpret_cast<uintptr_t>(caller));
  }
}

config,cpp

void loadConfig() {
  auto *fullyConcrete = getenv("SYMCC_NO_SYMBOLIC_INPUT");
  if (fullyConcrete != nullptr)
    g_config.fullyConcrete = checkFlagString(fullyConcrete);
//SYMCC_NO_SYMBOLIC_INPUT=1,输入永远不被被符号执行
  auto *outputDir = getenv("SYMCC_OUTPUT_DIR");
  if (outputDir != nullptr)
    g_config.outputDir = outputDir;
// SYMCC_OUTPUT_DIR(默认“/tmp/output”)
  auto *inputFile = getenv("SYMCC_INPUT_FILE");
  if (inputFile != nullptr)
    g_config.inputFile = inputFile;

  auto *logFile = getenv("SYMCC_LOG_FILE");
  if (logFile != nullptr)
    g_config.logFile = logFile;
//SYMCC_LOG_FILE(默认为空)当设置为文件名时,SymCC 创建
//  文件(或覆盖任何现有文件!)并使用它来记录后端活动
//  包括求解器输出(仅限简单后端)。
  auto *pruning = getenv("SYMCC_ENABLE_LINEARIZATION");
  if (pruning != nullptr)
    g_config.pruning = checkFlagString(pruning);
//默认为0,,启用 QSYM 的基本块修剪,  在执行代码时减少求解器查询的调用堆栈感知策略
  auto *aflCoverageMap = getenv("SYMCC_AFL_COVERAGE_MAP");
  if (aflCoverageMap != nullptr)
    g_config.aflCoverageMap = aflCoverageMap;
//在执行目标程序之前加载Map并使用它来  跳过已覆盖路径的求解器查询

然后大概意思就是对插桩代码的实现(后端)
由于qsym那还是比较复杂的,所以可以通过简单后端z3的看,这其实就是对符号的实现,并且使用的是最简单的取反约束

// This file is part of SymCC.
//
// SymCC is free software: you can redistribute it and/or modify it under the
// terms of the GNU General Public License as published by the Free Software
// Foundation, either version 3 of the License, or (at your option) any later
// version.
//
// SymCC is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
// A PARTICULAR PURPOSE. See the GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License along with
// SymCC. If not, see <https://www.gnu.org/licenses/>.

#include <Runtime.h>

#include <algorithm>
#include <atomic>
#include <cassert>
#include <cstring>
#include <iostream>
#include <set>
#include <vector>

#ifndef NDEBUG
#include <chrono>
#endif

#include "Config.h"
#include "GarbageCollection.h"
#include "LibcWrappers.h"
#include "Shadow.h"

#ifndef NDEBUG
// Helper to print pointers properly.
#define P(ptr) reinterpret_cast<void *>(ptr)
#endif
//分别是创建双精度(64 位)浮点排序。创建单精度(32 位)浮点排序。
#define FSORT(is_double)                                                       \
  ((is_double) ? Z3_mk_fpa_sort_double(g_context)                              \
               : Z3_mk_fpa_sort_single(g_context))

/* TODO Eventually we'll want to inline as much of this as possible. I'm keeping
   it in C for now because that makes it easier to experiment with new features,
   but I expect that a lot of the functions will stay so simple that we can
   generate the corresponding bitcode directly in the compiler pass. */

namespace {

/// Indicate whether the runtime has been initialized.
std::atomic_flag g_initialized = ATOMIC_FLAG_INIT;//原子布尔类型,有set和clear两种操作

/// The global Z3 context.
Z3_context g_context; //typedef System IntPtr Z3_context


/// The global floating-point rounding mode.
Z3_ast g_rounding_mode;z3ast树 typedef System IntPtr Z3_ast

/// The global Z3 solver.
Z3_solver g_solver; // TODO make thread-local SMT求解器

// Some global constants for efficiency.
Z3_ast g_null_pointer, g_true, g_false;

FILE *g_log = stderr;

#ifndef NDEBUG
[[maybe_unused]] void dump_known_regions() {
  std::cerr << "Known regions:" << std::endl;
  for (const auto &[page, shadow] : g_shadow_pages) {
    std::cerr << "  " << P(page) << " shadowed by " << P(shadow) << std::endl;
  }
}

void handle_z3_error(Z3_context c [[maybe_unused]], Z3_error_code e) {
  assert(c == g_context && "Z3 error in unknown context");
  std::cerr << Z3_get_error_msg(g_context, e) << std::endl;
  assert(!"Z3 error");
}
#endif

Z3_ast build_variable(const char *name, uint8_t bits) {//返回对应函数名的该大小的常量表达式
  Z3_symbol sym = Z3_mk_string_symbol(g_context, name);//使用 C 字符串创建 Z3 符号。 变量名a对应符号:’a‘
  auto *sort = Z3_mk_bv_sort(g_context, bits);//创建给定大小的位向量类型。(这种类型可以看做是机器整数)
  Z3_inc_ref(g_context, (Z3_ast)sort);
  Z3_ast result = Z3_mk_const(g_context, sym, sort);//声明并创建一个常量。是以下两个函数的简写Z3_func_decl d = Z3_mk_func_decl (c, s, 0, 0, ty);Z3_ast n = Z3_mk_app (c, d, 0, 0);
  Z3_inc_ref(g_context, result);
  Z3_dec_ref(g_context, (Z3_ast)sort);
  return result;//返回一直函数名的该大小的常量表达式
}

/// The set of all expressions we have ever passed to client code.
std::set<SymExpr> allocatedExpressions; //typedef Z3_ast SymExpr;

SymExpr registerExpression(Z3_ast expr) {
  if (allocatedExpressions.count(expr) == 0) {//表达式如果不在容器中
    // We don't know this expression yet. Record it and increase the reference
    // counter.
    allocatedExpressions.insert(expr);//插入到容器中,(二叉树的形式)
    Z3_inc_ref(g_context, expr);//Z3_context里面的API Z3_inc_ref (Z3_context c, Z3_ast a) 增加给定 AST 的引用计数器。上下文c应该是使用Z3_mk_context_rc创建的
  }

  return expr;
}

} // namespace

void _sym_initialize(void) {
  if (g_initialized.test_and_set())//atomic_flag 用来表示是否初始化
    return;

#ifndef NDEBUG
  std::cerr << "Initializing symbolic runtime" << std::endl;
#endif

  loadConfig();
  initLibcWrappers();
  std::cerr << "This is SymCC running with the simple backend" << std::endl
            << "For anything but debugging SymCC itself, you will want to use "
               "the QSYM backend instead (see README.md for build instructions)"
            << std::endl;

  Z3_config cfg;

  cfg = Z3_mk_config();//为Z3上下文创建一个配置
  Z3_set_param_value(cfg, "model", "true");
  Z3_set_param_value(cfg, "timeout", "10000"); // milliseconds
  g_context = Z3_mk_context_rc(cfg);//使用给定的配置创建上下文。,返回的内容调用Z3_inc_ref并且在不再需要时调用Z3_dec_ref
  Z3_del_config(cfg);//删除配置

#ifndef NDEBUG
  Z3_set_error_handler(g_context, handle_z3_error);//注册一个 Z3 错误处理程序。
#endif

  g_rounding_mode = Z3_mk_fpa_round_nearest_ties_to_even(g_context);//Z3_ast Z3_API Z3_mk_fpa_round_nearest_ties_to_even	(	Z3_context 	C	)	创建一个表示 NearestTiesToEven 舍入模式的 RoundingMode 排序数字。
  Z3_inc_ref(g_context, g_rounding_mode);//增加给定 AST 的引用计数器。(表达式)

  g_solver = Z3_mk_solver(g_context);//创建一个新的求解器。这个求解器是一个“组合求解器” 它在内部使用一个非增量求解器(solver1)和一个增量求解器(solver2)
  Z3_solver_inc_ref(g_context, g_solver);//增加给定求解器的参考计数器。

  auto *pointerSort = Z3_mk_bv_sort(g_context, 8 * sizeof(void *));//创建给定大小的位向量类型。return type: Z3_sort
  Z3_inc_ref(g_context, (Z3_ast)pointerSort);//pointerSort 引用计数器+1
  g_null_pointer = Z3_mk_int(g_context, 0, pointerSort);//创建适合机器整数的数字。
  Z3_inc_ref(g_context, g_null_pointer);//AST g_null_pointer 引用计数器+1
  Z3_dec_ref(g_context, (Z3_ast)pointerSort);//pointerSort 引用计数器-1
  g_true = Z3_mk_true(g_context);//创建一个代表 的 AST 节点true。
  Z3_inc_ref(g_context, g_true);// +1
  g_false = Z3_mk_false(g_context);//创建一个代表 的 AST 节点false。
  Z3_inc_ref(g_context, g_false);//+1

  if (g_config.logFile.empty()) {//config.h : Config g_config;  logFile为约束求解信息文件
    g_log = stderr;//文件流类型
  } else {
    g_log = fopen(g_config.logFile.c_str(), "w");//由于该文件声明为string类型,所以使用.c_str()
  }
}

Z3_ast _sym_build_integer(uint64_t value, uint8_t bits) {
  auto *sort = Z3_mk_bv_sort(g_context, bits);//创建给定大小bits的位向量类型。return type: Z3_sort
  Z3_inc_ref(g_context, (Z3_ast)sort);
  auto *result =
      registerExpression(Z3_mk_unsigned_int64(g_context, value, sort));//创建适合机器uint64_t整数的数字。并将其加入到表达式集合中如果没有该表达式的话
  Z3_dec_ref(g_context, (Z3_ast)sort);
  return result;//result为该表达式
}

Z3_ast _sym_build_integer128(uint64_t high, uint64_t low) {
  return registerExpression(Z3_mk_concat(//参数为:context ,z3_ast1,z3_ast2连接给定的位向量。size为t1+t2
      g_context, _sym_build_integer(high, 64), _sym_build_integer(low, 64)));//128
}

Z3_ast _sym_build_float(double value, int is_double) {
  auto *sort = FSORT(is_double);//是否是双精度(64)来创建向量
  Z3_inc_ref(g_context, (Z3_ast)sort);
  auto *result =
      registerExpression(Z3_mk_fpa_numeral_double(g_context, value, sort));//从双精度数创建浮点数排序。并加入到表达式集合中
  Z3_dec_ref(g_context, (Z3_ast)sort);
  return result;
}

Z3_ast _sym_get_input_byte(size_t offset) {返回该表达式
  static std::vector<SymExpr> stdinBytes;//static 存放变量名表达式

  if (offset < stdinBytes.size())//先执行下面的内容初始化,然后根据偏移从集合中取出表达式
    return stdinBytes[offset];

  auto varName = "stdin" + std::to_string(stdinBytes.size());
  auto *var = build_variable(varName.c_str(), 8);//创建一个于此同名的常量表达式

  stdinBytes.resize(offset);//Resize the AST vector v.
  stdinBytes.push_back(var);//AST压栈,增加一个元素在末尾

  return var;//返回该表达式
}

Z3_ast _sym_build_null_pointer(void) { return g_null_pointer; }//在初始化时候:创建value为0
Z3_ast _sym_build_true(void) { return g_true; }//初始化时候:TRUE节点
Z3_ast _sym_build_false(void) { return g_false; }//初始化时候:FALSE节点
Z3_ast _sym_build_bool(bool value) { return value ? g_true : g_false; }//根据value来决定创建TRUE还是FALSE

Z3_ast _sym_build_neg(Z3_ast expr) {
  return registerExpression(Z3_mk_bvneg(g_context, expr));
}

#define DEF_BINARY_EXPR_BUILDER(name, z3_name)                                 \
  SymExpr _sym_build_##name(SymExpr a, SymExpr b) {                            \
    return registerExpression(Z3_mk_##z3_name(g_context, a, b));               \
  }

DEF_BINARY_EXPR_BUILDER(add, bvadd)//Z3_mk_bvadd()标准二进制补码加法。 节点 t1 和 t2 必须具有相同的位向量排序
DEF_BINARY_EXPR_BUILDER(sub, bvsub)//同上  -法
DEF_BINARY_EXPR_BUILDER(mul, bvmul)//同上  *法
DEF_BINARY_EXPR_BUILDER(unsigned_div, bvudiv)//无符号除法:如果t2不为0则为t1/t2 如果t2=0,则结果未定义,必须有相同的位向量排序
DEF_BINARY_EXPR_BUILDER(signed_div, bvsdiv)//有符号除法 if t2!=0,and t1*t2 >= 0 结果为 t1/t2 的下限,if t2!=0,and t1*t2 < 0,则结果为 t1/t2 的上限。如果t2 为零,则结果未定义。节点 t1 和 t2 必须具有相同的位向量排序。
DEF_BINARY_EXPR_BUILDER(unsigned_rem, bvurem)//%无符号取余 t1 - (t1 /u t2) * t2,其中/u表示无符号除法。
DEF_BINARY_EXPR_BUILDER(signed_rem, bvsrem)//有符号取余
DEF_BINARY_EXPR_BUILDER(shift_left, bvshl)//左移
DEF_BINARY_EXPR_BUILDER(logical_shift_right, bvlshr)//逻辑右移 不考虑符号位
DEF_BINARY_EXPR_BUILDER(arithmetic_shift_right, bvashr)//算术右移 考虑符号位

DEF_BINARY_EXPR_BUILDER(signed_less_than, bvslt)//有符号<。
DEF_BINARY_EXPR_BUILDER(signed_less_equal, bvsle)//有符号<=
DEF_BINARY_EXPR_BUILDER(signed_greater_than, bvsgt)//有符号>
DEF_BINARY_EXPR_BUILDER(signed_greater_equal, bvsge)//有符号>=
DEF_BINARY_EXPR_BUILDER(unsigned_less_than, bvult)//无符号<
DEF_BINARY_EXPR_BUILDER(unsigned_less_equal, bvule)//无符号<=
DEF_BINARY_EXPR_BUILDER(unsigned_greater_than, bvugt)//无符号>
DEF_BINARY_EXPR_BUILDER(unsigned_greater_equal, bvuge)//无符号>=
DEF_BINARY_EXPR_BUILDER(equal, eq)//创建一个代表 的 AST 节点t1 = t2。节点必须具有相同的类型

DEF_BINARY_EXPR_BUILDER(and, bvand)// 按位与and
DEF_BINARY_EXPR_BUILDER(or, bvor)// 按位或 or
DEF_BINARY_EXPR_BUILDER(bool_xor, xor)// 异或xor
DEF_BINARY_EXPR_BUILDER(xor, bvxor)//按位异或

DEF_BINARY_EXPR_BUILDER(float_ordered_greater_than, fpa_gt)//浮点数大于
DEF_BINARY_EXPR_BUILDER(float_ordered_greater_equal, fpa_geq)//浮点数大于等于
DEF_BINARY_EXPR_BUILDER(float_ordered_less_than, fpa_lt)//浮点数小于
DEF_BINARY_EXPR_BUILDER(float_ordered_less_equal, fpa_leq)//浮点数小于等于
DEF_BINARY_EXPR_BUILDER(float_ordered_equal, fpa_eq)//浮点数等于

#undef DEF_BINARY_EXPR_BUILDER

Z3_ast _sym_build_fp_add(Z3_ast a, Z3_ast b) {
  return registerExpression(Z3_mk_fpa_add(g_context, g_rounding_mode, a, b));//浮点加法。参数为:逻辑语境(上下文context),RoundingMode排序术语。浮点排序项t1,t2
}

Z3_ast _sym_build_fp_sub(Z3_ast a, Z3_ast b) {
  return registerExpression(Z3_mk_fpa_sub(g_context, g_rounding_mode, a, b));//浮点减法
}

Z3_ast _sym_build_fp_mul(Z3_ast a, Z3_ast b) {
  return registerExpression(Z3_mk_fpa_mul(g_context, g_rounding_mode, a, b));//浮点惩罚
}

Z3_ast _sym_build_fp_div(Z3_ast a, Z3_ast b) {
  return registerExpression(Z3_mk_fpa_div(g_context, g_rounding_mode, a, b));//浮点除法
}

Z3_ast _sym_build_fp_rem(Z3_ast a, Z3_ast b) {
  return registerExpression(Z3_mk_fpa_rem(g_context, a, b));//浮点取余
}

Z3_ast _sym_build_fp_abs(Z3_ast a) {
  return registerExpression(Z3_mk_fpa_abs(g_context, a));//浮点绝对值
}

Z3_ast _sym_build_not(Z3_ast expr) {
  return registerExpression(Z3_mk_bvnot(g_context, expr));//按位取反
}

Z3_ast _sym_build_not_equal(Z3_ast a, Z3_ast b) {//整数 != 不相等
  return registerExpression(Z3_mk_not(g_context, Z3_mk_eq(g_context, a, b)));//Create an AST node representing not(a).The node a must have Boolean sort.
}

Z3_ast _sym_build_bool_and(Z3_ast a, Z3_ast b) {//unsigned and 与 &&
  Z3_ast operands[] = {a, b};//a,b为无符号
  return registerExpression(Z3_mk_and(g_context, 2, operands));//创建一个代表 的 AST 节点args[0] and ... and args[num_args-1]。该num_args:2
}

Z3_ast _sym_build_bool_or(Z3_ast a, Z3_ast b) {//同上 无符号 or或
  Z3_ast operands[] = {a, b};//无符号
  return registerExpression(Z3_mk_or(g_context, 2, operands));//创建一个代表 的 AST 节点args[0] and ... and args[num_args-1]。该num_args:2
}

Z3_ast _sym_build_float_ordered_not_equal(Z3_ast a, Z3_ast b) {//浮点数不等于 != 
  return registerExpression(
      Z3_mk_not(g_context, _sym_build_float_ordered_equal(a, b)));//在上面的DEF_BINARY_EXPR_BUILDER中定义 浮点数等于
}

Z3_ast _sym_build_float_ordered(Z3_ast a, Z3_ast b) {//浮点 或之后再取反
  return registerExpression(
      Z3_mk_not(g_context, _sym_build_float_unordered(a, b)));//
}

Z3_ast _sym_build_float_unordered(Z3_ast a, Z3_ast b) {//浮点或
  Z3_ast checks[2];
  checks[0] = Z3_mk_fpa_is_nan(g_context, a);//指示是否a为 NaN 的谓词
  checks[1] = Z3_mk_fpa_is_nan(g_context, b);//指示是否b为 NaN 的谓词
  return registerExpression(Z3_mk_or(g_context, 2, checks));//创建一个代表 的 AST 节点args[0] and ... and args[num_args-1]。该num_args:2
}

Z3_ast _sym_build_float_unordered_greater_than(Z3_ast a, Z3_ast b) {//无符号浮点数>
  Z3_ast checks[3];
  checks[0] = Z3_mk_fpa_is_nan(g_context, a);//NAN
  checks[1] = Z3_mk_fpa_is_nan(g_context, b);//NAN
  checks[2] = _sym_build_float_ordered_greater_than(a, b);//bool 
  return registerExpression(Z3_mk_or(g_context, 2, checks));//
}

Z3_ast _sym_build_float_unordered_greater_equal(Z3_ast a, Z3_ast b) {//无符号浮点数>=
  Z3_ast checks[3];
  checks[0] = Z3_mk_fpa_is_nan(g_context, a);
  checks[1] = Z3_mk_fpa_is_nan(g_context, b);
  checks[2] = _sym_build_float_ordered_greater_equal(a, b);
  return registerExpression(Z3_mk_or(g_context, 2, checks));
}

Z3_ast _sym_build_float_unordered_less_than(Z3_ast a, Z3_ast b) {无符号浮点数<=
  Z3_ast checks[3];
  checks[0] = Z3_mk_fpa_is_nan(g_context, a);
  checks[1] = Z3_mk_fpa_is_nan(g_context, b);
  checks[2] = _sym_build_float_ordered_less_than(a, b);
  return registerExpression(Z3_mk_or(g_context, 2, checks));
}

Z3_ast _sym_build_float_unordered_less_equal(Z3_ast a, Z3_ast b) {//无符号浮点数<=
  Z3_ast checks[3];
  checks[0] = Z3_mk_fpa_is_nan(g_context, a);
  checks[1] = Z3_mk_fpa_is_nan(g_context, b);
  checks[2] = _sym_build_float_ordered_less_equal(a, b);
  return registerExpression(Z3_mk_or(g_context, 2, checks));
}

Z3_ast _sym_build_float_unordered_equal(Z3_ast a, Z3_ast b) {//无符号浮点数==
  Z3_ast checks[3];
  checks[0] = Z3_mk_fpa_is_nan(g_context, a);
  checks[1] = Z3_mk_fpa_is_nan(g_context, b);
  checks[2] = _sym_build_float_ordered_equal(a, b);
  return registerExpression(Z3_mk_or(g_context, 2, checks));
}

Z3_ast _sym_build_float_unordered_not_equal(Z3_ast a, Z3_ast b) {//无符号的浮点数 !=
  Z3_ast checks[3];
  checks[0] = Z3_mk_fpa_is_nan(g_context, a);
  checks[1] = Z3_mk_fpa_is_nan(g_context, b);
  checks[2] = _sym_build_float_ordered_not_equal(a, b);
  return registerExpression(Z3_mk_or(g_context, 2, checks));
}

Z3_ast _sym_build_sext(Z3_ast expr, uint8_t bits) {//有符号位扩展大小为+bits
  return registerExpression(Z3_mk_sign_ext(g_context, bits, expr));//扩展给定位向量的符号扩展为大小的(有符号)等效位向量m+i,其中m是给定位向量的大小。
}

Z3_ast _sym_build_zext(Z3_ast expr, uint8_t bits) {//无符号的扩展大小为+bits
  return registerExpression(Z3_mk_zero_ext(g_context, bits, expr));//将具有零的给定位向量扩展为大小的(无符号)等效位向量m+i,其中m是给定位向量的大小。
}

Z3_ast _sym_build_trunc(Z3_ast expr, uint8_t bits) {//翻转
  return registerExpression(Z3_mk_extract(g_context, bits - 1, 0, expr));//从大小为 m 的位向量中提取从高到低的位,以生成大小为 n 的新位向量,其中 n = 高 - 低 + 1
}

Z3_ast _sym_build_int_to_float(Z3_ast value, int is_double, int is_signed) {
  auto *sort = FSORT(is_double);//根据单双进度创建AST
  Z3_inc_ref(g_context, (Z3_ast)sort);
  auto *result = registerExpression(//将有符号和无符号向量位转为sort的浮点项(四舍五入)
      is_signed   //是否是有符号的
          ? Z3_mk_fpa_to_fp_signed(g_context, g_rounding_mode, value, sort)//参数为context,RoundingMode 排序术语,位向量AST,浮点排序AST
          : Z3_mk_fpa_to_fp_unsigned(g_context, g_rounding_mode, value, sort));
  Z3_dec_ref(g_context, (Z3_ast)sort);
  return result;
}

Z3_ast _sym_build_float_to_float(Z3_ast expr, int to_double) {//从一个浮点项转为另外一个浮点项
  auto *sort = FSORT(to_double);
  Z3_inc_ref(g_context, (Z3_ast)sort);
  auto *result = registerExpression(
      Z3_mk_fpa_to_fp_float(g_context, g_rounding_mode, expr, sort));//从一个浮点项转为另外一个浮点项(四舍五入)
  Z3_dec_ref(g_context, (Z3_ast)sort);
  return result;
}

Z3_ast _sym_build_bits_to_float(Z3_ast expr, int to_double) {//从bits-->float
  if (expr == nullptr)
    return nullptr;

  auto *sort = FSORT(to_double);
  Z3_inc_ref(g_context, (Z3_ast)sort);
  auto *result = registerExpression(Z3_mk_fpa_to_fp_bv(g_context, expr, sort));//生成一个表示位向量项expr到 sort 浮点项的转换的项。expr必须是位向量排序,sort必须是浮点排序,并且expr的大小必须等于s的ebits+sbits
  Z3_dec_ref(g_context, (Z3_ast)sort);
  return result;
}

Z3_ast _sym_build_float_to_bits(Z3_ast expr) {//float-->bits
  if (expr == nullptr)
    return nullptr;
  return registerExpression(Z3_mk_fpa_to_ieee_bv(g_context, expr));//生成位向量的大小是自动决定的
}

Z3_ast _sym_build_float_to_signed_integer(Z3_ast expr, uint8_t bits) {//浮点数转为有符号整数
  return registerExpression(Z3_mk_fpa_to_sbv(//浮点数转有符号的位向量项(四舍五入),参数为:context,RoundingMode ,浮点排序,结果位向量的大小
      g_context, Z3_mk_fpa_round_toward_zero(g_context), expr, bits));//Z3_mk_fpa_round_toward_zero(g_context)创建一个表示 TowardZero 舍入模式的 RoundingMode 排序数字。
}

Z3_ast _sym_build_float_to_unsigned_integer(Z3_ast expr, uint8_t bits) {//浮点数转为无符号整数
  return registerExpression(Z3_mk_fpa_to_ubv(//浮点数转无符号的位向量项(四舍五入),参数为:context,RoundingMode ,浮点排序,结果位向量的大小
      g_context, Z3_mk_fpa_round_toward_zero(g_context), expr, bits));
}

Z3_ast _sym_build_bool_to_bits(Z3_ast expr, uint8_t bits) {//布尔转if-then-else
  return registerExpression(Z3_mk_ite(g_context, expr,//Z3_mk_ite 创建一个表示 if-then-else: 的 AST 节点ite(t1, t2, t3)。expr必须具有布尔排序,t2,t3必须具有相同排序
                                      _sym_build_integer(1, bits),//创建一个整数1,共bits位
                                      _sym_build_integer(0, bits)));
}

void _sym_push_path_constraint(Z3_ast constraint, int taken,
                               uintptr_t site_id [[maybe_unused]]) {//添加路径约束,就只有一个简单的取反
  if (constraint == nullptr)
    return;

  constraint = Z3_simplify(g_context, constraint);//简化器的接口。
  //为 Z3 使用的 AST 简化器提供接口。它返回一个等于参数的 AST 对象。返回的 AST 使用代数简化规则进行简化,例如常量传播(通过逻辑连接词传播真/假)。
  Z3_inc_ref(g_context, constraint);

  /* Check the easy cases first: if simplification reduced the constraint to
     "true" or "false", there is no point in trying to solve the negation or *
     pushing the constraint to the solver... */
//首先检查简单的情况,如果简化规则已经将约束减少为T or F ,在将这个约束加入到求解器或者解决这个否定就是没有意义的
  if (Z3_is_eq_ast(g_context, constraint, Z3_mk_true(g_context))) {//constraint==T
    assert(taken && "We have taken an impossible branch");
    Z3_dec_ref(g_context, constraint);
    return;
  }

  if (Z3_is_eq_ast(g_context, constraint, Z3_mk_false(g_context))) {//constraint==F
    assert(!taken && "We have taken an impossible branch");
    Z3_dec_ref(g_context, constraint);
    return;
  }

  /* Generate a solution for the alternative */
  Z3_ast not_constraint =
      Z3_simplify(g_context, Z3_mk_not(g_context, constraint));//添加约束 取反当前约束条件 !contraint
  Z3_inc_ref(g_context, not_constraint);

/**********************************创建回溯点*****************************************/

  Z3_solver_push(g_context, g_solver);//在当前创建一个回溯点(为了可以添加所有约束)(求解器包括一堆断言)
  Z3_solver_assert(g_context, g_solver, taken ? not_constraint : constraint);//在求解器中声明一个约束,可以使用函数Z3_solver_check和Z3_solver_check_assumptions来检查逻辑上下文是否一致。
  fprintf(g_log, "Trying to solve:\n%s\n",//将详细信息写入到log日志文件中
          Z3_solver_to_string(g_context, g_solver));//将求解器转为字符串

  Z3_lbool feasible = Z3_solver_check(g_context, g_solver);//检查断言是否是sat(Z3_L_TRUE),unsat(Z3_L_UNDEF,Z3 不能确保对Z3_solver_get_model的调用成功)
  if (feasible == Z3_L_TRUE) {//断言满足
    Z3_model model = Z3_solver_get_model(g_context, g_solver);//求解model
    Z3_model_inc_ref(g_context, model);
    fprintf(g_log, "Found diverging input:\n%s\n",
            Z3_model_to_string(g_context, model));
    Z3_model_dec_ref(g_context, model);
  } else {
    fprintf(g_log, "Can't find a diverging input at this point\n");//不满足
  }
  fflush(g_log);//刷新文件流
/**********************************回溯点*****************************************/
  Z3_solver_pop(g_context, g_solver, 1);//回溯n,回溯点

  /* Assert the actual path constraint */ //断言实际路径约束
  Z3_ast newConstraint = (taken ? constraint : not_constraint);
  Z3_inc_ref(g_context, newConstraint);
  Z3_solver_assert(g_context, g_solver, newConstraint);//声明约束
  assert((Z3_solver_check(g_context, g_solver) == Z3_L_TRUE) &&
         "Asserting infeasible path constraint");
  Z3_dec_ref(g_context, constraint);
  Z3_dec_ref(g_context, not_constraint);
}

SymExpr _sym_concat_helper(SymExpr a, SymExpr b) {//连接给定的向量
  return registerExpression(Z3_mk_concat(g_context, a, b));//连接给定的向量,a,b可以有不同的位向量排序
}

SymExpr _sym_extract_helper(SymExpr expr, size_t first_bit, size_t last_bit) {//抽取其中一段bits
  return registerExpression(
      Z3_mk_extract(g_context, first_bit, last_bit, expr));//参数为:context,unsighed high,unsigned low,Z3_AST
      //从大小为m的位向量中提取位high,以low生成大小为n的新位向量,其中n = high - low + 1
}

size_t _sym_bits_helper(SymExpr expr) {//给向量排序并返回大小
  auto *sort = Z3_get_sort(g_context, expr);//AST节点排序
  Z3_inc_ref(g_context, (Z3_ast)sort);
  auto result = Z3_get_bv_sort_size(g_context, sort);//返回给定排序向量sort的大小
  Z3_dec_ref(g_context, (Z3_ast)sort);
  return result;
}

/* No call-stack tracing *////没有调用栈跟踪
void _sym_notify_call(uintptr_t) {}
void _sym_notify_ret(uintptr_t) {}
void _sym_notify_basic_block(uintptr_t) {}

/* Debugging */
const char *_sym_expr_to_string(SymExpr expr) {//将AST节点转为字符串
  return Z3_ast_to_string(g_context, expr);// AST-->string 
}

bool _sym_feasible(SymExpr expr) {//判断是否该表达式约束有解
  expr = Z3_simplify(g_context, expr);//简化expr约束
  Z3_inc_ref(g_context, expr);

  Z3_solver_push(g_context, g_solver);//t添加回溯点
  Z3_solver_assert(g_context, g_solver, expr);//加入约束
  Z3_lbool feasible = Z3_solver_check(g_context, g_solver);//判断是否有解
  Z3_solver_pop(g_context, g_solver, 1);//回滚

  Z3_dec_ref(g_context, expr);
  return (feasible == Z3_L_TRUE);//有解
}

/* Garbage collection */
void _sym_collect_garbage() {//垃圾处理
  if (allocatedExpressions.size() < g_config.garbageCollectionThreshold)
    return;

#ifndef NDEBUG
  auto start = std::chrono::high_resolution_clock::now();
  auto startSize = allocatedExpressions.size();
#endif

  auto reachableExpressions = collectReachableExpressions();
  for (auto expr_it = allocatedExpressions.begin();
       expr_it != allocatedExpressions.end();) {
    if (reachableExpressions.count(*expr_it) == 0) {
      expr_it = allocatedExpressions.erase(expr_it);
    } else {
      ++expr_it;
    }
  }

#ifndef NDEBUG
  auto end = std::chrono::high_resolution_clock::now();
  auto endSize = allocatedExpressions.size();

  std::cerr << "After garbage collection: " << endSize
            << " expressions remain (before: " << startSize << ")" << std::endl
            << "\t(collection took "
            << std::chrono::duration_cast<std::chrono::milliseconds>(end -
                                                                     start)
                   .count()
            << " milliseconds)" << std::endl;
#endif
}

symcc里面的差不多就这么多了

也就是说symcc在编译的时候加入了符号,写好了逻辑,然后在后端实现具体符号的实现,像上面那个Z3的简单实现

  • 4
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值