Clang AST 介绍 (updating)

Clang AST 的介绍

这篇文章介绍了一片温柔的序曲关于 Clang AST 的神秘。这个针对于那写想对clang 有贡献的开发者,或是那些使用基于Clang 的AST 工具,如 AST匹配器。

幻灯片 (个人觉得很好值得好好研究)

介绍

Clang’s AST is different from ASTs produced by some other compilers inthat it closelyresembles both the written C++ code and the C++standard. For example, parenthesis expressions and compile timeconstants are available in an unreduced form in the AST. This makesClang’s AST a good fit forrefactoring tools.

Documentation for all Clang AST nodes is available via the generatedDoxygen. The doxygen onlinedocumentation is also indexed by your favorite search engine, which willmake a search for clang and the AST node’s class name usually turn upthe doxygen of the class you’re looking for (for example, search for:clang ParenExpr).

Examining the AST

一个熟悉Clang AST 的方法就是去 看一些简单的例子对应的AST。Clang 有一个内置 的AST-dump模式,他可以通过 标志 -ast-dump 调用。

看一下这个简单的AST 例子:

$ cat test.cc
int f(int x) {
  int result = (x / 42);
  return result;
}

# Clang by default is a frontend for many tools; -Xclang is used to pass
# options directly to the C++ frontend.
$ clang -Xclang -ast-dump -fsyntax-only test.cc
TranslationUnitDecl 0x5aea0d0 <<invalid sloc>>
... cutting out internal declarations of clang ...
`-FunctionDecl 0x5aeab50 <test.cc:1:1, line:4:1> f 'int (int)'
  |-ParmVarDecl 0x5aeaa90 <line:1:7, col:11> x 'int'
  `-CompoundStmt 0x5aead88 <col:14, line:4:1>
    |-DeclStmt 0x5aead10 <line:2:3, col:24>
    | `-VarDecl 0x5aeac10 <col:3, col:23> result 'int'
    |   `-ParenExpr 0x5aeacf0 <col:16, col:23> 'int'
    |     `-BinaryOperator 0x5aeacc8 <col:17, col:21> 'int' '/'
    |       |-ImplicitCastExpr 0x5aeacb0 <col:17> 'int' <LValueToRValue>
    |       | `-DeclRefExpr 0x5aeac68 <col:17> 'int' lvalue ParmVar 0x5aeaa90 'x' 'int'
    |       `-IntegerLiteral 0x5aeac90 <col:21> 'int' 42
    `-ReturnStmt 0x5aead68 <line:3:3, col:10>
      `-ImplicitCastExpr 0x5aead50 <col:10> 'int' <LValueToRValue>
        `-DeclRefExpr 0x5aead28 <col:10> 'int' lvalue Var 0x5aeac10 'result' 'int'

(下面是我在自己的电脑上编译运行的完整结果)
TranslationUnitDecl 0x582af70 <<invalid sloc>>
|-TypedefDecl 0x582b470 <<invalid sloc>> __int128_t '__int128'
|-TypedefDecl 0x582b4d0 <<invalid sloc>> __uint128_t 'unsigned __int128'
|-TypedefDecl 0x582b820 <<invalid sloc>> __builtin_va_list '__va_list_tag [1]'
`-FunctionDecl 0x582b940 <test.c:1:1, line:5:1> f 'int (int)'
  |-ParmVarDecl 0x582b880 <line:1:7, col:11> x 'int'
  `-CompoundStmt 0x582bb78 <line:2:1, line:5:1>
    |-DeclStmt 0x582bb00 <line:3:3, col:22>
    | `-VarDecl 0x582ba00 <col:3, col:21> result 'int'
    |   `-ParenExpr 0x582bae0 <col:16, col:21> 'int'
    |     `-BinaryOperator 0x582bab8 <col:17, col:19> 'int' '/'
    |       |-ImplicitCastExpr 0x582baa0 <col:17> 'int' <LValueToRValue>
    |       | `-DeclRefExpr 0x582ba58 <col:17> 'int' lvalue ParmVar 0x582b880 'x' 'int'
    |       `-IntegerLiteral 0x582ba80 <col:19> 'int' 42
    `-ReturnStmt 0x582bb58 <line:4:3, col:10>
      `-ImplicitCastExpr 0x582bb40 <col:10> 'int' <LValueToRValue>
        `-DeclRefExpr 0x582bb18 <col:10> 'int' lvalue Var 0x582ba00 'result' 'int'

The toplevel declaration ina translation unit is always the translation unitdeclaration.In this example, our first user written declaration is thefunctiondeclarationof “f”. The body of “f” is acompoundstatement,whose child nodes are adeclarationstatementthat declares our result variable, and the returnstatement.

AST Context

All information about the AST for a translation unit is bundled up inthe classASTContext.It allowstraversal of the whole translation unit starting fromgetTranslationUnitDecl,or to access Clang’stable ofidentifiersfor the parsed translation unit.

AST Nodes

Clang’s AST nodes are modeled on a class hierarchy that does not have acommon ancestor. Instead, there are multiple larger hierarchies forbasic node types likeDecl andStmt. Manyimportant AST nodes derive fromType,Decl,DeclContextorStmt, withsome classes deriving from both Decl and DeclContext.

There are also a multitude of nodes in the AST that are not part of alarger hierarchy, and are onlyreachable from specific other nodes, likeCXXBaseSpecifier.

Thus, to traverse the full AST, one starts from theTranslationUnitDecland thenrecursively traverses everything that can be reached from thatnode - this information has to be encoded for each specific node type.This algorithm is encoded in theRecursiveASTVisitor.See theRecursiveASTVisitortutorial.

The two most basic nodes in the Clang AST are statements(Stmt) anddeclarations(Decl). Notethat expressions(Expr) arealso statements in Clang’s AST.

*转载请注明出处

  • 1
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值