title: 基于ANTLR4+python解析设备树文件
top: 41
date: 2024-05-22 09:28:25
tags:
- ANTLR4
- 设备树
categories: - ANTLR4
什么是设备树?
引导程序要初始化并且引导一个计算机系统实现多种软件模块的交互。例如Bootloaders和hypervisors(固件)程序, 他们在把控制权交给操作系统软件之前可能需要在系统硬件上做低级的初始化工作。同时,bootloaders,hypervisos能够依次加载系统并把控制权交给操作系统, 使得在软件之间的交互变的标准化、接口化并且易用。在这个文档中引导程序被用作泛指那些初始化系统状态并加载执行操作系统的程序。boot引导程序主要包括:固件、bootloaders和hypervisors。一个client程序主要包括:bootloader、hypervisors、操作系统和一些其他特别目的的程序。一个软件包可能包含boot程序和client程序。
设备树规范提供了一个完整的boot引导程序到client程序的接口定义和广泛且多样系统开发的最小集合。
设备树文件样例
// template.dts
/dts-v1/;
//memreserve/ <address> <length>;
/*
mem {
mem;
};*/
/ {
interrupt-parent = <&intc>;
compatible = "linux,dummy-virt";
#size-cells = <0x02>;
#address-cells = <0x02>;
psci {
cpu_on = <0xc4000003>;
compatible = "arm,psci-1.0", "arm,psci-0.2", "arm,psci";
cpu_suspend = <0xc4000001>;
migrate = <0xc4000005>;
cpu_off = <0x84000002>;
method = "hvc";
assigned-clocks = <&v2m_sysctl 0>, <&v2m_sysctl 1>, <&v2m_sysctl 3>, <&v2m_sysctl 3>;
};
mem {
};
memory {
reg = <0x00 0x80000000 0x00 0x10000000>;
device_type = "memory";
};
uart {
compatible = "arm,pl011", "arm,primecell";
clock-names = "uartclk", "apb_pclk";
clocks = <&apb_pclk>, <&apb_pclk>;
interrupts = <0x00 0x01 0x04>;
reg = <0x00 0x9000000 0x00 0x1000>;
};
intc: intc {
interrupt-controller;
#size-cells = <0x02>;
ranges;
compatible = "arm,cortex-a15-gic";
#interrupt-cells = <0x03>;
reg = <0x00 0x8000000 0x00 0x10000>, <0x00 0x8010000 0x00 0x10000>;
#address-cells = <0x02>;
v2m {
msi-controller;
compatible = "arm,gic-v2m-frame";
reg = <0x00 0x8020000 0x00 0x1000>;
};
};
timer {
compatible = "arm,armv8-timer\0arm,armv7-timer";
interrupts = <0x01 0x0d 0x104>, <0x01 0x0e 0x104>, <0x01 0x0b 0x104>, <0x01 0x0a 0x104>;
always-on;
};
ethernet: ethernet {
compatible = "smsc,lan9118", "smsc,lan9115";
reg-io-width = <4>;
reg = <0x0 0xC0000000 0x0 0x10000>;
smsc,irq-active-high;
interrupts = <0 4 4>;
phy-mode = "mii";
smsc,irq-push-pull;
};
chosen {
stdout-path = "/uart";
rng-seed = <0x35d0afb0 0x45d1f049 0x9b6bc5af 0x61073667 0xfa7f51b8 0xa46898c7 0xf96fbe17 0x2093044>;
kaslr-seed = <0xe61a0cad 0xcc2e0aab>;
};
cpus {
#size-cells = <0x00>;
};
};
/ {
cpus {
#size-cells = <0x00>;
#address-cells = <0x01>;
cpu-map {
socket0 {
cluster0 {
core0 {
cpu = <&cpu_0>;
};};};};
cpu_0: cpu@0 {
reg = <0x00>;
info: compatible = "arm,cortex-a53";
device_type = "cpu";
extra_1 = [ab cd ef byte4: 00 ff fe];
extra_2 = reglabel: <0 sizelabel: 0x1000000>;
};};};
-
设备树文件注释方式同C语言。
-
一份设备树文件中,如果同时记录了多个根(
/
),解析完成后需将多个根的数据合并,相同路径下的相同属性键对应的值进行覆盖。 -
详细的设备树格式定义可见参考链接[1]。
ANTLR4描述(g4文件)
DTSLEXER.g4
lexer grammar DTSLexer;
LC : '{';
RC : '}';
LP : '(';
RP : ')';
SC : ';';
EQ : '=';
CO : ',';
SL : '/';
BEGIN_ADDRESS : '<' -> skip, pushMode(ADDRESS);
BEGIN_HEX_MODE : '[' -> skip, pushMode(HEX_MODE);
BEGIN_DEFINE_VAL : '"' -> skip, pushMode(DEFINE_VAL);
BEGIN_LINE_COMMENT : '//' -> skip, pushMode(LINE_COMMENT);
BEGIN_MUL_COMMENT : '/*' ->skip, pushMode(MUL_COMMENT);
WS : [ \t\r\n]+ -> skip ;
MEMRESERVE : SL 'memreserve' SL;
INCLUDE : 'include' ;
VERSION : SL 'dts-v' [0-9]+ SL;
LABEL : LABEL_NAME CL;
NODE_NAME : CHAR+(CHAR|[0-9]|[,._+]|'-')* AT (CHAR|[0-9]|[,._+]|'-')+
| CHAR+(CHAR|[0-9]|[,._+]|'-')*
;
PROPERTY_NAME : (CHAR|[0-9]|[,._+?#]|'-')+ ;
fragment CHAR : [a-zA-Z] ;
fragment HEX : '0' [Xx] [0-9a-fA-F]+ ;
fragment DEC : ('0' [Dd])?[0-9]+ ;
fragment OCT : '0' [Oo] [0-7]+ ;
fragment BIN : '0' [Bb] [0-1]+ ;
fragment NUM : HEX | DEC | OCT | BIN ;
fragment STR : [\u0021\u0023-\u003a\u003c-\u007e]+ ;
fragment NAME_ADDR : CHAR+(CHAR|[0-9]|[,._+]|'-')* ;
fragment AT : '@';
fragment HA : '#';
fragment CL : ':';
fragment NODE_PATH : (SL NODE_NAME)+ ;
fragment ASCII : [\u0000-\u007e] ;
fragment LABEL_NAME : CHAR+(CHAR|[0-9]|'_')* ;
// ADDRESS
mode ADDRESS;
WS_1 : [ \t\r\n]+ -> skip ;
PL : '+';
MI : '-';
AS : '*';
DI : '/' ;
MO : '%';
AM : '&';
LABEL_1 : LABEL ;
ADDRESS_VALUE : NUM+ | [0-9]+
| AM LABEL_NAME
| AM LC NODE_PATH RC
;
END_ADDRESS : '>' -> skip, popMode ;
// HEX_MODE
mode HEX_MODE;
WS_2 : WS -> skip;
LABEL_2 : LABEL ;
HEX_NUM : [0-9a-fA-F]+ ;
END_HEX_MODE : ']' -> skip, popMode ;
// DEFINE_VAL
mode DEFINE_VAL;
LABEL_3 : LABEL ;
PROPERTY_VALUE : [\u0021\u0023-\u007e]+ ;
END_DEFINE_VAL : '"' -> skip, popMode ;
// LINE_COMMENT
mode LINE_COMMENT;
END_LINE_COMMENT : '\n' -> skip, popMode ;
SKIP_ALL : ASCII -> skip ;
// MUL_COMMENT
mode MUL_COMMENT;
END_MUL_COMMENT : '*/' -> skip, popMode;
SKIP_ALL_1 : ASCII -> skip ;
DTSPARSER.g4
parser grammar DTSParser;
options {
tokenVocab = DTSLexer;
}
top : version_stat memory_stat? node+ EOF ;
version_stat : VERSION SC ;
memory_stat : MEMRESERVE ADDRESS_VALUE ADDRESS_VALUE SC ;
node : node_key LC node_value* RC SC ;
node_key : SL
| labels* NODE_NAME
;
node_value : stat
| node
;
stat : labels* property_key (EQ property_value)? SC ;
property_key : PROPERTY_NAME
| NODE_NAME
;
property_value : labels* value_type labels* (CO labels* value_type labels*)* CO? ;
value_type : PROPERTY_VALUE
| (labels* ADDRESS_VALUE)+
| (labels* HEX_NUM)+
;
labels : LABEL | LABEL_1 | LABEL_2 | LABEL_3 ;
- 值得注意的是,我们忽略了设备树语法中的
#inclued 'file'
,该类语句为预处理命令,需要单独解析处理,这里暂且忽略了。
基于visitor接口的python实现
# myDTS.py
from antlr4 import *
from antlr4.error.ErrorListener import ErrorListener
from DTSLexer import DTSLexer
from DTSParser import DTSParser
from DTSParserVisitor import DTSParserVisitor
class MyVisitor(DTSParserVisitor):
def __init__(self):
self._roots = []
def visitProperty_key(self, ctx: DTSParser.Property_keyContext):
return ctx.getText()
def visitValue_type(self, ctx: DTSParser.Value_typeContext):
text = []
for it in ctx.getChildren():
if it in ctx.labels():
continue
text.append(it.getText())
return ' '.join(text)
def visitProperty_value(self, ctx: DTSParser.Property_valueContext):
val = set()
for it in ctx.getChildren():
if it in ctx.labels() or it.getText() == ',':
continue
val.add(self.visit(it))
return val
def visitStat(self, ctx: DTSParser.StatContext):
if ctx.property_value():
return {self.visit(ctx.property_key()): self.visit(ctx.property_value())}
else:
return {self.visit(ctx.property_key()): set()}
def visitNode_key(self, ctx: DTSParser.Node_keyContext):
items = list(ctx.getChildren())
return items[-1].getText()
def visitNode(self, ctx:DTSParser.NodeContext):
key = self.visit(ctx.node_key())
val = {}
for it in ctx.node_value():
val.update(self.visit(it))
return {key: val}
def visitTop(self, ctx:DTSParser.TopContext):
for it in ctx.node():
self._roots.append(self.visit(it)['/'])
def merge_roots(self):
root, *roots = self._roots
for cur in roots:
self.recursive_merge(root, cur, [''])
del self._roots
return root
def recursive_merge(self, base, cur, keys):
delimiter = '/'
for key, val in cur.items():
if key not in base:
base.update({key: val})
else:
keys.append(key)
if isinstance(val, set):
base.update({key: val})
print(f"WARNING: Replace key '{delimiter.join(keys)}'.")
else:
self.recursive_merge(base.get(key), val, keys)
class DTSException(ErrorListener):
def syntaxError(
self,
recognizer,
offendingSymbol,
line: int,
column: int,
msg: str,
e: RecognitionException,
):
raise SyntaxError(f"Syntax error on line {line}, column {column}: {msg}")
def dts2dict(file):
lexer = DTSLexer(FileStream(file))
token_stream = CommonTokenStream(lexer)
parser = DTSParser(token_stream)
parser.removeErrorListeners()
parser.addErrorListener(DTSException())
tree = parser.top()
visitor = MyVisitor()
visitor.visit(tree)
return visitor.merge_roots()
print(MyDTS.dts2dict('template.dts'))
设备树文件转python字典(dts2dict)
{'interrupt-parent': {'&intc'},
'compatible': {'linux,dummy-virt'},
'#size-cells': {'0x02'},
'#address-cells': {'0x02'},
'psci': {
'cpu_on': {'0xc4000003'},
'compatible': {'arm,psci-1.0', 'arm,psci', 'arm,psci-0.2'},
'cpu_suspend': {'0xc4000001'},
'migrate': {'0xc4000005'},
'cpu_off': {'0x84000002'},
'method': {'hvc'},
'assigned-clocks': {'&v2m_sysctl 3', '&v2m_sysctl 1', '&v2m_sysctl 0'}
},
'mem': {},
'memory': {
'reg': {'0x00 0x80000000 0x00 0x10000000'},
'device_type': {'memory'}
},
'uart': {
'compatible': {'arm,pl011', 'arm,primecell'},
'clock-names': {'uartclk', 'apb_pclk'},
'clocks': {'&apb_pclk'},
'interrupts': {'0x00 0x01 0x04'},
'reg': {'0x00 0x9000000 0x00 0x1000'}
},
'intc': {
'interrupt-controller': set(),
'#size-cells': {'0x02'},
'ranges': set(),
'compatible': {'arm,cortex-a15-gic'},
'#interrupt-cells': {'0x03'},
'reg': {'0x00 0x8000000 0x00 0x10000', '0x00 0x8010000 0x00 0x10000'},
'#address-cells': {'0x02'},
'v2m': {
'msi-controller': set(),
'compatible': {'arm,gic-v2m-frame'},
'reg': {'0x00 0x8020000 0x00 0x1000'}
}
},
'timer': {
'compatible': {'arm,armv8-timer\\0arm,armv7-timer'},
'interrupts': {'0x01 0x0a 0x104', '0x01 0x0e 0x104', '0x01 0x0d 0x104', '0x01 0x0b 0x104'},
'always-on': set()
},
'ethernet': {
'compatible': {'smsc,lan9115', 'smsc,lan9118'},
'reg-io-width': {'4'},
'reg': {'0x0 0xC0000000 0x0 0x10000'},
'smsc,irq-active-high': set(),
'interrupts': {'0 4 4'},
'phy-mode': {'mii'},
'smsc,irq-push-pull': set()
},
'chosen': {
'stdout-path': {'/uart'},
'rng-seed': {'0x35d0afb0 0x45d1f049 0x9b6bc5af 0x61073667 0xfa7f51b8 0xa46898c7 0xf96fbe17 0x2093044'},
'kaslr-seed': {'0xe61a0cad 0xcc2e0aab'}
},
'cpus': {
'#size-cells': {'0x00'},
'#address-cells': {'0x01'},
'cpu-map': {
'socket0': {
'cluster0': {
'core0': {
'cpu': {
'&cpu_0'
}
}
}
}
},
'cpu@0': {
'reg': {'0x00'},
'compatible': {'arm,cortex-a53'},
'device_type': {'cpu'},
'extra_1': {'ab cd ef 00 ff fe'},
'extra_2': {'0 0x1000000'}
}
}
}
参考链接
[1] 设备树文件规范
[2] Linux设备树语法规范