一种获取java代码结构的实现思路
有时,我们需要获取java文件里的代码结构,即,只需要里面的class定义、方法声明、属性定义。不需要额外的方法实现
这里提供一下实现思路:
- 采用语法解析器Tree-sitter对java代码进行解析,获取里面的方法实现
- 遍历第一步获取到的方法列表,在源代码中将方法replace为空字符串后,即为一个类的结构
备注:有关Tree-sitter,可参考官网介绍:
https://tree-sitter.github.io/tree-sitter/
获取class中的import包、整个方法定义、方法中的代码块、块状注释,部分代码实现如下:
from code_ast.parsers import ASTParser
class MyASTParser(ASTParser):
def __init__(self, lang):
super().__init__(lang)
def parse_edit(self, tree, editcode, keep_text = True):
code_bytes = editcode.encode("utf-8")
return self.parser.parse(code_bytes, tree, keep_text)
java_parse = MyASTParser("java")
root_tree, _ = java_parse.parse(target)
root_node = root_tree.root_node
def get_infos(node, method_blocks, methods, comments, includes, packages):
if node:
content = node.text.decode('utf-8')
if node and node.type == "method_declaration":
methods.append(content)
if node and node.type == 'package_declaration':
packages.append(content)
if node and node.parent and node.type == 'block' and node.parent.type == "method_declaration":
method_blocks.append(content)
if node and node.parent and node.type == "block_comment" and node.parent.type == "program":
comments.append(content)
if node and node.type == "import_declaration":
includes.append(content)
for child in node.children:
get_infos(child, method_blocks, methods, comments, includes, packages)