您只需要关心实际导入到当前正在检查的模块中的名称。请注意,这里几乎没有并发症:导入的名称可从其他模块中使用,以便从当前模块导入;import foo在模块bar中使{}从外部可用。所以from bar import foo实际上和{}是一样的。在
任何对象都可以存储在一个列表、一个元组中、成为另一个对象的属性、存储在字典中、分配给另一个名称,并且可以动态引用。E、 g.存储在列表中的导入属性,由索引引用:import foo
spam = [foo.bar]
spam[0]()
调用foo.bar对象。通过AST分析跟踪其中一些用法是可以完成的,但是Python是一种高度动态的语言,您很快就会遇到一些限制。例如,您无法确定spam[0] = random.choice([foo.bar, foo.baz])将产生什么。
通过使用global和nonlocal语句,嵌套函数作用域可以更改父作用域中的名称。所以一个人为的函数,比如:
^{pr2}$
将导入模块foo并将其添加到全局命名空间中,但仅当调用bar()时。跟踪这一点很困难,因为您需要跟踪实际调用bar()的时间。这甚至可能发生在当前模块(import weirdmodule; weirdmodule.bar())之外。
如果忽略了这些复杂情况,只关注import语句中使用的名称,则需要跟踪Import和{}节点,并跟踪作用域(这样就可以知道本地名称是否屏蔽了全局,或者导入的名称是否导入了本地作用域)。然后查找引用导入名称的Name(..., Load)节点。在import ast
from collections import ChainMap
from types import MappingProxyType as readonlydict
class ModuleUseCollector(ast.NodeVisitor):
def __init__(self, modulename, package=''):
self.modulename = modulename
# used to resolve from ... import ... references
self.package = package
self.modulepackage, _, self.modulestem = modulename.rpartition('.')
# track scope namespaces, with a mapping of imported names (bound name to original)
# If a name references None it is used for a different purpose in that scope
# and so masks a name in the global namespace.
self.scopes = ChainMap()
self.used_at = [] # list of (name, alias, line) entries
def visit_FunctionDef(self, node):
self.scopes = self.scopes.new_child()
self.generic_visit(node)
self.scopes = self.scopes.parents
def visit_Lambda(self, node):
# lambdas are just functions, albeit with no statements
self.visit_Function(node)
def visit_ClassDef(self, node):
# class scope is a special local scope that is re-purposed to form
# the class attributes. By using a read-only dict proxy here this code
# we can expect an exception when a class body contains an import
# statement or uses names that'd mask an imported name.
self.scopes = self.scopes.new_child(readonlydict({}))
self.generic_visit(node)
self.scopes = self.scopes.parents
def visit_Import(self, node):
self.scopes.update({
a.asname or a.name: a.name
for a in node.names
if a.name == self.modulename
})
def visit_ImportFrom(self, node):
# resolve relative imports; from . import , from .. import
source = node.module # can be None
if node.level:
package = self.package
if node.level > 1:
# go up levels as needed
package = '.'.join(self.package.split('.')[:-(node.level - 1)])
source = f'{package}.{source}' if source else package
if self.modulename == source:
# names imported from our target module
self.scopes.update({
a.asname or a.name: f'{self.modulename}.{a.name}'
for a in node.names
})
elif self.modulepackage and self.modulepackage == source:
# from package import module import, where package.module is what we want
self.scopes.update({
a.asname or a.name: self.modulename
for a in node.names
if a.name == self.modulestem
})
def visit_Name(self, node):
if not isinstance(node.ctx, ast.Load):
# store or del operation, means the name is masked in the current scope
try:
self.scopes[node.id] = None
except TypeError:
# class scope, which we made read-only. These names can't mask
# anything so just ignore these.
pass
return
# find scope this name was defined in, starting at the current scope
imported_name = self.scopes.get(node.id)
if imported_name is None:
return
self.used_at.append((imported_name, node.id, node.lineno))
现在,给定一个模块名foo.bar和foo包中某个模块的以下源代码文件:from .bar import name1 as namealias1
from foo import bar as modalias1
def loremipsum(dolor):
return namealias1(dolor)
def sitamet():
from foo.bar import consectetur
modalias1 = 'something else'
consectetur(modalias1)
class Adipiscing:
def elit_nam(self):
return modalias1.name2(self)
您可以解析上述内容并使用以下内容提取所有foo.bar引用:>>> collector = ModuleUseCollector('foo.bar', 'foo')
>>> collector.visit(ast.parse(source))
>>> for name, alias, line in collector.used_at:
... print(f'{name} ({alias}) used on line {line}')
...
foo.bar.name1 (namealias1) used on line 5
foo.bar.consectetur (consectetur) used on line 11
foo.bar (modalias1) used on line 15
注意,sitamet作用域中的modalias1名称并没有被视为对导入模块的实际引用,因为它被用作本地名称。在