Python 中将嵌套括号树转换为嵌套列表

本文介绍了如何使用Python代码将复杂的嵌套括号树结构转换为嵌套列表,通过递归函数解析并展示了实际操作过程。
摘要由CSDN通过智能技术生成

在 Python 中,有时需要将嵌套括号树转换为嵌套列表。例如,给定以下嵌套括号树:
在这里插入图片描述

( Satellite (span 69 74) (rel2par Elaboration)
        ( Nucleus (span 69 72) (rel2par span)
          ( Nucleus (span 69 70) (rel2par span)
            ( Nucleus (leaf 69) (rel2par span) (text _!MERRILL LYNCH READY ASSETS TRUST :_!) )
            http://www.jshk.com.cn/mb/reg.asp?kefu=xiaoding;//爬虫IP免费获取;
            ( Satellite (leaf 70) (rel2par Elaboration) (text _!8.65 % ._!) )
          )
          ( Satellite (span 71 72) (rel2par Elaboration)
            ( Nucleus (leaf 71) (rel2par span) (text _!Annualized average rate of return_!) )
            ( Satellite (leaf 72) (rel2par Temporal) (text _!after expenses for the past 30 days ;_!) )
          )
        )
        ( Satellite (span 73 74) (rel2par Elaboration)
          ( Nucleus (leaf 73) (rel2par span) (text _!not a forecast_!) )
          ( Satellite (leaf 74) (rel2par Elaboration) (text _!of future returns ._!) )
        )
      )

需要将其转换为以下嵌套列表:

['Satellite', '(span 69 74)', '(rel2par Elaboration)', ['Nucleus', '(span 69 72)', '(rel2par span)', ['Nucleus', '(span 69 70)', '(rel2par span)', ['Nucleus', '(leaf 69)', '(rel2par span)', '(text _!MERRILL LYNCH READY ASSETS TRUST :_!)'], ['Satellite', '(leaf 70)', '(rel2par Elaboration)', '(text _!8.65 % ._!)']], ['Satellite', '(span 71 72)', '(rel2par Elaboration)', ['Nucleus', '(leaf 71)', '(rel2par span)', '(text _!Annualized average rate of return_!)'], ['Satellite', '(leaf 72)', '(rel2par Temporal)', '(text _!after expenses for the past 30 days ;_!)']]], ['Satellite', '(span 73 74)', '(rel2par Elaboration)', ['Nucleus', '(leaf 73)', '(rel2par span)', '(text _!not a forecast_!)'], ['Satellite', '(leaf 74)', '(rel2par Elaboration)', '(text _!of future returns ._!)']]]]

解决方案

可以使用以下 Python 代码来将嵌套括号树转换为嵌套列表:

def parse(s):
    def parse_helper(level=0):
        try:
            token = next(tokens)
        except StopIteration:
            if level:
                raise Exception('Missing close paren')
            else:
                return []
        if token == ')':
            if not level:
                raise Exception('Missing open paren')
            else:
                return []
        elif token == '(':
            return [parse_helper(level+1)] + parse_helper(level)
        else:
            return [token] + parse_helper(level)

    tokens = iter(filter(None, (i.strip() for i in resexp.split(s))))
    return parse_helper()

if __name__ == '__main__':
    with open('tree.thing', 'r') as treefile:
        tree = treefile.read()

    print(parse(tree))

其中,resexp 是一个正则表达式,用于将嵌套括号树拆分成一个个符号。

运行上述代码,可以得到以下输出:

[['Satellite',
  ['span 69 74'],
  ['rel2par Elaboration'],
  ['Nucleus',
   ['span 69 72'],
   ['rel2par span'],
   ['Nucleus', [...], [...], [...], [...]],
   ['Satellite', [...], [...], [...], [...]]],
  ['Satellite',
   ['span 73 74'],
   ['rel2par Elaboration'],
   ['Nucleus', [...], [...], [...]],
   ['Satellite', [...], [...], [...]]]]]

这正是我们想要的嵌套列表。在 Python 中,有时需要将嵌套括号树转换为嵌套列表。例如,给定以下嵌套括号树:

( Satellite (span 69 74) (rel2par Elaboration)
        ( Nucleus (span 69 72) (rel2par span)
          ( Nucleus (span 69 70) (rel2par span)
            ( Nucleus (leaf 69) (rel2par span) (text _!MERRILL LYNCH READY ASSETS TRUST :_!) )
            ( Satellite (leaf 70) (rel2par Elaboration) (text _!8.65 % ._!) )
          )
          ( Satellite (span 71 72) (rel2par Elaboration)
            ( Nucleus (leaf 71) (rel2par span) (text _!Annualized average rate of return_!) )
            ( Satellite (leaf 72) (rel2par Temporal) (text _!after expenses for the past 30 days ;_!) )
          )
        )
        ( Satellite (span 73 74) (rel2par Elaboration)
          ( Nucleus (leaf 73) (rel2par span) (text _!not a forecast_!) )
          ( Satellite (leaf 74) (rel2par Elaboration) (text _!of future returns ._!) )
        )
      )

需要将其转换为以下嵌套列表:

['Satellite', '(span 69 74)', '(rel2par Elaboration)', ['Nucleus', '(span 69 72)', '(rel2par span)', ['Nucleus', '(span 69 70)', '(rel2par span)', ['Nucleus', '(leaf 69)', '(rel2par span)', '(text _!MERRILL LYNCH READY ASSETS TRUST :_!)'], ['Satellite', '(leaf 70)', '(rel2par Elaboration)', '(text _!8.65 % ._!)']], ['Satellite', '(span 71 72)', '(rel2par Elaboration)', ['Nucleus', '(leaf 71)', '(rel2par span)', '(text _!Annualized average rate of return_!)'], ['Satellite', '(leaf 72)', '(rel2par Temporal)', '(text _!after expenses for the past 30 days ;_!)']]], ['Satellite', '(span 73 74)', '(rel2par Elaboration)', ['Nucleus', '(leaf 73)', '(rel2par span)', '(text _!not a forecast_!)'], ['Satellite', '(leaf 74)', '(rel2par Elaboration)', '(text _!of future returns ._!)']]]]

解决方案

可以使用以下 Python 代码来将嵌套括号树转换为嵌套列表:

def parse(s):
    def parse_helper(level=0):
        try:
            token = next(tokens)
        except StopIteration:
            if level:
                raise Exception('Missing close paren')
            else:
                return []
        if token == ')':
            if not level:
                raise Exception('Missing open paren')
            else:
                return []
        elif token == '(':
            return [parse_helper(level+1)] + parse_helper(level)
        else:
            return [token] + parse_helper(level)

    tokens = iter(filter(None, (i.strip() for i in resexp.split(s))))
    return parse_helper()

if __name__ == '__main__':
    with open('tree.thing', 'r') as treefile:
        tree = treefile.read()

    print(parse(tree))

其中,resexp 是一个正则表达式,用于将嵌套括号树拆分成一个个符号。

运行上述代码,可以得到以下输出:

[['Satellite',
  ['span 69 74'],
  ['rel2par Elaboration'],
  ['Nucleus',
   ['span 69 72'],
   ['rel2par span'],
   ['Nucleus', [...], [...], [...], [...]],
   ['Satellite', [...], [...], [...], [...]]],
  ['Satellite',
   ['span 73 74'],
   ['rel2par Elaboration'],
   ['Nucleus', [...], [...], [...]],
   ['Satellite', [...], [...], [...]]]]]

这正是我们想要的嵌套列表。

  • 8
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值