在NLTK 2.0中,您可以使用nltk.parse.generate生成所有可能的
sentences for a given grammar.
该代码定义了一个基于(P)CFG中的生产规则生成单个句子的函数.
# This example uses choice to choose from possible expansions
from random import choice
# This function is based on _generate_all() in nltk.parse.generate
# It therefore assumes the same import environment otherwise.
def generate_sample(grammar, items=["S"]):
frags = []
if len(items) == 1:
if isinstance(items[0], Nonterminal):
for prod in grammar.productions(lhs=items[0]):
frags.append(generate_sample(grammar, prod.rhs()))
else:
frags.append(items[0])
else:
# This is where we need to make our changes
chosen_expansion = choice(items)
frags.append(generate_sample,chosen_expansion)
return frags
为了利用PCFG中的权重,你显然要使用一个比choice()更好的采样方法,它隐含地假定当前节点的所有扩展都是均等的.