stanford postagger 的demo默认情况下输出slashtags。在代码中如何修改,才能让它输出xml?
在MaxentTagger.java文件中,修改public String apply(String o)代码
public String apply(String o) {
StringBuilder taggedSentence = new StringBuilder();
int outputStyle;
boolean tokenize;
if (config != null) {
outputStyle = PlainTextDocumentReaderAndWriter.asIntOutputFormat(config.getOutputFormat());
tokenize = config.getTokenize();
} else {
// outputStyle = PlainTextDocumentReaderAndWriter.OUTPUT_STYLE_SLASH_TAGS;
outputStyle = PlainTextDocumentReaderAndWriter.OUTPUT_STYLE_XML;//change the default tag style to xml
tokenize = true;
}
修改后的运行结果:
<sentence id="0">
<word wid="0" pos="RB">Butterfly</word>
<word wid="1" pos=",">,</word>
<word wid="2" pos="PRP">you</word>
<word wid="3" pos="VBP">are</word>
<word wid="4" pos="JJ">mine</word>
<word wid="5" pos=".">.</word>
</sentence>
在MaxentTagger.java文件中,修改public String apply(String o)代码
public String apply(String o) {
StringBuilder taggedSentence = new StringBuilder();
int outputStyle;
boolean tokenize;
if (config != null) {
outputStyle = PlainTextDocumentReaderAndWriter.asIntOutputFormat(config.getOutputFormat());
tokenize = config.getTokenize();
} else {
// outputStyle = PlainTextDocumentReaderAndWriter.OUTPUT_STYLE_SLASH_TAGS;
outputStyle = PlainTextDocumentReaderAndWriter.OUTPUT_STYLE_XML;//change the default tag style to xml
tokenize = true;
}
修改后的运行结果:
<sentence id="0">
<word wid="0" pos="RB">Butterfly</word>
<word wid="1" pos=",">,</word>
<word wid="2" pos="PRP">you</word>
<word wid="3" pos="VBP">are</word>
<word wid="4" pos="JJ">mine</word>
<word wid="5" pos=".">.</word>
</sentence>