基于
Python
语言的中文文本处理研究
温珍
【期刊名称】
《南昌工程学院学报》
【年
(
卷
),
期】
2018(037)003
【摘要】
With
the
popularity
of
computer
technology,text
processing
based
on
machine
language
are
widely
used
in
various
fields.
How
to
combine
the
advantages
of
statistical
and
machanical
methods
and
apply them to the automatic processing of texts has naturally become
the focus of corpus linguistic research at home and abroad. Compared
with
foreign
countries,do-mestic
research
in
the
field
of
Chinese
text
processing
is
lagging
behind.
Therefore,Chinese
text
processing
based
on ma-chine language makes more senses,especially for Chinese coding
and
word
segmentation.
All
the
examples
are
mainly
from
HSK
test
compositions written by English-spenking learaers of Chinese and Self-
built Chinese compositions Corpus of native Chinese speakers. Through
the
key
elements
of
segmentation,high-frequency
word
extraction,and
syntactic
analysis
in
the
study
of
the
Verb-Noun
Collocations,the
in-
depth observation of corpus reveals that Chinese learners whose native
language is English are less likely to use Verb+Object collocations than
Chinese
native
speakers.
And
this
paper
proposed
the
re-search
prospects
of
Chinese
Verb+Object
collocations.%
随着计算机技术的普及
,
基于机器语言的文本处理方法开始应用到各个领域
,
如何结合统计方法和机器方