中英文分句

中英文分句

这里主要是使用了两个包:pyltp 和 nltk

安装过程省略,使用方式如下:

import nltk  # 英文分句
from pyltp import SentenceSplitter  # 中文分句

s = "Since I was very small, I was very shy in the public place, so I always avoided giving performance in front of so many people. Though I tried hard to get over it in school, I still felt uneasy in the public place. When I came to the job market, I realized that I must get over my fear, or I would lose my stage.

print "\n".join(nltk.sent_tokenize(s))

# Since I was very small, I was very shy in the public place, so I always avoided giving performance in front of so many people.
# Though I tried hard to get over it in school, I still felt uneasy in the public place.
# When I came to the job market, I realized that I must get over my fear, or I would lose my stage. 

x = "在我很小的时候,在公共场合我会感到非常的害羞,所以我总是避免在人多的情况下表演。虽然我在学校努力想要克服这个问题,但在公共场合我还是感到不自在。当我来到就业市场时,我意识到我必须克服我的恐惧了,否则我将失去自己的舞台。"

sents = SentenceSplitter.split(x)
print "\n".join(sents)

# 在我很小的时候,在公共场合我会感到非常的害羞,所以我总是避免在人多的情况下表演。
# 虽然我在学校努力想要克服这个问题,但在公共场合我还是感到不自在。
# 当我来到就业市场时,我意识到我必须克服我的恐惧了,否则我将失去自己的舞台。

http://www.pythontip.com/blog/post/10012/

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值