python中pos()_python中不带NLTK的POS标记器

我想给索拉尼库尔德语的限定词和介词做一个词性标记。我使用下面的代码将每个标记放在库尔德语文本中的每个命题或限定词之后。在import os

SOR = open("SOR-1.txt", "r+", encoding = 'utf-8')

old_text = SOR.read()

punkt = [".", "!", ",", ":", ";"]

text = ""

for i in old_text:

if i in punkt:

text+=" "+i

else:

text += i

d = {"DET":["ئێمە" , "ئێوە" , "ئەم" , "ئەو" , "ئەوان" , "ئەوەی", "چەند" ], "PREP":["بۆ","بێ","بێجگە","بە","بەبێ","بەدەم","بەردەم","بەرلە","بەرەوی","بەرەوە","بەلای","بەپێی","تۆ","تێ","جگە","دوای","دەگەڵ","سەر","لێ","لە","لەبابەت","لەباتی","لەبارەی","لەبرێتی","لەبن","لەبەینی","لەبەر","لەدەم","لەرێ","لەرێگا","لەرەوی","لەسەر","لەلایەن","لەناو","لەنێو","لەو","لەپێناوی","لەژێر","لەگەڵ","ناو","نێوان","وەک","وەک","پاش","پێش","" ], "punkt":[".", ",", "!"]}

text = text.split()

for w in text:

for pos in d:

if w in d[pos]:

SOR.write(w+"/"+pos+" ")

SOR.close()

我想做的是在定义的字典中的每个单词之后在文本中添加POS标记,但是结果是在文件末尾有一个单词和POS标记的单独列表。在

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值