python spacy代码

代码如下:

import spacy
nlp = spacy.load('en')
test_doc = nlp(u"it's word tokenize test for spacy")

# 分词
print("\n1、分词")
print(test_doc)
for token in test_doc:
    print(token)

# 分句
print("\n2、分句")
test_doc = nlp(u'Natural language processing (NLP) deals with the application of computational models to text or speech data. Application areas within NLP include automatic (machine) translation between languages; dialogue systems, which allow a human to interact with a machine using natural language; and information extraction, where the goal is to transform unstructured text into structured (database) representations that can be searched and browsed in flexible ways. NLP technologies are having a dramatic impact on the way people interact with computers, on the way people interact with each other through the use of language, and on the way people access the vast amount of linguistic data now in electronic form. From a scientific viewpoint, NLP involves fundamental questions of how to structure formal models (for example statistical models) of natural language phenomena, and of how to design algorithms that implement these models.')
print(test_doc)
for sent in test_doc.sents:
    print(sent)

# 词干化
print("\n3、词干化")
test_doc = nlp(u"you are best. it is lemmatize test for spacy. I love these books")
print(test_doc)
for token in test_doc:
    print(token, token.lemma_, token.lemma)

# 词性标注
print("\n4、词性标注")
print(test_doc)
for token in test_doc:
    print(token, token.pos_, token.pos)


# 命名实体识别
print("\n5、命名实体识别")
test_doc = nlp(u"Rami Eid is studying at Stony Brook University in New York")
print(test_doc)
for ent in test_doc.ents:
    print(ent, ent.label_, ent.label)

# 名词短语提取
print("\n6、名词短语提取")
test_doc = nlp(u'Natural language processing (NLP) deals with the application of computational models to text or speech data. Application areas within NLP include automatic (machine) translation between languages; dialogue systems, which allow a human to interact with a machine using natural language; and information extraction, where the goal is to transform unstructured text into structured (database) representations that can be searched and browsed in flexible ways. NLP technologies are having a dramatic impact on the way people interact with computers, on the way people interact with each other through the use of language, and on the way people access the vast amount of linguistic data now in electronic form. From a scientific viewpoint, NLP involves fundamental questions of how to structure formal models (for example statistical models) of natural language phenomena, and of how to design algorithms that implement these models.')
print(test_doc)
for np in test_doc.noun_chunks:
    print(np)

# 基于词向量计算两个单词的相似度
print("\n7、基于词向量计算两个单词的相似度")
test_doc = nlp(u"Apples and oranges are the same . Boots and hippos aren't.")
print(test_doc)
apples = test_doc[0]
print(apples)
oranges = test_doc[2]
print(oranges)
boots = test_doc[7]
print(boots)
hippos = test_doc[9]
print(hippos)

print(apples.similarity(oranges))
print(boots.similarity(hippos))

结果:

/usr/bin/python3.5 /home/wmmm/PycharmProjects/untitled/zstp.py

1、分词
it's word tokenize test for spacy
it
's
word
tokenize
test
for
spacy

2、分句
Natural language processing (NLP) deals with the application of computational models to text or speech data. Application areas within NLP include automatic (machine) translation between languages; dialogue systems, which allow a human to interact with a machine using natural language; and information extraction, where the goal is to transform unstructured text into structured (database) representations that can be searched and browsed in flexible ways. NLP technologies are having a dramatic impact on the way people interact with computers, on the way people interact with each other through the use of language, and on the way people access the vast amount of linguistic data now in electronic form. From a scientific viewpoint, NLP involves fundamental questions of how to structure formal models (for example statistical models) of natural language phenomena, and of how to design algorithms that implement these models.
Natural language processing (NLP) deals with the application of computational models to text or speech data.
Application areas within NLP include automatic (machine) translation between languages; dialogue systems, which allow a human to interact with a machine using natural language; and information extraction, where the goal is to transform unstructured text into structured (database) representations that can be searched and browsed in flexible ways.
NLP technologies are having a dramatic impact on the way people interact with computers, on the way people interact with each other through the use of language, and on the way people access the vast amount of linguistic data now in electronic form.
From a scientific viewpoint, NLP involves fundamental questions of how to structure formal models (for example statistical models) of natural language phenomena, and of how to design algorithms that implement these models.

3、词干化
you are best. it is lemmatize test for spacy. I love these books
you -PRON- 561228191312463089
are be 10382539506755952630
best good 5711639017775284443
. . 12646065887601541794
it -PRON- 561228191312463089
is be 10382539506755952630
lemmatize lemmatize 4507259281035238268
test test 1618900948208871284
for for 16037325823156266367
spacy spacy 10639093010105930009
. . 12646065887601541794
I -PRON- 561228191312463089
love love 3702023516439754181
these these 6459564349623679250
books book 13814433107111459297

4、词性标注
you are best. it is lemmatize test for spacy. I love these books
you PRON 94
are VERB 99
best ADJ 83
. PUNCT 96
it PRON 94
is VERB 99
lemmatize ADJ 83
test NOUN 91
for ADP 84
spacy NOUN 91
. PUNCT 96
I PRON 94
love VERB 99
these DET 89
books NOUN 91

5、命名实体识别
Rami Eid is studying at Stony Brook University in New York
Rami Eid PERSON 378
Stony Brook University ORG 381
New York GPE 382

6、名词短语提取
Natural language processing (NLP) deals with the application of computational models to text or speech data. Application areas within NLP include automatic (machine) translation between languages; dialogue systems, which allow a human to interact with a machine using natural language; and information extraction, where the goal is to transform unstructured text into structured (database) representations that can be searched and browsed in flexible ways. NLP technologies are having a dramatic impact on the way people interact with computers, on the way people interact with each other through the use of language, and on the way people access the vast amount of linguistic data now in electronic form. From a scientific viewpoint, NLP involves fundamental questions of how to structure formal models (for example statistical models) of natural language phenomena, and of how to design algorithms that implement these models.
Natural language processing
the application
computational models
Application areas
NLP
automatic (machine) translation
languages
dialogue systems
a human
a machine
natural language
information extraction
the goal
unstructured text
structured (database) representations
flexible ways
NLP technologies
a dramatic impact
the way
people
computers
the way
people
the use
language
the way
people
the vast amount
linguistic data
electronic form
a scientific viewpoint
NLP
fundamental questions
formal models
example
natural language phenomena
algorithms
these models

7、基于词向量计算两个单词的相似度
Apples and oranges are the same . Boots and hippos aren't.
Apples
oranges
Boots
hippos
0.518096
0.158362

进程已结束,退出代码0
  • 1
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值