python处理文本数据_TextBlob 是一个用于处理文本数据的Python(2和3)库

TextBlob: Simplified Text Processing

68747470733a2f2f62616467656e2e6e65742f707970692f762f54657874426c6f62

68747470733a2f2f62616467656e2e6e65742f7472617669732f736c6f7269612f54657874426c6f622f646576

TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more.

from textblob import TextBlob

text = '''

The titular threat of The Blob has always struck me as the ultimate movie

monster: an insatiably hungry, amoeba-like mass able to penetrate

virtually any safeguard, capable of--as a doomed doctor chillingly

describes it--"assimilating flesh on contact.

Snide comparisons to gelatin be damned, it's a concept with the most

devastating of potential consequences, not unlike the grey goo scenario

proposed by technological theorists fearful of

artificial intelligence run rampant.

'''

blob = TextBlob(text)

blob.tags # [('The', 'DT'), ('titular', 'JJ'),

# ('threat', 'NN'), ('of', 'IN'), ...]

blob.noun_phrases # WordList(['titular threat', 'blob',

# 'ultimate movie monster',

# 'amoeba-like mass', ...])

for sentence in blob.sentences:

print(sentence.sentiment.polarity)

# 0.060

# -0.341

TextBlob stands on the giant shoulders of NLTK and pattern, and plays nicely with both.

Features

Noun phrase extraction

Part-of-speech tagging

Sentiment analysis

Classification (Naive Bayes, Decision Tree)

Tokenization (splitting text into words and sentences)

Word and phrase frequencies

Parsing

n-grams

Word inflection (pluralization and singularization) and lemmatization

Spelling correction

Add new models or languages through extensions

WordNet integration

Get it now

$ pip install -U textblob

$ python -m textblob.download_corpora

Examples

See more examples at the Quickstart guide.

Documentation

Full documentation is available at https://textblob.readthedocs.io/.

Requirements

Python >= 2.7 or >= 3.5

Project Links

License

MIT licensed. See the bundled LICENSE file for more details.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值