信息检索中有用的开发包-----持续更新中

Word Vector Tool

http://sourceforge.jp/projects/sfnet_wvtool/

The Word Vector Tool is a simple but flexible Java library to create word vector representations of text documents. Word vectors can be used for various text processing tasks, as text classification, text clustering or information retrieval.


http://code.google.com/p/fudannlp/

功能(Functions)

  1. 信息检索: 文本分类 新闻聚类
  2. 中文处理: 中文分词 词性标注 实体名识别 关键词抽取 依存句法分析 时间短语识别
  3. 结构化学习: 在线学习 层次分类 聚类 精确推理


Stanford NLP Chinese(中文)的使用

http://www.zhizhihu.com/html/y2011/3060.html


Mallet:自然语言处理工具包

http://www.zhizhihu.com/html/y2010/2199.html


http://blog.mashape.com/post/48946187179/20-natural-language-processing-apis

Natural Language Processing, or NLP, is a field of computer science, artificial intelligence, and linguistics concerned with the interactions between computers and human (natural) languages.

Here are useful APIs that help bridge the human-computer interaction:

 

  1. Text Processing - The WebKnox text processing API lets you process (natural) language texts. You can detect the text’s language, the quality of the writing, find entity mentions, tag part-of-speech, extract dates, extract locations, or determine the sentiment of the text.
  2. Question-Answering - The WebKnox question-answering API allows you to find answers to natural language questions. These questions can be factual such as “What is the capital of Australia” or more complex.
  3. Jeannie - Jeannie (Voice Actions) is a virtual assistant with over two Million downloads, now also available via API. The objective of this service is to provide you and your robot with the smartest answer to any natural language question, just like Siri.
  4. Diffbot - Diffbot extracts data from web pages automatically and returns structured JSON. For example, our Article API returns an article’s title, author, date and full-text. Use the web as your database! We use computer vision, machine learning and natural language processing to add structure to just about any web page.
  5. nlpTools - Text processing framework to analyse Natural Language. It is especially focused on text classification and sentiment analysis of online news media (general-purpose, multiple topics).
  6. Speech2Topics - Yactraq Speech2Topics is a cloud service that converts audiovisual content into topic metadata via speech recognition & natural language processing. Customers use Yactraq metadata to target ads, build UX features like content search/discovery and mine Youtube videos for brand sentiment.
  7. Stremor Automated Summary and Abstract Generator - Language Heuristics goes a step beyond Natural Language Processing to extract intent from text. Summaries are created through extraction, but maintain readability by keeping sentence dependencies intact.
  8. Repustate Sentiment and Social Media Analytics - Repustate’s sentiment analysis and social media analytics API allows you to extract key words and phrases and determine social media sentiment in one of many languages. These languages include English, Arabic, German, French and Spanish. Monitor social media as well using our API and retrieve your data all with simple API calls.
  9. Sentiment Analysis for Social Media - The multilingual sentiment analysis API (with exceptional accuracy, 83.4% as opposed to industry standard of 65.4%, and available in Mandarin) from Chatterbox classifies social media texts as positive or negative, with a free daily allowance to get you started. The system uses advanced statistical models (machine learning & NLP) trained on social data, meaning the detection can handle slang, common misspellings, emoticons, hashtags, etc.
  10. Skyttle 2.0 - Skyttle API extracts topical keywords (single words and multiword expressions) and sentiment (positive or negative) expressed in text. Languages supported are English, French, German, Russian.
  11. Text-Processing - Sentiment analysis, stemming and lemmatization, part-of-speech tagging and chunking, phrase extraction and named entity recognition.
  12. Stemmer - This API takes a paragraph and returns the text with each word stemmed using porter stemmer, snowball stemmer or UEA stemmer
  13. SpringSense Meaning Recognition - The fastest and most accurate Meaning Recognition (Word Sense Disambiguation) API in the world. Recognises any nouns in a body of text and allows you to provide a rich user-interface with meaning definitions.
  14. LanguageTool - Style and grammar checking / proofreading for more than 25 languages, including English, French, Polish, Spanish and German.
  15. DuckDuckGo - DuckDuckGo Zero-click Info includes topic summaries, categories, disambiguation, official sites, !bang redirects, definitions and more. You can use this API for many things, e.g. define people, places, things, words and concepts; provides direct links to other services (via !bang syntax); list related topics; and gives official sites when available
  16. Jetlore Semantic Text Processing - Semantic Text Processing API extracts named entities from English text, including social media posts, user comments, product reviews, picture captions, email content, news articles, and web pages. We guarantee exceptional accuracy of over 90% precision at over 60% recall. The API handles slang, common misspellings, understands hashtags, and auto-fetches embedded URLs making it ideal for processing any user-generated content and social media.

    ESA Semantic Relatedness - Calculates the semantic relatedness between pairs of text excerpts based on the likeness of their meaning or semantic content.

    AlchemyAPI - AlchemyAPI provides advanced cloud-based and on-premise text analysis infrastructure that eliminates the expense and difficulty of integrating natural language processing systems into your application, service, or data processing pipeline.

    Sentence Recognition - The Sentence Recognition API will match strings of text based off of the meaning of the sentences. It’s powerful NLP engine offering utilizes a semantic network to understand the text presented.

    Machine Linking - We develop a multilingual SaaS platform performing semantic analysis of textual documents: by interfacing with our API, developers can connect unstructured documents, written in different languages, to resources in the Linked Open Data cloud such as DBPedia or Freebase

    TextTeaser - TextTeaser is an automatic summarization API. It extracts the most important sentences of an article. The purpose of the API is to provide a preview of what the article is all about.

    Textalytics Media Analytics - Textalytics Media Analysis API analyzes mentions, topics, opinions and facts in all types of media. This API provides services for: - Sentiment analysis - Extracts positive and negative opinions according to the context. - Entities extraction - Identifies persons, companies, brands, products, etc. and provides a canonical form that unifies different mentions (IBM, International Business Machines Corporation, etc.) - Topic and keyword extraction - Facts and other key information - Dates, URLs, addresses, user names, e-mails and money amounts. - Thematic classification - Organize information by topic using IPTC standard classification (more than 200 categories hierarchically structured). - Configured for different type of media: microblogging and social networks, blogs and news

    Wit.ai - it enables developers to add a Siri-like modern natural language interface to their app or device with minimal effort. It integrates well with Android’s speech to text engine.

You should also check out our other useful API lists for machine learningsummarizing textsentiment analysisSMS APIs, and face recognition APIs.


  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值