NLP with Java---Overview

最新推荐文章于 2024-06-18 20:00:00 发布

HoiDev

最新推荐文章于 2024-06-18 20:00:00 发布

阅读量299

点赞数

分类专栏： NLP 文章标签： NLP

本文链接：https://blog.csdn.net/qq_33938256/article/details/52763423

版权

7 篇文章 0 订阅

订阅专栏

Overview of text processing tasks

Finding Parts of Text–>split/tokenization
Finding Sentences–>Sentence Boundary Disambiguation (SBD)
- Finding People and Things–>Name Entity Recognition
- Detecting Parts of Speech–>POS Tagging
Classification(with label)/Clustering(without label)
Extracting Relationships–>IR
Combined Approaches

Split/Tokenization-->Sentence(SBD)-->NER-->POS-->Classification/Cluster-->IR

The basic steps include:

Identifying the task
Select a model
- Understanding the problem domain and
  the required quality of results permits us to select the appropriate model
Building and trainning the model
- Training a model is the process of executing an algorithm against a set of data, formulating the model, and then verifying the model
- labeled samples or dataset is called a corpus
Verifying the model
- split sample and test sets
- Often, only part of a corpus is used for training
  while the other part is used for verification
Using the model

This includes data for training purposes and the data that needs to be processed.

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

关注关注