java ansj 自定义词典_ansj_seg - ansj分词.ict的真正java实现.分词效果速度都超过开源版的ict. 中文分词,人名识别,词性标注,用户自定义词典...

ansj_seg - ansj分词.ict的真正java实现.分词效果速度都超过开源版的ict. 中文分词,人名识别,词性标注,用户自定义词典

34

best java chinese word seg !

https://github.com/NLPchina/ansj_seg

Dependencies:

org.nlpcn:nlp-lang:1.7.8

Related Projects

BaiduLac by 百度 Baidu's open-source lexical analysis tool for Chinese, including word segmentation, part-of-speech tagging & named entity recognition.

nlp

chinese-nlp

Python

Deep Learning NLP Pipeline implemented on Tensorflow. Following the 'simplicity' rule, this project aims to use the deep learning library of Tensorflow to implement new NLP pipeline. You can extend the project to train models with your own corpus/languages. Pretrained models of Chinese corpus are distributed. Free RESTful NLP API are also provided. Visit http://www.deepnlp.org/api/v1.0/pipeline for details. 下载预训练模型 If you install deepnlp via pip, the pre-trained models are not distributed due to size restriction. You can download full models for 'Segment', 'POS' en and zh, 'NER' zh, zh_entertainment, zh_o2o, 'Textsum' by calling the download function.

C++

Davepy is a Chinese Pinyin Input Method Editor (IME), which supports smoothly converting from Chinese Pinyin to Chinese Hanzi. In which some statistic language modeling approaches are introduced and some NLP technique will be added into it continually.

Python

"Jieba" (Chinese for "to stutter") Chinese text segmentation: built to be the best Python Chinese word segmentation module.

natural-language-processing

nlp

Python

This code belongs to the "Implementing a CNN for Text Classification in Tensorflow" blog post. It is slightly simplified implementation of Kim's Convolutional Neural Networks for Sentence Classification paper in Tensorflow.

text-classification

convolutional-neural-networks

tensorflow

cnn

deep-learning

chinese

nlp

Go

Go efficient text segmentation; support english, chinese, japanese and other. Dictionary with double array trie (Double-Array Trie) to achieve, Sender algorithm is the shortest path based on word frequency plus dynamic programming.

segment

nlp

gse

chinese

english

japanese

trie

Python

SegyMAT is a set of Matlab/Octave m-files to read and write SEG Y data following SEG Y Revision 0 and 1

This project is used to segment text into tokens according its context and semantic. the segment use front-maximum matching and CRF algorithms to split text.

chinese

crf

crfsharp

nlp

word-segment

wordbreaker

Clojure

Duckling is a Clojure library that parses text into structured data: “the first Tuesday of October” => {:value "2014-10-07T00:00:00.000-07:00" :grain :day}

See our [blog post announcement](https://wit.ai/blog/2014/10/01/open-source-parser-duckling) for more context.Duckling is shipped with modules that parse temporal expressions i

HTML

This document aims to track the progress in Natural Language Processing (NLP) and give an overview of the state-of-the-art (SOTA) across the most common NLP tasks and their corresponding datasets. It aims to cover both traditional and core NLP tasks such as dependency parsing and part-of-speech tagging as well as more recent ones such as reading comprehension and natural language inference. The main objective is to provide the reader with a quick overview of benchmark datasets and the state-of-the-art for their task of interest, which serves as a stepping stone for further research. To this end, if there is a place where results for a task are already published and regularly maintained, such as a public leaderboard, the reader will be pointed there.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值