使用ColdFusion / Java进行语言检测

过去,我使用字符范围来尝试识别文本的语言。 虽然这似乎适用于俄语,中文,日语,土耳其语,希腊语,希伯来语,韩语和阿拉伯语,但对于诸如法语,德语和西班牙语之类的拉丁语来说,这却毫无用处。

Today, a question was asked on StackOverflow:

如何识别包含西班牙语等其他语言的句子?

Someone recommended polyglot and PYCLD2 python libraries and this started me on my quest for a Java solution. I found Apache OpenNLP, but it seemed overkill as I was only needed language detection. Lingua looked promising, but the library was 30mb and integration didn't seem very easy. On the Lingua page, the Optimaize Language Detector Java library was referenced. There was also a tag cloud at the top of the page and "language-detection" was listed. I followed it and filtered the language to "java" and received 23 public repositories. The kju2 language-detector library is a fork of Optimaize, seemed more ColdFusion-friendly in terms of integration & usage and the pre-compiled JAR file is only 1.2mb (versus 131mb for Lingua).

Installation

将JAR文件复制到您的JAVA路径。

Usage

实例化languageDetector.cfc组件。

var languageDetector = new languageDetector();

languageDetector.detect(text)

返回具有检测到的语言的文本字符串。

languageDetector("Quel est votre nom?")          // CATALAN (French?)
languageDetector("Wie heißen Sie?")              // GERMAN
languageDetector("¿Cuál es tu nombre?")          // SPANISH
languageDetector("Πως σε λένε?")                 // GREEK    
languageDetector("آپ کا نام کیا ہے؟ ")          // URDU
languageDetector("Как Вас зовут?")               // BELARUSIAN (Russian)
languageDetector("คุณชื่ออะไร?")                    // THAI

Source

Download it from Github.

GitHub logo JamoCA / cf-language-detector

ColdFusion wrapper for kju2 forked "Language Detection Library for Java"

cf-language-detect

ColdFusion wrapper for kju2-forked "Language Detection Library for Java".

Installation

Install the JAR file to your existing JAVA path and restart the ColdFusion server.

  1. Download and build JAR file manually from https://github.com/kju2/language-detector
  2. Download pre-compiled JAR from MvnRepository. https://mvnrepository.com/artifact/io.github.kju2.languagedetector/language-detector/1.0.5
  3. Use included JAR file (v1.0.5)

Usage

Instantiate the component:

    var languageDetector = new languageDetector();

languageDetector.detect(text)

Returns a text string with the language detected.

languageDetector("Quel est votre nom?")          // CATALAN (French?)
languageDetector("Wie heißen Sie?")              // GERMAN
languageDetector("¿Cuál es tu nombre?")          // SPANISH
languageDetector("Πως σε λένε?")                 // GREEK    
languageDetector("آپ کا نام کیا ہے؟ ")          // URDU
languageDetector("Как Вас зовут?")               // BELARUSIAN (Russian)
languageDetector("คุณชื่ออะไร?")                    // THAI

Language Support

68 Built-in Language Profiles

  1. AFRIKAANS (af)
  2. ALBANIAN (sq)
  3. ARABIC (ar)

from: https://dev.to//gamesover/language-detection-using-coldfusion-java-40m1

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值