1-gram 中文分词

这篇博客介绍了一个使用Perl编写的1-gram中文分词程序,旨在娱乐并作为作业备份。程序无版权,可自由获取。
摘要由CSDN通过智能技术生成

一个1-gram实现,网上有个类似的python的,由于要交作业,写了个perl的,娱乐而已。

备份一下。无版权,需要自取。

#!/usr/bin/perl -w

# Attention please!
# This program should only be executed in UNIX-like platform.
# If run in windows, some unexpected problem will appear.
# In most cases, windows text editor will add in some special control byte in your text file.
# They are sheltered by windows and you almost cannot find them.
# What is worse, this is a chinese segmentation program, an unwanted byte will cause disaster!
#===============================================================================
# Introduction:
# The algorithm used is 1-gram.
# Simply, isn't it? But I don't think so...
# 
# Basicly, three extra files is needed.
# 1. The one used for create a dictinary. Default: 199801q.txt
# 2. The input file. Default: input.txt
# 3. The output file. Default: output.txt
# * Program will create a temp file called "config" to speed up itself.
#=========
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值