在英文系统中查找Unicode文本文件中的中文信息

写的一个工具,用来查找Unicode文本中的中文字符。

FindCnchar.java

import java.io.*;

/**
 * Application: FindCnchar
 * Author: Steven
 * Data: Monday, May 28, 2007
 * Time: 15:05:42
 * Purpose:
Find Chinese Characters in text.
 
 * Usage:
Java FindCnchar filepath

 * Description:
In fact, this appliction just pick out those lines containing not-ASCII characters now, of course, including chinese characters.
The source text file SHALL be encoded in "Unicode"(i.e. UTF-16 ).
Good luck to find your interested chinese information in English System.
 */
public class FindCnchar
{
    public static void main( String[] args ) throws Exception
    {
        // argument check
        if ( args.length < 1 )
        {
            println( "not enough arguments!" );
            printUsage();
        }
        if (!( new File( args[0] ).exists() ) )
        {
            println("assigned file doesnot exist!");
        }
       
        // parameters
        File srcfile = new File( args[0] );
        File destfile = new File( srcfile.getAbsolutePath() + ".txt" );
        BufferedReader fin = new BufferedReader( new InputStreamReader( new FileInputStream( srcfile ), "UTF-16" ) );
        //PrintWriter fout = new PrintWriter( new FileWriter( destfile ) );
        PrintWriter fout = new PrintWriter( new OutputStreamWriter( new FileOutputStream( destfile ), "UTF-16" ) );
       
        // line counter
        int line_no = 1;
        int cur_line = -1;
       
        // search
        String line = null;
        while( ( line = fin.readLine() ) != null )
        {
            for( int i=0; i<line.length(); ++i )
            {
                if ( (int)line.charAt(i) > 128 )
                {
                    fout.println( line_no + ":" + line );
                    break;
                }
            }
            ++line_no;
        }
       
       
        fin.close();
        fout.close();
       
        // report
        println( "Finished search." );
        println( destfile.getAbsolutePath() );
    }
   
    /**
     * Is a chinese character.
     */
//    public boolean isCnchar( char ch )
//    {
//    }
   
   
    public static void println( Object o )
    {
        System.out.println( o );
    }
   
    public static void print( Object o )
    {
        System.out.print( o );
    }
   
    public static void printUsage()
    {
        println("Usage:");
        println("FindCnchar file_path");
        println("");
    }   
}/*END OF CLASS FindCnchar*/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值