LaTeX 写作的两个辅助工具：统计中文字数和关闭Acrobat中的PDF文档

最新推荐文章于 2024-07-02 10:37:48 发布

陈硕

最新推荐文章于 2024-07-02 10:37:48 发布

阅读量1.5w

点赞数

分类专栏： Typesetting with LaTeX & Word 文章标签：工具 file null initialization cmd filenames

本文链接：https://blog.csdn.net/Solstice/article/details/196229

版权

LaTeX 没有像 Word 那样自带中文字数统计功能，加上 LaTeX 源文件中有许多控制字符，不能通过文件大小获知其中有多少汉字。为此我用C写了一个统计中文字数的小工具，名为 cwc ，即 chinese word counter。这个程序只有 count_files() 函数使用了 Windows API，稍作修改就能移植到 Linux/Unix 下。

#include <stdio.h>
#include <wchar.h>
#include <windows.h>
int total = 0; // total chinese characters

// UNICODE version word counter
void word_count_u(FILE* pf)
{
int w = 0, b = 2;
wint_t c;

    while((c = getwc(pf)) != WEOF) {
        b += 2;         // byte count
        if (c > 127) {  // 中文字符
            w++;        // char count
        }
    }

    printf("%10d /t %10d/n", w, b);
    total += w;
}

// word counter
void word_count(const char* file)
{
    int w = 0, b = 0;
    int c;
    int unicode = 0;

    FILE *pf = fopen(file, "rb");
    if (NULL == pf) {
        return;
    }
    printf ("%20s : ", file);

    // 判断是否为 UNICODE 文件
    if ((c = getc(pf)) == 0xff) {
        int cc;
        if ((cc = getc(pf)) == 0xfe) {
            unicode = 1;
            printf("UNICODE");
            word_count_u(pf);
        }
        else {
            fseek(pf, 0, SEEK_SET);
        }
    }
    else {
        ungetc(c, pf);
    }

    if (!unicode) {
        printf("       ");
        while((c = getc(pf)) != EOF) {
            b++; // byte count
            if (c > 127) { // 中文字符
                w++; // char count
                b++; // 每个中文字符占两字节
                if ((c = getc(pf)) == EOF)
                    break;
            }
        }

        printf("%10d /t %10d/n", w, b);
        total += w;
    }
    fclose(pf);
}

void count_files(const char* file)
{
WIN32_FIND_DATA FindFile

最低0.47元/天解锁文章

陈硕

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
1
评论
LaTeX 写作的两个辅助工具：统计中文字数和关闭Acrobat中的PDF文档

LaTeX 没有像 Word 那样自带中文字数统计功能，加上 LaTeX 源文件中有许多控制字符，不能通过文件大小获知其中有多少汉字。为此我用C写了一个统计中文字数的小工具，名为 cwc ，即 chinese word counter。这个程序只有 count_files() 函数使用了 Windows API，稍作修改就能移植到 Linux/Unix 下。#include #include #i
复制链接

扫一扫

专栏目录