【模拟】我要清晰的Unicode转码表

最新推荐文章于 2023-04-14 23:18:22 发布

ACM_Ted

最新推荐文章于 2023-04-14 23:18:22 发布

阅读量1k

点赞数

分类专栏： ACM 文章标签： string comments stream output input 文档

本文链接：https://blog.csdn.net/ACM_Ted/article/details/7551508

版权

ACM 专栏收录该内容

149 篇文章 0 订阅

订阅专栏

我要清晰的Unicode转码表

http://icpc.ahu.edu.cn/OJ/Problem.aspx?id=416

Description

据文件历史记录，WX曾经在2008年2月研究过Unicode，并尝试了Shift-JIS、GB2312、Unicode的相互转换问题。找资料当然要上官方网站，于是从unicode.org上Down下了官方编码映射表，不过看似这个表有一些问题，下面就请你处理一下。文档中，#之后为注释部分，可能有空行或多于的空格，我所需要的是正规的代码转换部分。

比如Shift-JIS到Unicode的转换文件正文内容举例如下（所有编码都有注释）：

0x7D	0x007D	# RIGHT CURLY BRACKET
0x7E	0x203E	# OVERLINE
0x8140	0x3000	# IDEOGRAPHIC SPACE
0x8141	0x3001	# IDEOGRAPHIC COMMA

前面是Shift-JIS编码，中间是Unicode编码，后面是注释，以Tab（ASCII编号9）分隔，注意不是空格
现在我需要按一些要求转化成下面的格式：

0x7D	0x007D	Right Curly Bracket
0x7E	0x203E	Overline
0x8140	0x3000	Ideographic Space
0x8141	0x3001	Ideographic Comma

具体要求如下，去除整行的注释和空行，对于正规转码表部分格式如下：（原码）Tab（目标码）Tab（描述），其中描述去除#和左右多于空格，中间有超过一个的空格也合并，除字母空格外没有其他字符，单词首字母大写。各个单位仍然以Tab分隔（好看，处理也方便）。特别的，如果描述为比如：“0xE686 0x8ADE # <CJK>”（其实这种东西很多，）那么就只输出编码。

Input

输入内容如描述所述，以EOF结束，每行不超过100个字符，每行或者为空行，或者以#开始或者以0开始

Output

输出内容如描述所述

Sample Input

Original

Transformed

#	Any comments or problems, contact <John_Jenkins@taligent.com>
#
0x20	0x0020	# SPACE
0x9D57	0x6294	# <CJK>

0x9D58	0x62D7	# <CJK>

Sample Output

Original

Transformed

0x20	0x0020	Space
0x9D57	0x6294
0x9D58	0x62D7

模拟题都是考细心和耐心的啊，这个好像出现<CJK>就不能输出注释，题意理解错误WA两次啊。

#include<iostream>
#include<sstream>
#include<cstdio>
#include<cstring>
#include<string>
using namespace std;
int main()
{
    char ch[105];
    string s,temp,cnt;
    for(bool flag=true,first=true,ti=false;gets(ch)!=NULL;flag=true,first=true,ti=false)
    {
        s=string(ch);
        istringstream stream(s);
        for(;stream>>temp;)
        {
ti=true;
            if(temp[0]=='#'||temp.empty())
            {
                flag=false;
                break;
            }
            printf("%s\t",temp.c_str());
            stream>>temp;
            printf("%s",temp.c_str());
            stream>>temp;
            if(temp[0]=='#'&&temp.length()!=1)
            {
                if(temp.length()==1)
                   temp="";
                else
                   temp=temp.substr(1,temp.length()-1);
            }
            cnt="";
            if(temp.length()!=0&&temp!="#")
            {
                if(temp=="<CJK>")
                {
                    printf("\n");
                    break;
                }
               for(int i=0;i<temp.length();++i)
               {
                   first=false;
                   if(i==0)
                   {
                       cnt+="\t";
                       cnt+=toupper(temp[i]);
                   }
                   else
                      cnt+=tolower(temp[i]);
               }
            }
            for(;stream>>temp;)
            {
               if(temp=="<CJK>")  break;
               for(int i=0;i<temp.length();++i)
               {
                   if(i==0)
                   {
                       if(first)
                       {
                               cnt+="\t";
                               cnt+=toupper(temp[i]);
                               first=false;
                       }
                       else
                       {
                               cnt+=" ";
                               cnt+=toupper(temp[i]);
                       }
                   }
                   else
                      cnt+=tolower(temp[i]);
               }
            }
            printf("%s",cnt.c_str());
        }
        if(flag&&ti)  printf("\n");
    }
    return 0;
}

ACM_Ted

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
【模拟】我要清晰的Unicode转码表

我要清晰的Unicode转码表http://icpc.ahu.edu.cn/OJ/Problem.aspx?id=416Description据文件历史记录，WX曾经在2008年2月研究过Unicode，并尝试了Shift-JIS、GB2312、Unicode的相互转换问题。找资料当然要上官方网站，于是从unicode.org上Down下了官方编码映射表，不过看似这
复制链接

扫一扫