XML4C中文兼容问题

  首 页 | 新 闻 | 技术中心 | 第二书店 | 《程序员》 | 《开发高手》 | 社 区 | 黄 页 | 人 才
移 动专 题SUNIBM微 软微 创精 华Donews人 邮
我的技术中心 
我的分类我的文档
全部文章发表文章
专栏管理使用说明

 RSS 订阅 
最新文档列表
Windows/.NET
.NET  (rss)    
Visual C++  (rss)    
Delphi  (rss)    
Visual Basic  (rss)    
ASP  (rss)    
JavaScript  (rss)    
Java/Linux
Java  (rss)    
Perl  (rss)    
综合
其他开发语言  (rss)    
文件格式  (rss)    
企业开发
游戏开发  (rss)    
网站制作技术  (rss)    
数据库
数据库开发  (rss)    
软件工程
其他  (rss)    

积极原创作者 
A8P8D3JK (1)
adamsun (3)
ahxu (2)
maoerzuozuo (6)
playyuer (30)
lgchao (2)
devercn (8)
foxmail (3)
netcasper (2)
cker (12)
CSDN - 文档中心 - Visual C++ 阅读:538   评论: 0    参与评论
标题  XML4C完美兼容中文的补充     flyelf [原作]
关键字  XML4C完美兼容中文的补充
出处 

 

XML 4C 完美兼容中文的补充

 

xml 4c 兼容中文的问题一直是大家比较头疼的问题,网上也有很多关于这方面的讨论,但是一直没有太好的结论。在IBM Developerworks的网站上,找到了邹月明先生的一篇文章《剖析XML 4c 源码,完美兼容中文XML》,该文章对Xml 4c 的源码进行了剖析,对xml 4c 的源码进行了修改,从而达到了对中文兼容的目的。我也针对Xml 4c 的源码按照文章中的说法进行了修改,这种方法在我的debug版本中确实完美的解决了xml 4c 对中文的完美兼容。但是,问题出现了在我的Release版本中,对中文的解析出现了混乱。

为了能够弄清楚问题的所在,决定对xml 4c release版本进行跟踪,结果发现问题就是出在 邹月明 先生的文章中所指出的修改之处。该代码对于DebugRelease返回的是不同的结果(首先需要声明的是,我已经在使用该库之前调用了setlocale ( LC_ALL, “Chinese-siimplified” ) ),只要字符串中包含有中文,在Release版本下calcRequiredSize ( const char* const srcText )函数种调用mbstowcs返回的结果就是不正确。下面是《剖析XML4C源码》修改的函数calcRequiredSize代码:

 

unsigned int Win32LCPTranscoder::calcRequiredSize(const char* const srcText)

{

/*

if(!srcText)

return 0;

unsigned charLen = ::mblen(srcText, MB_CUR_MAX);

if(charLen == -1)

return 0;

else if(charLen != 0 )

charLen = strlen(srcText)/charLen;

if(charLen == -1)

return 0;

return charLen;

*/

 

if(! srcText){

return 0;

}

unsigned int retVal = ::mbstowcs( 0, srcText, 0 );

if ( retVal == -1) {

return 0;

}

return retVal;

}

 

这段代码在Debug下正确运行没有问题,但是在Release下,计算长度就是会出现问题,得到的结果不正确。开始以为是在程序开始的设置的本地环境服务被修改了,可是跟踪代码的时候发现并没有被改变,这让人感到很迷惑,不知道是什么原因造成的。当在mbstowcs前设置当前的本地环境服务(setlocale(LC_ALL, NULL );)之后,函数mbstowcs就能够正确的工作了,如果有谁知道原因,可以来信告知(email: flyelfsky@hotmail.com)

于是,修改之后的代码就如下了:

unsigned int Win32LCPTranscoder::calcRequiredSize(const char* const srcText)

{

 

if(! srcText){

return 0;

}

#ifndef _DEBUG

// 确保在release下能够得到正确的结果

setlocale ( LC_ALL, NULL );

#endif

unsigned int retVal = ::mbstowcs( 0, srcText, 0 );

if ( retVal == -1) {

return 0;

}

return retVal;

}

 

当然了,不仅仅是这个函数需要添加这个,对于所有的mbstowcs函数的前面都要添加这个代码:setlocale(LC_ALL, NULL );下面列出的就是需要修改的函数:

unsigned int Win32LCPTranscoder::calcRequiredSize(const char* const srcText

                                                  , MemoryManager* const manager)

char* Win32LCPTranscoder::transcode(const XMLCh* const toTranscode);

char* Win32LCPTranscoder::transcode(const XMLCh* const toTranscode,

                                    MemoryManager* const manager)

XMLCh* Win32LCPTranscoder::transcode(const char* const toTranscode)

XMLCh* Win32LCPTranscoder::transcode(const char* const toTranscode,

                                     MemoryManager* const manager)

bool Win32LCPTranscoder::transcode( const   char* const     toTranscode

                                    ,       XMLCh* const    toFill

                                    , const unsigned int    maxChars

                                    , MemoryManager* const  manager)

bool Win32LCPTranscoder::transcode( const   XMLCh* const    toTranscode

                                    ,       char* const     toFill

                                    , const unsigned int    maxBytes

                                    , MemoryManager* const  manager)

 

 

使用函数mbstowcswcstombs进行转换,只不过是为了移植的方便,而在win32下已经提供了另外的api来进行转换:WideCharToMultiByteMultiByteToWideChar,在这些转换函数中,把mbstowcs替换为MultiByteToWideChar,把wcstombs替换为WideCharToMultiByte。这样重新编译后就不会出现这个问题了。

 

附录:修改之后的Win32TransService.cpp的部分寒暑内容

 

// ---------------------------------------------------------------------------

//  Win32LCPTranscoder: Implementation of the virtual transcoder interface

// ---------------------------------------------------------------------------

unsigned int Win32LCPTranscoder::calcRequiredSize(const char* const srcText

                                                  , MemoryManager* const manager)

{

     if ( ! srcText )

     {

            return 0;

     }

 

#ifdef _WIN32

     unsigned int retVal = ::MultiByteToWideChar ( CP_ACP,

                                                                                    0,

                                                                                    srcText,

                                                                                    -1,

                                                                                    NULL,

                                                                                    0 );

#else

#ifndef _DEBUG

     setlocale( LC_ALL, "" );

#endif

     unsigned int retVal = ::mbstowcs( 0, srcText, 0 );

#endif

     if ( retVal == -1 )

     {

            return 0;

     }

     return retVal;

}

 

char* Win32LCPTranscoder::transcode(const XMLCh* const toTranscode)

{

    if (!toTranscode)

        return 0;

 

    char* retVal = 0;

    if (*toTranscode)

    {

        // Calc the needed size

        const unsigned int neededLen = calcRequiredSize(toTranscode);

 

        // Allocate a buffer of that size plus one for the null and transcode

        retVal = new char[neededLen + 1];

#ifdef _WIN32

            ::WideCharToMultiByte ( CP_ACP,

                                                      0,

                                                      toTranscode,

                                                      -1,

                                                      retVal,

                                                      neededLen,

                                                      "",

                                                      NULL );

#else

#ifndef _DEBUG

     setlocale( LC_ALL, "" );

#endif

        ::wcstombs(retVal, toTranscode, neededLen + 1);

#endif

           

        // And cap it off anyway just to make sure

        retVal[neededLen] = 0;

    }

    else

    {

        retVal = new char[1];

        retVal[0] = 0;

    }

    return retVal;

}

 

char* Win32LCPTranscoder::transcode(const XMLCh* const toTranscode,

                                    MemoryManager* const manager)

{

    if (!toTranscode)

        return 0;

 

    char* retVal = 0;

    if (*toTranscode)

    {

        // Calc the needed size

        const unsigned int neededLen = calcRequiredSize(toTranscode, manager);

 

        // Allocate a buffer of that size plus one for the null and transcode

        retVal = (char*) manager->allocate((neededLen + 1) * sizeof(char)); //new char[neededLen + 1];

#ifdef _WIN32

            ::WideCharToMultiByte ( CP_ACP,

                                                      0,

                                                      toTranscode,

                                                      -1,

                                                      retVal,

                                                      neededLen,

                                                      "",

                                                      NULL );

#else

#ifndef _DEBUG

     setlocale( LC_ALL, "" );

#endif

        ::wcstombs(retVal, toTranscode, neededLen + 1);

#endif

 

        // And cap it off anyway just to make sure

        retVal[neededLen] = 0;

    }

    else

    {

        retVal = (char*) manager->allocate(sizeof(char)); //new char[1];

        retVal[0] = 0;

    }

 

    return retVal;

}

 

XMLCh* Win32LCPTranscoder::transcode(const char* const toTranscode)

{

    if (!toTranscode)

        return 0;

 

    XMLCh* retVal = 0;

    if (*toTranscode)

    {

        // Calculate the buffer size required

        const unsigned int neededLen = calcRequiredSize(toTranscode);

        if (neededLen == 0)

        {

            retVal = new XMLCh[1];

            retVal[0] = 0;

            return retVal;

        }

 

        // Allocate a buffer of that size plus one for the null and transcode

        retVal = new XMLCh[neededLen + 1];

#ifdef _WIN32

            ::MultiByteToWideChar ( CP_ACP,

                                                      0,

                                                      toTranscode,

                                                      -1,

                                                      retVal,

                                                      neededLen );

#else

#ifndef _DEBUG

     setlocale( LC_ALL, "" );

#endif  

        ::mbstowcs(retVal, toTranscode, neededLen + 1);

#endif

           

        // Cap it off just to make sure. We are so paranoid!

        retVal[neededLen] = 0;

    }

    else

    {

        retVal = new XMLCh[1];

        retVal[0] = 0;

    }

    return retVal;

}

 

XMLCh* Win32LCPTranscoder::transcode(const char* const toTranscode,

                                     MemoryManager* const manager)

{

    if (!toTranscode)

        return 0;

 

    XMLCh* retVal = 0;

    if (*toTranscode)

    {

        // Calculate the buffer size required

        const unsigned int neededLen = calcRequiredSize(toTranscode, manager);

        if (neededLen == 0)

        {

            retVal = (XMLCh*) manager->allocate(sizeof(XMLCh)); //new XMLCh[1];

            retVal[0] = 0;

            return retVal;

        }

 

        // Allocate a buffer of that size plus one for the null and transcode

        retVal = (XMLCh*) manager->allocate((neededLen + 1) * sizeof(XMLCh)); //new XMLCh[neededLen + 1];

#ifdef _WIN32

            ::MultiByteToWideChar ( CP_ACP,

                                                      0,

                                                      toTranscode,

                                                      -1,

                                                      retVal,

                                                      neededLen );

#else

#ifndef _DEBUG

     setlocale( LC_ALL, "" );

#endif

        ::mbstowcs(retVal, toTranscode, neededLen + 1);

#endif

           

        // Cap it off just to make sure. We are so paranoid!

        retVal[neededLen] = 0;

    }

    else

    {

        retVal = (XMLCh*) manager->allocate(sizeof(XMLCh)); //new XMLCh[1];

        retVal[0] = 0;

    }

    return retVal;

}

 

bool Win32LCPTranscoder::transcode( const   char* const     toTranscode

                                    ,       XMLCh* const    toFill

                                    , const unsigned int    maxChars

                                    , MemoryManager* const  manager)

{

    // Check for a couple of psycho corner cases

    if (!toTranscode || !maxChars)

    {

        toFill[0] = 0;

        return true;

    }

 

    if (!*toTranscode)

    {

        toFill[0] = 0;

        return true;

    }

 

    // This one has a fixed size output, so try it and if it fails it fails

 

#ifdef _WIN32

     size_t to_ = ::MultiByteToWideChar ( CP_ACP,

                                                                    0,

                                                                    toTranscode,

                                                                    -1,

                                                                    toFill,

                                                                    maxChars );

#else

#ifndef _DEBUG

     setlocale( LC_ALL, "" );

#endif

     size_t to_ = ::mbstowcs(toFill, toTranscode, maxChars + 1);

#endif

 

     return ( to_ != size_t(-1) );

     //

}

 

bool Win32LCPTranscoder::transcode( const   XMLCh* const    toTranscode

                                    ,       char* const     toFill

                                    , const unsigned int    maxBytes

                                    , MemoryManager* const  manager)

{

    // Watch for a couple of pyscho corner cases

    if (!toTranscode || !maxBytes)

    {

        toFill[0] = 0;

        return true;

    }

 

    if (!*toTranscode)

    {

        toFill[0] = 0;

        return true;

    }

 

    // This one has a fixed size output, so try it and if it fails it fails

 

     //

#ifdef _WIN32

     size_t to_ = ::WideCharToMultiByte ( CP_ACP,

                                                                    0,

                                                                    toTranscode,

                                                                    -1,

                                                                    toFill,

                                                                    maxBytes,

                                                                    "",

                                                                    NULL );

#else

#ifndef _DEBUG

     setlocale( LC_ALL, "" );

#endif

     size_t to_ = ::wcstombs(toFill, toTranscode, maxBytes + 1);

#endif

     if ( to_ == size_t(-1) )

     {

            return false;

     }

    

    // Cap it off just in case

    toFill[maxBytes] = 0;

    return true;

}

 

 


相关文章
对该文的评论


网站简介 - 广告服务 - 网站地图 - 帮助信息 - 联系方式 - English
北京百联美达美数码科技有限公司 版权所有 京ICP证020026号
Copyright © CSDN.NET, Inc. All Rights Reserved
<script> document.write(" "); </script>
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值