php xml dom 中文标记,PHP，XML，DOM - 如何确保最终的xml文件编码= utf-8？(PHP, XML, DOM

PHP，XML，DOM - 如何确保最终的xml文件编码= utf-8？(PHP, XML, DOM — how to make sure that final xml file encoding=utf-8?)

请帮帮我...

这是详细情景..

我有一个包含xml标签的xml文件，例如

data.xml - following its content

-----------------

some text

-------------------

现在我将此文件上传到我的翻译服务。我的代码加载文件...以下是PHP代码

$dom = new DOMDocument('1.0', 'utf-8');

if ( !$dom->load($target_file) ) {

echo "Cannot load file $target_file";

exit;

}

然后我的逻辑运行并用一些重音字符替换节点值，例如nënë，它工作正常，最后我保存文件

$dom->save($target_file);

现在输出应如下所示

data.xml - following its content

-----------------

nënë

-------------------

但是当我打开文件输出如下

-------------------

nënë

-------------------

请帮助我...我应该如何确保xml文件编码应该是UTF-8 ?????

等候......

Please help me...

here is the detail scenario..

I have an xml file containing xml tags e.g.

data.xml - following its content

-----------------

some text

-------------------

Now I uploaded this file to my translations service. my code loads the file... following is the php code

$dom = new DOMDocument('1.0', 'utf-8');

if ( !$dom->load($target_file) ) {

echo "Cannot load file $target_file";

exit;

}

then my logic operates and replaces the node value with some accented characters e.g. nënë and it works fine and finally i save the file

$dom->save($target_file);

Now the output should be like as follow

data.xml - following its content

-----------------

nënë

-------------------

BUT When i open the file the output as follow

-------------------

nënë

-------------------

please Help me ... How should I make sure that xml file encoding should be UTF-8?????

Waiting......

原文：https://stackoverflow.com/questions/9998131

更新时间：2020-02-20 14:27

最满意答案

不知道你是否已经解决了：

如果您的数据是UTF-8编码的，并且您发现saveXML()将所有非ASCII字符转换为数字实体(例如ä - >＆＃xF6;)：

加载源数据时可能缺少XML声明。在使用load()或loadXML()读取之前，尝试将添加到文档的开头。然后非ASCII字符应保持不变。为我工作。

Don't know if you already solved it or not:

If your data is UTF-8-encoded and you discover that saveXML() turned all non-ASCII characters into numeric entities (e.g. ä -> ö):

Chances are that the XML declaration has been missing when you loaded the source data. Try adding <?xml version="1.0" encoding="UTF-8"?> to the beginning of the document before you read it with load() or loadXML(). Then the non-ASCII characters should remain untouched. Worked for me.

2012-04-21

相关问答

我相信您在chrome中遇到了流式XML解析错误。该错误将指向XML标记的开头，但实际上“错误”是内容中的某个位置。这是因为服务器以块的形式响应，其中一个块在多字节UTF字符的中间被分割。 I believe you encountered a streamed XML parsing bug in chrome. The error will point to the beginning of the XML tag, but in fact the „error” is somewher

...

什么时候ç 是“ç”，那么你的编码是Windows-1252(或者可能是ISO-8859-1)，但不是UTF-8。 When ç is "ç", then your encoding is Windows-1252 (or maybe ISO-8859-1), but not UTF-8.

使用项目功能，您可以获得您正在寻找的元素 $路径 - >项(0) - >的nodeValue; Using the item functions you can get the element you are looking for $path->item(0)->nodeValue;

在Vim中有很多与编码有关的混淆。有两种编码设置， 'encoding'和'fileencoding' 。 'encoding'是与当前vim会话相关的一个 - 我将它始终保留为'utf-8'，但后来我只使用gVim或启用了unicode的终端。 'fileencoding'是文件本身的编码，可以自动检测到，或者可以用设置( ++enc )或模式行覆盖，我相信。它基于'fileencodings'选项进行检测。尝试这个： vim

:set encoding=utf-8

:e ++enc=ut

...

问题是你的php.ini中的short_open_tag = On the problem is short_open_tag = On in your php.ini

我认为你做的一切都正确，除了你的终端是拉丁语-1。 ä的UTF-8序列是C3 A4，如果显示为Latin-1则为Ã¤。 I think you did everything correctly, except that your terminal is in Latin-1. The UTF-8 sequence for ä is C3 A4, which is Ã¤ if displayed as Latin-1.

您正在将FileWriter传递给XMLWriter 。 Writer已经处理了String或char[]数据，因此它已经处理了编码，这意味着XMLWriter没有机会影响它。另外， FileWriter是一个特别有问题的Writer类型，因为你永远不能指定它应该使用哪种编码，而是它总是使用平台默认编码(在Windows上通常类似于ISO-8859-1，在Linux上则是UTF-8)。因此基本上不应该使用它。为了让XMLWriter应用它作为配置给出的内容，传递一个OutputStream

...

直接看起来问题在于您的响应中的XML编码。 URL url = new URL("http://myurl.com");

InputSource is = new InputSource(url.openStream());

is.setEncoding("ISO-8859-1"); // Also Try UTF-8 or UTF-16

BufferedReader br = new BufferedReader(new InputStreamReader(is.getByteStream())

...

不知道你是否已经解决了：如果您的数据是UTF-8编码的，并且您发现saveXML()将所有非ASCII字符转换为数字实体(例如ä - >＆＃xF6;)：加载源数据时可能缺少XML声明。在使用load()或loadXML()读取之前，尝试将添加到文档的开头。然后非ASCII字符应保持不变。为我工作。资料来源： http ： //www.php.net/manual/en/domdocument.savexml.

...

确保您的输入数据尚未编码为UTF-8因为如果是，则通过调用utf8_encode()对其进行双重编码。如果您希望遇到编码为UTF-8字符串并且还使用其他字符集(我猜是ISO-8859-9 )，那么我认为用这样的函数替换utf8_encode()更好： function encode_to_utf8_if_needed($string)

{

$encoding = mb_detect_encoding($string, 'UTF-8, ISO-8859-9, ISO-8859-1');

...

php xml dom 中文标记,PHP，XML，DOM - 如何确保最终的xml文件编码= utf-8？(PHP, XML, DOM — how to make sure that final xm...