Problem
In many PHP legacy products the function htmlspecialchars($string) is used to convert characters like < and > and quotes a.s.o to HTML-entities. That avoids the interpretation of HTML Tags and asymmetric quote situations.
Since PHP 5.4 for $string in htmlspecialchars($string) utf8 characters are expected if no charset is defined explicitly as third parameter in the function. Legacy products are mostly in Latin1 (alias iso-8859-1) what makes the functions htmlspecialchars(), htmlentites() and html_entity_decode() to return empty strings if a special character, e. g. a German Umlaut, is present in $string:
PHP<5.4
echo htmlspecialchars('Woermann') //Output: <b>Woermann<b>
echo htmlspecialchars('Wörmann') //Output: <b>Wörmann<b>
PHP=5.4
echo htmlspecialchars('Woermann') //Output: <b>Woermann<b>
echo htmlspecialchars('Wörmann') //Output: empty
Three alternative solutions
a) Not runnig legacy products on PHP 5.4
b) Change all find spots in your code from
htmlspecialchars($string) and *** to
htmlspecialchars($string, ENT_COMPAT | ENT_HTML401, 'ISO-8859-1')
c) Replace all htmlspecialchars() and *** with a new self-made function
*** The same is true for htmlentities() and html_entity_decode();
Solution c
1 Make Search and Replace in the concerned legacy project:
Search for: htmlspecialchars
Replace with: htmlXspecialchars
Search for: htmlentities
Replace with: htmlXentities
Search for: html_entity_decode
Replace with: htmlX_entity_decode
2a Copy and paste the following three functions into an existing already everywhere included PHP-file in your legacy project. (of course that PHP-file must be included only once per request, otherwise you will get a Redeclare Function Fatal Error).
function htmlXspecialchars($string, $ent=ENT_COMPAT, $charset='ISO-8859-1') {
return htmlspecialchars($string, $ent, $charset);
}
function htmlXentities($string, $ent=ENT_COMPAT, $charset='ISO-8859-1') {
return htmlentities($string, $ent, $charset);
}
function htmlX_entity_decode($string, $ent=ENT_COMPAT, $charset='ISO-8859-1') {
return html_entity_decode($string, $ent, $charset);
}
or 2b crate a new PHP-file containing the three functions mentioned above, let's say, z. B. htmlXfunctions.inc.php and include it on the first line of every PHP-file in your legacy product like this: require_once('htmlXfunctions.inc.php').