[#1]
simonsimcity at gmail dot com [2014-06-05 08:16:38]
Sorry, for posting it again, but I found a bug in my code:
If you have a character, like the cyrillic ?? (a soft-sign - no sound), the "Any-Latin" would translate it to a prime-character, and the "Latin-ASCII" doesn't touch prime-characters. Therefore I added an option to remove all characters, that are higher than \u0100.
Here's my new code, including an example:
var_dump(transliterator_transliterate('Any-Latin; Latin-ASCII; [\u0100-\u7fff] remove',
"A ? ?b??rmensch p? h?yeste niv?! ?? ?? ?????? PHP! ?????. ?"));
// string(50) "A ae Ubermensch pa hoyeste niva! I a lublu PHP! est. fi"
Another approach, I found quite helpful (if you by no way want to remove characters ...), try to use iconv() in addition. This surely will just return ASCII characters.
See: http://stackoverflow.com/a/3542748/517914
Also an example here:
var_dump(iconv("UTF-8", "ASCII//TRANSLIT//IGNORE", transliterator_transliterate('Any-Latin; Latin-ASCII',
"A ? ?b??rmensch p? h?yeste niv?! ?? ?? ?????? PHP! ?????. ?"));
// string(50) "A ae Ubermensch pa hoyeste niva! I a lublu PHP! est'. fi"