How do I encode something in ut8mb4 in Python?
I have two sets of data: data I am migrating to my new MySQL database over from Parse, and data going forward (that talks only to my new database). My database is utf8mb4 in order to store emoji and accented letters.
The first set of data only shows up correctly (when emoji and accents are involved) when I have in my python script:
MySQLdb.escape_string(unicode(xstr(data.get('message'))).encode('utf-8'))
and when reading from the MySQL database in PHP:
$row["message"] = utf8_encode($row["message"]);
The second set of data only shows up correctly (when emoji and accents are involved) when I DON'T include the utf8_encode($row["message"]) portion. I am trying to reconcile these so that both sets of data are returned correctly to my iOS app. Please help!
解决方案
MySQL's utf8mb4 encoding is just standard UTF-8.
They had to add that name however to distinguish it from the broken UTF-8 character set which only supported BMP characters.
In other words, you should always encode to UTF-8 when talking to MySQL, but take into account that the database may not be able to handle Unicode codepoints beyond U+FFFF, unless you use utf8mb4 on the MySQL side.