I have the string
u"Played Mirror's Edge\u2122"
Which should be shown as
Played Mirror's Edge™
But that is another issue. My problem at hand is that I'm putting it in a model and then trying to save it to a database. AKA:
a = models.Achievement(name=u"Played Mirror's Edge\u2122")
a.save()
And I'm getting :
'ascii' codec can't encode character u'\u2122' in position 13: ordinal not in range(128)
full stack trace (as requested) :
Traceback:
File "/var/home/ptarjan/django/mysite/django/core/handlers/base.py" in get_response
86. response = callback(request, *callback_args, **callback_kwargs)
File "/var/home/ptarjan/django/mysite/yourock/views/alias.py" in import_all
161. types.import_all(type, alias)
File "/var/home/ptarjan/django/mysite/yourock/types/types.py" in import_all
52. return modules[type].import_all(siteAlias, alias)
File "/var/home/ptarjan/django/mysite/yourock/types/xbox.py" in import_all
117. achiever = self.add_achievement(dict, siteAlias, alias)
File "/var/home/ptarjan/django/mysite/yourock/types/base_profile.py" in add_achievement
130. owner = siteAlias,
File "/var/home/ptarjan/django/mysite/django/db/models/query.py" in get
304. num = len(clone)
File "/var/home/ptarjan/django/mysite/django/db/models/query.py" in __len__
160. self._result_cache = list(self.iterator())
File "/var/home/ptarjan/django/mysite/django/db/models/query.py" in iterator
275. for row in self.query.results_iter():
File "/var/home/ptarjan/django/mysite/django/db/models/sql/query.py" in results_iter
206. for rows in self.execute_sql(MULTI):
File "/var/home/ptarjan/django/mysite/django/db/models/sql/query.py" in execute_sql
1734. cursor.execute(sql, params)
File "/var/home/ptarjan/django/mysite/django/db/backends/util.py" in execute
19. return self.cursor.execute(sql, params)
File "/var/home/ptarjan/django/mysite/django/db/backends/mysql/base.py" in execute
83. return self.cursor.execute(query, args)
File "/usr/lib/pymodules/python2.5/MySQLdb/cursors.py" in execute
151. query = query % db.literal(args)
File "/usr/lib/pymodules/python2.5/MySQLdb/connections.py" in literal
247. return self.escape(o, self.encoders)
File "/usr/lib/pymodules/python2.5/MySQLdb/connections.py" in string_literal
180. return db.string_literal(obj)
Exception Type: UnicodeEncodeError at /import/xbox:bob
Exception Value: 'ascii' codec can't encode character u'\u2122' in position 13: ordinal not in range(128)
And the pertinant part of the model :
class Achievement(MyBaseModel):
name = models.CharField(max_length=100, help_text="A human readable achievement name")
I'm using a MySQL backend with this in my settings.py
DEFAULT_CHARSET = 'utf-8'
So basically, how the heck should I deal with all this unicode stuff? I was hoping it would all "just work" if I stayed away from funny character sets and stuck to UTF8. Alas, it seems to not be just that easy.
解决方案
Thank you to everyone who was posting here. It really helps my unicode knowledge (and hoepfully other people learned something).
We seemed to be all barking up the wrong tree since I tried to simplify my problem and didn't give ALL information. It seems that I wasn't using "REAL" unicode strings, but rather BeautifulSoup.NavigableString which repr themselves as unicode strings. So all the printouts looked like unicode, but they weren't.
Somewhere deep in the MySQLDB library they couldn't deal with these strings.
This worked :
>>> Achievement.objects.get(name = u"Mirror's Edge\u2122")
On the other hand :
>>> b = BeautifulSoup(u"Mirror's Edge\u2122").span.string
>>> Achievement.objects.get(name = b)
... Exceptoins ...
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2122' in position 13: ordinal not in range(128)
But this works :
>>> Achievement.objects.get(name = unicode(b))
So, thanks again for all the unicode help, I'm sure it will come in handy. But for now ...
WARNING : BeautifulSoup doesn't return REAL unicode strings and should be coerced with unicode() before doing anything meaningful with them.