I have a list of store names, with a few thousand names, some of which have non-standard American English characters that are posing a problem.
For example, my input file looks like this:
store_name
yéché
Ázak
ótndle
I want the output file to actually look like this (I think Googledocs made this happen, btw):
store_name new_store_name
yéché yéché
Ázak Ãzak
ótndle ótndle
There are only about 10 such rules that convert the non-standard American English character into this format, so I went through and did control f in excel to make them. But I'd like to be able in the future to do things like this computationally, and was just wondering if there is a quick way of doing this using Python. To be clear, what I want to do is make:
é become é
Á become Ãi
解决方案print a
péché
Álak
óundle
print a.decode('latin9').encode('utf8'),
péché
Ãlak
óundle
I had to do the reverse...