The unicode() constructor has the signature unicode(string[, encoding, errors]). All of its arguments should be 8-bit strings. The first argument is converted to Unicode using the specified encoding; if you leave off the encoding argument, the ASCII encoding is used for the conversion, so characters greater than 127 will be treated as errors:
原来经常报错:UnicodeDecodeError: 'ascii' codec can't decode byte 0xbd in position 0: ordinal not in range(128)是因为这个,以后用unicode(s,'encoding')加上相应的encoding。如果不知道文件中字符的类型,可以用 sys.getfilesystemencoding()获取。