python读取xls文件_从python中的xls读取unicode

博主在尝试使用Python的xlrd库读取包含非ASCII字符的.xls文件时遇到编码问题,无论是不指定还是指定utf-8编码Override,都会引发UnicodeDecodeError。已经尝试了xlrd.open_workbook()的不同参数设置,但都无法成功解码文件中的特殊字符。
摘要由CSDN通过智能技术生成

我正在尝试使用

Python读取.xls文件.该文件包含多个非ascii字符(即äöü).我已经尝试过使用openpyxls和xlrd(我对xlrd寄予厚望,因为它无论如何都会读取unicode中的所有内容),但都没有工作.

我在尝试从xls打印信息时发现了多个处理编码/解码的答案,但我似乎无法达到那么远.只需尝试读取文件后,此脚本就会出错:

import xlrd

workbook = xlrd.open_workbook('export_data.xls')

导致:

Traceback (most recent call last):

File "C:\Users\Administrator\workspace\tufinderxlstoxml\tufinderxlstoxml2.py", line 2, in

workbook = xlrd.open_workbook('export_data.xls')

File "C:\Python27_32\lib\site-packages\xlrd\__init__.py", line 435, in open_workbook

ragged_rows=ragged_rows,

File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 119, in open_workbook_xls

bk.get_sheets()

File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 705, in get_sheets

self.get_sheet(sheetno)

File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 696, in get_sheet

sh.read(self)

File "C:\Python27_32\lib\site-packages\xlrd\sheet.py", line 796, in read

strg = unpack_string(data, 6, bk.encoding or bk.derive_encoding(), lenlen=2)

File "C:\Python27_32\lib\site-packages\xlrd\biffh.py", line 269, in unpack_string

return unicode(data[pos:pos+nchars], encoding)

UnicodeDecodeError: 'ascii' codec can't decode byte 0x92 in position 55: ordinal not in range(128)

WARNING *** OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero

*** No CODEPAGE record, no encoding_override: will use 'ascii'

*** No CODEPAGE record, no encoding_override: will use 'ascii'

我也尝试过:

workbook = xlrd.open_workbook('export_data.xls', encoding_override="utf-8")

导致:

Traceback (most recent call last):

File "C:\Users\Administrator\workspace\tufinderxlstoxml\tufinderxlstoxml2.py", line 2, in

workbook = xlrd.open_workbook('export_data.xls', encoding_override="utf-8")

File "C:\Python27_32\lib\site-packages\xlrd\__init__.py", line 435, in open_workbook

ragged_rows=ragged_rows,

File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 119, in open_workbook_xls

bk.get_sheets()

File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 705, in get_sheets

self.get_sheet(sheetno)

File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 696, in get_sheet

sh.read(self)

File "C:\Python27_32\lib\site-packages\xlrd\sheet.py", line 796, in read

strg = unpack_string(data, 6, bk.encoding or bk.derive_encoding(), lenlen=2)

File "C:\Python27_32\lib\site-packages\xlrd\biffh.py", line 269, in unpack_string

return unicode(data[pos:pos+nchars], encoding)

UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 55: invalid start byte

WARNING *** OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero

并包括顶部各种版本:

# -*- coding: utf-8 -*-

我在Windows Server 2008计算机上的python 2.7上运行它.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值