python打开xls文件报错22_使用Python读取XLS文件时出现错误(little-endian)

I download a XLS file from the web using selenium.

I tried many options I found in stack-overflow and other websites to read the XLS file :

import pandas as pd

df = pd.read_excel('test.xls') # Read XLS file

Expected "little-endian" marker, found b'\xff\xfe'

And

df = pd.ExcelFile('test.xls').parse('Sheet1') # Read XLSX file

Expected "little-endian" marker, found b'\xff\xfe'

And again

from xlrd import open_workbook

book = open_workbook('test.xls')

CompDocError: Expected "little-endian" marker, found b'\xff\xfe'

I have tried different encoding: utf-8, ANSII, utf_16_be, utf16

I have even tried to get the encoding of the file from notepad or other applications.

Type of file : Microsoft Excel 97-2003 Worksheet (.xls)

I can open the file with Excel without any issue.

What's frustrating is that if I open the file with excel and just press save I then can read the file with of the previous python command.

I would be really grateful if someone could provide me other ideas I could try. I need to open this file with a python script only.

Thanks,

Max

Solution(Somewhat messy but simple) that could potentially work for any type of Excel file :

Called VBA from python to Open and save the file in Excel. Excel "clean-up" the file and then Python is able to read it with any read Excel type function

Solution inspired by @Serge Ballesta and @John Y comments.

## Open a file in Excel and save it to correct the encoding error

import win32com.client

import pandas

downloadpath="c:\\firefox_downloads\\"

filename="myfile.xls"

xl=win32com.client.Dispatch("Excel.Application")

xl.Application.DisplayAlerts = False # disables Excel pop up message (for saving the file)

wb = xl.Workbooks.Open(Filename=downloadpath+filename)

wb.SaveAs(downloadpath+filename)

wb.Close

xl.Application.DisplayAlerts = True # enables Excel pop up message for saving the file

df = pandas.ExcelFile(downloadpath+filename).parse('Sheet1') # Read XLSX file

Thank you all!

解决方案

What does pd mean?? What

pandas is made for data science. In my opinion, you have to use openpyxl (read and write only xlsx) or xlwt/xlrd (read xls... and write only xls).

from xlrd import open_workbook

book = open_workbook()

sheet =....

It has several examples with this on Internet...

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值