I'm having trouble figuring out how to import Excel files into my Python script. I'm only a few days into Python so I'm guessing it's something very obvious I'm missing. I'm using Python 3 and the tablib module. From the examples on the tablib site, I've worked out how to save files in xls format
def saveXLS(self, name, data):
# Form the dataset with the accompanying headers
dataTab = tablib.Dataset()
dataTab.headers = data[0][:]
for i in range(1,len(data)):
dataTab.append(data[i][:])
with open(self.saveDir + name + ".xls", 'wb') as f:
f.write(dataTab.xls)
(I know that loop is horrible and un-Pythonic, but it's important I get results at the moment as it's for work). At the moment, I open the Excel workbook and save it as a text file (I should point out that all my data is tab-delimited and consists of strings, even for numbers).
I open it like this
def loadTxt(self,name, fileType, data):
if( fileType == "txt"):
with open(self.currentWorkingDir + "\\" + name + ".txt",'r') as f:
reader=csv.reader(f,delimiter='\t')
for X in reader:
data.append(X)
I tried copying the "dbf" example on the tablib website (http://tablib.readthedocs.org/en/latest/api/) to get
def loadXLS(self):
self.data = tablib.Dataset()
self.data = open('Data.xlsx').read()
return self.datav
And I get an error (as I expected, as I pulled it from my ass)
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 637: character maps to .
I really have no clue how to figure this out unfortunately, so any advice would be really appreciated.
解决方案
You've probably figured it out by now, but for the next person, you need to read the Excel file as binary:
my_input_stream = open("my_file.xlsx", "rb")
my_dataset = tablib.import_set(my_input_stream)
dataset[1:5]