python 读取excel表格时日期为数字问题

最新推荐文章于 2024-05-21 17:26:53 发布

csamll

最新推荐文章于 2024-05-21 17:26:53 发布

阅读量1.3k

点赞数

文章标签： excel python

本文链接：https://blog.csdn.net/csamll/article/details/132756746

版权

问题描述：在读取excel有时会出现日期变为数字，如47123，这是由于表格的设置，导致我们读取到的日期变成了日期序列（date sequence），此时我们需要编写函数来完成日期序列到日期的转换。

采用xlrd库可同时处理xlsx于xls文件，注意此时的版本应为1.2.0，版本过高则无法处理xlsx文件

import xlrd
import re 
import pandas as pd


filepath = 'your file path'
wb = xldd.open_workbook(filepath)
ws= wb.sheet_by_name('Sheet1')

pattern = r'^[45]\d{4}$'#用于匹配五位整数


#日期到日期序列转换函数：
def data(para):
    delta = pd.Timedelta(str(para)+'days')
    time = pd.to_datetime('1899-12-10')+delta
    return time

#判断转换后的日期是否正常，及对转换后的日期按需求进行处理
def date_parse(cell_value):
    cell_value_date=data(cell_value)
    try:
        date = cell_value_date
        date = date.replace(hour = 0 ,minute = 0 ,second = 0 )# 此处可以不将时分秒置零
        year_last_two = int(date.strftime("%y"))#. %y表示读取年份的后两位，%Y表示读取年份的全部位数
        if year_last_two ==23 or year_last_two ==24:#此处只希望保留23年或24年的日期
            return date.strftime("%Y-%m-%d %H-%m-%s")返回时间
        else:
            return cell_value
    except ValueError:
        return cell_value

# 遍历单元格，修改符合条件的元素:
for row in range(ws.nrows):
    for col in range(ws.ncols):
        cell = ws.cell(row, col)
        cell_value = cell.value
        ctype = cell.ctype#由于读取的日期为数字，所以此时在表格属性中数字对应的ctype为2
        if type == 2 and match = re.search(pattern, str(int(cell_value)) :#若其属性为2，表格内的值为一个五位整数，则对其进行日期转换
            cell_value = date_parse(cell_value)
            print(cell_value)

此时成功将日期序列转换为日期，可在date_parse函数中对所希望保留的日期跨度进行调整。

csamll

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
python 读取excel表格时日期为数字问题

问题描述：在读取excel有时会出现日期变为数字，如47123，这是由于表格的设置，导致我们读取到的日期变成了日期序列（date sequence），此时我们需要编写函数来完成日期序列到日期的转换。采用xlrd库可同时处理xlsx于xls文件，注意此时的版本应为1.2.0，版本过高则无法处理xlsx文件。此时成功将日期序列转换为日期，可在date_parse函数中对所希望保留的日期跨度进行调整。
复制链接

扫一扫