matlab读取Excel中文本型数字

这段代码演示了如何使用MATLAB批量读取指定文件夹下的所有Excel文件,处理含有特殊字符的数据,并进行分块操作。在读取过程中,代码还加入了错误处理机制,当遇到问题时会记录错误文件并继续执行。最后,将数据写入新的Excel文件。这在大数据处理中是一种有效的优化策略。
摘要由CSDN通过智能技术生成

关键代码内容:

clc;
clear;
Path=fullfile('D:\Data\20210128PM2.5');
File=dir(fullfile(Path,'*.xlsx'));
FileNames = {File.name};
Length_Names = size(FileNames,2);
for k = 1:Length_Names
    K_Trace = strcat(Path,'\' ,FileNames(k));
    [num,txt,raw] =  xlsread(char(K_Trace),'sheet1','G2:G1496');
    idx = find(strcmp(raw,'—'));
    nan = num2cell('0');
    raw(idx)=nan; 
    pmdata = cellfun(@str2num,raw(1:end));
    PM_All(:,k) = pmdata;
    fprintf('已读取第%d个文件\n',k)  
end
fprintf('共读取%d个文件\n',k)

代码解释:

  1. 读取文件夹下所有后缀为xlsx的文件
Path=fullfile('D:\Data\20210128PM2.5');
File=dir(fullfile(Path,'*.xlsx'));
FileNames = {File.name};
Length_Names = size(FileNames,2);
  1. 在Excel中以文本格式保存的数值,使用xlsread读取,只能在raw中获取。
  2. strcmp函数找内容为“—”的下标,使用raw(idx)=nan将其替换为0。
  3. cellfun为cell的函数,对cell里的元素执行str2num函数,最终得到double类型的矩阵
[num,txt,raw] =  xlsread(char(K_Trace),'sheet1','G2:G1496');
idx = find(strcmp(raw,'—'));
nan = num2cell('0');
raw(idx)=nan; 
pmdata = cellfun(@str2num,raw(1:end));

Excel公式对非零元素求均值

=AVERAGEIF(F2:EPZ2,"<>0")

插曲:
尝试分块处理

clc;
clear;
Path=fullfile('D:/Data/20210128PM2.5');
File=dir(fullfile(Path,'*.xlsx'));
FileNames = {File.name};
Length_Names = size(FileNames,2);
Length_Names_chunk = ceil(Length_Names/10);
for chunk = 1:10
    i = 1;
    if chunk == 10
        for k = Length_Names_chunk*(chunk-1)+1:Length_Names
            K_Trace = strcat(Path,'\' ,FileNames(k));
            [num,txt,raw] =  xlsread(char(K_Trace),'sheet1','G2:G1496');
            idx = find(strcmp(raw,'—'));
            nan = num2cell('0');
            raw(idx)=nan; 
            pmdata = cellfun(@str2num,raw(1:end));
            PM_All(:,i) = pmdata;
            i = i + 1;
            fprintf('已读取第%d个文件\n',k)  
        end
    else
        for k = Length_Names_chunk*(chunk-1)+1:Length_Names_chunk*chunk
            K_Trace = strcat(Path,'\' ,FileNames(k));
            [num,txt,raw] =  xlsread(char(K_Trace),'sheet1','G2:G1496');
            idx = find(strcmp(raw,'—'));
            nan = num2cell('0');
            raw(idx)=nan; 
            pmdata = cellfun(@str2num,raw(1:end));
            PM_All(:,i) = pmdata;
            i = i + 1;
            fprintf('已读取第%d个文件\n',k)  
        end
    end
    NewFile_Path = sprintf('D:/Data/20210128PM2.5/output/output%d.xlsx',chunk);
    xlswrite(NewFile_Path,PM_All)
    fprintf('第%d个分块读取结束\n',chunk)
    clearvars PM_All
end

避免报错,用try……catch处理

clc;
clear;
Path=fullfile('D:\Data\20210128PM2.5');
File=dir(fullfile(Path,'*.xlsx'));
FileNames = {File.name};
Length_Names = size(FileNames,2);
% All_PM_Data = zeros(1495,Length_Names);
error_num = 0;
for k = 1:Length_Names
    try 
        K_Trace = strcat(Path,'\' ,FileNames(k));
        [num,txt,raw] =  xlsread(char(K_Trace),'sheet1','G2:G1496');
        idx = find(strcmp(raw,'—'));
        nan = num2cell('0');
        raw(idx) = nan; 
        pmdata = cellfun(@str2num,raw(1:end));
        PM_All(:,k) = pmdata;
        fprintf('已读取第%d个文件\n',k) 
    catch
        fprintf('第%d个文件有问题!!!!!\n',k)
        error_text = fopen('D:\Data\20210128PM2.5\output\error_log.txt','a');
        fprintf(error_text,'%s\n',K_Trace);
        fclose(error_text);
        error_num = error_num+1;
        continue
    end
end
fprintf('共读取%d个文件\n',k)
fprintf('共%d个文件读取有问题\n',error_num)
NewFile_Path = 'D:\Data\20210128PM2.5\output\output.xlsx';
xlswrite(NewFile_Path,PM_All)
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值