本文共239个字,预计阅读时间需要1分钟。
下面的代码给出了将mat格式数据集转换为arff与txt格式的matlab代码。
注意,每个.mat文件中只有一个数据集,其中共有m+1列,最后一列是label。
转为arff: mat2arff.m代码
%
% This function is used to convert the input data to '.arff'
% file format,which is compatible to weka file format ...
%
% Parameters:
% input_filename -- Input file name,only can conversion '.mat','.txt'
% or '.csv' file format ...
% arff_filename -- the output '.arff' file ...
% NOTEs:
%The input 'M*N' file data must be the following format:
% M: sampel numbers;
% N: sample features and label,"1:N-1" -- features, "N" - sample label ...
% 读取文件数据 ...
clear
clc
input_filename = 'GLIOMA-t.mat';
arff_filename = 'GLIOMA.arff';
if strfind(input_filename,'.mat')
matdata = importdata(input_filename);
elseif strfind(input_filename,'.txt')
matdata = textread(input_filename) ;
elseif strfind(input_filename,'.csv')
matdata = csvread(input_filename);
end
[row,col] = size(matdata);
f = fopen(arff_filename,'wt');
if (f < 0)
error(sprintf('Unable to open the file %s',arff_filename));
return
end
fprintf(f,'%s\n',['@relation ',arff_filename]);
for i = 1 : col - 1
st = ['@attribute att_',num2str(i),' numeric'];
fprintf(f,'%s\n',st);
end
% 保存文件头最后一行类别信息
floatformat = '%.16g';
Y = matdata(:,col);
uY = unique(Y); % 得到label类型
st = ['@attribute label {'];
for j = 1 : size(uY) - 1
st = [st sprintf([floatformat ' ,'],uY(j))];
end
st = [st sprintf([floatformat '}'],uY(length(uY)))];
fprintf(f,'%s\n\n',st);
% 开始保存数据 ...
labelformat = [floatformat ' '];
fprintf(f,'@data\n');
for i = 1 : row
Xi = matdata(i,1:col-1);
s = sprintf(labelformat,Y(i));
s = [sprintf([floatformat ' '],[; Xi]) s];
fprintf(f,'%s\n',s);
end
fclose(f);
转为txt: mat2txt.m代码
当然也可用save直接转换,但是会出现每一行开头会空两格的情况。
注意dataName.mat中的数据集名称是data
clc
clear
load('dataName.mat')
fid = fopen('dataName.txt', 'wt');
for i = 1 : size(data, 1)
for j = 1 : size(data, 2) - 1
fprintf(fid,'%e ',data(i, j));
end
fprintf(fid,'%e\n',data(i, size(data, 2)));
end
fclose(fid);