Matlab数据可视化（4）：一维数据绘图 II

最新推荐文章于 2024-08-23 13:39:07 发布

MoussaTintin

最新推荐文章于 2024-08-23 13:39:07 发布

阅读量9.4k

点赞数 5

分类专栏：原创可视化文章标签：图像处理 Matlab 数据分析可视化 visualization

本文链接：https://blog.csdn.net/jackytintin/article/details/9076519

版权

原创同时被 2 个专栏收录

31 篇文章 9 订阅

订阅专栏

可视化

5 篇文章 1 订阅

订阅专栏

五. 结点连接图（node link plot）

（源代码：NodeLinks.m）

有时，我们需要绘制出不同结点之间的连通关系，即结点连接图。以下以绘制美国128座城市之间的连通关系为例，介绍两种结点连接图的画法。

1) 定义每座城市与距它最近的城市连通，与其余视为不连通，然后根据连通性，利用gplot命令，直观的绘出结点连接图。（图9）

%% #1
%% 定义数据
[XYCoord] = xlsread('inter_city_distances.xlsx','Sheet3');
[intercitydist citynames] = xlsread('inter_city_distances.xlsx','Distances');

% 避免将自身判定为最近的城市
howManyCities = 128;
for i =1:howManyCities;intercitydist(i,i)=Inf;end

%% 定义临接矩阵
% n阶方阵，1表示相连，0表示不相连; 这是我们规定城市只与它最近的城市相连
adjacency = zeros(howManyCities,howManyCities);
for i = 1:howManyCities
    alls = find(intercitydist(i,:)==min(intercitydist(i,:)));    
    for j = 1:length(alls)
        adjacency(i,alls(j)) = 1;
        adjacency(alls(j),i) = 1;
    end    
    clear alls
end
figure('units','normalized','position',[ 0.2813    0.2676    0.3536    0.3889]);
plot(XYCoord(1:howManyCities,1),XYCoord(1:howManyCities,2),'ro');hold on;
title('距离最近的城市彼此相连');
gplot(adjacency,XYCoord);
xlabel('南北');
ylabel('东西');
set(gcf,'Color',[1 1 1],'paperpositionmode','auto');

图 9

2) 定义距离小于100英里的城市为相连，以更紧致的方式绘出连接情况。（图10）

%% 定义临接关系（距离小于100英里的城市之间认为是相邻的）
% 重新排列数据，以使距离短的城市大致靠近
for i = 1:howManyCities;  intercitydist(i,i) = 0; end
[balh I] = sort(intercitydist(1,:));
citynames = citynames(I);
XYCoord = XYCoord(I,:);
%% 重新计算距离矩阵
for i = 1:howManyCities;  
    for j = 1:howManyCities
        if i==j
            intercitydist(i,i) = Inf; 
        else
            intercitydist(i,j) = sqrt((XYCoord(i,1)-XYCoord(j,1))^2 + (XYCoord(i,2)-XYCoord(j,2))^2); 
        end
    end
end
%% 计算临接矩阵
adjacency = zeros(howManyCities,howManyCities);
for i = 1:howManyCities
    alls = find(intercitydist(i,:)<100);    
    for j = 1:length(alls)
        adjacency(i,alls(j)) = 1;
        adjacency(alls(j),i) = 1;
    end    
    clear alls
end

%% 图像大小和位置
figure('units','normalized','position',[0.0844    0.2259    0.8839    0.4324]);
axes('Position',[0.0371    0.2893    0.9501    0.6296]); 
xlim([1 howManyCities]);
ylim([0 100]);
hold on;
%% 在X轴上标注城市名称
set(gca,'xtick',1:howManyCities,'xticklabel',citynames,...
                                   'ticklength',[0.001 0]);
box on; 
rotateXLabels(gca,90);
%% 为不同弧线分配不同的颜色（距离越近，颜色越深）
m = colormap(pink(howManyCities+1));
cmin = min(min(intercitydist));
cmax = 150;
%% 绘制弧线
for i = 1:howManyCities
    for j = 1:howManyCities
        if adjacency(i,j)==1
            x=[i (i+j)/2 j]; 
            y=[0 intercitydist(i,j) 0];
            pol_camp=polyval(polyfit(x,y,2),linspace(i,j,25));
            plot(linspace(i,j,25),pol_camp,'Color',m(fix((intercitydist(i,j)-cmin)/(cmax-cmin)*howManyCities)+1,:),'linewidth',100/intercitydist(i,j));        
        end
    end
end
%% 标注
title('公路距离小于100英里的城市','fontsize',14);
ylabel('城市间距离');
set(gca,'Position',[0.0371    0.2893    0.9501    0.6296]);
%% 绘制水平网格以增加可读性
line(repmat(get(gca,'xlim'),9,1)',[linspace(10,90,9); linspace(10,90,9)],'Color',[.8 .8 .8]);
set(gcf,'Color',[1 1 1],'paperpositionmode','auto');

图 10

六. 日历热图（Heat Map）

（源代码：CalendarHeatMap.m）

数据大小可以用颜色的深浅直观的表示，这可以通过imagesc或pcolor命令获得（二者参考点的位置不同）。以下即是用Imagesc命令，绘制Google公司10到11年的股价数据。（图11）

%% Add path to utilities
addpath('.\utilities');


%% 数据预处理
[GOOG dateGOOG] = xlsread('GOOG_090784_012412.csv');
dateGOOG = datenum({dateGOOG{2:end,1}});
dateGOOG = dateGOOG(end:-1:1);
GOOG = GOOG(end:-1:1,:);
% 选择最后两年的数据
newData = [];
newDateData=[];
for i =1:numel(dateGOOG)-1
    if abs(dateGOOG(i+1)-dateGOOG(i))==1
        newData     =   [newData;   GOOG(i)];
        newDateData =   [newDateData dateGOOG(i)];
    else
        delta = abs(dateGOOG(i+1)-dateGOOG(i));
        for j = 1:delta
            newData     =   [newData;   NaN];
            newDateData =   [newDateData dateGOOG(i)+j-1];
        end
    end
end
newData     =   [newData;   GOOG(end)];
newDateData =   [newDateData dateGOOG(end)];
idx = find(newDateData<=datenum('12/31/2011')&newDateData>=datenum('1/1/2010'));
newData=newData(idx);
newDateData = newDateData(idx);


%% 日历布局
% 1行6个月，两年共4行
figure('units','normalized','Position',[ 0.3380    0.0889    0.6406    0.8157]);
colormap('summer');
xs = [0.03 .03+.005*1+1*.1525 0.03+.005*2+2*.1525 0.03+.005*3+3*.1525 0.03...
    +.005*4+4*.1525 0.03+.005*5+5*.1525];
ys = [0.14 .14+0.04*1+1*.165   .14+0.04*2+2*.165   .14+0.04*3+3*.165];


% 估计每月的天数
isthereALeapyear = find(~(mod(unique(str2num(datestr(newDateData,'yyyy'))),4)| ...
    mod(unique(str2num(datestr(newDateData,'yyyy'))),400)));
if isempty(isthereALeapyear)
    D = [31 28 31 30 31 30; 31 31 30 31 30 31;31 28 31 30 31 30; 31 31 30 31 30 31];
else
    if isthereALeapyear==1
        D = [31 29 31 30 31 30; 31 31 30 31 30 31;31 28 31 30 31 30; 31 31 30 31 30 31];
    else
        D = [31 28 31 30 31 30; 31 31 30 31 30 31;31 29 31 30 31 30; 31 31 30 31 30 31];
    end
end
 
%% 开始绘图
Dcnt=0;
for i = 1:4
    for j = 1:6
        % 绘制月视图
        axes('Position',[xs(j) ys(i) .1525    0.165]);         
        % 计算所在的月份
        idx = find(newDateData>=datenum([datestr(newDateData(1)+Dcnt,'mm') ...
            '/01/'  datestr(newDateData(1)+Dcnt,'yyyy')]) & ...
             newDateData<=datenum([datestr(newDateData(1)+Dcnt,'mm') '/31/' ...
             datestr(newDateData(1)+Dcnt,'yyyy')]));
        % 得到当月信息
        A = calendar(newDateData(1)+Dcnt);
        % 填入股价数据
        data = NaN(size(A));
        for k = 1:max(max(A))
            [xx yy] = find(A==k);
            data(xx,yy) = newData(idx(k));
        end
        % 上色
        imagesc(data); alpha(.4);hold on;set(gca,'fontweight','bold');
        xlim([.5 7.5]); ylim([0 6.5]);
        for m = 1:6
            for n= 1:7
                if A(m,n)~=0
                    text(n,m,num2str(A(m,n)));
                end
            end
        end     
        % 添加日历头
        text(.75,.25,'S','fontweight','bold'); text(1.75,.25,'M','fontweight','bold');
        text(2.75,.25,'T','fontweight','bold');
        text(3.75,.25,'W','fontweight','bold');text(4.75,.25,'R','fontweight','bold');
        text(5.75,.25,'F','fontweight','bold');text(6.75,.25,'S','fontweight','bold');
        
        title([datestr(newDateData(1)+Dcnt,'mmm')  datestr(newDateData(1)+Dcnt,'yy')]);
        set(gca,'xticklabel',[],'yticklabel',[],'ticklength',[0 0]);
        line([-.5:7.5; -.5:7.5], [zeros(1,9); 6.5*ones(1,9)],'Color',[.8 .8 .8]);
        line([zeros(1,9); 7.5*ones(1,9)],[-.5:7.5; -.5:7.5], 'Color',[.8 .8 .8]);
        box on;
        Dcnt=Dcnt+D(i,j);
    end
end


%%增加图标和标题
colorbar('Location','SouthOutside','Position',[ 0.1227    0.0613    0.7750    0.0263]);
alpha(.4);
annotation('textbox',[0.30 0.9354 0.8366 0.0571],...
    'String','2010年1月到2011年12月Google股价日记录',...
    'LineStyle','none','Fontsize',14);


 set(gcf,'Color',[1 1 1],'paperpositionmode','auto');

图 11

七. 数据分布的可视化分析

（源代码：DistributionAnalysis.m）

1) 首先利用散点图和直方图对数据进行初步观察。（图 12）

%% 加载数据
load distriAnalysisData;

%% 观测数据分布
% 绘制直方图
figure('units','normalized','position',[0.2099    0.6269    0.4354    0.2778]);
subplot(1,2,1);plot(sort(B),'.');xlabel('序号');ylabel('观察值');title('一维散点图');
subplot(1,2,2);hist(B);xlabel('间隔');ylabel('观测频率');title('直方图');
set(gcf,'Color',[1 1 1],'paperpositionmode','auto');

图 12

2) 进一步优化直方图，对数据进行更细致考查。（图13）

%% 优化直方图 
figure('units','normalized','position',[0.2099    0.6269    0.4354    0.2778]);
[N c] = hist(B,round(sqrt(length(B))));
bar(c,N);
title('间隔大小 = sqrt(n)');
xlabel('间隔');ylabel('观测频率');title('优化直方图间隔以发现隐藏的结构');
set(gcf,'Color',[1 1 1],'paperpositionmode','auto');

图 13

3) 利用样条插值，为直方图添加包络线。（图14）

%% 利用样条插值绘制直方图的包络
[N c] = hist(B,200);
% 用样条计算包络
env = interp1(c,N,c,'spline'); 
% 绘制归一化的包络
figure('units','normalized','position',[ 0.2099    0.6269    0.4354    0.2778]);
bar(c,N./max(N));hold;
plot(c,env./max(env),'r','Linewidth',3);
xlabel('间隔'); ylabel({'归一化后的包络','间隔数为200'});
title('直方图包络是数据经验分布建模的有力工具');
set(gcf,'Color',[1 1 1],'paperpositionmode','auto');

图 14

4) 绘制QQ图，观察数据分布的正态性。（图15）

%% MATLAB qqplot 命令
figure;qqplot(B);box on
title({get(get(gca,'title'),'String'),'数据点越接近曲线，数据分布的正态性越好'});
set(gcf,'Color',[1 1 1],'paperpositionmode','auto');

图 15

5) 残差分析，评估模型对数据真实分布的近似程度。（图16）

%% 分析残差
figure;
sigma_ampl = [79.267229 8.121365 5 6.254915 5.062882 11.117357 577.45966 ...
    531.38438 962.45674 1800 800 357.92132];
mu=[29 38 51 70 103 133];
% 高斯混合模型
f_sum=0;x=1:200;
for i=1:6
    f_sum=f_sum+sigma_ampl(i+6)./(sigma_ampl(i)).*exp(-(x-mu(i)).^2./(2*sigma_ampl(i).^2));
end
subplot(2,1,1);
clear h;
h(1)=plot(c,env,'Linewidth',1.5);hold on;
h(2)=plot(c,f_sum,'r','Linewidth',1.5); axis tight
legendflex(h,{'直方图轮廓','高斯混合模型'},'ref',gcf,'anchor',{'ne','ne'},'xscale',.5,'buffer',[-50 -50]);
title({'对数据分布建模后，通过残差分析评估','模型对数据描述的符合程度'});
subplot(2,1,2);
plot(c,env-f_sum,'.');axis tight;
title(['残差 = 观测值 - 拟合值, 均方误差 = ' num2str(sqrt(sum(abs(env-f_sum).^2)))]);
set(gcf,'Color',[1 1 1],'paperpositionmode','auto');

图 16

八. 时间序列分析

（源代码：TimeSeriesAnalysis.m）

1) 绘制散点图。（图17）

load timeseriesAnalysis;

%% 数据散点图
figure;
subplot(2,1,1);plot(x,ydata1);title('样本1实时心率（采样间隔0.5秒）');xlabel('时间（秒）');ylabel('心率（心跳次数/分钟）');
subplot(2,1,2);plot(x,ydata2);title('样本2实时心率（采样间隔0.5秒）');xlabel('时间（秒）');ylabel('心率（心跳次数/分钟）');
set(gcf,'color',[1 1 1],'paperpositionmode','auto');

图 17

2) 过滤信号中的线性成分（detrend）。（图18）

%% 过滤数据的线性趋势（detrend）
figure;
y_detrended1 = detrend(ydata1); 
y_detrended2 = detrend(ydata2); 
subplot(2,1,1);plot(x, ydata1,'-',x, ydata1-y_detrended1,'r');title('去除线性成分后的信号1');
legend({'信号','线性成分'});
xlabel('时间 （秒）');ylabel('心率（心跳次数/分钟）');
subplot(2,1,2);plot(x, ydata2,'-',x, ydata2-y_detrended2,'r');title('去除线性成分后的信号2');
legend({'信号','线性成分'});
xlabel('时间（秒）');ylabel('心率（心跳次数/分钟）');
set(gcf,'color',[1 1 1],'paperpositionmode','auto');

图 18

3) 计算自相关函数，绘制自相关图（correlogram）。（图19）

%% 自相关函数
figure;
y_autoCorr1 = acf(subplot(2,1,1),ydata1,100); 
set(get(gca,'title'),'String','心率数据自相关函数（样本1）');
set(get(gca,'xlabel'),'String','间期（秒）');
tt = get(gca,'xtick');
for i = 1:length(tt); ttc{i} = sprintf('%.2f ',0.5*tt(i)); end
set(gca,'xticklabel',ttc);
y_autoCorr2 = acf(subplot(2,1,2),ydata2, 100);
set(gca,'xticklabel',ttc);
set(get(gca,'title'),'String','心率数据自相关函数（样本2）');
set(get(gca,'xlabel'),'String','间期（秒）');
set(gcf,'color',[1 1 1],'paperpositionmode','auto');

图 19

4) 利用傅里叶变换寻找周期成分。（图20）

%% 利用傅里叶变换寻找周期成分
figure;

nfft = 2^(nextpow2(length(x)));
% 快速傅里叶变换（信号长度小于nfft时补零） 
ySpectrum1 = fft(y_detrended1,nfft);
ySpectrum2 = fft(y_detrended2,nfft);

NumUniquePts = ceil((nfft+1)/2);
% FFT is symmetric, throw away second half and use the magnitude of the coeeicients only
powerSpectrum1 = abs(ySpectrum1(1:NumUniquePts));
powerSpectrum2 = abs(ySpectrum2(1:NumUniquePts));
% Scale the fft so that it is not a function of the length of x
powerSpectrum1 = powerSpectrum1./max(powerSpectrum1);
powerSpectrum2 = powerSpectrum2./max(powerSpectrum2);
powerSpectrum1 = powerSpectrum1.^2;
powerSpectrum2 = powerSpectrum2.^2;
% Since we dropped half the FFT, we multiply the coeffixients we have by 2 to keep the same energy.
% The DC component and Nyquist component, if it exists, are unique and should not be multiplied by 2. 
if rem(nfft, 2) % odd nfft excludes Nyquist point 
     powerSpectrum1(2:end) = powerSpectrum1(2:end)*2;
     powerSpectrum2(2:end) = powerSpectrum2(2:end)*2;
else
     powerSpectrum1(2:end -1) = powerSpectrum1(2:end -1)*2;
     powerSpectrum2(2:end -1) = powerSpectrum2(2:end -1)*2;
end
% This is an evenly spaced frequency vector with NumUniquePts points.
% Sampling frequency
Fs = 1/(x(2)-x(1)); 
f = (0:NumUniquePts-1)*Fs/nfft;
plot(f,powerSpectrum1,'-',f,powerSpectrum2,'r');
title('心率信号能量谱');
xlabel('频率（赫兹）'); ylabel('能量');
xlim([0 .25]);
annotation_pinned('textarrow',[.15,.085],[.25,.03],'String',{'0.1Hz处的尖峰很可能','是该样本呼吸的频率'});
annotation_pinned('textarrow',[.1,.02],[.75,.87],'String',{'0.02 Hz处的尖峰很可能','是该样本呼吸的频率'});
set(gcf,'color',[1 1 1],'paperpositionmode','auto');

图 20

5) 对信号平滑处理，去掉频谱上系数较小的成分。（图21）

%% 去除能量较小的频率成分
figure;
% 不使用0垫位
ySpectrum1 = fft(ydata1);
ySpectrum2 = fft(ydata2);
% 去掉很小的系数
freqInd1=find(abs(ySpectrum1)<400); 
freqInd2=find(abs(ySpectrum2)<400); 
ySpectrum1(freqInd1)=0;
ySpectrum2(freqInd2)=0;
% 重建信号
y_cyclic1=ifft(ySpectrum1);
y_cyclic2=ifft(ySpectrum2);
subplot(2,1,1);
h(1)= plot(x,ydata1,'b');hold on;h(2)=plot(x,y_cyclic1,'r','linewidth',1.5);
title('心率信号1');axis tight;
legendflex(h,...                %handle to plot lines
    {'原始信号','平滑后信号'},... %corresponding legend entries
    'ref', gcf, ...             %which figure
    'anchor', {'e','e'}, ...  %location of legend box
    'buffer',[-10 0], ...         % an offset wrt the location
    'fontsize',8,...            %font size
    'xscale',.5);               %a scale factor for actual symbols    
xlabel('时间（秒）');ylabel('心率（心跳次数/分钟）');
subplot(2,1,2);
h(1) = plot(x,ydata2,'b');hold on;h(2)=plot(x,y_cyclic2,'r','linewidth',1.5);
title('心率信号2');axis tight;
xlabel('时间（秒）');ylabel('心率（心跳次数/分钟）');
set(gcf,'color',[1 1 1],'paperpositionmode','auto');