非独立同分布包含多种场景,本文仅讨论标签分布不平衡场景。
分布式自动调制识别相关的背景描述,感兴趣的同学请下载文末论文~
咱们直接进入正题!
所采用的数据集是RadioML2018.01A
数据集下载链接:RF Datasets For Machine Learning | DeepSig
在划分子数据集前,需要将数据集按调制类型进行重新组合,即不同信噪比的同一调制信号组合存为一个mat文件。
%author: Fu Xue
%date: 2023/06/13
clear all
close all
mod = ["OOK", "4ASK", "8ASK", "BPSK", "QPSK", "8PSK", "16PSK", "32PSK", "16APSK", "32APSK", "64APSK", "128APSK", "16QAM", "32QAM", "64QAM", "128QAM", "256QAM", "AM-SSB-WC", "AM-SSB-SC", "AM-DSB-WC", "AM-DSB-SC","FM", "GMSK", "OQPSK"];
n = 12; #子数据集个数
for kk = 1:size(mod,2)
rand_original(:,kk) = rand(1,n)';
rand_original(:,kk) = floor(rand_original(:,kk)/sum(rand_original(:,kk))*79872);
end
xlswrite('Unbalance_sample_number_report.xlsx',rand_original)
rand_original = xlsread('Unbalance_sample_number_report.xlsx');
sum_rand = sum(rand_original(:,:),2);
%开始组合各节点的数据集
for k = 1:n
Data_ED = [];
Data_ED_labels = [];
for kk = 1:24
train_part_path = strcat(mod(kk),'.mat');
load(train_part_path)
if k == 1
Data_ED = [Data_ED; Data_after_part(1:rand_original(k,kk),:,:)];
Data_ED_labels = [Data_ED_labels;(kk-1)*(ones(1,rand_original(k,kk)))'];
end
if k >=2
Data_ED = [Data_ED; Data_after_part(sum(rand_original(1:k-1,kk))+1:sum(rand_original(1:k,kk)),:,:)];
Data_ED_labels = [Data_ED_labels;(kk-1)*(ones(1,rand_original(k,kk)))'];
end
end
save(strcat('train/part',num2str(k),'.mat'),'Data_ED')
save(strcat('train/part',num2str(k),'_labels.mat'),'Data_ED_labels')
end
代码运行结束后,即得到12个样本量分布不均匀的子数据集。
下图是我运行一次得到的12个子数据集的样本量分布
样本量分布不均匀时,基于权重平均的分布式自动调制识别性能请见论文[2]。
[1] T. J. O’shea, T. Roy and T. C. Clancy, "Over-the-air deep learning based radio signal classification", IEEE J. Sel. Top. Signal Process., vol. 12, no. 1, pp. 168-179, Feb. 2018.
[2] X. Fu, G. Gui, Y. Wang, H. Gacanin and F. Adachi, "Automatic Modulation Classification Based on Decentralized Learning and Ensemble Learning," IEEE Transactions on Vehicular Technology, vol. 71, no. 7, pp. 7942-7946, July 2022, doi: 10.1109/TVT.2022.3164935.
若该经验贴及论文对各位同学的科研、学习有所帮助,欢迎各位同学引用相关作者及我们的论文~