在http://lvdmaaten.github.io/tsne/可以下载到matlab,python等多个版本的实现
我用matlab比较熟悉,所以采用的是matlab版本
T_SNE的主要思想是:
- 将高维数据转化为K-means data network
- 投影为二维或者三维数据,达到可视化的效果
可以看到经过分类训练过后的特征,聚类非常明显
wiki_image_ori
wiki_text_ori
wiki_image
wiki_text
3D example
% Load data
clear;
load('./WIKI/test_img.mat');
load('./WIKI/test_txt.mat');
load('./WIKI/test_lab.mat');
ind = randperm(size(test_img, 1));
train_image = test_img(ind(1:693),:);
train_text = test_txt(ind(1:693),:);
train_labels_text = test_lab(ind(1:693));
train_labels_image = test_lab(ind(1:693))+10;
% Set parameters
no_dims = 2;
initial_dims = 200;
perplexity = 30;
mappedimage = tsne(train_image, train_labels_image, 300, initial_dims, perplexity);
mappedtext = train_text;
train_X=[mappedimage;mappedtext];
train_labels=[train_labels_text;train_labels_image];
% Run t_SNE
mappedX = tsne(train_X, train_labels, no_dims, initial_dims, perplexity);
% Plot results
gscatter(mappedX(:,1), mappedX(:,2),train_labels,[0 0 0;1 1 0; 1 0 0;0 0 1;0 1 1;0 1 0;0.8 0.4 0.1;1 0.7 0.8;1 0.9 0.8;0.4 0.35 0.8;0 0 0;1 1 0; 1 0 0;0 0 1;0 1 1;0 1 0;0.8 0.4 0.1;1 0.7 0.8;1 0.9 0.8;0.4 0.35 0.8],'DDDDDDDDDD..........',10);