20:Learning a Self-Expressive Network for Subspace Clustering

      State-of-the-art subspace clustering methods are based on the self-expressive model, which represents each data point as a linear combination of other data points. since the number of self-expressive coefficients grows quadratically with the number of data points, their ability to handle large-scale datasets is often limited ,learn the self-expressive coefficients with desired properties on the training data, but also han- dle out-of-sample data .

     SENet yields highly competitive performance on MNIST, Fashion MNIST and Extended MNIST and state-of-the-art perfor- mance on CIFAR-10.

1. Introduction


      With the development of data collection and storage and processing technology, the availability of computer vision in large-scale database has increased dramatically. With the development of deep learning and other modern machine learning technology, based on the analysis of large data has achieved great success, but these methods need a lot of labeled data, and the data acquisition cost are high. Never markers extracting mode and big data clustering has become an important open problem. We are assuming that each cluster are made by a higher dimensional space environment under the condition of low dimensional subspace approximation, consider large-scale unmarked data clustering problem, namely the subspace clustering. The problem in the image motion segmentation hybrid clustering system to identify cancer subtypes clustering. Hyperspectral image segmentation has a wide range of applications.

      we introduce the self-expressive network (SENet) to learn a self-expressive model for subspace clustering, which can be leveraged to handle out-of-sample data and large-scale data.

      SENet trained on a certain dataset can be used to produce self-expressive coefficients for another dataset drawn from the same data distribution, therefore the method can handle out-of-sample data effectively.

Our experiments showcase the effectiveness of our method as summarized below:

We show that the self-expressive coefficients computed by a trained SENet closely approximate those computed by solving for them directly without the network. This il- lustrates the ability of SENet to approximate the desired self-expressive coefficients.

We show that a SENet trained on (part of) the training set of MNIST and Fashion MNIST can be used to pro- duce self-expressive coefficients on the test set that give a good clustering performance. This illustrates the abil- ity of SENet to handle out-of-sample data.

We show that SENet can be used to cluster datasets con- taining 70,000+ data poins, such as MNIST, Fashion MNIST and Extended MNIST, very efficiently, achiev- ing a performance that closely matches (for MNIST, Fashion MNIST and Extended MNIST) or surpasses (for CIFAR-10) the state of the art.

2.related work

Deep Clustering : deep networks are used to extract features. our work assumes that the input data already lie in linear subspaces, and focuses on computing the self-expressive coefficients.

Scalable Subspace Clustering: several methods adopt a two-step approach for computing self-expressive coefficients:

1) construct a dictionary, either generated in random or learned/selected from data.

2) express each data point as a linear combinations of the atoms in the dictionary.

However, the output dimension hence the scale of the optimization problem increases at least quadratically with the size of the dictionary, therefore using a sufficiently large dictionary may be impossible.

Self-attention Models : The self-attention mechanism used in Graph Attention Networks (GAT) , Transformer, Non-local Neural Networks. Similar to SENet, the coefficients in the linear combination are computed with a neural network. However, unlike the self- expressive models, which use the distance between the input features and output features to define a training loss in an unsupervised manner, the self-attention methods impose a supervised learning loss on the output features. This leads to a difference in the design of the network architecture.

3. Self-Expressive Network




4. Experiments



为了进一步验证所训练的自表达网络具有推广能力,我们从MNIST和Fashion MNIST的训练集中,采样一部分样本来构成训练集,然后我们分布使用MNIST的独立测试集和Fashion MNIST的独立测试集作为测试集。这时可以发现这里的子空间保持误差随着随着训练集的增大,而稳定的下降,结果越来越好,然后谱聚类给出的聚类准确率也是可以接受的,随着训练集的增大,会得到更好的结果。


接下来,我们为了展示自表达网络处理大规模数据集的潜力,分布从MNIST和Fashion MNIST,CIFAR-10,EMNIST数据集,选择一部分样本构成数据集。这里我们训练自表达网络记录了它的训练时间和所获得的聚类准确率,这里可以看到我们的方法可以在比较短的时间内,得到相当的结果或者是要胜出的准确率。










当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则
钱包余额 0


