# Main Contribution
* propose a novel label confusion model (LCM) as an effective enhancement component for text classification. It can be applied to mainstream deep learning models without changing its structure and require no extra computation cost in prediction procedure.
* the proposed LCM can be used for both English and Chinese, and can be especially useful and the dataset is noisy with many mislabelled samples.
# Method
## LCM
The method is simple
You only need to learn the label embedding via a DNN and then dot product the label embedding with its corresponding content embedding that is produced by a deep encoder such as Bert, CNN, RNN. And a linear layer is used to transform it with softmax as activation function.
Then the original one-hot vector is added with a controlling parameter.
![](./images/1638152727316.png)
## Training
During Training, KL divergence is used as the loss function to measure the distance between the predicted vector and the label confusion embedding.
![](./images/1638152871016.png)
# Experiment
* Useful for almost all models
![](./images/1638152985745.png)
* especially useful for noisy dataset
![](./images/1638153035305.png)