所以我今天的活开始了:
In this paper, the authors target the problem of Multimodal Name Entity Recognition(MNER) as an improvement on NER(text only)
The paper proposes a multimodal fusion based on a heterogeneous graph of texts and images to make the representaion more consisten and to get a better representation of texts and images.the paper first constructs the heterogeneous graph with dynamic links between textual and visual nodes. Therefore the model could get the delicate regions in the images to cast light on the entities in the texts.On the heterogeneous and dynamic graph, the paper designs a simple and novel contrastive learning strategy to classify the graph as the auxiliary task. The strategy could also alleviate the negative effects of images.
Strong Points:
1.The main problems of MNRE are clearly pointed out such as introducing negative effects of images ,in the meanwhile the model is contrapuntally designed and works as the experiment show.
2.The proposed approach is pretty novel such as the two-stream graph transformer , the heterogeneous graph with dynamic links between textual and visual nodes and so on.
3.Overall, this paper is clearly written and well organized.
4.Well-rounded related work, and sufficient baseline methods to be compared.
Weak Points:
1.The part of Auxiliary Contrastive Learning can be expanded so that the readers can understand the Ablations better.
2.The part of Tagging could be more breif or be more elaborated if necessery.
3.In Abstract and Introduction the modalities voices are mentioned but they are not mentioned later.The author could talk about the future word as well as some difficulties encountered in the modalities voices.
Details:
1.The case of "Martin Garrix" could be used to explain the part of Two-Stream Mechanism so that the reader could understand better.
2.The authors can introduce the future work.The paper do not introduce the effect of other modalities such as voices , videos.
3.Some figures of cases can be showed in the part of Auxiliary Contrastive Learning and the part of detailed analysis .Figures of cases should not only appear in the part of case studies.
一上午加下午就整这个去了,晚上还要去和老板吃饭。= =
回家看了一下深度学习的花书
.