目录
[1] Embodied Language Grounding with 3D Visual Feature Representations
- task:language grounding
- motivation:传统方法会过拟合
- contribution:把3D特征用到语言理解和空间推理中
[2] Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form Sentences
[3] PhraseCut: Language-based Image Segmentation in the Wild
总结
- [2] 在传统的Video Object Grounding中,如果使用反事实或者其他思想,将陈述句变成疑问句,会不会提升效果?