利用LLMS进行场景理解的工作
代码:https://github.com/MIT-SPARK/llm_scene_understanding
leveraging language for classifying rooms in indoor environments based on their contained objects:
(i) a zero-shot approach,
(ii) a feed-forward classifier approach, and
(iii) a contrastive classifier approach.
本质是利用语言模型学习到的’common-sense’机制来进行类别划分,同时对发现新类和未知类有益。
Methods: summarize a room’s contents in a query sentence