阅读笔记

最新推荐文章于 2023-11-27 17:58:19 发布

sophieag

最新推荐文章于 2023-11-27 17:58:19 发布

阅读量331

点赞数

本文链接：https://blog.csdn.net/sophieag/article/details/50014279

版权

What's Cookin'? Interpreting Cooking Videos using Text, Speech and Vision

程序知识procedural knowledge 从多个模态提取

alignment

（instructional step - speech signal）HMM

数据收集与预处理

youtube上搜索，并且增加扩展连接的内容

句子分类 naive bayes (recipe step, recipe ingredient, background)

parse：POS tagging，entity chunking， constituency parsing 分类树节点必为v

（欧式距离->词之间距离）stem 若找不到明显entity 启发式找前句

speech transcript

ASR system

factored HMM（step of recipe -- ASR words）, keyword confidence

visual detecors, CNN classify, 找到direct object

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

关注关注