本专栏是计算机视觉方向论文收集积累,时间:2021年12月1日,来源:paper digest
欢迎关注原创公众号 【计算机视觉联盟】,回复 【西瓜书手推笔记】 可获取我的机器学习纯手推笔记!
直达笔记地址:机器学习手推笔记(GitHub地址)
标题:EA GAN :一种高效的两阶段进化结构搜索算法
作者:应国浩;贺新;高斌;韩波;储小文
类别: cs.简历 [cs.简历LG, cs.NE]
突出:为了缓解不稳定的问题,我们提出了一个有效的两阶段进化算法( EA )基于 NAS 框架来发现 GAN ,称为\ text bf { EA GAN }。
标题:基于不可见和保隐私图像域的人体在床姿态估计
作者:曹挺;穆罕默德·阿里·阿明;西蒙·丹曼;拉尔斯·彼得森;戴维·艾哈迈德-阿利斯蒂扎巴尔
类别: cs.简历 [cs.简历LG]
突出:自我监督的方法直接从数据中学习功能的有效性的动机,我们提出了一个多模态的条件变分自编码器( MC - VAE )能够从丢失的模式在训练中看到的重建功能。
标题:骨架动作识别的匿名化
作者:金敏贤;秦振跃;刘阳;金东宇
类别: cs.简历 [cs.简历LG]
突出:我们提出了两种变体的匿名化算法,以保护潜在的隐私泄漏的骨架数据集。
标题:一种统一的视觉变换器剪枝框架
作者:余浩;吴建新
类别: cs.简历 [cs.简历]
突出在本文中,我们设计了一个统一的框架结构剪枝的 ViT 和其变种,即 UP - ViTs 。
标题:Voint 云:用于 3D 理解的多视点云表示
作者:阿卜杜拉·哈姆迪;西尔维奥·詹科拉;贝尔纳·加内姆
类别: cs.简历 [cs.简历LG, 68T45]
突出:为此,我们引入了多视点云( Voint cloud )的概念,将每个 3D 点表示为从多个视点提取的一组特征。
标题:金字塔对抗训练提高 ViT 性能
作者:查理斯·赫尔曼等。
类别: cs.简历 [cs.简历]
突出在这项工作中,我们提出了金字塔对抗训练,一个简单而有效的技术,以提高 ViT 的整体性能。
标题:跨数据库微表情识别中人脸显著区域的寻找
作者:蒋兴勋;宗元;郑文明
分类:综合类【综合类】
突出:为了处理跨数据库的微表情识别,我们提出了一种新的领域自适应方法称为传输组稀疏回归(TGSR).
第八条标题:单视图三维重建网络中重建与识别的数据集分散视角
作者:周业凡;沈一茹;闫玉军;陈峰;杨耀清
类别:综合类
突出:因此,我们引入了离散分数,一个新的数据驱动的度量,量化这个主导因素,并研究其对NNs.
标题:MMP TRACK :大规模密集标注多摄像头多人跟踪基准
作者:韩晓天等。
类别: cs.简历 [cs.简历]
突出在本文中,我们提供了一个大规模的密集标记的多摄像机跟踪数据集在五个不同的环境与自动注释系统的帮助。
第十条标题:Air Object :一种用于对象识别的时间演化图嵌入
作者:尼基尔·瓦尔马·凯莎;王晨;邱玉恒;徐宽;塞巴斯蒂安·谢尔
类别:综合类【综合类】
突出:在这种情况下,我们提出了一种新的时间的 3D 对象编码方法,称为 Air Object ,以获得基于全局关键点图的对象嵌入。
第 11 条标题:Neu Sample :高效视图合成的神经采样场
作者:方洁敏等。
类别:综合类【综合类】cs.GR]
突出:我们进行实验逼真的合成 360 $^{\ circ }$和真正的前瞻性,两个流行的 3D 场景集,并显示 Neu Sample 实现更好的渲染质量比 NeRF ,同时享受更快的推理速度。
第 12 章,标题:基于单目二维图像的三维服装数字化三维虚拟试穿系统
作者:马奇佳等。
类别: cs.简历 [cs.简历]
重点在本文中,我们开发了一个强大的三维服装数字化解决方案,可以很好地推广到真实世界的服装目录图像,具有布料纹理遮挡和大的身体姿态变化。
13,标题13,标题:热:用于结构重建的整体边缘注意变压器
亚特豪斯:嘉诚陈;一明·干;靖国神社·弗鲁卡(Yasuka Furuukawa)
类别:综合类
突出:本文提出了一种新的基于注意力的神经网络结构化重建,它以一个二维光栅图像作为输入,并重建一个平面图描绘底层的几何结构。
第十四条标题:通过打破黑暗实现微光图像增强
作者:胡启明;郭晓洁
类别:综合类
突出:寻求满意的照明效果,清洁度和现实主义从退化的输入,本文提出了一种新的框架,启发的分割和规则的原则,大大减轻退化纠缠。
第 15 条,标题:歌词改编:通过源-目标对抗和自诱导跨域增强学习域不变性
作者:沈凤仪;阿基尔·古拉姆;阿赫迈特·法鲁克·金枪鱼;奥奈·乌法里奥格鲁;阿露伊斯·诺尔
类别: cs.简历 [cs.简历]
突出在本文中,我们提出了一种新的 Trident - like 架构,强制一个共享的特征编码器,以满足对抗的源和目标的约束,同时,从而学习一个域不变的特征空间。
第十六条,标题:Diff SDF Sim :具有隐式形状的可微刚体动力学
作者:迈克尔斯特里克;约尔格斯图克勒
类别:综合类【综合类】cs.GR,cs.LG,中华民国】
突出:在本文中,我们提出了一种新的方法来与摩擦接触表示对象的形状隐式使用有符号距离场的可微物理(SDFs).
第 17 条,标题:用于图像生成的生成卷积层
作者:申永国朴胜
类别: cs.简历 [cs.简历]
突出本文介绍了一种新的卷积方法,称为生成卷积( GC onv ),这是简单而有效的,以提高生成对抗网络( GAN )的性能。
第 18 条,标题:CLIP 驱动的参考图像分割
作者:王肇庆等。
类别: cs.简历 [cs.简历]
突出受对比语言图像预训练( CLIP )的启发,本文提出了一种端到端 CLIP 驱动的参考图像分割框架( CRIS )。
第 19 条标题:结构化群体稀疏性增强的形态检测
作者:波里亚阿赫代伊;巴里亚乔杜里;索布汉索利马尼;杰里米道森;纳赛尔米纳斯拉巴迪
类别:综合类
突出在本文中,我们考虑的挑战,人脸变形的攻击,这大大破坏了完整性的人脸识别系统,如采用在边境保护机构。
第 20 条,标题:CLIP 遇上视频字幕机:属性感知表征学习促进精准字幕
作者:杨邦邹月仙
类别:综合类
突出:具体而言,我们的实证研究表明, INP 与 CLIP 的视频字幕模型棘手捕捉属性的语义和敏感的无关的背景信息。
第 21 条,标题:基于形态学约束的三维深度生成模型的反事实增强脑成像公平建模
作者:庞博等。
类别: cs.简历 [cs.简历LG]
突出:我们描述 Counter synth ,一个条件的生成模型的微分变形,诱导标签驱动的,生物合理的变化体积脑图像。
第 22 条,标题:EPose :让高效的 Pose 更普遍适用
作者:奥斯汀·拉里;罗伯特·贝恩;马赞·阿洛泰比
类别: cs.简历 [cs.简历LG]
突出:在本文中,我们试图提高 Efficient Pose 的能力,推断对象的大小,并通过简化数据收集和损失计算。
第二十三条,标题:FE neRF :神经辐射场中的人脸编辑
作者:孙静翔等。
类别: cs.简历 [cs.简历]
突出:为了克服这些限制,我们提出了 FE NeRF ,一个 3D 感知的生成器,可以产生视图一致和本地可编辑的肖像图像。
第 24 条,标题:面部特征在静止环境中如何传达注意力
作者:贾内尔多曼泰
类别: cs.简历 [cs.简历]
突出:本文旨在通过分析哪些视觉特征最有助于预测意识和疲劳,扩展以往的研究分心检测。
第 25 条,标题:全范围稳健的部分对部分点云配准
作者:潘亮;蔡中刚;刘紫薇
类别: cs.简历 [cs.简历]
突出:在这项工作中,我们提出了图匹配共识网络( GMC Net ),它估计全范围 1 部分到部分点云配准( PPR )的位置不变对应。
第 26 条标题:Plant Stereo :用于植物表面稠密重建的立体匹配基准
作者:王庆宇等。
类别: cs.简历 [cs.简历]
突出在本文中,我们的目标是解决数据集和模型之间的问题,并提出了一个大规模的立体数据集具有高精度的视差地面真相命名为 Plant Stereo 。
第 27 条,标题:基于多尺度令牌聚合的自我关注分流
作者:任素成;周大全;何胜峰;冯嘉石;王新潮
类别: cs.简历 [cs.简历]
突出:为了解决这个问题,我们提出了一种新的和通用的策略,称为分流的自我注意~( SSA ),允许 ViT 模型的注意力在混合尺度每个注意层。
第 28 条,标题:后人类:人体的形状和状态分离的潜在表象
作者:桑德罗·隆巴迪等。
类别:综合类
突出:在这项工作中,我们提出了一种新的神经隐式表示的人体,这是完全可微的和优化与解纠缠的形状和姿势潜在的空间。
第 29 条,标题:基于标签传播的半监督三维手形位姿估计
作者:萨米拉·卡维亚尼;阿米尔·拉希米;理查德·哈特利
类别:综合类
突出:为了解决这个问题的背景下,半监督的 3D 手的形状和姿势估计,我们提出了姿势对齐网络传播的 3D 注释从标记帧附近的未标记帧稀疏注释的视频。
第 30 条,标题:航空图像满足众包轨迹:一种鲁棒道路提取新方法
作者:刘凌波等。
类别: cs.简历 [cs.简历AI]
突出在这项工作中,我们专注于一个具有挑战性的任务的土地分析,即,自动提取交通道路遥感数据,它有广泛的应用在城市发展和扩张估计。
第 31 条,标题:用于目标检测和实例分割任务的 MIS 数据集
作者:钦坦·通迪亚;拉吉夫·库马尔;乌姆·达马尼;西瓦库马尔
类别: cs.简历 [cs.简历]
突出在本文中,我们介绍了 MIS Check - Dam ,一个新的数据集的检查水坝从卫星图像建立一个自动化系统的检测和测绘检查水坝,重点是农业灌溉结构的重要性。
第 32 条,标题:基于空间和多尺度感知的可视化类嵌入的零帧语义分割
作者:查成国;王育雄
类别: cs.简历 [cs.简历]
突出在本文中,我们解决 L - ZS SS 有一个限制,这是一个优点的零射击学习推广。
第 33 条,标题:在野外的幻觉神经辐射场
作者:陈星宇等。
类别: cs.简历 [cs.简历AI]
突出:为了解决这个问题,我们提出了一个端到端的框架构建一个幻觉 Ne RF ,称为 H - Ne RF 。
第 34 条,标题:基于深度学习的下颌骨根管通道自动追踪
作者:于泽云
类别: cs.简历 [cs.简历]
突出:在这里,我们提出了一个基于深度学习的框架来检测下颌管从CBCT数据
第 35 条,标题:采样 Aug :论贴片采样增强对单图像超分辨率的重要性
作者:王世尊等。
分类:综合类【综合类】
突出在本文中,我们提出了一个简单而有效的数据增强方法。
第 36 条,标题:Map Reader :一个用于按比例对地图进行语义探索的计算机视觉管道
作者:霍赛尼;威尔逊;比伦;麦克多诺
类别:综合类【综合类】cs.LG,中欧和东欧
重点:我们介绍了MapReader,这是一个用Python编写的免费开源软件库,用于分析大型地图集合(扫描或出生为数字)。
第37条,标题1:魔鬼在边缘:基于边缘的标签平滑用于网络校准
刘海;伊斯梅尔·本·艾德;阿德里安·加德兰;何塞·多尔兹
类别: cs.简历 [cs.简历LG]
突出:根据我们的观察,我们提出了一个简单而灵活的推广不等式约束的基础上,施加一个可控的边际 log it 距离。
第 38 条,标题:人类不可感知攻击及其在提高公平性中的应用
作者:华欣茹;许环忠;白泽;阮越
类别:综合类
突出:我们提供了一个分布式鲁棒优化( DRO )框架,它集成了基于人类的图像质量评估方法,以设计最佳的攻击是人类难以察觉的,但显着破坏深度神经网络。
第 39 条,标题:重建学生关注学生-教师金字塔匹配
作者:山田慎二;贺田和宏
分类:综合类【综合类】
突出在这里,我们提出了一个强大的方法来弥补的缺点STPM.
第 40 条,标题:半局部卷积在激光雷达扫描处理中的应用
作者:拉里萨·特里斯;戴维·彼得; J ·马里斯·齐尔纳
类别: cs.简历 [cs.简历LG]
突出:因此,我们提出了半局部卷积( SLC ),一个卷积层的重量分担量减少沿垂直尺寸。
第 41 条,标题:ConDA :通过正则域级联实现 LiDAR 分割的无监督域自适应
作者:孔令东;尼亚姆·奎德;梁艾林
类别:综合类【综合类】cs.LG,中华民国】
突出在这项工作中,我们在这方面进行了改进和扩展。
第 42 条,标题:可变形 Proto P Net :一种基于可变形原型的可解释图像分类器
作者:琼恩·唐纳利;艾丽娜·杰德·巴内特;陈超凡
分类:综合类【综合类】cs.LG]
突出在本文中,我们提出了一种基于案例的解释性神经网络,提供空间上灵活的原型,称为可变形的原型部分网络(可变形 Proto P Net )的缺点。
第 43 条,标题:Ada ViT :高效图像识别的自适应视觉变换器
作者:孟令臣等。
类别: cs.简历 [cs.简历]
突出:在本文中,我们认为,由于图像之间的巨大变化,他们需要建模之间的远程依赖补丁不同。
第 44 条,标题:360 单深度:高分辨率 360 °单目深度估计
作者:曼努埃尔·雷-阿雷;袁明泽;克里斯蒂安·理查德
类别: cs.简历 [cs.简历]
突出在这项工作中,我们提出了一个灵活的框架,从高分辨率的 360 {\ de g }图像使用切线图像的单目深度估计。
第 45 条,标题:Hyper Style :利用 HyperNetworks 进行 Style GAN 反演,用于真图像编辑
作者:尤瓦尔·阿拉卢夫;奥马尔·托夫;罗恩·莫卡季;里农·加尔;阿米特· H ·伯马诺
类别:综合类
突出在这项工作中,我们将这种方法引入到领域的编码器为基础的反演。
第 46 条,标题:ZZ - Net :二维点云通用旋转等变结构
作者:格奥尔格·巴克曼;弗雷德里克·卡尔;阿克塞尔·福林思
类别:综合类【综合类】cs.LG]
突出在本文中,我们关注的旋转等方差的 2D 点云数据。
第四十七条,标题:有效的视觉变换器的自适应令牌采样
作者:莫森法耶兹等。
类别: cs.简历 [cs.简历]
突出:在这项工作中,我们,因此,引入一个可微分的无参数自适应令牌采样( ATS )模块,它可以插入到任何现有的视觉变压器架构。
第四十八条,标题:用于视频帧插值的时空多流网络
作者:多力坤达尼尔;张凡;大卫布尔
分类:综合类【综合类】
突出:在此背景下,我们提出了一种新的基于深度学习的 VF I 方法, ST - MF Net ,基于时空多流架构。
第 49 条,标题:基于无人机图像的输电铁塔损伤自动检测
作者:阿莱索·坎贝罗·巴雷罗;克莱门斯·西博尔德;安娜·希尔曼;彼得·艾塞特
类别: cs.简历 [cs.简历]
突出: 我们的主要贡献是开发一个系统,用于远程获取的无人机图像上的损伤检测,应用技术来克服数据稀缺性和模糊性的问题,以及评估这种方法来解决这个特定问题的可行性。
第 50 条,标题:MC - SSL 0.0:走向多概念自我监控学习
作者:萨拉阿蒂托;穆罕默德阿维斯;阿马拉法鲁克;冯振华;约瑟夫基特勒
类别:综合类【综合类】cs.LG]
突出:本研究的目的是调查的可能性建模的所有概念,在图像中存在的不使用标签。
第 51 条,标题:通过对象级整合实现大规模视频分析
作者:丹尼尔里瓦斯;弗朗切斯克基姆;约德波罗;大卫卡雷拉
类别:第五号公约[第六号公约]
突出在本文中,我们提出 Fo MO (专注于移动物体)。
第 52 条标题:基于半监督包围盒挖掘的点云实例分割
作者:廖永斌等。
类别: cs.简历 [cs.简历AI]
突出在本文中,我们介绍了第一个半监督点云实例分割框架( SP IB )使用标记和未标记的边界框作为监督。
第 53 条标题:Affect - DML :基于深度度量学习的人的情感的上下文感知单点识别
作者:彭坤宇等。
类别: cs.简历 [cs.简历]
突出在本文中,我们概念化的一杆识别的背景下的情绪-一个新的问题,旨在从一个单一的支持样本识别人类的情感状态在更细的粒子水平。
第 54 条,标题:超级神经元图像去噪:为什么要深入?
作者: 朱奈德·马利克 ;塞尔坎·基拉尼亚兹 ;蒙塞夫·加布伊
类别: cs.简历 [cs.简历LG]
突出:由于非本地信息的集成是已知的,以利于去噪,在这项工作中,我们调查的合成和现实世界的图像去噪超级神经元的使用。
第 55 条,标题:Assist SR :以事后为中心的问题驱动的视频片段检索
作者:雷思丹;王宇轩;毛东兴;高迪飞;郑寿
类别: cs.简历 [cs.简历]
突出:相比之下,我们提出了一个新的任务称为阿富汗为中心的问题驱动的视频片段检索( AQ VSR )。
第五十六届会议,标题:基于优化元学习的摄像机畸变感知三维人体姿态估计
作者:曹汉乙;曹育申;于家明;金俊模
类别: cs.简历 [cs.简历]
突出在本文中,我们提出了一个简单而有效的模型,可以快速适应任何失真的环境,利用 MA ML ,一个代表性的优化为基础的元学习算法的三维人体姿态估计视频。
第五十七条标题:Nee Drop :基于针滴落的稀疏点云形状自监控表示
作者:亚历山大·博奇;皮埃尔·阿兰·朗格鲁瓦;吉尔斯·普伊;雷诺·马莱
类别:综合类【综合类】cs.CG,cs.LG]
突出:相比之下,我们介绍{\ method },一个自我监督的方法学习形状表示可能非常稀疏的点云。
第 58 条,标题:视频复原中的时间对齐方法研究
作者:周坤;李文波;卢丽颖;韩晓光;卢江波
类别:综合类
突出:在这项工作中,我们提出了一种新的,通用的迭代对准模块,采用逐步细化的计划,子对准,产生更准确的运动补偿。
第 59 条标题:基于视频的抑郁症识别的两阶段时间建模框架使用图形表示
作者:许嘉琪;宋思扬;库尔西·库苏曼;哈蒂丝·古尼斯;米歇尔·瓦尔斯塔尔
类别:cs.CV[cs.CV,68T40,I.2.1]
突出:在这个意义上,我们提出了一个两阶段的框架,模型抑郁症的严重程度从多尺度的短期和视频级的面部行为。
第 60 条,标题:单光子三维成像的自适应门控技术
作者:包瑞安;潘德亚;安尼斯
类别:综合类
突出:我们提出了一个自适应门控方案建立在汤普森采样。
第 61 条,标题:NeRF ReN :带反射的神经辐射场
作者:郭元晨;康迪;包林超;何玉;张松海
类别:综合类【综合类】cs.GR]
突出:具体而言,我们建议将一个场景分割成发射和反射组件,和模型的两个组件与独立的神经辐射场。
第 62 条,标题:DA Former :改进领域自适应语义分割的网络架构和训练策略
作者:霍耶;戴登新;范古尔
类别:综合类
突出:由于获取像素明智的注释的真实世界的图像的语义分割是一个昂贵的过程,模型可以训练更容易的合成数据和适应真实的图像,而不需要他们的注释。
第 63 条,标题:图像编辑生成模型 Edibert
作者:蒂博·伊森胡特;乌戈·塔涅利安;玛丽;戴维·皮卡德
类别:综合类【综合类】cs.LG]
突出:在本文中,我们的目标是向统一的图像编辑方法迈出一步。
第 64 条,标题:多路径++:行为预测的有效信息融合和轨迹聚合
作者:巴拉克里什南·瓦拉达拉詹等。
分类:综合类【综合类】cs.LG,中华民国】
突出:在本文中,我们提出了多路径++,未来的预测模型,实现了国家的最先进的性能上流行的基准。
第 65 条,标题:多模式文本识别网络:视觉特征与语义特征的交互增强
作者:那炳虎;金允植;朴胜瑞
类别: cs.简历 [cs.简历]
突出本文介绍了一种新的方法,称为多模态文本识别网络( MA TRN ),使视觉和语义特征之间的相互作用,更好的识别性能。
第 66 条,标题:人再识别的无监督域泛化:一种特定域的自适应框架
作者:齐磊;王磊;石英环;新庚
类别: cs.简历 [cs.简历]
突出在本文中,我们转而研究无监督域的概括 Re ID ,假设没有标签是可用的任何源域。
第 67 条,标题:用于医学图像配准的正则化方向表示
作者:文森特·雅文;皮埃尔·亨利·康泽;纪尧姆·达登;朱利安·伯特;迪米特里·维斯维基斯
类别: cs.简历 [cs.简历]
突出: 遵循这一研究路径,我们提出了一种新的方法,用于单模态和多模态图像配准的基础上,来自结构信息,如梯度矢量流场,我们称之为\ text it {矢量场相似度}的正则化矢量场的对齐。
第 68 条,标题:基于事件的结构光
作者:马纳斯·穆格里卡尔;吉列尔莫·加列戈;大卫·斯卡拉穆扎
类别:综合类
突出:我们提出了一种新的结构光系统,使用事件相机,以解决准确和高速深度传感的问题。
第 69 条,标题:SketchEd it :基于局部草图的免掩码局部图像处理
作者:曾宇;林哲;帕特尔
类别:综合类【综合类】cs.MM]
突出为此,我们研究了一种基于草图的图像处理的新范式:无掩模局部图像处理,它只需要用户的草图输入,并利用整个原始图像。
第 70 条,标题:用于 ToF 数据去噪的光线定向深度更新卷积
作者:迈克尔·谢林;佩德罗·赫莫希拉;蒂莫·罗平斯基
类别: cs.简历 [cs.简历]
突出:在本文中,我们提出了一种迭代去噪方法操作在 3D 空间中,这是设计来学习2.5 D 数据,使 3D 点卷积,以纠正点的位置沿视图方向。
第 71 条,标题:从部分注释数据中学习多个稠密预测任务
作者:李卫红;刘夏雷;哈坎·比伦
类别: cs.简历 [cs.简历]
突出在本文中,我们提出了一种标签有效的方法,并在多个密集的预测任务的部分注释数据,我们称之为多任务部分监督学习的联合学习。
第 72 条,标题:空穴鲁棒线框检测
作者:孔乃金;朴基旺;哈什蒂·高卡
类别: cs.简历 [cs.简历]
突出:我们表明定性和定量,我们的方法显着优于以前的作品无法处理的孔,以及提高普通的检测无孔。
第 73 条,标题:利用深度神经网络学习的拓扑一致性
作者:斯图亚特辛纳考斯基;法比安贝尼特兹奎罗斯;阿莱克斯马丁内斯
类别: cs.简历 [cs.简历LG]
突出在这项工作中,我们定义了一类新的拓扑特征,准确地描述学习的进展,同时快速计算在运行时间。
第 74 条,标题:ART Seg :使用注意力进行热图像语义分割
作者:法金·穆尼尔;肖艾卜·阿扎姆;联合国法蒂玛;蒙古·全
类别: cs.简历 [cs.简历AI]
突出在这项工作中,我们已经采用了热相机的语义分割。
第 75 条,标题:图像字幕的神经注意:优秀方法综述
作者:赞亚尔·祖霍里安沙扎迪;朱加尔·卡利塔
类别: cs.简历 [cs.简历]
突出在这项调查中,我们提供了一个文献回顾,专注的图像字幕深度学习模型。
第 76 条,标题:理论上,人脸识别系统最可怕的变形噩梦
作者:尤娜· M ·凯利;雷蒙德·韦尔德胡斯;卢克·斯普雷沃斯
类别: cs.简历 [cs.简历]
突出:我们提出了一种方法来创建第三种,不同类型的变形,具有更容易训练的优点。
第 77 条,标题:无源非监督域自适应三维目标检测的注意原型
作者:迪普蒂·赫格德;维沙尔·帕特尔
类别: cs.简历 [cs.简历]
突出:我们提出了一个单帧的方法,无源,无监督的域适应基于激光雷达的 3D 对象检测器,使用类原型,以减轻效果伪标签噪声。
第 78 条,标题:CT 块:一种新的点云局部和全局特征提取器
作者:郭尚伟;李军;赖正超;孟宪同;韩少坤
类别: cs.简历 [cs.简历]
突出在本文中,我们提出了一种新的模块,可以同时提取和融合局部和全局的功能,这被命名为 CT 块。
第 79 条,标题:基于多层网络图处理的高光谱图像分割
作者:张松阳;邓勤文;丁志
类别: cs.简历 [cs.简历,哎呀。SP]
突出:利用最近开发的多层网络图形信号处理(M-GSP),这项工作提出了几种方法 HSI 分割的基础上M-GSP特征提取。
第 80 条,标题:Poly World :基于图神经网络的卫星影像多边形建筑物提取
作者:斯特凡诺·佐齐;沙巴布·巴兹拉夫肯;斯特凡·哈本舒斯;弗里德里希·弗朗多夫
类别: cs.简历 [cs.简历]
突出本文介绍了 Poly World ,一个神经网络,直接从图像中提取建筑顶点,并正确连接它们,以创建精确的多边形。
第 81 条,标题:扩散自动编码器:一种有意义的、可解码的表示
作者:孔帕特·普雷查库尔;纳塔纳特·查提;苏蒂萨克·韦兹德旺萨;苏帕索恩·苏瓦亚纳科恩
类别: cs.简历 [cs.简历]
突出:我们的关键思想是使用一个可学习的编码器发现的高级语义,和 DPM 作为解码器建模剩余的随机变化。
第 82 条,标题:基于语义局部参数模型的三维人体形体概率估计
作者:阿卡什·森古普塔;伊格纳斯·布维蒂斯;罗伯托·奇波拉
类别: cs.简历 [cs.简历]
突出: 相反,我们提出了一种方法:(I)以语义体度量的形式预测局部体形状的分布;(2)使用线性映射将身体测量上的局部分布转换为SMPL形状参数上的全局分布。
83,标题1,标题:HRNET:面向面具检测和社会距离的边缘人工智能
Kinshuk Sengupta;Praveen Ranvastava
类别: cs.简历 [cs.简历AI]
突出:本文的目的是为社区提供创新的新兴技术框架,以应对疫情。
第 84 条,标题:FMD - cG AN :使用条件生成对抗网络的快速运动去模糊
作者:贾廷库马尔;因德拉深马斯坦;尚穆甘纳森拉曼
类别:比较类[比较类,第四章,第一. 4 . 3 ;第一. 4 . 4 ]
突出:在本文中,我们提出了一个快速的运动去模糊条件生成对抗网络( FMD - cG AN ),有助于在盲运动的单图像去模糊。
第 85 条,标题:无监督领域自适应:一个现实检验
作者:凯文·马斯格雷夫;塞尔日·贝隆吉;林胜男
类别: cs.简历 [cs.简历]
突出: 在本文中,我们通过大规模试验表明: 1 )在 oracle 设置中,UDA 算法之间的准确性差异比之前认为的要小, 2 )最先进的验证方法与准确性没有很好的相关性, 3 )验证方法导致的准确度下降,使 UDA 方法之间的差异相形见绌。
第 86 段,标题:利用遗传算法生成人脸图像识别中的对抗示例
作者:安德鲁·梅里根;阿兰·斯米顿
类别: cs.简历 [cs.简历]
重点在本工作中,我们使用一个生成的对抗性网络(GAN)来创建对抗性的例子来欺骗面部识别,并且我们在欺骗人脸识别方面取得了一个可接受的成功率。
87,标题一:用场景不可知论的混合增强区分视觉表示学习
作者:李思远;刘自成;狄武;刘子汉;李斯坦
类别:综合类
突出:为了克服这些限制,我们系统地研究了两个子任务的目标,并提出了情景-农业混合SL和自我监督学习(SSL)场景,命名为 SAMix 。
第 88 条,标题:GAN - CNMP :一种交互式生成绘图工具
作者:埃塞阿达;尤努斯塞克尔;比那尔雅纳尔达
类别:cs.GR[cs.GR,人工智能,人工智能cs.LG,中国东北】
突出:在这项工作中,我们提出了一个新的框架, GAN - CNMP ,它包含了一种新的对抗损失CNMP以增加素描的流畅性和一致性。
第 89 条,标题:信任评论家:具有初始收敛保证的无发生器和多用途 WG ANs
作者:特里斯坦米尔恩;蒂安比洛克;阿德里安纳克曼
CATEGORY: cs.LG[cs.LG,cs.CV,cs.NE,math.OC,49 Q 22,I.3.3;I.4.4;I.4.3]突出
:从最优传输理论的思想启发,我们提出了信任的批评( TTC ),生成建模的新算法。第九十届会议,标题:
一种基于改进波束搜索和改进稳定秩的深度神经网络高效低秩压缩作者:叶文荣;姜秀贤;李元宗
类别: cs.LG [cs.LG, CS.人工智能简历]
突出
:在这项工作中,我们提出了一个低秩压缩方法,利用一个修改后的波束搜索的自动排名选择和一个修订的稳定排名的压缩友好的培训。第 91 条,标题:
用于行为分析的多种弱监督源的自动综合作者:曾俊华;孙洁;岳怡松
类别: cs.LG [cs.LG, CS.简历]
突出
:为了减少专家的努力,我们提出 Auto SWAP :自动合成数据高效的任务级标记函数的框架。第 92 条,标题:
Loss Plot :一种更好的损失景观可视化方法作者:罗伯特·贝恩;米哈伊尔·托卡列夫;哈什·科塔里;拉胡尔·达米尼尼
类别:cs.LG[cs.LG,凯斯西储cs.HC]
突出
:这项工作记录了我们的用户驱动的方法来创建一个平台,半自动化这个过程。第 93 条,标题:
变分自编码器的指数倾斜高斯先验作者:格里芬·弗洛托;斯特凡·克莱默;米哈伊·尼卡
类别:cs.LG[cs.LG,[法属哥伦比亚特区政府]
突出
:为了缓解这个问题,我们提出了指数倾斜高斯先验分布的变分自编码器( VAE )。第九十四条,标题:
Col ibri Doc :一种手眼自动套针对接系统作者:谢文·德赫加尼等。
类别: RO 号[ RO 号, CV 号]
突出
:这项工作的目的是减少机器人安装准备手术任务之前的复杂性,因此,增加系统集成到临床工作流程的直观性。第 95 号决议,标题:
通过多任务学习改进儿童低级别胶质瘤的分割作者:帕托·瓦费基亚;马蒂亚斯·瓦格纳;乌里·塔博里;比尔吉特·厄特尔·瓦格纳:法尔扎德·哈尔瓦蒂
类别:第四组【第四、第五组】
突出
:我们开发了一个分割模型训练磁共振成像( MRI )的低级别胶质瘤( pL GGs )的儿科患者的病儿医院(多伦多,安大略省,加拿大)。第九十六条,标题:
局部扰动弱监督分割胶质瘤脑肿瘤作者:萨吉斯·拉贾帕克萨;法扎德·卡尔瓦蒂
类别:第四组【第四、第五组】
突出
:这项工作提出了使用局部扰动作为弱监督的解决方案,从预训练的 3D 分类模型提取分割面具的脑肿瘤。第 97 条,标题:
局部和全局学习 MRI 重建的对比学习作者:易巧思;刘金豪;胡乐;方发明;张贵旭
类别:第四组【第四、第五组】
突出
:为了解决这些问题,我们提出了一个对比学习局部和全局学习磁共振重建网络( CLG Net )。第 98 条,标题:
基于全自动深度学习的胰腺导管腺癌 CT 检测框架作者:李华等.
==同步,由老年人纠正=@EERDER_MAN
突出
:在这项研究中,最先进的深度学习模型被用于开发 PD AC 检测的自动框架,专注于小病灶。第九十九条,标题:
快速鲁棒加速 MRI 重建独立递归推理机级联数据一致性评估作者:卡尔卡鲁索;诺特布姆;胡尔斯特;福斯;凯恩
类别: EESS . IV 【 ESS . IV ,第四章,第五章】cs.LG,物理学
突出
:这项工作提出了级联的独立递归推理机( CI RIM )评估 DC 通过展开优化,隐式梯度下降和显式的设计条款。
1, TITLE: EAGAN: Efficient Two-stage Evolutionary Architecture Search for GANs
AUTHORS: Guohao Ying ; Xin He ; Bin Gao ; Bo Han ; Xiaowen Chu
CATEGORY: cs.CV [cs.CV, cs.LG, cs.NE]
HIGHLIGHT: To alleviate the instability issue, we propose an efficient two-stage evolutionary algorithm (EA) based NAS framework to discover GANs, dubbed \textbf{EAGAN}.
2, TITLE: In-Bed Human Pose Estimation from Unseen and Privacy-Preserving Image Domains
AUTHORS: Ting Cao ; Mohammad Ali Armin ; Simon Denman ; Lars Petersson ; David Ahmedt-Aristizabal
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: Motivated by the effectiveness of self-supervised methods in learning features directly from data, we propose a multi-modal conditional variational autoencoder (MC-VAE) capable of reconstructing features from missing modalities seen during training.
3, TITLE: Anonymization for Skeleton Action Recognition
AUTHORS: Myeonghyeon Kim ; Zhenyue Qin ; Yang Liu ; Dongwoo Kim
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: We propose two variants of anonymization algorithms to protect the potential privacy leakage from the skeleton dataset.
4, TITLE: A Unified Pruning Framework for Vision Transformers
AUTHORS: Hao Yu ; Jianxin Wu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we design a unified framework for structural pruning of both ViTs and its variants, namely UP-ViTs.
5, TITLE: Voint Cloud: Multi-View Point Cloud Representation for 3D Understanding
AUTHORS: Abdullah Hamdi ; Silvio Giancola ; Bernard Ghanem
CATEGORY: cs.CV [cs.CV, cs.LG, 68T45]
HIGHLIGHT: To this end, we introduce the concept of the multi-view point cloud (Voint cloud), representing each 3D point as a set of features extracted from several view-points.
6, TITLE: Pyramid Adversarial Training Improves ViT Performance
AUTHORS: CHARLES HERRMANN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we present Pyramid Adversarial Training, a simple and effective technique to improve ViT's overall performance.
7, TITLE: Seeking Salient Facial Regions for Cross-Database Micro-Expression Recognition
AUTHORS: Xingxun Jiang ; Yuan Zong ; Wenming Zheng
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: To deal with cross-database micro-expression recognition, we propose a novel domain adaption method called Transfer Group Sparse Regression (TGSR).
8, TITLE: A Dataset-Dispersion Perspective on Reconstruction Versus Recognition in Single-View 3D Reconstruction Networks
AUTHORS: Yefan Zhou ; Yiru Shen ; Yujun Yan ; Chen Feng ; Yaoqing Yang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Thus, we introduce the dispersion score, a new data-driven metric, to quantify this leading factor and study its effect on NNs.
9, TITLE: MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark
AUTHORS: XIAOTIAN HAN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we provide a large-scale densely-labeled multi-camera tracking dataset in five different environments with the help of an auto-annotation system.
10, TITLE: AirObject: A Temporally Evolving Graph Embedding for Object Identification
AUTHORS: Nikhil Varma Keetha ; Chen Wang ; Yuheng Qiu ; Kuan Xu ; Sebastian Scherer
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: In this context, we propose a novel temporal 3D object encoding approach, dubbed AirObject, to obtain global keypoint graph-based embeddings of objects.
11, TITLE: NeuSample: Neural Sample Field for Efficient View Synthesis
AUTHORS: JIEMIN FANG et. al.
CATEGORY: cs.CV [cs.CV, cs.GR]
HIGHLIGHT: We perform experiments on Realistic Synthetic 360$^{\circ}$ and Real Forward-Facing, two popular 3D scene sets, and show that NeuSample achieves better rendering quality than NeRF while enjoying a faster inference speed.
12, TITLE: Robust 3D Garment Digitization from Monocular 2D Images for 3D Virtual Try-On Systems
AUTHORS: SAHIB MAJITHIA et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we develop a robust 3D garment digitization solution that can generalize well on real-world fashion catalog images with cloth texture occlusions and large body pose variations.
13, TITLE: HEAT: Holistic Edge Attention Transformer for Structured Reconstruction
AUTHORS: Jiacheng Chen ; Yiming Qian ; Yasutaka Furukawa
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper presents a novel attention-based neural network for structured reconstruction, which takes a 2D raster image as an input and reconstructs a planar graph depicting an underlying geometric structure.
14, TITLE: Low-light Image Enhancement Via Breaking Down The Darkness
AUTHORS: Qiming Hu ; Xiaojie Guo
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To seek results with satisfied lighting, cleanliness, and realism from degraded inputs, this paper presents a novel framework inspired by the divide-and-rule principle, greatly alleviating the degradation entanglement.
15, TITLE: TridentAdapt: Learning Domain-invariance Via Source-Target Confrontation and Self-induced Cross-domain Augmentation
AUTHORS: Fengyi Shen ; Akhil Gurram ; Ahmet Faruk Tuna ; Onay Urfalioglu ; Alois Knoll
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a novel trident-like architecture that enforces a shared feature encoder to satisfy confrontational source and target constraints simultaneously, thus learning a domain-invariant feature space.
16, TITLE: DiffSDFSim: Differentiable Rigid-Body Dynamics With Implicit Shapes
AUTHORS: Michael Strecke ; Joerg Stueckler
CATEGORY: cs.CV [cs.CV, cs.GR, cs.LG, cs.RO]
HIGHLIGHT: In this paper, we propose a novel approach to differentiable physics with frictional contacts which represents object shapes implicitly using signed distance fields (SDFs).
17, TITLE: Generative Convolution Layer for Image Generation
AUTHORS: Seung Park ; Yong-Goo Shin
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper introduces a novel convolution method, called generative convolution (GConv), which is simple yet effective for improving the generative adversarial network (GAN) performance.
18, TITLE: CRIS: CLIP-Driven Referring Image Segmentation
AUTHORS: ZHAOQING WANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Inspired by the recent advance in Contrastive Language-Image Pretraining (CLIP), in this paper, we propose an end-to-end CLIP-Driven Referring Image Segmentation framework (CRIS).
19, TITLE: Morph Detection Enhanced By Structured Group Sparsity
AUTHORS: Poorya Aghdaie ; Baaria Chaudhary ; Sobhan Soleymani ; Jeremy Dawson ; Nasser M. Nasrabadi
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we consider the challenge of face morphing attacks, which substantially undermine the integrity of face recognition systems such as those adopted for use in border protection agencies.
20, TITLE: CLIP Meets Video Captioners: Attribute-Aware Representation Learning Promotes Accurate Captioning
AUTHORS: Bang Yang ; Yuexian Zou
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Specifically, our empirical study on INP vs. CLIP shows that INP makes video caption models tricky to capture attributes' semantics and sensitive to irrelevant background information.
21, TITLE: Equitable Modelling of Brain Imaging By Counterfactual Augmentation with Morphologically Constrained 3D Deep Generative Models
AUTHORS: GUILHERME POMBO et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: We describe Countersynth, a conditional generative model of diffeomorphic deformations that induce label-driven, biologically plausible changes in volumetric brain images.
22, TITLE: EPose: Let's Make EfficientPose More Generally Applicable
AUTHORS: Austin Lally ; Robert Bain ; Mazen Alotaibi
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper we try to improve on EfficientPose by giving it the ability to infer an object's size, and by simplifying both the data collection and loss calculations.
23, TITLE: FENeRF: Face Editing in Neural Radiance Fields
AUTHORS: JINGXIANG SUN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To overcome these limitations, we propose FENeRF, a 3D-aware generator that can produce view-consistent and locally-editable portrait images.
24, TITLE: How Facial Features Convey Attention in Stationary Environments
AUTHORS: Janelle Domantay
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper aims to extend previous research on distraction detection by analyzing which visual features contribute most to predicting awareness and fatigue.
25, TITLE: Robust Partial-to-Partial Point Cloud Registration in A Full Range
AUTHORS: Liang Pan ; Zhongang Cai ; Ziwei Liu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we propose Graph Matching Consensus Network (GMCNet), which estimates pose-invariant correspondences for fullrange 1 Partial-to-Partial point cloud Registration (PPR).
26, TITLE: PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction
AUTHORS: QINGYU WANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we aim to address the issue between datasets and models and propose a large scale stereo dataset with high accuracy disparity ground truth named PlantStereo.
27, TITLE: Shunted Self-Attention Via Multi-Scale Token Aggregation
AUTHORS: Sucheng Ren ; Daquan Zhou ; Shengfeng He ; Jiashi Feng ; Xinchao Wang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To address this issue, we propose a novel and generic strategy, termed shunted self-attention~(SSA), that allows ViTs to model the attentions at hybrid scales per attention layer.
28, TITLE: LatentHuman: Shape-and-Pose Disentangled Latent Representation for Human Bodies
AUTHORS: SANDRO LOMBARDI et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we propose a novel neural implicit representation for the human body, which is fully differentiable and optimizable with disentangled shape and pose latent spaces.
29, TITLE: Semi-Supervised 3D Hand Shape and Pose Estimation with Label Propagation
AUTHORS: Samira Kaviani ; Amir Rahimi ; Richard Hartley
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To tackle this issue in the context of semi-supervised 3D hand shape and pose estimation, we propose the Pose Alignment network to propagate 3D annotations from labelled frames to nearby unlabelled frames in sparsely annotated videos.
30, TITLE: Aerial Images Meet Crowdsourced Trajectories: A New Approach to Robust Road Extraction
AUTHORS: LINGBO LIU et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this work, we focus on a challenging task of land analysis, i.e., automatic extraction of traffic roads from remote sensing data, which has widespread applications in urban development and expansion estimation.
31, TITLE: The MIS Check-Dam Dataset for Object Detection and Instance Segmentation Tasks
AUTHORS: Chintan Tundia ; Rajiv Kumar ; Om Damani ; G. Sivakumar
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we introduce MIS Check-Dam, a new dataset of check-dams from satellite imagery for building an automated system for the detection and mapping of check-dams, focusing on the importance of irrigation structures used for agriculture.
32, TITLE: Zero-Shot Semantic Segmentation Via Spatial and Multi-Scale Aware Visual Class Embedding
AUTHORS: Sungguk Cha ; Yooseung Wang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we address L-ZSSS has a limitation in generalization which is a virtue of zero-shot learning.
33, TITLE: Hallucinated Neural Radiance Fields in The Wild
AUTHORS: XINGYU CHEN et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: To solve this problem, we present an end-to-end framework for constructing a hallucinated NeRF, dubbed as H-NeRF.
34, TITLE: Automatic Tracing of Mandibular Canal Pathways Using Deep Learning
AUTHORS: Mrinal Kanti Dhar ; Zeyun Yu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Here, we propose a deep learning-based framework to detect mandibular canals from CBCT data.
35, TITLE: SamplingAug: On The Importance of Patch Sampling Augmentation for Single Image Super-Resolution
AUTHORS: SHIZUN WANG et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, we present a simple yet effective data augmentation method.
36, TITLE: MapReader: A Computer Vision Pipeline for The Semantic Exploration of Maps at Scale
AUTHORS: Kasra Hosseini ; Daniel C. S. Wilson ; Kaspar Beelen ; Katherine McDonough
CATEGORY: cs.CV [cs.CV, cs.LG, cs.SE]
HIGHLIGHT: We present MapReader, a free, open-source software library written in Python for analyzing large map collections (scanned or born-digital).
37, TITLE: The Devil Is in The Margin: Margin-based Label Smoothing for Network Calibration
AUTHORS: Bingyuan Liu ; Ismail Ben Ayed ; Adrian Galdran ; Jose Dolz
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: Following our observations, we propose a simple and flexible generalization based on inequality constraints, which imposes a controllable margin on logit distances.
38, TITLE: Human Imperceptible Attacks and Applications to Improve Fairness
AUTHORS: Xinru Hua ; Huanzhong Xu ; Jose Blanchet ; Viet Nguyen
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We provide a Distributionally Robust Optimization (DRO) framework which integrates human-based image quality assessment methods to design optimal attacks that are imperceptible to humans but significantly damaging to deep neural networks.
39, TITLE: Reconstruction Student with Attention for Student-Teacher Pyramid Matching
AUTHORS: Shinji Yamada ; Kazuhiro Hotta
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: Here we proposed a powerful method which compensates for the shortcomings of STPM.
40, TITLE: Semi-Local Convolutions for LiDAR Scan Processing
AUTHORS: Larissa T. Triess ; David Peter ; J. Marius Z�llner
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: Therefore, we propose semi local convolution (SLC), a convolution layer with reduced amount of weight-sharing along the vertical dimension.
41, TITLE: ConDA: Unsupervised Domain Adaptation for LiDAR Segmentation Via Regularized Domain Concatenation
AUTHORS: Lingdong Kong ; Niamul Quader ; Venice Erin Liong
CATEGORY: cs.CV [cs.CV, cs.LG, cs.RO]
HIGHLIGHT: In this work, we improve and extend on this aspect.
42, TITLE: Deformable ProtoPNet: An Interpretable Image Classifier Using Deformable Prototypes
AUTHORS: Jon Donnelly ; Alina Jade Barnett ; Chaofan Chen
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: In this paper, we address this shortcoming by proposing a case-based interpretable neural network that provides spatially flexible prototypes, called a deformable prototypical part network (Deformable ProtoPNet).
43, TITLE: AdaViT: Adaptive Vision Transformers for Efficient Image Recognition
AUTHORS: LINGCHEN MENG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we argue that due to the large variations among images, their need for modeling long-range dependencies between patches differ.
44, TITLE: 360MonoDepth: High-Resolution 360� Monocular Depth Estimation
AUTHORS: Manuel Rey-Area ; Mingze Yuan ; Christian Richardt
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we propose a flexible framework for monocular depth estimation from high-resolution 360{\deg} images using tangent images.
45, TITLE: HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing
AUTHORS: Yuval Alaluf ; Omer Tov ; Ron Mokady ; Rinon Gal ; Amit H. Bermano
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we introduce this approach into the realm of encoder-based inversion.
46, TITLE: ZZ-Net: A Universal Rotation Equivariant Architecture for 2D Point Clouds
AUTHORS: Georg B�kman ; Fredrik Kahl ; Axel Flinth
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we are concerned with rotation equivariance on 2D point cloud data.
47, TITLE: ATS: Adaptive Token Sampling For Efficient Vision Transformers
AUTHORS: MOHSEN FAYYAZ et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we, therefore, introduce a differentiable parameter-free Adaptive Token Sampling (ATS) module, which can be plugged into any existing vision transformer architecture.
48, TITLE: Spatio-Temporal Multi-Flow Network for Video Frame Interpolation
AUTHORS: Duolikun Danier ; Fan Zhang ; David Bull
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: In this context, we present a novel deep learning based VFI method, ST-MFNet, based on a Spatio-Temporal Multi-Flow architecture.
49, TITLE: Automated Damage Inspection of Power Transmission Towers from UAV Images
AUTHORS: Aleixo Cambeiro Barreiro ; Clemens Seibold ; Anna Hilsmann ; Peter Eisert
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Our main contributions are the development of a system for damage detection on remotely acquired drone images, applying techniques to overcome the issue of data scarcity and ambiguity, as well as the evaluation of the viability of such an approach to solve this particular problem.
50, TITLE: MC-SSL0.0: Towards Multi-Concept Self-Supervised Learning
AUTHORS: Sara Atito ; Muhammad Awais ; Ammarah Farooq ; Zhenhua Feng ; Josef Kittler
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: This study aims to investigate the possibility of modelling all the concepts present in an image without using labels.
51, TITLE: Large-Scale Video Analytics Through Object-Level Consolidation
AUTHORS: Daniel Rivas ; Francesc Guim ; Jord� Polo ; David Carrera
CATEGORY: cs.CV [cs.CV, cs.NI]
HIGHLIGHT: In this paper, we present FoMO (Focus on Moving Objects).
52, TITLE: Point Cloud Instance Segmentation with Semi-supervised Bounding-Box Mining
AUTHORS: YONGBIN LIAO et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, we introduce the first semi-supervised point cloud instance segmentation framework (SPIB) using both labeled and unlabelled bounding boxes as supervision.
53, TITLE: Affect-DML: Context-Aware One-Shot Recognition of Human Affect Using Deep Metric Learning
AUTHORS: KUNYU PENG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we conceptualize one-shot recognition of emotions in context -- a new problem aimed at recognizing human affect states in finer particle level from a single support sample.
54, TITLE: Image Denoising By Super Neurons: Why Go Deep?
AUTHORS: Junaid Malik ; Serkan Kiranyaz ; Moncef Gabbouj
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: As the integration of non-local information is known to benefit denoising, in this work we investigate the use of super neurons for both synthetic and real-world image denoising.
55, TITLE: AssistSR: Affordance-centric Question-driven Video Segment Retrieval
AUTHORS: Stan Weixian Lei ; Yuxuan Wang ; Dongxing Mao ; Difei Gao ; Mike Zheng Shou
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In contrast, we present a new task called Affordance-centric Question-driven Video Segment Retrieval (AQVSR).
56, TITLE: Camera Distortion-aware 3D Human Pose Estimation in Video with Optimization-based Meta-Learning
AUTHORS: Hanbyel Cho ; Yooshin Cho ; Jaemyung Yu ; Junmo Kim
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a simple yet effective model for 3D human pose estimation in video that can quickly adapt to any distortion environment by utilizing MAML, a representative optimization-based meta-learning algorithm.
57, TITLE: NeeDrop: Self-supervised Shape Representation from Sparse Point Clouds Using Needle Dropping
AUTHORS: Alexandre Boulch ; Pierre-Alain Langlois ; Gilles Puy ; Renaud Marlet
CATEGORY: cs.CV [cs.CV, cs.CG, cs.LG]
HIGHLIGHT: In contrast, we introduce {\method}, an self-supervised method for learning shape representations from possibly extremely sparse point clouds.
58, TITLE: Revisiting Temporal Alignment for Video Restoration
AUTHORS: Kun Zhou ; Wenbo Li ; Liying Lu ; Xiaoguang Han ; Jiangbo Lu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we present a novel, generic iterative alignment module which employs a gradual refinement scheme for sub-alignments, yielding more accurate motion compensation.
59, TITLE: Two-stage Temporal Modelling Framework for Video-based Depression Recognition Using Graph Representation
AUTHORS: Jiaqi Xu ; Siyang Song ; Keerthy Kusumam ; Hatice Gunes ; Michel Valstar
CATEGORY: cs.CV [cs.CV, 68T40, I.2.1]
HIGHLIGHT: In this sense, we propose a two-stage framework that models depression severity from multi-scale short-term and video-level facial behaviours.
60, TITLE: Adaptive Gating for Single-Photon 3D Imaging
AUTHORS: Ryan Po ; Adithya Pediredla ; Ioannis Gkioulekas
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose an adaptive gating scheme built upon Thompson sampling.
61, TITLE: NeRFReN: Neural Radiance Fields with Reflections
AUTHORS: Yuan-Chen Guo ; Di Kang ; Linchao Bao ; Yu He ; Song-Hai Zhang
CATEGORY: cs.CV [cs.CV, cs.GR]
HIGHLIGHT: Specifically, we propose to split a scene into transmitted and reflected components, and model the two components with separate neural radiance fields.
62, TITLE: DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation
AUTHORS: Lukas Hoyer ; Dengxin Dai ; Luc Van Gool
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: As acquiring pixel-wise annotations of real-world images for semantic segmentation is a costly process, a model can instead be trained with more accessible synthetic data and adapted to real images without requiring their annotations.
63, TITLE: EdiBERT, A Generative Model for Image Editing
AUTHORS: Thibaut Issenhuth ; Ugo Tanielian ; J�r�mie Mary ; David Picard
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we aim at making a step towards a unified approach for image editing.
64, TITLE: MultiPath++: Efficient Information Fusion and Trajectory Aggregation for Behavior Prediction
AUTHORS: BALAKRISHNAN VARADARAJAN et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG, cs.RO]
HIGHLIGHT: In this paper, we present MultiPath++, a future prediction model that achieves state-of-the-art performance on popular benchmarks.
65, TITLE: Multi-modal Text Recognition Networks: Interactive Enhancements Between Visual and Semantic Features
AUTHORS: Byeonghu Na ; Yoonsik Kim ; Sungrae Park
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper introduces a novel method, called Multi-modAl Text Recognition Network (MATRN), that enables interactions between visual and semantic features for better recognition performances.
66, TITLE: Unsupervised Domain Generalization for Person Re-identification: A Domain-specific Adaptive Framework
AUTHORS: Lei Qi ; Lei Wang ; Yinghuan Shi ; Xin Geng
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we turn to investigate unsupervised domain generalization for ReID, by assuming that no label is available for any source domains.
67, TITLE: Regularized Directional Representations for Medical Image Registration
AUTHORS: Vincent Jaouen ; Pierre-Henri Conze ; Guillaume Dardenne ; Julien Bert ; Dimitris Visvikis
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Following this research path, we propose a new method for mono- and multimodal image registration based on the alignment of regularized vector fields derived from structural information such as gradient vector flow fields, a technique we call \textit{vector field similarity}.
68, TITLE: ESL: Event-based Structured Light
AUTHORS: Manasi Muglikar ; Guillermo Gallego ; Davide Scaramuzza
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose a novel structured-light system using an event camera to tackle the problem of accurate and high-speed depth sensing.
69, TITLE: SketchEdit: Mask-Free Local Image Manipulation with Partial Sketches
AUTHORS: Yu Zeng ; Zhe Lin ; Vishal M. Patel
CATEGORY: cs.CV [cs.CV, cs.MM]
HIGHLIGHT: To this end, we investigate a new paradigm of sketch-based image manipulation: mask-free local image manipulation, which only requires sketch inputs from users and utilizes the entire original image.
70, TITLE: RADU: Ray-Aligned Depth Update Convolutions for ToF Data Denoising
AUTHORS: Michael Schelling ; Pedro Hermosilla ; Timo Ropinski
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose an iterative denoising approach operating in 3D space, that is designed to learn on 2.5D data by enabling 3D point convolutions to correct the points' positions along the view direction.
71, TITLE: Learning Multiple Dense Prediction Tasks from Partially Annotated Data
AUTHORS: Wei-Hong Li ; Xialei Liu ; Hakan Bilen
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present a label efficient approach and look at jointly learning of multiple dense prediction tasks on partially annotated data, which we call multi-task partially-supervised learning.
72, TITLE: Hole-robust Wireframe Detection
AUTHORS: Naejin Kong ; Kiwoong Park ; Harshith Goka
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We show qualitatively and quantitatively that our approach significantly outperforms previous works unable to handle holes, as well as improves ordinary detection without holes given.
73, TITLE: Leveraging The Topological Consistencies of Learning in Deep Neural Networks
AUTHORS: Stuart Synakowski ; Fabian Benitez-Quiroz ; Aleix M. Martinez
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this work, we define a new class of topological features that accurately characterize the progress of learning while being quick to compute during running time.
74, TITLE: ARTSeg: Employing Attention for Thermal Images Semantic Segmentation
AUTHORS: Farzeen Munir ; Shoaib Azam ; Unse Fatima ; Moongu Jeon
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this work, we have employed the thermal camera for semantic segmentation.
75, TITLE: Neural Attention for Image Captioning: Review of Outstanding Methods
AUTHORS: Zanyar Zohourianshahzadi ; Jugal K. Kalita
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this survey, we provide a review of literature related to attentive deep learning models for image captioning.
76, TITLE: A Face Recognition System's Worst Morph Nightmare, Theoretically
AUTHORS: Una M. Kelly ; Raymond Veldhuis ; Luuk Spreeuwers
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose a method to create a third, different type of morph, that has the advantage of being easier to train.
77, TITLE: Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection
AUTHORS: Deepti Hegde ; Vishal Patel
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors that uses class prototypes to mitigate the effect pseudo-label noise.
78, TITLE: CT-block: A Novel Local and Global Features Extractor for Point Cloud
AUTHORS: Shangwei Guo ; Jun Li ; Zhengchao Lai ; Xiantong Meng ; Shaokun Han
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a novel module that can simultaneously extract and fuse local and global features, which is named as CT-block.
79, TITLE: Hyperspectral Image Segmentation Based on Graph Processing Over Multilayer Networks
AUTHORS: Songyang Zhang ; Qinwen Deng ; Zhi Ding
CATEGORY: cs.CV [cs.CV, eess.SP]
HIGHLIGHT: Leveraging on the recent-developed graph signal processing over multilayer networks (M-GSP), this work proposes several approaches to HSI segmentation based on M-GSP feature extraction.
80, TITLE: PolyWorld: Polygonal Building Extraction with Graph Neural Networks in Satellite Images
AUTHORS: Stefano Zorzi ; Shabab Bazrafkan ; Stefan Habenschuss ; Friedrich Fraundorfer
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper introduces PolyWorld, a neural network that directly extracts building vertices from an image and connects them correctly to create precise polygons.
81, TITLE: Diffusion Autoencoders: Toward A Meaningful and Decodable Representation
AUTHORS: Konpat Preechakul ; Nattanat Chatthee ; Suttisak Wizadwongsa ; Supasorn Suwajanakorn
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Our key idea is to use a learnable encoder for discovering the high-level semantics, and a DPM as the decoder for modeling the remaining stochastic variations.
82, TITLE: Probabilistic Estimation of 3D Human Shape and Pose with A Semantic Local Parametric Model
AUTHORS: Akash Sengupta ; Ignas Budvytis ; Roberto Cipolla
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In contrast, we present a method that (i) predicts distributions over local body shape in the form of semantic body measurements and (ii) uses a linear mapping to transform a local distribution over body measurements to a global distribution over SMPL shape parameters.
83, TITLE: HRNET: AI on Edge for Mask Detection and Social Distancing
AUTHORS: Kinshuk Sengupta ; Praveen Ranjan Srivastava
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: The purpose of the paper is to provide innovative emerging technology framework for community to combat epidemic situations.
84, TITLE: FMD-cGAN: Fast Motion Deblurring Using Conditional Generative Adversarial Networks
AUTHORS: Jatin Kumar ; Indra Deep Mastan ; Shanmuganathan Raman
CATEGORY: cs.CV [cs.CV, eess.IV, I.4.3; I.4.4]
HIGHLIGHT: In this paper, we present a Fast Motion Deblurring-Conditional Generative Adversarial Network (FMD-cGAN) that helps in blind motion deblurring of a single image.
85, TITLE: Unsupervised Domain Adaptation: A Reality Check
AUTHORS: Kevin Musgrave ; Serge Belongie ; Ser-Nam Lim
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we show via large-scale experimentation that 1) in the oracle setting, the difference in accuracy between UDA algorithms is smaller than previously thought, 2) state-of-the-art validation methods are not well-correlated with accuracy, and 3) differences between UDA algorithms are dwarfed by the drop in accuracy caused by validation methods.
86, TITLE: Using A GAN to Generate Adversarial Examples to Facial Image Recognition
AUTHORS: Andrew Merrigan ; Alan F. Smeaton
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work we use a Generative Adversarial Network (GAN) to create adversarial examples to deceive facial recognition and we achieve an acceptable success rate in fooling the face recognition.
87, TITLE: Boosting Discriminative Visual Representation Learning with Scenario-Agnostic Mixup
AUTHORS: Siyuan Li ; Zicheng Liu ; Di Wu ; Zihan Liu ; Stan Z. Li
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To overcome such limitations, we systematically study the objectives of two sub-tasks and propose Scenario-Agostic Mixup for both SL and Self-supervised Learning (SSL) scenarios, named SAMix.
88, TITLE: GAN-CNMP: An Interactive Generative Drawing Tool
AUTHORS: S. Ece Ada ; M. Yunus Seker ; Pinar Yanardag
CATEGORY: cs.GR [cs.GR, cs.AI, cs.CV, cs.LG, cs.NE]
HIGHLIGHT: In this work, we proposed a new framework, GAN-CNMP, that incorporates a novel adversarial loss on CNMP to increase sketch smoothness and consistency.
89, TITLE: Trust The Critics: Generatorless and Multipurpose WGANs with Initial Convergence Guarantees
AUTHORS: Tristan Milne ; �tienne Bilocq ; Adrian Nachman
CATEGORY: cs.LG [cs.LG, cs.CV, cs.NE, math.OC, 49Q22, I.3.3; I.4.4; I.4.3]
HIGHLIGHT: Inspired by ideas from optimal transport theory we present Trust the Critics (TTC), a new algorithm for generative modelling.
90, TITLE: A Highly Effective Low-Rank Compression of Deep Neural Networks with Modified Beam-Search and Modified Stable Rank
AUTHORS: Moonjung Eo ; Suhyun Kang ; Wonjong Rhee
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: In this work, we propose a low-rank compression method that utilizes a modified beam-search for an automatic rank selection and a modified stable rank for a compression-friendly training.
91, TITLE: Automatic Synthesis of Diverse Weak Supervision Sources for Behavior Analysis
AUTHORS: Albert Tseng ; Jennifer J. Sun ; Yisong Yue
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: To reduce expert effort, we present AutoSWAP: a framework for automatically synthesizing data-efficient task-level labeling functions.
92, TITLE: LossPlot: A Better Way to Visualize Loss Landscapes
AUTHORS: Robert Bain ; Mikhail Tokarev ; Harsh Kothari ; Rahul Damineni
CATEGORY: cs.LG [cs.LG, cs.CV, cs.HC]
HIGHLIGHT: This work documents our user-driven approach to create a platform for semi-automating this process.
93, TITLE: Exponentially Tilted Gaussian Prior for Variational Autoencoder
AUTHORS: Griffin Floto ; Stefan Kremer ; Mihai Nica
CATEGORY: cs.LG [cs.LG, cs.CV, stat.ML]
HIGHLIGHT: To alleviate this issue, we propose the exponentially tilted Gaussian prior distribution for the Variational Autoencoder (VAE).
94, TITLE: ColibriDoc: An Eye-in-Hand Autonomous Trocar Docking System
AUTHORS: SHERVIN DEHGHANI et. al.
CATEGORY: cs.RO [cs.RO, cs.CV]
HIGHLIGHT: The aim of this work is to reduce the complexity of robotic setup preparation prior to the surgical task and therefore, increase the intuitiveness of the system integration into the clinical workflow.
95, TITLE: Improving The Segmentation of Pediatric Low-Grade Gliomas Through Multitask Learning
AUTHORS: Partoo Vafaeikia ; Matthias W. Wagner ; Uri Tabori ; Birgit B. Ertl-Wagner ; Farzad Khalvati
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: We developed a segmentation model trained on magnetic resonance imaging (MRI) of pediatric patients with low-grade gliomas (pLGGs) from The Hospital for Sick Children (Toronto, Ontario, Canada).
96, TITLE: Localized Perturbations For Weakly-Supervised Segmentation of Glioma Brain Tumours
AUTHORS: Sajith Rajapaksa ; Farzad Khalvati
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: This work proposes the use of localized perturbations as a weakly-supervised solution to extract segmentation masks of brain tumours from a pretrained 3D classification model.
97, TITLE: Contrastive Learning for Local and Global Learning MRI Reconstruction
AUTHORS: Qiaosi Yi ; Jinhao Liu ; Le Hu ; Faming Fang ; Guixu Zhang
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: To address these problems, we propose a Contrastive Learning for Local and Global Learning MRI Reconstruction Network (CLGNet).
98, TITLE: Fully Automatic Deep Learning Framework for Pancreatic Ductal Adenocarcinoma Detection on Computed Tomography
AUTHORS: NAT�LIA ALVES et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, q-bio.QM]
HIGHLIGHT: In this study, state-of-the-art deep learning models were used to develop an automatic framework for PDAC detection, focusing on small lesions.
99, TITLE: Assessment of Data Consistency Through Cascades of Independently Recurrent Inference Machines for Fast and Robust Accelerated MRI Reconstruction
AUTHORS: D. Karkalousos ; S. Noteboom ; H. E. Hulst ; F. M. Vos ; M. W. A. Caan
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG, physics.med-ph]
HIGHLIGHT: This work proposes the Cascades of Independently Recurrent Inference Machines (CIRIM) to assess DC through unrolled optimization, implicitly by gradient descent and explicitly by a designed term.