卷友们好,我是rumor。
ACL22的Paper list终终终于放出来了!
https://www.2022.aclweb.org/papers
这次增加了ARR机制,想必是几家欢喜几家愁。新的规则出发点对于投稿人和审稿人都是好的,投稿人可以增加几轮修改机会,审稿人可以避免重复工作。但一个无奈的问题是,大部分人都有ddl拖延症,导致这次ACL之前的ARR投稿量暴涨,不少审稿人都反馈审吐了,所以有些工作可能不会被深度审视,刘知远大佬也在知乎上反馈,在ddl之前几个月的审稿意见质量更高:
所以,下次一定改掉拖延症?
好了,不说这么痛心的话题,我们这就来举行一年一度的奇葩标题大赏!(ACL21的在这里)
提前声明,本次大赏带有强烈的主观感受,如有冒犯,请加我微信leerumorrrr,当面发红包道歉。
你的模型名字真好看
据我越来越差的记忆力总结,一份高影响力的工作,大概率有一个好记的模型名字,最好是两个音节,比如BERT、T5、ERNIE、MASS等,或者是三个音节的缩写,比如GPT、WWM等,或者比较容易联想,比如RoBERTa、ALBERT、UNILM等。于是现在最普遍的标题套路,就是名字+冒号+方法简述,让我们来看看今年的花样。
第一梯队:套路型
下面是BERT家族登场:
AlephBERT: Language Model Pre-training and Evaluation from Sub-Word to Sentence Level
KinyaBERT: a Morphology-aware Kinyarwanda Language Model
LinkBERT: Pretraining Language Models with Document Links
Pyramid-BERT: Reducing Complexity via Successive Core-set based Token Selection
RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining
SkipBERT: Efficient Inference with Shallow Layer Skipping
XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems to Improve Language Understanding
Dict-BERT: Enhancing Language Model Pre-training with Dictionary
bert2BERT: Towards Reusable Pretrained Language Models
除了BERT外,LM也是个高频后缀:
Prix-LM: Pretraining for Multilingual Knowledge Base Construction
GLM: General Language Model Pretraining with Autoregressive Infilling
HOLM: Hallucinating Objects with Language Models for Referring Expression Recognition in Partially-Observed Scenes
MarkupLM: Pre-training of Text and Markup Language for Visually Document Understanding
MELM: Data Augmentation with Masked Entity Language Modeling for Low-Resource NER
ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference
CoCoLM: Complex Commonsense Enhanced Language Model with Discourse Relations
第二梯队:很好记
这个梯队深刻地对标题中每个词进行了组合尝试,足以见得作者的用心。
比如这种常用词:
ABC: Attention with Bounded-memory Control
PPT: Pre-trained Prompt Tuning for Few-shot Learning
再比如美食家族:
CAKE: A Scalable Commonsense-Aware Framework For Multi-View Knowledge Graph Completion
EAG: Extract and Generate Multi-way Aligned Corpus for Complete Multi-lingual Neural Machine Translation
BBQ: A hand-built bias benchmark for question answering
还比如加入叠音:
ConTinTin: Continual Learning from Task Instructions
MIMICause: Representation and automatic extraction of causal relation types from clinical notes
CoCoLM: Complex Commonsense Enhanced Language Model with Discourse Relations
第三梯队:你记住算我输
但是还有一类名字让我很不李姐,也可能是有什么梗我没get,毕竟作者应该努力过,但我是真的念不顺溜,如果能记住这些名字,那一定是真爱了。
OIE@OIA: an Adaptable and Efficient Open Information Extraction Framework
CipherDAug: Ciphertext based Data Augmentation for Neural Machine Translation
SaFeRDialogues: Taking Feedback Gracefully after Conversational Safety Failures
WatClaimCheck: A new Dataset for Claim Entailment and Inference
LM-BFF-MS: Improving Few-Shot Fine-tuning of Language Models based on Multiple Soft Demonstration Memory
WLASL-LEX: a Dataset for Recognising Phonological Properties in Sign Language
SyMCoM - Syntactic Measure of Code Mixing A Study Of English-Hindi Code-Mixing
BiSyn-GAT+: Bi-Syntax Aware Graph Attention Network for Aspect-based Sentiment Analysis
记住上面这些名字,一会儿要考。
你的问题直击我心
如果一个fancy的方法名字还不够震撼,那一定要问出发人深思的问题,先铺下悬念,保证标题打开率,再用内容一步步征服你,比如:
Does BERT really agree ? Fine-grained Analysis of Lexical Dependence on a Syntactic Task
What Makes Reading Comprehension Questions Difficult?
Your Answer is Incorrect... Would you like to know why? Introducing a Bilingual Short Answer Feedback Dataset
When did you become so smart, oh wise one?! Sarcasm Explanation in Multi-modal Multi-party Dialogues Tanmoy Chakraborty
Good Night at 4 pm?! Time Expressions in Different Cultures
Hey AI, Can You Solve Complex Tasks by Talking to Agents?
Why don’t people use character-level machine translation?
To be or not to be an Integer? Encoding Variables for Mathematical Text
Relevant CommonSense Subgraphs for What if... Procedural Reasoning
问号,叹号,省略号,这些作者不光技术水平一流,也把英语句式掌握的很透彻,真怕他们卷出一首诗来。
啊!!!
这类题目也深谙自媒体的套路,先发出一句深思,或者很抓人的结论,引人入胜,之后再慢慢展开:
So Different Yet So Alike! Constrained Unsupervised Text Style Transfer
That Is a Suspicious Reaction!: Interpreting Logits Variation to Detect NLP Adversarial Attacks
That Slepen Al the Nyght with Open Ye! Cross-era Sequence Segmentation with Switch-memory
One Agent To Rule Them All: Towards Multi-agent Conversational AI
First the Worst: Finding Better Gender Translations During Beam Search
心中最佳
又到了评选我的心中最佳的时刻,其实说我选出了三个。翻完55页的标题列表并不简单,但这三篇还是紧紧抓住了我。
Dim Wihl Gat Tun: The Case for Linguistic Expertise in NLP for Under-Documented Languages
老实说我不知道Dim Wihl Gat Tun是什么意思,甚至谷歌也不知道:
ECO v1: Towards Event-Centric Opinion Mining
这位勇敢的作者在全NLPer的面前挖了一个「坑」,明年ACL23记得发ECO v2啊亲。
Human Language Modeling
这篇文章到现在还没放出来,这个标题瞬间就勾到我了,甚至还让我产生了一些自我怀疑,你做的是human language modeling,那我做的是什么???等放出来我一定要看看是什么故事。
别打我
好了,这次ACL22标题大赏就圆满落幕了,只是一些个人对好玩标题的点评,不针对内容。
真心祝贺每一位作者,感谢你们对NLP边界所做的探索,也祝愿屏幕前的大家和我自己继续努力,希望明年ACL23可以看到各位的标题!!!!!!
我是朋克又极客的AI算法小姐姐rumor
北航本硕,NLP算法工程师,谷歌开发者专家
欢迎关注我,带你学习带你肝
一起在人工智能时代旋转跳跃眨巴眼
「请罗列“你记住算我输”系列的标题」