What we (I) do on surface, what we (I) do by heart

Emerging trends: A tribute to Charles Wayne

— Kenneth Ward Church, 2017, Natural Language Engineering


Charles Wayne restarted funding in speech and language in the mid-1980s, after a funding winter brought on by Pierce’s glamour-and-deceit criticisms in the ALPAC report and Whither Speech Recognition. Wayne introduced a new glamour-and-deceit-proof idea, an emphasis on evaluation (those evaluations that would make sponsors happy.) These days shared tasks and leaderboards have become common place. Wayne has been doing much more than merely running competitions, but he did what he did in such a subtle Columbo-like way.

Those of us with research to sell need to find more and more ways to be relevant to potential sponsors given this new world order. Now that the technology has progressed to the point where industry is prepared to take the lead, it is becoming clear that industry is now prepared to invest in speech and language services and products at levels that go way beyond what can be done with typical government grants. That said, industrial investments tend to be more product focused with less discretion for more speculative long-term research.


Mark Liberman has given a number of talks on somewhat related topics (MT, Speech, AI). He used the term “common task” to refer to the shared task release --> eval scheme, which host a self-correcting, -regulating community (cuz your methods must be revealed to one another upon your eval results are presented). He ends the discussion of the events leading up to the funding winter with a slide titled “Tell us what you really think, John” (many Piercian Engineers were skeptical: “you can’t turn water into gasoline, no matter what you measure.”)

  • “Common Task” Structure
    – A detailed task definition and eval plan developed in consultation with researchers and published as the first step in the project.
    – Auto eval software written and maintained by NIST and published at the start of the project.
    – Shared data training and dev data is published at the start of project, eval test is withheld for periodic public evaluations.
  • A less obvious benefit of it is that it enabled hill climbing.
    – because the eval metrics were automatic
    – and the eval code was public
  • An even less obvious benefit, according to Liberman, was the culture.
    – because researchers share methods and results on shared data with a common metric
    – participation in this culture became so valuable that many research groups joined without funding
  • The common task method created a poisitve feedback loop
    – When everyone’s program has to interpret the same ambiguous evidence, ambiguity resolution becomes a sort of gambling game, which rewards the use of statistical methods, and has led to the flowering of ML.
    – Given the nature of speech and language, statistical methods need the largest possible training set, which reinforces the value of shared data.
    – Iterated train-and-test cycles on this gambling game are addictive; they create “simple, clear, sure knowledge”, which motivates participation in the common-task culture.
     

Wayne’s idea helped sell the research program to potential sponsors. … and researchers who had objected to be tested twice a year began testing themselves every hour … The “Common Task Method” has become the standard research paradigm in experimental computational science.

  • The General Experience
    – Error rates decline by a fixed percentage each year, to an asymptote depending on task and data quality.
    – Progress usually comes from many small improvements (with only occassional larger jumps); improvement by 1% is a reason to break out the champagne.
    – Shared data plays a crucial role – and is re-used in unexpected ways.
    –Glamour and deceit have been mostly avoided.
     

[Liberman’s Talk 2015]


As a practical matter, the sponsor has considerable leverage to guide our research in so many ways. More generally, those of us with research to sell need to find more and more ways to be more and more relevant to potential sponsors, in a new world order where government and enterprise markets have been eclipsed by consumer markets.

On one side, researchers are sellers. They earn a living by selling what they produce. So they need to find arguments that work with the buyers. On top of that, some of those passionately care about something going beyond that. They have to make sure they don’t fall off the scientific path too much when pursuing practical needs (i.e. produce & grants) so that they spare part of their extraodinary minds for potentially digging out dirty things and refreshing the world order.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值