Explainability & Reviewing: The responsbility finally goes back to the audience, i.e EVERYONE

Emerging trends: I did it, I did it, I did it, but …

— Kenneth Ward Church, 2017, Natural Language Engineering


A call for Explanation — Insights should be much more valued than numbers

It is considered a feature that ML has become so powerful (and so opaque) that it is no longer necessary (or even relevant) to talk about how it works. [Church & Hestness 2019 A Survey of 25 Years of Evaluation]

Does it make it ok for machines to do bad things if no one knows what's happening and why, including those of us who created the machines?

There has been a trend for publications to report better and better numbers, but less and less insight.

Years ago, someone from an industrial lab presented a talk at a conference saying basically I did it, I did it, I did it, but I’ll be damned if I’ll tell you how! I had a strong allergic reaction to this talk because I was worried that my employer might ask me to publish similar papers so they could take credit for my results while protecting their intellectual property as trade secrete. Since then, I have often argued that we need to reject papers that try to pull this kind of stunt. We can’t afford papers that report results without insights.

It reminds me of [Bowman 2021 What will it take to fix benchmarking in NLU]: Ultimately, the community needs to compare the cost of making serious investments in better benchmarks to the cost of wasting researcher time and computational resources due to our inability to measure progress. If insightless papers with better numbers were chosen to be published, then, indeed, the progress could be “hard to gauge”.

O’Neil argues in Weapons of Math Destruction that big data increase inequality and threaten democracy largely because of opacity. Numbers offer the sheen of objectivity; algorithms seem to ‘transcend morality’, as O’Neil put it.

No one miss the old days: It was no longer necessary to think about degrees of freedom as we did in the bad old days when we used to worry about feature selection. It used to be considered necessary and desirable to have more observations than parameters, but these days it is no longer necessary to worry about such details with modern neural nets.

How could ML work if there really are more degrees of freedom than observations? Modern optimizations are so complicated that it is hard to address traditional questions like degrees of freedom, significance of each parameter and ANOVA. Such questions were well-understood for simple optimization methods such as regression, but the literature has less to say about such questions for more modern optimization methods, though there are a few suggestions such as [LeCun 1989 Optimal Brain Damage] LeCun gave suggestions upon how to measure "significance of a parameter in a network or a network’s “information content” to move beyond the notion that “complexity = the number of free parameters”.

Neural nets are great for many tasks, but they haven’t yet automated researchers out of a job. Research is harder than just pushing a button and waiting for the optimization to converge on a publishable result.


I worry that the literature may be turning into a giant leaderboard.

A reviewing burdens continue to become more and more oneraous, reviewers are looking for easier and easier ways to discharge responsibility. Leaderboards provide a useful service by helping the audience figure out how the proposed solution stacks up to the competition, but that should be just a starting point to motivate a more interesting discussion on why the proposed solution works as well as it does.

I prefer to believe that cheating is rarely caught because there is so little to catch. In any case, there are lots of standard tricks that we have all seen way too often like weak baselines, mindless metrics, lack of transparency, etc.

The It is nice to see the field come together as it has, but we may have been too successful.


Final Words: Whatever you measure, you get

The Writer, The Reviewer and the Audience are the same group of people

The work tends to be better if authors are advocating positions that they care passionately about for reasons that go beyond personal gain. Apparently that will not be the case at least in the short-term. But at the end of the day, the ultimate satisfaction comes from meeting (and exceeding) audience expectations. The audience must demand more than merely good numbers. It is the responsibility of the audience to expect both good numbers as well as insights, and vote early and often with citations.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值