Designing and Interpreting Probes with Control Tasks [[Hewitt & Liang 2019](https://arxiv.org/abs/1909.03368)]
tl;dr: A good probe should be selective, achieving high linguistic task acc and low control task acc.
这篇大概是最典型的 Arguing for simple probes,当然应该还有文中提到的另一篇有相同观点的:[Alain and Bengio: Understanding intermediate layers using linear classifier probes]: The task of deep neural network classifier is to come up with a representation for the final layer that can be easily fed to a linear classifier (i.e. the most elementary form of useful classifier)
Motivation for control tasks
- Favor ‘ease of extraction’
- Disourage probes that learn the task by themselves
As long as a representation is a lossless encoding, a sufficiently expressive probe with enough training data can learn any task on top of it. Such expressive probes don’t have discrimination power to measure the goodness of a representation. This is in contrast to [Pimental 2020], which states there is no difference between learning the task and representations encoding information
Desiderata for control tasks
- Control tasks have the same input and output space as a linguistic task (e.g. POS) but can only be learned if the probe memorizes the mapping, since the inputs are no longer prodictive of their labels. This corresponds to label-shuffled in [Pareto Probing], which have been criticized for only corrupting the input-output mapping without corrupting the input structures. The structured input might lead to representations from which info is easier to extract.
- The more a probe is able to make task output decisions independently of the linguistic properties of a representation, the less its acc on a linguistic task necessarily reflects the properties of the representation.
- Control tasks must have two properties at a high level. a) Structure: The output for a word token is a deterministic function of the word type. 又被[Pimental 2020]喷了,这确实就很不现实。 b) Randomness: The output for each word type