ID = voxsrc20_ask_00
Status: closed
Question
How to understand these two distances?
Since open-set speaker recognition is essentially a metric learning problem, the key is to learn features that
have small intra-class and large inter-class distance.[1]
Answer
- Each speaker is a
class
. - Each utterance is a
object
- Small intra-class distance: all utterances of a speaker forms a speaker’s space,
this space should be impact, small and discriminative. - Large inter-class distance: different speaker spaces should not be overlapped,
this inter-class distance should be large
Reference
[1] In defence of metric learning for speaker recognition, arXiv:2003.11982