Use relatively abundant source data
to train better model considering limited target data
.
Firstly, use source data to train a initial model.
Secondly, use target data to re-train model (Should it be similar to fine-tune?)
One case of transfer learning is speaker verification
.
The target data is usually very limited, which is few shot, and even one shot. While the source data is enormous. Therefore, the right way is to train the encoder based on large quantity of source data and its augmented data.
But someone also point it out that speaker verification is actually a meta learning, and metric learning. I just think it should a manifold learning case. Because, the enrollment process can’t be treated as a learning process, but actually a centralization process, which is to find out the center of enrolled utterances. And metric learning alone is not suitable, because we also need to treat it as vector instead of scalar by adding direction info.