Connectionist Temporal Classification, an algorithm used to train deep neural networks in speech recognition, handwriting recognition and other sequence problems.
1. Problem
- don’t know the characters in the transcript align to the audio when having a dataset of audio clips and corresponding transcripts.
- people’s rates of speech vary.
- hand-align takes lots of time.
- Speech recognition, handwriting recognition from images, sequences of pen strokes, action labelling in videos.
2. Question Define
when mapping input seq