How to represent meaning in a computer?
Discrete Representation:
One-Hot Representation
But the one-hot representation has a problem: hard to compute similarity.
Distributional Representation:
Full Document & Window Based
Full document, like word-document coocurrence matrix -> LDA -> suitable for text classification.
Window based, like the following:
But the window based method also has a problem: the matrix dimension is too high.
Solution: SVD, 什么是奇异值分解
But SVG has to cost much time!
So we think about obtaining low dimensional vectors directly.
Warm Up: Gradient Descent