I have seen two broad uses of matrix factorization in recommenders. Both involve low-rank approximate factorizations. The first is really a generic approach which can be combined with many factorizations: alternating least squares (here's my summary:
Big, Practical Recommendations with Alternating Least Squares
), and the second is one specific factorization by itself, the
Singular value decomposition
Alternating least squares is flexible but less precise. It refers to any means of of factoring A≈XkYTk A≈XkYkT , where Xk Xk and Yk Yk are low rank. "Approximate" means minimizing some squared-error difference with the input A, but, here you can customize exactly what is considered in the loss function. For example you can ignore missing values (crucial) or weight different Aij Aij differently. The price is that you don't get many guarantees about the two factors. They are not necessarily orthonormal. In practice it does not help, but doesn't hurt much.
The factorization here just involves alternating solving for Xk Xk and Yk Yk by fixing the other. When fixed, it is a problem that has a direct analytical solution, one which is entirely parallelizable (also important). You can choose several decompositions to plug in for this phase; I use a QR decomposition because it's fast and can 'detect' when the requested rank is even too low.
In contrast the SVD is a particular decomposition that gives more guarantees about its factorization A=UΣVTk A=UΣVkT . The two outside factors are orthornormal for example. Σ Σ will even help show you how big k should be.
Alternating least squares is flexible but less precise. It refers to any means of of factoring A≈XkYTk A≈XkYkT , where Xk Xk and Yk Yk are low rank. "Approximate" means minimizing some squared-error difference with the input A, but, here you can customize exactly what is considered in the loss function. For example you can ignore missing values (crucial) or weight different Aij Aij differently. The price is that you don't get many guarantees about the two factors. They are not necessarily orthonormal. In practice it does not help, but doesn't hurt much.
The factorization here just involves alternating solving for Xk Xk and Yk Yk by fixing the other. When fixed, it is a problem that has a direct analytical solution, one which is entirely parallelizable (also important). You can choose several decompositions to plug in for this phase; I use a QR decomposition because it's fast and can 'detect' when the requested rank is even too low.
In contrast the SVD is a particular decomposition that gives more guarantees about its factorization A=UΣVTk A=UΣVkT . The two outside factors are orthornormal for example. Σ Σ will even help show you how big k should be.
The cost is speed and flexibility. The SVD is relatively more computationally expensive and harder to parallelize. There is also not a good way to deal with missing values or weighting; you need to assume that in your sparse input, missing values are equal to a mean value 0. (Someone can correct me if these assumptions are mitigatable.)
https://www.quora.com/What-is-the-difference-between-SVD-and-matrix-factorization-in-context-of-recommendation-engine