https://stackoverflow.com/questions/23838056/what-is-the-difference-between-transform-and-fit-transform-in-sklearn
In scikit-learn estimator api,
fit()
: used for generating learning model parameters from training data
transform()
: parameters generated from fit()
method,applied upon model to generate transformed data set.
fit_transform()
:combination of fit()
and transform()
api on same data set
Checkout Chapter-4 from this book & answer from stackexchange for more clarity
Further more explanation as follows (an example to explain the meaning of fit() and fit_transform() ):
To center the data (make it have zero mean and unit standard error), you subtract the mean and then divide the result by the standard deviation.
Hence, every sklearn's transform's fit()
just calculates the parameters (e.g. μ and σ in case of StandardScaler) and saves them as an internal objects state. Afterwards, you can call its transform()
method to apply the transformation to a particular set of examples.
fit_transform()
joins these two steps and is used for the initial fitting of parameters on the training set x, but it also returns a transformed x′. Internally, it just calls first fit()
and then transform()
on the same data.