Q1. How would you define Machine Learning?
A1: The science to programming computers to learn from data.
Q2. Can you name four types of problems where it shines?
A2: Predict the weather; segment different people; speech recognization; image recognization.
Q3. What is a labeled training set?
A3: The labeled training set contains the desired solution for each instance.
Q4. What are the two most common supervised tasks?
A4: Regression; Classification.
Q5. Can you name four common unsupervised tasks?
A5: Dimensionality reduction; anomaly detection; clustering; association rule learning.
Q6. What type of Machine Learning algorithm would you use to allow a robot to walk in various unknown terrains?
A6: Reinforcement learning.
Q7. What type of algorithm would you use to segment your customers into multiple groups?
A7: Clustering.
Q8. Would you frame the problem of spam detection as a supervised learning problem or an unsupervised learning problem?
A8: Supervised learning.
Q9. What is an online learning system?
A9: Train the system incrementally by feeding data instances sequentially.
Q10. What is out-of-core learning?
A10: Train systems on huge datasets that cannot fit in one machine's main memory. So out-of-core learning algorithm loads part of the data, runs a training step on that data. and repeats the process until it has run on all of the data.
Q11. What type of learning algorithm relies on a similarity measure to make predictions?
A11: Instance-based learning.
Q12. What is the difference between a model parameter and a learning algorithm's hyperparameter?
A12: A model parameter can help the model to make predictions, while a hyperparameter is a parameter of the algorithm itself, not the model.
Q13. What do model-based learning algorithms search for? What is the most common strategy they use to secceed? How do they make predictions?
A13: It search for an optimal value to generalize better; minimize the cost function; feed the new instance into the prediction function.
Q14. Can you name four of the main challenges in Machine Learning?
A14: Poor-quality data; insufficient quantity of training data; nonrepresentative training data; irrelevant features.
Q15. If your model performs great on the training data but generalizes poorly to new intances, what is happening? Can you name three possible solutions?
A15: It's overfitting;3 possible solutions:gather more training data; reduce the model's complexity; select fewer parameters.
Q16. What is a test set and why would you want to use it?
A16: Test set is for examination of your model; we use it because it can evaluate the model.
Q17. What is the purpose of a validation set?
A17: Make sure the model can perform well on new data.
Q18. What can go wrong if you tune hyperparameters using the test set?
A18: Maybe overfitting the test set and the prediction is inaccurate.
Q19. What is cross-validation and why would you prefer it to a validation set?
A19: Cross-validation splits the training set into complementary subsets, each model is trained by the different combinations of these subsets; it can avoid wasting too much training data.