Welcome to Kaggle Learn’s Intermediate Machine Learning micro-course!
If you have some background in machine learning and you’d like to learn how to quickly improve the quality of your models, you’re in the right place! In this micro-course, you will accelerate your machine learning expertise by learning how to:
- tackle data types often found in real-world datasets (missing values, categorical variables),
- design pipelines to improve the quality of your machine learning code,
- use advanced techniques for model validation (cross-validation),
- build state-of-the-art models that are widely used to win Kaggle competitions (XGBoost), and
- avoid common and important data science mistakes (leakage).
Along the way, you’ll cement your knowledge by completing a hands-on exercise with real-world data for each new topic. The hands-on exercises use data from the Housing Prices Competition for Kaggle Learn Users, where you’ll use 79 different explanatory variables (such as the type of roof, number of bedrooms, and number of bathrooms) to predict home prices. You’ll measure your progress by submitting predictions to this competition and watching your position rise on the leaderboard!
Prerequisites
You’re ready for this micro-course if you’ve built a machine learning model before, and you’re familiar with topics such as model validation, underfitting and overfitting, and random forests.
If you’re completely new to machine learning, please check out our introductory micro-course, which covers everything you need to prepare for this intermediate micro-course.
Your Turn
Continue to the first exercise to learn how to submit predictions to a Kaggle competition and determine what you might need to review before getting started.