The machine learning practitioner has a tradition of algorithms and a pragmatic focus on results and model skill above other concerns such as model interpretability.
Statisticians work on much the same type of modeling problems under the names of applied statistics and statistical learning.Coming from a mathematical background, they have more of a focus on the behavior of models and explainability of predictions.
The statisticians need to consider algorithmic methods was called out in the classic two cultures paper.
Machine learning practitioners must also take heed, keep an open mind, and learn both the terminology and relevant methods from applied statistics.
After reading this blog, you will know:
- Machine learning and predictive modeling are a computer science perspective on modeling data with a focus on algorithmic methods and model skill.
- Statistics and statistical learning are a mathematical perspective on modeling data with a focus on data models and on goodness of fit.
- Machine learning practitioners must keep an open mind and leverage methods and understand the terminology from the closely related fields of applied statistics and statistical learning.
1.1 Machine Learning
Machine learning is a subfield of artificial intelligence and is related to the broader field of computer science. When it comes to developing machine learning models in order to make predictions, there is a heavy focus on algorithms, code, and results.
The field of machine learning is concerned with the question of how to construct computer programs that automatically improve with experience.
1.2 Predictive Modeling
The useful part of machine learning for the practitioner may be called predictive modeling.This explicitly ignores distinctions between statistics and machine learning. It also shucks off the broader objectives of statistics (understanding data) and machine learning (understanding learning in software) and only concerns itself, as its name suggests, with developing models that make predictions.
Predictive modeling provides a laser-focus on developing models with the objective of getting the best possible results with regard to some measure of model skill. This pragmatic approach often means that results in the form of maximum skill or minimum error are sought at the expense of almost everything else.
1.3 Statistical Learning
The process of working with a dataset and developing a predictive model is also a task in statistics. A statistician may have traditionally referred to the activity as applied statistics. Statistics is a subfield of mathematics, and this heritage gives a focus of well defined, carefully chosen methods.
Statistical learning refers to a set of tools for modeling and understanding complex datasets. It is a recently developed area in statistics and blends with parallel developments in computer science and, in particular, machine learning.