DS Wannabe Prep学习笔记: Machine Learning Algo 1

本文介绍了模型欠拟合和过拟合的概念,强调了正则化技术在防止过拟合中的作用,包括L1、L2正则化以及ElasticNet的区别。还讨论了数据增强、过采样和欠采样等处理不平衡数据的方法,以及监督学习、无监督学习和强化学习的不同应用场景。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

先复习一下基础

Defining Model Underfitting and Overfitting

Type Definition How to reduce
Underfitting the model isn’t able to capture the relationship between the dataset’s independent variables (e.g., weight, height, etc.) and the dependent variables (e.g., price). 

1. adding more variables or model features to help the model learn more patterns from the trainning data and reduce underfitting

2. to increase the no. of iterations the model trains for b4 training is stopped

Overfitting when a model fits the training data too closely and very specifically finding patterns that happen to be in the traning set but not elsewhere.  REGULARIZATION

Regularization

Regularization in machine learning is a technique used to prevent a model from overfitting. Overfitting occurs when a model learns the detail and noise in the training data to the extent that it negatively impacts the performance of the model on new data. This means that the model becomes too complex, capturing patterns that may not be present in the test data or in new data it encounters after deployment.

Here are the key points about regularization:

  1. Purpose: Regularization techniques are used to simplify models without substantially decreasing their accuracy. They do this by adding some form of penalty or constraint to the model optimization process.

  2. Types of Regularization:

    • L1 Regularization (Lasso):  Adds a penalty equivalent to the absolute value of the magnitude of coefficients. This can lead to some coefficients being zero, which is useful for feature selection.
    • L2 Regularization (Ridge): Adds a penalty equivalent to the square of the magnitude of coefficients. This doesn't reduce coefficients to zero but makes them smaller, leading to a less complex model.
    • Elastic Net: Combines L1 and L2 regularization and can be used to balance between feature selection (L1) and feature shrinkage (L2).
  3. Effect on Model Complexity: Regularization typically leads to a decrease in model complexity, which can reduce overfitting. This is done by penalizing the weights of the model, thereby discouraging overly complex models that fit the noise in the training data.

  4. Choosing the Regularization Term: The strength of the regularization is controlled by a hyperparameter, often denoted as lambda (λ) or alpha. The higher the value of this hyperparameter, the stronger the regularization effect. Selecting the right value is critical and is usually done using cross-validation.

  5. Bias-Variance Tradeoff: Regularization is a key technique in managing the bias-variance tradeoff in machine learning. By adding regularization, we increase the bias but decrease the variance, hopefully leading to a better overall model performance on unseen data.

  6. Application in Different Algorithms: While regularization is most commonly talked about in the context of linear models (like linear regression and logistic regression), it's also applicable to other algorithms, including neural networks, where techniques like dropout and weight decay are forms of regularization.

Interview question 3-1: What is L1 versus L2 regularization?

Example answer

L1 regularization, also known as lasso regularization, is a type of regularization that shrinks model parameters toward zer

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值