Mean Squared Error、Cross Entropy、softmax函数(Multi-class classification)的二元分类=sigmoid函数(binary classif

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

在这里插入图片描述

在这里插入图片描述
在这里插入图片描述
交叉熵(Cross Entropy)是Shannon信息论中一个重要概念,主要用于度量两个概率分布间的差异性信息。
https://www.baidu.com/baidu?tn=68018901_12_oem_dg&ie=utf-8&word=交叉熵

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
MSE与sigmoid函数不适合配合使用;
在这里插入图片描述

在这里插入图片描述
在这里插入图片描述
disciminative model: logistic model
genderative model:

在这里插入图片描述
在这里插入图片描述

在这里插入图片描述

naive bayes,假设样本是独立independent的,不考虑不同demension间的correlation,因为data sample不够多;

generative model,做了假设,假设data来自某一机率模型;
data少的时候,generative model赢过discriminative model
把formulation里面拆出priors和class-dependent probabilities,可以来源不同

discriminative mpdel, 没有做假设,performance受到data影响

在这里插入图片描述
softmax对最大的值做强化,
算softmax可以用 exponention, 大小值之间拉的更开,
也可以用gussian,从generative model的角度,如果3个guassion的class都是同一个distribution,共用同一个corvenrence matrix,

minimum cross entropy,需要做假设,此时等价于maximum likelihood
(maximum likelihood如果完全不知道假设,或者知道所有的假设都相同;即如果知道p(h(xi))均匀分布(均匀分布是最大熵分布,已知均值的指数分布是最大熵分布,已知均值方差的正态分布是最大熵分布)

logistics regression等价于求maximum entropy

最小化交叉熵的过程实际上就是已知分布A,最小化分布B的不确定性的过程,也就是让两个分布尽量一致的过程。

最大熵模型的形式假设满足了最大熵原理的后半部分,对于最大熵模型的训练过程就是拟合参数使得最大熵模型满足已知条件限制的过程。

minimum cross entropy和maximum entropy的区别

logistic regression与最大熵模型

对数线性模型 Log-linear model
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
视频:
https://www.bilibili.com/video/BV1Ht411g7Ef?p=11

两种计算loss,比较error的方法:
Mean Squared Error、Cross Entropy

MSE与sigmoid函数不适合配合使用;
Cross Entropy与sigmoid函数(binary classification)配合使用;
Cross Entropy与softmax函数(Multi-class classification)配合使用;
!!!
交叉熵损失函数(Cross Entropy Error Function)与均方差损失函数(Mean Squared Error)

softmax函数(Multi-class classification)的二元分类=sigmoid函数(binary classification)

maximum entropy,和logistics regression一模一样;
Soft-max处理=normolization归一化+最大值最小值之间差异变大;

对于discriminative model,不需要假设分布,那么对于cross entropy,也不需要假设分布,但是对于minimum cross entropy,是generative model,假设了,比如说Bernoulli distribution(logistics regression,maximum likelihood)、Gaussian distribution(linear regression),都可以看作maximum likelihood,它们有相同的function表现形式;

Bernoulli distribution仍为计算probability概率,当logistics regression,maximum likelihood,使用sigmoid计算cross entropy,计算loss,直接更新w和b的时候,可以看作discriminative model;

【逻辑回归Logistic Regression(2)Maximum Likelihood-哔哩哔哩】

逻辑回归cross entropy loss

loss 函数

交叉熵损失函数(Cross Entropy Error Function)与均方差损失函数(Mean Squared Error)

交叉熵损失(Cross-Entropy)和平方损失(MSE)究竟有何区别

### YOLOv4 Loss Function Explanation and Issues In the context of object detection models like YOLOv4, the loss function plays a crucial role in training performance. The overall objective is to minimize localization error while ensuring accurate classification. The total loss \( L \) in YOLOv4 consists primarily of three components: 1. **Localization Loss**: This measures how accurately bounding boxes predict objects' positions. 2. **Confidence Loss**: Evaluates whether each grid cell contains an object or not. 3. **Classification Loss**: Measures prediction accuracy for class labels within detected objects. For these losses, cross-entropy serves as the foundation but gets modified specifically for better convergence properties during training[^1]. #### Localization Loss Calculation Bounding box coordinates are predicted relative to the location on the feature map using sigmoid functions to ensure values remain between 0 and 1. Widths and heights undergo logarithmic transformations before applying mean squared error (MSE). ```python def calculate_localization_loss(pred_boxes, true_boxes): mse = tf.keras.losses.MeanSquaredError() return mse(true_boxes, pred_boxes) ``` #### Confidence Loss Implementation This part penalizes predictions that incorrectly estimate if there exists any object inside cells. It uses binary cross-entropy where positive examples have higher weight than negative ones due to imbalance issues common in datasets used for training detectors such as COCO or Pascal VOC. ```python def confidence_loss(predicted_confidences, actual_confidences): bce = tf.keras.losses.BinaryCrossentropy() return bce(actual_confidences, predicted_confidences) ``` #### Classification Component Details When multiple classes exist per image, softmax activation combined with categorical cross-entropy ensures proper distribution over all possible categories without overlap among probabilities assigned by model outputs. ```python def classification_loss(class_predictions, ground_truth_classes): cce = tf.keras.losses.CategoricalCrossentropy(from_logits=True) return cce(ground_truth_classes, class_predictions) ``` Despite efforts made towards improving stability through various means including IoU-awareness mechanisms introduced later versions after v3, challenges persist regarding effective reduction especially when dealing large-scale variations across different types of images encountered throughout extensive real-world applications[^4]. In some cases, even achieving satisfactory results may require substantial tuning effort beyond default configurations provided out-of-the-box solutions offered initially upon release. --related questions-- 1. How does YOLOv4 address class imbalance problems? 2. What improvements were implemented from YOLOv3 to enhance loss minimization efficiency? 3. Can you explain the impact of data augmentation techniques on reducing loss in YOLO architectures? 4. Are there alternative methods besides MSE for calculating localization errors effectively? 5. Which hyperparameters significantly influence the behavior of the loss function in YOLOv4?
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值