Big Data Analytics 笔记整理 1

Cross Validation

  • random subsampling:
    在这里插入图片描述
  • k-fold
    在这里插入图片描述
  • leave one out
    在这里插入图片描述

Cost Function

减少预测误差 (prediction error) ——预测值和真实值的差异,这种差异一般用error metric量化
Error Metric:

  • cost function
  • loss fuction (machine learning)
  • objective function (optimization)
  • utility function (equal to negative cost, used in decision theory)

Cost Function
MSE, RMSE, MAE, FP & FN, F1 scores

M S E ( a p p r o x i m a t i o n ) = 1 n ∑ i = 1 n ( t r u e − a p p r o x i m a t i o n ) 2 = 1 n d ∑ i = 1 n d ( y i − x i T θ ^ ) 2 MSE(approximation) = \frac{1}{n} \sum^{n}_{i=1}{(true - approximation)}^2\\ = \frac{1}{n_d}\sum^{n_d}_{i=1}(y_i-x_i ^\mathsf{T}\hat{\theta})^2 MSE(approximation)=n1i=1n(trueapproximation)2=nd1i=1nd(yixiTθ^)2

R M S E = M S E RMSE = \sqrt{MSE} RMSE=MSE
M A E ( y ^ ) = 1 n ∑ i = 1 n ∣ y i − y i ^ ∣ MAE(\hat{y}) = \frac{1}{n} \sum^{n}_{i=1}{|y_i - \hat{y_i}|} MAE(y^)=n1i=1nyiyi^
TP: True positive; FP: false positive; TN: true negetive; FN: false negetive

P ( y ^ ∣ y ) P(\hat{y}\mid y) P(y^y) y = 1 y=1 y=1 y = 0 y=0 y=0
y ^ = 1 \hat{y}=1 y^=1TPFP
y ^ = 0 \hat{y}=0 y^=0FNTN

R e c a l l = S e n s i t i v i t y = T P T P + F N \mathrm{Recall}=\mathrm{Sensitivity} = \frac{TP}{TP+FN} Recall=Sensitivity=TP+FNTP
S p e c i f i c i t y = T N F P + T N \mathrm{Specificity} = \frac{TN}{FP+TN} Specificity=FP+TNTN

A c c u r a c y = T P + T N T P + T N + F P + F N \mathrm{Accuracy} = \frac{TP+TN}{TP+TN+FP+FN} Accuracy=TP+TN+FP+FNTP+TN
T r u e   p o s i t i v e   r a t e   ( T P R ) = S e n s i t i v i t y \mathrm{True~positive~rate~(TPR)}=\mathrm{Sensitivity} True positive rate (TPR)=Sensitivity
F a l s e   p o s i t i v e   r a t e   ( F P R ) = 1 − S p e c i f i c i t y \mathrm{False~positive~rate~(FPR)}=1-\mathrm{Specificity} False positive rate (FPR)=1Specificity
If the data is skewed (数据倾斜):
P r e c i s i o n = T P T P + F P \mathrm{Precision}=\frac{TP}{TP+FP} Precision=TP+FPTP
F 1   s c o r e = 2 ( p r e c i s i o n × r e c a l l p r e c i s i o n + r e c a l l ) \mathrm{F_1~score}=2\left(\frac{precision \times recall}{precision + recall}\right) F1 score=2(precision+recallprecision×recall)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值