大眼呆萌君-CSDN博客

原创 Curriculum adversarial training

Weakness of adversarial training: overfit to the attack in use and hence does not generalize to test dataCurriculum adversarial training思想：train model from weak attack to strong attack方法Let lll de...

2020-04-16 19:59:41 450

原创 Dropout network, DropConnect network

Mathematical form of Dropout network and DropConnect networkDropConnect - network structure, model, inference

2020-04-16 16:32:24 270

原创调超参(lr,regularization parameter)经验整理

Learning rate最优值从1e-4到1e-1的数量级都碰到过，原则大概是越简单的模型的learning rate可以越大一些。[https://blog.csdn.net/weixin_44070747/article/details/94339089]其它：增大batchsize来保持学习率的策略[抛弃Learning Rate Decay吧 https://www.sohu....

2020-04-09 04:46:50 1468

原创 group sparsity

regularization for categorical variables

2020-03-01 00:19:36 878

Big picture on why we need randomness in stochastic algorithmsrandomness during initialization: as the structure of the search space is unknownrandomness during the progression of the search: avoid...

2020-02-05 02:03:43 224

原创调参之learning rate

The learning rate is perhaps the most important hyperparameter. If you have time to tune only one hyperparameter, tune the learning rate.\hspace{20em} – Page 429, Deep Learning, 2016a l...

2020-02-05 01:23:12 1219

原创 Uniform convergence may be unable to explain generalization in deep learning

本文价值：understand the limitations of u.c. based bounds / cast doubt on the power of u.c. bounds to fully explain generalization in DLhighlight that explaining the training-set-size dependence of the g...

2020-01-25 21:19:33 627 1

原创 Adversarial Robustness

Motivation: a limitation of the (supervised) ML framework\text{: a limitation of the (supervised) ML framework}: a limitation of the (superv...

2020-01-16 00:55:05 5172

原创 principal component analysis

Derivation (method of Lagrangian multiplier)DerivationFirst step:Find αk′x\bm \alpha'_k \bm xαk′x that maximises var(αk′x)\text{var}(\bm \alpha'_k \bm x)var(αk′x)Choose normalisation constrai...

2020-01-15 00:47:58 150

原创特征归一化

题目（2）：为什么需要对数值类型的特征做归一化 (normalization)？回答角度：归一化的方式归一化的作用各方式的优劣数据类型结构化数据：数值型、类别型（ordinal, nominal）非结构化数据：包含的信息无法用一个简单的数值表示，并且每条数据的大小各不相同归一化方式min-max scalingPromay be useful where all p...

2019-12-31 19:22:34 167

原创机器学习中的凸和非凸优化问题

题目（145）：机器学习中的优化问题，哪些是凸优化问题，哪些是非凸优化问题？请各举一个例子。 - 凸优化定义 - 凸优化问题 - 非凸优化问题 - 凸优化定义：公式、geometric insight - 凸优化问题：逻辑回归；通过Hessian matrix的半正定性质判定；局部最优等价于全部最优 - 非凸优化问题：PCA；PCA求解方式凸优化问题逻辑回归Li(θ)=lo...

2019-12-28 19:08:07 2474

原创 CSDN-markdown cheatsheet

排版，数学公式

2019-12-28 18:37:27 104

原创 L1正则项与稀疏性

题目（164）：L1正则化使得模型参数具有稀疏性的原理是什么？回答角度：几何角度，即解空间形状微积分角度，对带L1限制的目标函数求导贝叶斯先验解空间形状正则条件和限制条件的等价性L1范数与L2范数的几何形状如果原问题目标函数的最优解不在解空间内，那么约束条件下的最优解一定是在解空间的边界上。[复习KKT, complementary slackness]...

2019-12-24 05:36:49 546

原创 Deep Learning相关概念

EpochOne Epoch is when an ENTIRE dataset is passed forward and backward through the neural network only ONCE [1].IterationIterations is the number of batches needed to complete one epoch [1]...

2019-12-19 04:13:05 179

原创梯度下降、随机梯度下降法、及其改进

GD, SGD, batch GD改进算法：momentum, ADAM, etc

2019-12-18 17:32:14 719

原创无约束优化问题的求解

一阶、二阶算法和Taylor expansion之间的关系

2019-12-17 17:47:02 486

原创验证梯度的正确性

题目（152）：如何验证求目标函数梯度功能的正确性？考点：微积分、Taylor expansion近似（微积分）根据partial derivative的定义，∂L(θ)∂θi=L(θ1,⋯ ,θi+h,⋯ ,θp)−L(θ1,⋯ ,θi−h,⋯ ,θp)2h\frac{\partial L(\bm \theta)}{\partial \theta_i} = \frac{L(\theta_...

2019-12-16 17:25:20 400

原创损失函数

题目（142）：有监督学习涉及的损失函数有哪些？请列举并简述它们的特点。解释角度不同Label类型对应的损失函数：categorical (binary/multi-class classification), ordinal (ordinal classification), continuous (regression)classification: 0-1 loss and its ...

2019-12-16 01:29:20 423

原创 Line Search Methods

重点Armijo condition的直观理解Armijo conditionstep length问题：过大或过小Backtracking line search 1. Initialization: alpha (=1), tau (decay rate) 2. while f(x^t + alpha p^t) ">" f(x^t) alpha = tau*a...

2019-12-14 05:29:33 382

原创 Python基础

数据类型列表List：有序对象字典Dictionary：无序的键值映射集合Set列表推导式a=[1,2,2,4,5]myList = [item*4 for item in a]myList = [item*4 for item in a if item > 2]参考文章《机器学习实战》附录A Python入门...

2019-12-13 17:23:50 94

原创 Multi-task Learning

Multi-task learning and its definitionLinear MTLRegularisers for linear MTL (Quadratic regulariser, Structured sparsity)Clustered MTLFurther topics (Transferring to new tasks)

2016-11-30 22:11:19 600

原创 Python文件读写

一、打开文件f = open("D:\\test.txt", "r")说明：第一个参数是文件名称，包括路径；第二个参数是打开的模式mode'r'：只读'w'：只写。如果文件不存在，则自动创建文件；如文件已存在，则覆盖该文件'a'：附加到文件末尾'r+'：读写如果需要以二进制方式打开文件，需要在mode后面加上字符"b"，比如"rb""wb"等二、读取内容目前常...

2016-09-06 20:04:32 336

原创 Python处理中文语言——读取中文

1、导入中文txt文本，并转换为unicode · s.decode(*encoding*); u.encode(*encoding*) · BOM头2、导入包含中文的py file

2016-09-05 14:59:15 25491 1

my_god2008的博客