zealscott-CSDN博客

原创 CMU 11-785 L23 Variational Autoencoders

EM for PCAWith complete informationIf we knew zzz for each xxx, estimating AAA and DDD would be simplex=Az+Ex=A z+E x=Az+EP(x∣z)=N(Az,D)P(x \mid z)=N(A z, D)P(x∣z)=N(Az,D)Given complete information (x1,z1),(x2,z2)\left(x_{1}, z_{1}\right),\lef

2021-03-08 17:18:45 299

原创 CMU 11-785 L22 Revisiting EM algorithm and generative models

Key pointsEM: An iterative technique to estimate probability models for data with missing components or informationBy iteratively “completing” the data and reestimating parametersPCA: Is actually a generative model for Gaussian dataData lie close

2021-02-25 21:15:01 344

原创 CMU 11-785 L21 Boltzmann machines2

The Hopfield net as a distributionThe Helmholtz Free Energy of a SystemAt any time, the probability of finding the system in state sss at temperature TTT is PT(s)P_T(s)PT(s)At each state it has a potential energy EsE_sEsThe internal energy of t

2021-01-21 21:18:24 325

原创 CMU 11-785 L20 Boltzmann machines 1

Training hopfield netsGeometric approachW=YYT−NpI\mathbf{W}=\mathbf{Y} \mathbf{Y}^{T}-N_{p} \mathbf{I}W=YYT−NpIE(y)=yTWy\mathbf{E}(\mathbf{y})=\mathbf{y}^{T} \mathbf{W y}E(y)=yTWySine : yT(YYT−NpI)y=yTYYTy−NNp\mathbf{y}^{T}\left(\mathbf{Y} \mat

2020-12-16 17:46:42 155

原创 CMU 11-785 L19 Hopfield network

Hopfield NetSo far, neural networks for computation are all feedforward structuresLoopy networkEach neuron is a perceptron with +1/-1 outputEvery neuron receives input from every other neuronEvery neuron outputs signals to every other neuron

2020-11-07 17:51:01 204

原创 CMU 11-785 L18 Representation

Logistic regressionThis the perceptron with a sigmoid activationIt actually computes the probability that the input belongs to class 1Decision boundaries may be obtained by comparing the probability to a thresholdThese boundaries will be lines (hype

2020-11-07 17:48:11 116

原创 CMU 11-785 L17 Seq2seq and attention model

Generating LanguageSynthesisInput: symbols as one-hot vectorsDimensionality of the vector is the size of the 「vocabulary」Projected down to lower-dimensional “embeddings”The hidden units are (one or more layers of) LSTM unitsOutput at each time:

2020-08-06 16:39:21 230

原创 CMU 11-785 L16 Connectionist Temporal Classification

Sequence to sequenceSequence goes in, sequence comes outNo notion of “time synchrony” between input and outputMay even nots maintain order of symbols (from one language to another)With order synchronyThe input and output sequences happen in the

2020-08-06 16:36:46 211

原创 CMU 11-785 L15 Divergence of RNN

Variants on recurrent netsArchitecturesHow to train recurrent networks of different architecturesSynchronyThe target output is time-synchronous with the inputThe target output is order-synchronous, but not time synchronousOne to oneNo rec

2020-05-30 21:34:56 390

原创 CMU 11-785 L14 Stability analysis and LSTMs

StabilityWill this necessarily be「Bounded Input Bounded Output」?Guaranteed if output and hidden activations are boundedBut will it saturate？Analyzing RecursionSufficient to analyze the behavior of the hidden layer since it carries the relevant

2020-05-25 19:50:18 439

原创 CMU 11-785 L13 Recurrent Networks

Modelling SeriesIn many situations one must consider a series of inputs to produce an outputOutputs too may be a seriesFinite response modelCan use convolutional neural net applied to series data (slide)Also called a Time-Delay neural network

2020-05-20 23:35:11 282

原创 CMU 11-785 L12 Back propagation through a CNN

ConvolutionEach position in zzz consists of convolution result in previous mapWay for shrinking the mapsStride greater than 1Downsampling (not necessary)Typically performed with strides > 1PoolingMaxpoolingNote: keep tracking of loc

2020-05-19 19:37:00 250

原创 CMU 11-785 L10 CNN architecture

ArchitectureA convolutional neural network comprises “convolutional” and “downsampling ” layersConvolutional layers comprise neurons that scan their input for patternsDownsampling layers perform max operations on groups of outputs from the convolutio

2020-05-19 19:28:29 168

原创 CMU 11-785 L09 Cascade-Correlation and Deep Learning

Cascade-Correlation AlgorithmStart with direct I/O connections only. No hidden units.Train output-layer weights using BP or Quickprop.If error is now acceptable, quit.Else, Create one new hidden unit offline.Create a pool of candidate units. Each ge

2020-05-19 19:23:48 239

原创 CMU 11-785 L08 Motivation of CNN

MovivationFind a word in a signal of find a item in pictureThe need for shift invarianceThe location of a pattern is not importantSo we can scan with a same MLP for the patternJust one giant...

2020-05-07 22:24:15 164

原创 Nodejs 豆瓣爬虫实践

使用 Nodejs 从豆瓣小组中爬取帖子，并进行过滤。前端网页解析网页结构打开一个豆瓣小组网页，例如https://www.douban.com/group/16473/使用 F12 解析网站，可以看到，每一个帖子都由一个a标签构成，标题为title我们需要提取的包括标题、URL以及时间信息，因此可以直接使用request以及cheerio包进行提取：request(opt, f...

2020-05-07 10:04:00 762

原创 CMU 11-785 L07 Optimizers and regularizers

OptimizersMomentum and Nestorov’s method improve convergence by normalizing the mean (first moment) of the derivativesConsidering the second momentsRMS Prop / Adagrad / AdaDelta / ADAM1Simple ...

2020-05-03 15:01:45 181

原创 CMU 11-785 L06 Optimization

ProblemsDecaying learning rates provide googd compromise between escaping poor local minima and convergenceMany of the convergence issues arise because we force the same learning rate on all parame...

2020-05-03 15:01:26 247

原创 CMU 11-785 L05 Convergence

BackpropagationThe divergence function minimized is only a proxy for classification error(like Softmax)Minimizing divergence may not minimize classification errorDoes not separate the points even...

2020-04-23 23:24:45 161

原创在服务器上部署 Jupyter Notebook

安装 Ananconda使用命令行安装wget wget https://repo.continuum.io/archive/Anaconda3-5.2.0-Linux-x86_64.sh注意，选择安装路径时，如果想要所有用户都能使用，则安装在usr/local/ananconda3目录下注意修改/etc/profile.d下的conda.sh，指定环境变量（在登入另外的用户时会提...

2020-04-23 23:21:33 250

原创 CMU 11-785 L03.5 A brief note on derivatives

What is derivatives?A derivative of a function at any point tells us how much a minute increment to the argument of the function will increment the value of the functionTo be clear, what we want ...

2020-04-21 20:02:06 184

原创 CMU 11-785 L04 Backpropagation

Problem setupInput-output pairs: not to mentionRepresenting the output: one-hot vectoryi=exp⁡(zi)∑jexp⁡(zj)y_{i}=\frac{\exp \left(z_{i}\right)}{\sum_{j} \exp \left(z_{j}\right)}yi=∑jexp(z...

2020-04-21 19:56:29 173

原创 CMU 11-785 L03 Learning the network

PreliminaryThe bias can also be viewed as the weight of another input component that is always set to 1z=∑iwixiz=\sum_{i} w_{i} x_{i}z=∑iwixiWhat we learn: The …parameters… of the network...

2020-03-16 16:08:36 194

原创 CMU 11-785 L02 What can a network represent

PreliminaryPerceptronThreshold unit“Fires” if the weighted sum of inputs exceeds a thresholdSoft perceptronUsing sigmoid function instead of a threshold at the outputActivation: The functio...

2020-03-04 11:01:50 253

原创使用 Hugo 进行持续集成写作及同步

我们通常会在本地计算机上写 Markdown 文件，然后使用 Hugo 建立静态博客网站。因此需要一种方法将本地文件同步到服务器上，同时实现 GitHub 集成，确保网站的可维护性。我使用了 Git hook 的方法进行同步与集成。服务器上更新yum updateyum install nginxyum install git新建 hugo 用户：adduser hugopass...

2020-03-03 16:40:08 631

原创 Hive优化

Hive简单优化与定期ETLHive优化Hive的执行依赖于底层的MapReduce作业，因此对Hadoop作业的优化或者对MapReduce作业的调整是提高Hive性能的基础。大多数情况下，用户不需要了解Hive内部是如何工作的。但是当对Hive具有越来越多的经验后，学习一些Hive的底层实现细节和优化知识，会让用户更加高效地使用Hive。如果没有适当的调整，那么即使查询Hive中的一...

2019-06-14 18:35:47 184

原创初始装载

初始装载在数据仓库可以使用前，需要装载历史数据。这些历史数据是导入进数据仓库的第一个数据集合。首次装载被称为初始装载，一般是一次性工作。由最终用户来决定有多少历史数据进入数据仓库。例如，数据仓库使用的开始时间是2015年3月1日，而用户希望装载两年的历史数据，那么应该初始装载2013年3月1日到2015年2月28日之间的源数据。在2015年3月2日装载2015年3月1日的数据（假设执行频率是每...

2019-06-06 09:52:29 260

原创 PCA算法推导

PCA理解与应用。MotivationPCA与Factor analysis非常相似，都是主要用于reduction data dimensions。但PCA的想法相比于Factor analysis更简单，实现起来也更加直观和容易（只需要算特征值）。PCA tries to identify the subspace in which the data approximately li...

2019-05-26 22:03:23 1826

原创基于时间戳的并发控制

实现基于时间戳的事务处理原型。 TO算法流程维护若干时间戳事务时间戳：以事务开始时间标识事务的先后顺序，表示为ts(T)数据项读写时间戳：记录读写该数据的最新事务的时间戳，表示为r_ts(X), w_ts(X)另每个数据项x有三个队列，分别为读队列dm_read(x)，写队列dm_write(x)，预写队列dm_pre(x)。min_R_ts(x)，min_P_ts(x)分别为...

2019-05-22 21:02:28 5039

原创基于锁的并发控制

实现基于2PL的事务处理原型。基本概念Short duration lock 短锁动作开始前申请锁，动作结束立即把锁释放Long duration Lock 长锁动作开始前申请锁，动作结束继续持有锁2PL的思路事务从锁的角度看分为加锁和解锁两个阶段Growing加锁阶段，事务只获取锁，不释放锁Shrinking解锁阶段，事务只能释放锁，不能加新锁...

2019-05-22 21:01:34 1148

原创 Factor Analysis

这应该是学ML以来推导过的最痛苦的算法了，所以我想先用直观的语言描述什么是Factor analysis。因子分析(factor analysis)是一种数据简化的技术。它通过研究众多变量之间的内部依赖关系，探求观测数据中的基本结构，并用少数几个假想变量来表示其基本的数据结构。这几个假想变量能够反映原来众多变量的主要信息。原始的变量是可观测的显在变量，而假想变量是不可观测的潜在变量，称为因子...

2019-05-14 22:27:00 221

原创 Hive使用

使用Hadoop和Hive。首先，Hive是使用了MapReduce引擎和HDFS存储的中间键，其元数据存储在MySQL，Hive只是方便查询，其数据库中的数据都在HDFS中。安装Hadoop和Hive在之前的分布式系统中，已经安装好Hadoop，具体教程可参考这里。需要注意的是，在Ubuntu下，如果把环境变量放到~/.bash_profile，并不是一个好的选择，因为每次新的ter...

2019-05-09 20:49:50 323

原创 EM算法推导

推导EM算法，并证明收敛性。Jensen’s inequality定理：若fff是凸函数，XXX是随机变量，我们有：E[f(X)]≥f(EX)\mathrm{E}[f(X)] \geq f(\mathrm{E} X)E[f(X)]≥f(EX)若fff是严格凸函数，也就是f′′>0f^{''} > 0f′′>0恒成立，同...

2019-04-18 21:05:12 258

原创 EM 思想

以Kmeans和GMM为例，阐述EM思想。Kmeanskmeans是一种相当简单和直观的聚类算法，主要分类两步：对于每个点，选择离他最近的聚类中心作为他的类别：c(i):=arg⁡min⁡j∥x(i)−μj∥2c^{(i)} :=\arg \min _{j}\left\|x^{(i)}-\mu_{j}\right\|^{2}c(i):=argminj∥∥x(i)−μj∥∥2...

2019-04-18 11:40:26 223

原创 Advice for applying Machine Learning -- Andrew Ng

Key ideas:Diagnostics for debugging learning algorithms.Error analyses and ablative analysis.How to get started on a machine learning problem.Premature (statistical) optimization.Debugging ...

2019-04-16 21:08:01 167

原创 Regularization&feature selection

Cross validation / multual information / Bayesian statistics and regularization在之前我们讨论了最小化风险函数，但很多时候这样做的效果并不好，这是由于bias and variance的权衡。因此，我们需要进行模型选择，来自动的选择最合适的模型。Cross validation假设我们有一些有限的模型，如何来选择...

2019-04-15 15:06:25 166

原创 Convex Formulation for Learning from Positive and Unlabeled Data

Unbiased PU learning. 该论文在之前PU learning中使用非凸函数作为loss的基础上，对正类样本和未标记样本使用不同的凸函数loss，从而将其转为凸优化问题。结果表明，该loss（double hinge loss）与非凸loss（ramp）精度几乎一致，但大大减少了计算量。IntrodutionBackground论文首先强调了PU问题的重要性，举了几个例子...

2019-04-03 13:45:44 846 2

原创 Analysis of Learning from Positive and Unlabeled Data

PU learning论文阅读。本文从基本的分类损失出发，推导了PU的分类问题其实就是Cost-sensitive classiﬁcation的形式，同时，通过实验证明了如果使用凸函数作为loss function，例如hinge loss会导致错误的分类边界（有bias），因此需要使用例如ramp loss之类的凹函数。同时，论文还对先验π\piπ存在偏差的情况进行了讨论，说明了如果样本中...

2019-04-03 13:45:02 843

原创 Learning Classiﬁers from Only Positive and Unlabeled Data

PU learning 经典论文。本文主要考虑在SCAR假设下，证明了普通的分类器和PU分类器只相差一个常数，因此可以使用普通分类器的方法来估计p(s∣x)p(s|x)p(s∣x)，进而得到p(y∣x)p(y|x)p(y∣x)。同时提供了三种方法来估计这个常数，最后，还对先验p(y)p(y)p(y)的估计提供了思路。Learning a traditional class...

2019-04-03 13:41:46 402

原创日志分析

创建外部表首先启动gpfdist服务：nohup gpfdist -d /home/dyt/PJ4 -p 9058 -l /home/dyt/PJ4/gpfdist.log &查看是否启动成功：ps -ef | grep gpfdist创建外部表1,123432423,2019-03-15 23:12:25,zsl2,123657567,2019-03-15 23:12...

2019-03-21 09:35:29 223

空空如也

空空如也