Deep Learning 理解

18 篇文章 0 订阅
9 篇文章 0 订阅

一、word2vector
1、 hierarchical softmax
传统的softmax的最后一层需要计算每个单词的概率,效率太低,因此提出了替代方案:Hierarchical softmax。

Hierarchical Softmax 基于这样的思想:相比于直接建模 P(y/x) ,我们可以先定义一个划分函数 c() 将 y 划分到区域 C,然后:
在这里插入图片描述
即计算 x 条件下 y 的概率,先计算 x 条件下 y 所在的区域的概率,再计算该区域条件下 y 的概率。这个方法可以嵌套,可以在 C 区域下再划分区域。通过不断将样本空间(词汇表)分成两个大小相等的互补的集合,可以每次将样本空间的大小缩小一半,最终经过 步就可以得到想要的样本点
参考:https://zhuanlan.zhihu.com/p/57028381
https://www.zhihu.com/question/43378064

2、 negative sampling
根据某种分布采样得到负样本,之后按照二分类的交叉熵计算损失:-log(logistic) - SUM(log(1-logistic))
具体理论和源码可以参考我的另一篇博客
这里着重关注负采样的采样流程:word2vec的tensorflow的实现中candidate_sampling_ops.py说采用的是“an approximately log-uniform
or Zipfian distribution”,即公示:P(class) = (log(class + 2) - log(class + 1)) / log(range_max + 1)(这里要求按照词频降序排列)
但是在paper和这个tutorial中说是按照概率分布来采样:
在这里插入图片描述
因此比较迷糊:到底哪种方式比较合适呢

c语言的采样实现

**

建立一个unigram
table包含了一亿个元素的数组,这个数组是由词汇表中每个单词的索引号填充的,并且这个数组中有重复,也就是说有些单词会出现多次。每个单词的索引在这个数组中出现的次数根据公式P(w_i)table_size得出,也就是说计算出的负采样概率1亿=单词在表中出现的次数。
因此我理解的步骤:(1)根据P(w_i)*table_size计算每个单词在表中应该出现的次数(2)根据次数向表中put相应个数的该单词,左右单词重复该操作后应该正好table装满(因为概率p的和为1)(3)在0-table_size中产生随机数,随机数作为key对应的value就是采样的index。

**

二、Loss
1、cross entropy for softmax:
How close is the predicted distribution to the true distribution? That is what the cross-entropy loss determines。参考https://stackoverflow.com/questions/41990250/what-is-cross-entropy

Deep Learning Tutorials ======================= Deep Learning is a new area of Machine Learning research, which has been introduced with the objective of moving Machine Learning closer to one of its original goals: Artificial Intelligence. Deep Learning is about learning multiple levels of representation and abstraction that help to make sense of data such as images, sound, and text. The tutorials presented here will introduce you to some of the most important deep learning algorithms and will also show you how to run them using Theano. Theano is a python library that makes writing deep learning models easy, and gives the option of training them on a GPU. The easiest way to follow the tutorials is to `browse them online <http://deeplearning.net/tutorial/>`_. `Main development <http://github.com/lisa-lab/DeepLearningTutorials>`_ of this project. .. image:: https://secure.travis-ci.org/lisa-lab/DeepLearningTutorials.png :target: http://travis-ci.org/lisa-lab/DeepLearningTutorials Project Layout -------------- Subdirectories: - code - Python files corresponding to each tutorial - data - data and scripts to download data that is used by the tutorials - doc - restructured text used by Sphinx to build the tutorial website - html - built automatically by doc/Makefile, contains tutorial website - issues_closed - issue tracking - issues_open - issue tracking - misc - administrative scripts Build instructions ------------------ To build the html version of the tutorials, install sphinx and run doc/Makefile
'Written by three experts in the field, Deep Learning is the only comprehensive book on the subject.' -- Elon Musk, co-chair of OpenAI; co-founder and CEO of Tesla and SpaceX, Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning., The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models., Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值