Question 1
A Boltzmann Machine is different from a Feed Forward Neural Network in the sense that :
Question 2
Throughout the lecture, when talking about Boltzmann Machines, why do we talk in terms of computing the
expected value of
sisj
and not the value of
sisj
?
Question 3
When learning an RBM, we decrease the energy of data particles
and increase the energy of fantasy particles. Brian insists that the latter is not needed. He claims that it is should be sufficient to just decrease the energy of data particles and the energy of all other regions of state space would have increased relatively. This would also save us the trouble of sampling from the model distribution. What is wrong with this intuition ?
Question 4
Restricted Boltzmann Machines are easier to learn than Boltzmann Machines with arbitrary connectivity. Which of the following is a contributing factor ?
Question 5
PCD a better algorithm than CD1 when it comes to training a good generative model of the data. This means that samples drawn from a freely running Boltzmann Machine which was trained with PCD (after enough time) are likely to look more realistic than those drawn from the same model trained with CD1. Why does this happen ?
Question 6
It's time for some math now!
In RBMs, the energy of any configuration is a linear function of the state.
E(v,h)=−∑iaivi−∑jbjhj−∑i,jvihjWij
and this eventually leads to
ΔWij∝⟨vihj⟩data−⟨vihj⟩model
If the energy was non-linear, such as
E(v,h)=−∑iaif(vi)−∑jbjg(hj)−∑i,jf(vi)g(hj)Wij
for some non-linear functions
f
and
g
, which of the following would be true.
Question 7
In RBMs, the energy of any configuration is a linear function of the state.
E(v,h)=−∑iaivi−∑jbjhj−∑i,jvihjWij
and this eventually leads to
P(hj=1|v)=11+exp(−∑iWijvi−bj)
If the energy was non-linear, such as
E(v,h)=−∑iaivi−∑jbjhj−∑i,jf(vi,hj)Wij
for some non-linear function
f
, which of the following would be true.