Deriving contrastive divergence
For a long time, I did not get how contrastive divergence (CD) works. I was stumped by the bracket notation, and by “maximizing the log probability of the data”.
This made everything clearer: http://www.robots.ox.ac.uk/~ojw/files/NotesOnCD.pdf.
Local copy here (in case website is down)
The only math needed is integrals, partial derivatives, sum, products, and the derivative of the log of an arbitrary
function. (log u)' = u' / u
.
Just so the math fits on one screen, I’ll put the equations below.
x
is a data point. f(x; Θ)
is our function. Θ
is a vector of model parameters.
data:image/s3,"s3://crabby-images/4d94a/4d94ad12daa0f121b2c6ad0bb753e5674556c88a" alt=""
Z(Θ)
is the partition function.
data:image/s3,"s3://crabby-images/3ffbe/3ffbe4f07936354d268454e6a83dfb4c7a3d092b" alt=""
We learn our model parameters, Θ
, by maximizing the probability of a training set of data,
X = x1,..,K
, given as
data:image/s3,"s3://crabby-images/37608/37608c45af2552f6982a1277f7b749c32aeeefed" alt=""
Which is the same as minimizing the energy E(X; Θ)
data:image/s3,"s3://crabby-images/59248/59248c622e17dc509970e5ac68c3590f20e28a85" alt=""
We derive the energy with respect to the model parameters, Θ
Quick explanation for (7): for first bit d log Z(Θ)
, we just put the d/dΘ
on top of it. For the second bit,
we put the d/dΘ
inside the sum.
Quick explanation for (8): The sum is rewritten with the bracket notation.
data:image/s3,"s3://crabby-images/23bfd/23bfd8e218166ce07fc4384e78545af93fb2cea9" alt=""
Here we calculate the first bit of the energy derivative of (7)
(9): We use (log u)' = u' / u
(9 -> 10): We use the definition of Z. See (2)
(10-> 11): Easy
(11 -> 12): We use (log u)' = u' / u
again.
(12 -> 13): We use the definition of p. See (1)
data:image/s3,"s3://crabby-images/50884/50884b212aac9412d60a2a1ff97d541886b3a493" alt=""
data:image/s3,"s3://crabby-images/858f6/858f6f551a1aa81e0681df84bba44426de5327b1" alt=""
data:image/s3,"s3://crabby-images/698b2/698b2994401d9277e49cb95e5ce4da9696fa06da" alt=""