Preliminary
- Unbiased Estimator. Remember that estimators are random variables; an estimator is unbiased if its expected value is equal to the true value of the parameter being estimated. To use regression as an example. Suppose you measured two variables x and y where the true (linear) relationship is given by: y=5*x+2. Of course, any sample that you draw will have noise in it, so you have to estimate the true values of the slope and intercept. Suppose that you draw a thousand samples of x, y and calculate the least squares estimators for each sample (assuming that the noise is normally distributed). As you do that, you'll notice two things:
- All of your estimates are different (because the data is noisy)
- The mean of all of those estimates starts to converge on the true values (5 and 2)
- Markov Chain. A markov chain is a mathematical system that undergoes transitions from one state to another, between a finite or countable number of possible states. It is a random process usually characterized as memoryless: the next state depends only on the current state and not on the sequence of events that preceded it. This specific kind of memoryless is called Markov property.
- Mixing: no matter which state (node) the random process starts, the process eventually stabilizes to a stationary state, known as mixing. A good example can be seen here.
- Mixing time of a Markov chain is the time until the Markov chain is "close" to its steady state distribution, i.e. stationary distribution Pi.
- Rapid/fast mixing refers to the mixing time grows at most polynomially fast in log(n), where n is the number of states of the chain. Tools for providing rapid mixing include arguments based on conductance and the method of coupling.
Monte Carlo Methods
- Monte Carlo is the art of approximating an expectation by the sample mean of a function of simulated random variables. (Eric C. Anderson, 1999).
- Monte Carlo is about invoking laws of large numbers to approximate expectations.
- Monte Carlo methods (or Monte Carlo experiments are a broad class of computational algorithms that rely on repeated random sampling (i.e., simulations) to obtain numerical results (in order to determine the properties of some phenomenon or behaviors).
- Monte Carlo methods are mainly used in three distinct problems: optimization, numerical integration and generation of samples from a probability distribution.
Importance Sampling
- Importance sampling is a Monte-Carlo scheme which does not involve a Markov Chain, every sample is an unbiased estimator.
- It is used when one cannot sample from P but has a proposal distribution Q.
- Importance sampling is used for the purpose of numerical integration.
References
- Eric C. Anderson, Monte Carlo Methods and Importance Sampling, Statistical Genetics, 1999.
- Bengio and Senecal, Quick training of probabilistic neural nets by importance sampling, 2003.
- Sampling: http://en.wikipedia.org/wiki/Sampling_(statistics)
- Monte Carlo method: http://en.wikipedia.org/wiki/Monte_Carlo_method