Importance Sampling

TagsMethods

Importance Sampling

Importance sampling allows you to compute an expectation under a distribution using samples from another one, as long as you can evaluate under the original distribution.

We call P(x)/Q(x)P(x) / Q(x) the importance sampling weight. Now, a key restriction is that the support of QQ must be larger than PP, or else you will run into a divide by zero.

Intuitively, importance sampling is just pinching a distribution to make it into a different distribution. Importance sampling is unbiased, but it is high variance.

Variance of importance sampling

We can compute the variance

and this trick is just we push it back into the original distribution, but the squaring effect has an unfortunate residual. If there are any situations where P(x)P(x) is likely but Q(x)Q(x) is not likely, then this P(x)/Q(x)P(x)/Q(x) term can explode. Therefore, the variance is very unstable and can be quite large.