Computes the KL-divergence from prior to posterior distribution,
which measures "surprise" in the Bayesian framework.
Usage
kl_divergence(posterior, prior, base = 2)
Arguments
- posterior
Numeric vector of posterior probabilities
- prior
Numeric vector of prior probabilities (same length as posterior)
- base
Base of logarithm (default: 2 for bits)
Value
Numeric scalar: the KL-divergence value (always non-negative)
Details
KL-divergence is defined as:
$$D_{KL}(P || Q) = \sum_i P_i \log(P_i / Q_i)$$
where P is the posterior and Q is the prior. The divergence is 0 when
posterior equals prior (no surprise), and increases as they differ.
Zero probabilities are handled by excluding those terms (convention that
0 * log(0) = 0).
Examples
# No surprise when prior equals posterior
kl_divergence(c(0.5, 0.5), c(0.5, 0.5))
#> [1] 0
# High surprise when distributions differ
kl_divergence(c(0.9, 0.1), c(0.5, 0.5))
#> [1] 0.5310044
# Maximum surprise when posterior is certain
kl_divergence(c(1.0, 0.0), c(0.5, 0.5))
#> [1] 1