Skip to contents

Computes the KL-divergence from prior to posterior distribution, which measures "surprise" in the Bayesian framework.

Usage

kl_divergence(posterior, prior, base = 2)

Arguments

posterior

Numeric vector of posterior probabilities

prior

Numeric vector of prior probabilities (same length as posterior)

base

Base of logarithm (default: 2 for bits)

Value

Numeric scalar: the KL-divergence value (always non-negative)

Details

KL-divergence is defined as: $$D_{KL}(P || Q) = \sum_i P_i \log(P_i / Q_i)$$

where P is the posterior and Q is the prior. The divergence is 0 when posterior equals prior (no surprise), and increases as they differ.

Zero probabilities are handled by excluding those terms (convention that 0 * log(0) = 0).

Examples

# No surprise when prior equals posterior
kl_divergence(c(0.5, 0.5), c(0.5, 0.5))
#> [1] 0

# High surprise when distributions differ
kl_divergence(c(0.9, 0.1), c(0.5, 0.5))
#> [1] 0.5310044

# Maximum surprise when posterior is certain
kl_divergence(c(1.0, 0.0), c(0.5, 0.5))
#> [1] 1