Creates a model that compares observed events to expected rates based on a known baseline (e.g., population). This addresses "base rate bias" where patterns in visualizations are dominated by underlying factors like population density.
Details
Under the base rate model, expected proportions are defined by the
expected vector. The likelihood measures how well observed data
matches these expected proportions:
$$P(D|BaseRate) = 1 - \frac{1}{2} \sum_i |O_i - E_i|$$
For example, if region A has 10% of the population, we expect 10% of events. Regions with event rates matching their population share show low surprise; regions with disproportionate rates show high surprise.
This is the primary tool for de-biasing choropleth maps.
Examples
# Population-weighted base rate
population <- c(10000, 50000, 100000, 25000)
model <- bs_model_baserate(population)
# Use in model space
space <- model_space(
bs_model_uniform(),
bs_model_baserate(population)
)