Create a Base Rate Model — bs_model_baserate • bayesiansurpriser

Creates a model that compares observed events to expected rates based on a known baseline (e.g., population). This addresses "base rate bias" where patterns in visualizations are dominated by underlying factors like population density.

Usage

bs_model_baserate(expected, normalize = TRUE, name = NULL)

Arguments

expected: Numeric vector of expected values or proportions. E.g., population counts, area sizes, or any prior expectation.
normalize: Logical; normalize expected to sum to 1?
name: Optional name for the model

Value

A bs_model_baserate object

Details

Under the base rate model, expected proportions are defined by the expected vector. The likelihood measures how well observed data matches these expected proportions:

$$P(D|BaseRate) = 1 - \frac{1}{2} \sum_i |O_i - E_i|$$

For example, if region A has 10% of the population, we expect 10% of events. Regions with event rates matching their population share show low surprise; regions with disproportionate rates show high surprise.

This is the primary tool for de-biasing choropleth maps.

Examples

# Population-weighted base rate
population <- c(10000, 50000, 100000, 25000)
model <- bs_model_baserate(population)

# Use in model space
space <- model_space(
  bs_model_uniform(),
  bs_model_baserate(population)
)