Creates a model that normalizes observations by their expected standard error, accounting for varying sample sizes. This addresses "sampling error bias" where regions with small sample sizes show artificially high variability.
Arguments
- sample_size
Numeric vector of sample sizes (e.g., population)
- target_rate
Target rate/proportion. If NULL, estimated from data.
- type
Type of data: "count" (Poisson) or "proportion" (binomial)
- formula
Formula for likelihood computation:
"paper" (default): Uses the funnel score from the paper's unemployment data (
dM = Z * sqrt(pop_frac)) and converts it to a two-tailed normal tail probability."poisson": Uses Poisson-based standard error and converts the resulting z-score to a two-tailed normal tail probability.
- control_limits
Numeric vector of control limits (in SDs) for funnel plot. Default is c(2, 3) for warning and control limits.
- name
Optional name for the model
Details
The de Moivre funnel model uses the insight that sampling variability decreases with sample size according to de Moivre's equation: $$SE = \sigma / \sqrt{n}$$
With formula = "paper": The model uses the formula that matches the paper's unemployment reference data: $$Z = (rate - mean_rate) / stddev_rate$$ $$dM = Z \times \sqrt{population / total\_population}$$ $$P(D|M) = 2 \times \Phi(-|dM|)$$
With formula = "poisson": For count data (Poisson), the model computes z-scores as: $$z = (observed - expected) / \sqrt{expected}$$
For proportion data (binomial): $$z = (observed - expected) / \sqrt{p(1-p)/n}$$
Observations with large z-scores (far from expected after accounting for sample size) are genuinely surprising, while high rates in small regions are discounted as expected variation.
This model is essential for:
De-biasing per-capita rate maps
Creating funnel plots
Identifying genuine outliers vs. sampling noise
Examples
# Population sizes for regions
population <- c(10000, 50000, 100000, 25000)
# Funnel model using the paper's unemployment-reference formula
model <- bs_model_funnel(population, formula = "paper")
# Funnel model with known target rate
model <- bs_model_funnel(population, target_rate = 0.001)
# For proportion data with Poisson-based formula
model <- bs_model_funnel(population, type = "proportion", formula = "poisson")