Skip to contents

Creates a non-parametric model using kernel density estimation (KDE). This model is built from a sample of the data and can detect when subsequent observations deviate from the pattern established by early data.

Usage

bs_model_sampled(
  sample_frac = NULL,
  kernel = "gaussian",
  bandwidth = "nrd0",
  n_grid = 512,
  sample_indices = NULL,
  name = NULL
)

Arguments

sample_frac

Fraction of data to use for building the prior (0 < x < 1). If NULL, uses all data for density estimation.

kernel

Kernel type for density estimation. One of: "gaussian", "epanechnikov", "rectangular", "triangular", "biweight", "cosine", "optcosine"

bandwidth

Bandwidth selection method or numeric value. If character, one of: "nrd0", "nrd", "ucv", "bcv", "SJ". If numeric, used directly as bandwidth.

n_grid

Number of points in the density estimation grid (default: 512). Higher values give smoother estimates but use more memory.

sample_indices

Integer vector of specific indices to use for building prior. Overrides sample_frac if provided.

name

Optional name for the model

Value

A bs_model_sampled object

Details

The sampled model builds a density estimate from a subset of observations (typically early observations in temporal data) and measures surprise as deviation from this learned distribution.

This is useful for:

  • Detecting temporal changes in distribution

  • Building a "post hoc" model from initial observations

  • Detecting emerging patterns in streaming data

The likelihood for each observation is the density at that point under the KDE built from the sample.

Examples

# KDE model using first 10% of data
model <- bs_model_sampled(sample_frac = 0.1)

# KDE with specific bandwidth
model <- bs_model_sampled(bandwidth = 5)

# Use specific observations for training
model <- bs_model_sampled(sample_indices = 1:10)