Skip to contents

Fits a robust state space model to multivariate time series data using iterative parameter estimation and outlier detection. This procedure is inspired by the Iterative Procedure for Outlier Detection (IPOD) algorithm of She and Owen (2011) and is applied over a sequence of regularization parameters (\(\lambda\)'s), identifying outliers via Mahalanobis residuals and re-fitting the model iteratively.

Usage

roams_SSM(
  y,
  init_par,
  build,
  num_lambdas = 20,
  custom_lambdas = NA,
  cores = 1,
  B = 50,
  lower = NA,
  upper = NA,
  tol = 1e-04,
  lambda_min = 2,
  excessive_outliers_iter_limit = 1,
  control = list(parscale = init_par)
)

Arguments

y

A numeric matrix of observations, with each row corresponding to a time point.

init_par

A numeric vector of initial parameter values for optimization.

build

A function that accepts a parameter vector and returns a dlm model (as used in dlm::dlmMLE()). The specify_SSM function can be used to create this build function.

num_lambdas

Integer. The number of \(\lambda\) values to evaluate. Ignored if custom_lambdas is specified. Default is 20.

custom_lambdas

Optional numeric vector. If supplied, these are the exact \(\lambda\) values used for model fitting. If not provided or set to NA, then num_lambdas \(\lambda\)'s are automatically chosen.

cores

Integer. Number of CPU cores to use for parallel processing. Default is 1 (sequential execution).

B

Integer. Maximum number of iterations per \(\lambda\). Default is 50.

lower

Optional numeric vector of lower bounds for optimization. If NA, defaults to -Inf for all parameters. Must be of same length as init_par.

upper

Optional numeric vector of upper bounds for optimization. If NA, defaults to Inf for all parameters. Must be of same length as init_par.

tol

Tolerance level for checking convergence of the ROAMS procedure. Default is 1e-4.

lambda_min

Minimum \(\lambda\) value to consider when constructing the sequence of \(\lambda\)'s. Ignored if custom_lambdas is specified. Default is 2.

excessive_outliers_iter_limit

Integer. Maximum number of iterations allowed where \(\ge 50\%\) of timepoints are flagged as outliers. This many outliers suggests \(\lambda\) is too low. Allows ROAMS to get through these \(\lambda\)'s quicker. Default is 1.

control

A named list of control options to pass to optim via dlm::dlmMLE(). Default is list(parscale = init_par), which can help the optimizer if parameters are on vastly different scales.

Value

If more than one \(\lambda\) values are used, returns an object of class roams_SSM_list — a list containing a roams_SSM model for each \(\lambda\). If only one \(\lambda\) value is used (i.e. custom_lambdas is manually specified as a single value), returns a single roams_SSM object.

Each roams_SSM object includes:

  • lambda - The \(\lambda\) value used.

  • prop_outlying - Proportion of non-missing time points identified as outliers.

  • BIC - Bayesian Information Criterion of the final model.

  • loglik - Log-likelihood of the fitted model.

  • RSS - Residual sum of squares.

  • gamma - Matrix of estimated outlier adjustments.

  • iterations - Number of iterations performed.

  • Optimization output from dlm::dlmMLE() from the final iteration.

  • y - The original data matrix.

  • build - The original build function used to specify the model.

Details

The ROAMS procedure alternates between estimating model parameters via maximum likelihood and identifying outlying observations based on Mahalanobis distance of residuals. For each iteration:

  1. A dlm model is fit using dlm::dlmMLE().

  2. Mahalanobis distance of residuals (Mahalanobis residuals) are computed.

  3. Observations with Mahalanobis residual (plus a \(\log(|\mathbf{S}_{t|t-1}|)\) adjustment) above the current \(\lambda\) threshold are treated as missing in the next iteration.

The algorithm stops when the change in parameters and outlier estimates is sufficiently small or if too many outliers are detected (more than 50% of complete observations).

References

She, Y., & Owen, A. B. (2011). Outlier Detection Using Nonconvex Penalized Regression. Journal of the American Statistical Association, 106(494), 626–639. https://doi.org/10.1198/jasa.2011.tm10390