State Space Model (SSM
)¶
State-space models are appealing for loss ratio forecasting because they allow separation of observation noise from the true latent variability in the underlying ultimate loss ratios. The distinction is important, particularly in cases where the premium volume of a program changes significantly over time, or when extending the model to capture effects that influence the latent (but not observation) process.
Our State Space Model (SSM
) can be viewed as a latent random walk with noise, with an AR(1) and
optional MA(1) component. By default, the model is an latent ARMA(1, 1). The SSM
is
mathematically expressed as:
where \(\mathrm{LR}_i\) indicates the observed loss ratio for accident period \(i\), \(\eta_i\) is the latent log loss ratio for the same accident period, and \(\mathrm{UEP}_i\) is the used earned premium for the same accident period (see details below). The state space component of the model captures how the latent log loss ratio (\(\eta_i\)) evolves over time. The evolution of \(\eta_i\) is controlled by a reversion parameter (\(\phi_{\text{reversion}}\)), a momentum parameter (\(\gamma_{\text{momentum}}\)), and a latent noise parameter (\(\epsilon_{\text{latent}}\)). The reversion parameter controls how much the latent log loss ratio reverts to a target loss ratio (\(\mathrm{LR}_{\text{target}}\)) each period. The momentum parameter controls how much the latent log loss ratio is influenced by the previous period’s latent change, and the latent noise parameter controls how much latent change occurs each period.
The SSM
is specified such that \(\exp(\eta_i)\) is the expected loss ratio for each
accident period, and the observed loss ratios are then assumed to be Gamma distributed where
\(\mathrm{Gamma(\exp(\eta_i), \sigma_{i}^2)}\) is the mean-variance parameterization of the
Gamma distribution.
Model Fit Configuration¶
The SSM
model is fit using the following API call:
model = client.forecast_model.create(
triangle=...,
name="example_name",
model_type="SSM",
config={ # default model_config
"loss_definition": "reported",
"loss_family": "gamma",
"include_momentum": True,
"use_cape_cod": True,
"use_measurement_error": False,
"period_years": 1.0,
"line_of_business": None,
"informed_priors_version": None,
"priors": None, # see defaults below
"recency_decay": 1.0,
"seed": None
}
)
The SSM
model accepts the following configuration parameters in config
:
loss_definition
: Name of loss field to model in the underlying triangle (e.g.,"reported"
,"paid"
, or"incurred"
). Defaults to"reported"
.loss_family
: Outcome distribution family (e.g.,"gamma"
,"lognormal"
, or""normal"
). Defaults to"gamma"
.include_momentum
: Whether to include a momentum parameter in the model. Defaults toTrue
, resulting in the ARMA(1, 1) model described above. If set toFalse
, the momentum parameter is set to 0 and the process drops out, leaving only the latent AR(1) process in the model.use_cape_cod
: Whether to use the Cape Cod method to account for down-weighting more recent, greener years based on the age-to-ultimate. Defaults toTrue
, which will estimate the used earned premium (UEP) by scaling the earned premium for each accident period by the percent of observed losses vs ultimate losses estimated for the given accident period. UEP is lower for more recent accident periods, which increases the \(\sigma_{i}^2\) term for those observations, effecitvely down-weighting them in the likelihood. If set toFalse
, the model will assume that the UEP is equal to the input earned premium for each accident period. See Korn, 2021 ([1]) for more details on how the Cape Cod method functions in the context of a SSM on loss ratios.use_measurement_error
: Whether to include measurement error in the model. Defaults toFalse
, which assumes that the mean of the input ultimate loss ratios are the true loss ratios. If set toTrue
, for losses in the input triangle that have associated uncertainty (i.e. posterior predictive distributions from a loss development or tail model), “true” ultimates are estimated given the mean and standard deviation of the observed/predicted ultimates, and these true ultimates are used as the outcome variable in the model (i.e. \(LR_i\)) as opposed to the observed ultimates. Note thatuse_cape_cod
anduse_measurement_error
should not both be set toTrue
, as they are different ways to account for uncertainty in the ultimate loss ratios.period_years
: Number of years in each accident period. Defaults to1.0
, which means that each period is one year. If set to a value lesser than1.0
, the model will treat each period as a proportion of a year (i.e.period_years=0.25
indicates that each \(i\) is an accident quarter). It is critical that this parameter is set correctly, as it is used to scale priors in the model, all of which are scaled by default to accident years.line_of_business
: Line of business that the input triangle belongs to. If specified, backtest-informed priors leveraging industry data are used to fit the model. Must be preovided ifinformed_priors_version
is specified. Otherwise, defaults toNone
and the default priors below are used. Supported lines include:["CA", "MC", "MO", "OO", "PC", "PO", "PP", "SL", "WC"]
. Abbreviations map to the following lines:
{
"CA": "Commercial Auto Liability",
"MC": "Medical Liability: Claims Made",
"MO": "Medical Liability: Occurrence",
"OO": "Other Liability: Occurrence",
"PC": "Product Liability: Claims Made",
"PO": "Product Liability: Occurrence",
"PP": "Private Passenger Auto",
"SL": "Special Liability",
"WC": "Workers' Compensation"
}
informed_priors_version
: Version of the industry-informed priors to use when fitting the model. Supported versions currently only include:"2022"
. Specify as"latest"
to always use the most up-to-date priors available. Defaults tolatest
.priors
: Dictionary of prior distributions to use for model fitting. Default priors are:
{
"target_log_lr_loc": -0.5,
"target_log_lr_scale": 1.0,
"reversion_logit_loc": 1.5,
"reversion_logit_scale": 1.0,
"latent_log_noise_loc": -2.0,
"latent_log_noise_scale": 1.0,
"obs_log_noise_loc": -1.0,
"obs_log_noise_scale": 1.0,
"base_log_noise_loc": -5.0,
"base_log_noise_scale": 1.0,
"momentum_logit_loc": -1.0,
"momentum_logit_scale": 1.0,
}
recency_decay
: Likelihood weight decay to down-weight older experience periods. Defaults to1.0
, which means no decay. If set to a value between0.0
and1.0
, the likelihood of older experience periods will be downweighted by a geometric decay function with factorrecency_decay
. See Geometric decay weighting for more information.seed
: Random seed for model fitting.
Model Predict Configuration¶
The SSM
model is used to predict future losses using the following API call:
predictions = model.forecast_model.predict(
triangle=...,
config={ # default config
"include_process_noise": True,
}
target_triangle=None,
)
Above, triangle
is the triangle to use to start making predictions from and target_triangle
is the triangle to make predictions on. For most use-cases, triangle
will be the same triangle
that was used in model fitting, and target_triangle
should be specified to include future
accident periods (including earned premium values) that forecasts should be made on.
The SSM
prediction behavior can be further changed with configuration parameters in
config
:
include_process_noise
: Whether to include process noise in the predictions. Defaults toTrue
, which generates posterior predictions from the mathematical model as specified above. If set toFalse
, the model will generate predictions without adding process noise to the predicted losses. Referring to the mathematical expression above, this equates to obtaining the expectation \(\exp(\eta_{i})\) as predictions as oppposed to \(\mathrm{LR}_{i}\).