Uncertainty

Model-level uncertainty

We compute the following uncertainty metrics for each model:

Count of observations (n)
Mean prediction residual (mean bias)
Standard deviation of prediction error (std)
Root mean squared error (rmse)
Explained variance (r2: \(R^2\))
Skew (skew)
Kurtosis (kurtosis)

Cross-validation

Model-level uncertainty metrics are derived via cross-validation.

We use mainly three blocking strategies reflecting different prediction tasks:

random folds (r)

blocking by census block groups (bg)

blocking by census block groups, next-year forecasting (tbg)

Variations include:

next-year forecasting (t)

spatially blocked (s)

blocked by census tracts (tract)

Parcel-level uncertainty

We compute the following uncertainty metrics for each Parcel:

Statistical support

Parcel-level metrics that offer users a measure of statistical support, i.e., an answer to the question: to which extent is each parcel for which we develop FMV estimates similar to the parcels in the training data used to fit their corresponding Model?

Area of Applicability (AOA)

An indicator of how “different” a given parcel is from the sample of parcel sales that were used to train the model (training data).

We use the “Area of Applicability” (AOA) metric proposed by Meyer & Pebesma (2021) [1]. The AOA identifies observations for which prediction errors are expected to fall within the empirically derived prediction uncertainties obtained through cross-validation.

The purpose of the Area of Applicability is to help analysts identify parcels whose Predictor set is so different (“far away”) from the predictor sets of the training data (sales data) that we cannot make the assumption that our approach to estimate parcel-level prediction uncertainties (prediction errors in cross-validation) works for them. In other words, such estimates are distant extrapolations, have weak statistical support, and should be considered speculative.

For our AOA estimates, we compute Meyer & Pebesma’s distance index, and divide it by the threshold they propose to define the (binary) Area of Applicability (AOA) (outlier-removed upper dissimilarity index of the training data).

This means that an AOA of ≤1 (or a ln(AOA) of 0) marks predictions that are considered to be sufficiently “similar” to the training sample to trust estimated uncertainties, whereas an AOA of >1 identifies estimates for which the predictive model might be more biased or imprecise than our performance statistics suggest.