Predictor sets

Predictors sets are sets of predictors (independent variables) seen by models.

Changing predictor sets across models allows us to study how predictor selection affects model performance and biases.

Predictor set specifications

The below table contains the full list of predictor sets used in model specifications.

For an interpretation of the meaning of each variable, see the full list of predictors.

Predictors in the column continuous are used as is. Predictors in dummied are first transformed into categorical dummies (e.g., for year-quarter or regional fixed effects). Stacked (second-level) models use the predictions of first-level models as predictors: lnusd-ha_<model identifier>.

label

continuous

dummied

X

Main predictor set

clim_ppt_summer, clim_ppt_winter, clim_tmean_summer, clim_tmean_winter, cst_2500, cst_50, elev, f_soil_farmlandofuniqueimportance, f_soil_local, f_soil_prime, f_soil_statewide, fld_fr_fath_f100, fld_fr_fath_p100, hh_inc_med_bg_2012-2016, lake_dist_asinh, lake_exposure, lat_id, lat_id_rot45, ln_bld_pop_exp_c4, ln_ha, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, p_barren, p_bld_fp_500, p_bld_fp_5000, p_crops, p_forest, p_grassland, p_pasture, p_prot_2010_5000, p_shrub, p_wet, rd_dist_pvd+, river_exposure, slope, travel, year_cont

X-soilif

Main + soil “if”

clim_ppt_summer, clim_ppt_winter, clim_tmean_summer, clim_tmean_winter, cst_2500, cst_50, elev, f_soil_farmlandofuniqueimportance, f_soil_local, f_soil_prime, f_soil_primeif, f_soil_statewide, f_soil_statewideif, fld_fr_fath_f100, fld_fr_fath_p100, hh_inc_med_bg_2012-2016, lake_dist_asinh, lake_exposure, lat_id, lat_id_rot45, ln_bld_pop_exp_c4, ln_ha, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, p_barren, p_bld_fp_500, p_bld_fp_5000, p_crops, p_forest, p_grassland, p_pasture, p_prot_2010_5000, p_shrub, p_wet, rd_dist_pvd+, river_exposure, slope, travel, year_cont

X-soildetail

Main + all soils

clim_ppt_summer, clim_ppt_winter, clim_tmean_summer, clim_tmean_winter, cst_2500, cst_50, elev, f_soil_farmlandofuniqueimportance, f_soil_local, f_soil_prime, f_soil_primedrain, f_soil_primeirrigate, f_soil_primeprotect, f_soil_statewide, f_soil_statewidedrain, f_soil_statewideirrigate, f_soil_statewideprotect, fld_fr_fath_f100, fld_fr_fath_p100, hh_inc_med_bg_2012-2016, lake_dist_asinh, lake_exposure, lat_id, lat_id_rot45, ln_bld_pop_exp_c4, ln_ha, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, p_barren, p_bld_fp_500, p_bld_fp_5000, p_crops, p_forest, p_grassland, p_pasture, p_prot_2010_5000, p_shrub, p_wet, rd_dist_pvd+, river_exposure, slope, travel, year_cont

X-nosoilclim

Main - climate/soil

cst_2500, cst_50, elev, fld_fr_fath_f100, fld_fr_fath_p100, hh_inc_med_bg_2012-2016, lake_dist_asinh, lake_exposure, lat_id, lat_id_rot45, ln_bld_pop_exp_c4, ln_ha, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, p_barren, p_bld_fp_500, p_bld_fp_5000, p_crops, p_forest, p_grassland, p_pasture, p_prot_2010_5000, p_shrub, p_wet, rd_dist_pvd+, river_exposure, slope, travel, year_cont

X-pnas

Nolte 2020 PNAS

cst_2500, cst_50, elev, fld_fr_fath_f100, fld_fr_fath_p100, hh_inc_med_bg_2012-2016, lat_id, lat_id_rot45, ln_ha, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, p_barren, p_bld_fp_500, p_bld_fp_5000, p_crops, p_forest, p_grassland, p_pasture, p_prot_2010_5000, p_shrub, p_wet, rd_dist_pvd+, slope, travel, water_exposure, year_cont

X-irr

Main + irrigation

clim_ppt_summer, clim_ppt_winter, clim_tmean_summer, clim_tmean_winter, cst_2500, cst_50, elev, f_soil_farmlandofuniqueimportance, f_soil_local, f_soil_prime, f_soil_statewide, fld_fr_fath_f100, fld_fr_fath_p100, hh_inc_med_bg_2012-2016, irr_2000_2020, lake_dist_asinh, lake_exposure, lat_id, lat_id_rot45, ln_bld_pop_exp_c4, ln_ha, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, p_barren, p_bld_fp_500, p_bld_fp_5000, p_crops, p_forest, p_grassland, p_pasture, p_prot_2010_5000, p_shrub, p_wet, rd_dist_pvd+, river_exposure, slope, travel, year_cont

X-bare

Nothing

lat_id, lat_id_rot45, ln_ha, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, year_cont

X-people

People

lat_id, lat_id_rot45, ln_bld_pop_exp_c4, ln_ha, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, year_cont

X-buildings

Buildings

lat_id, lat_id_rot45, ln_ha, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, p_bld_fp_500, p_bld_fp_5000, year_cont

X-wealth

Wealth

hh_inc_med_bg_2012-2016, lat_id, lat_id_rot45, ln_ha, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, year_cont

X-peoplewealth

People & wealth

hh_inc_med_bg_2012-2016, lat_id, lat_id_rot45, ln_bld_pop_exp_c4, ln_ha, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, year_cont

X-peoplebuildings

People & buildings

lat_id, lat_id_rot45, ln_bld_pop_exp_c4, ln_ha, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, p_bld_fp_500, p_bld_fp_5000, year_cont

X-popbldwealth

People, buildings, wealth

hh_inc_med_bg_2012-2016, lat_id, lat_id_rot45, ln_bld_pop_exp_c4, ln_ha, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, p_bld_fp_500, p_bld_fp_5000, year_cont

X-fp

Main predictor set

clim_ppt_summer, clim_ppt_winter, clim_tmean_summer, clim_tmean_winter, cst_2500, cst_50, elev, f_soil_farmlandofuniqueimportance, f_soil_local, f_soil_prime, f_soil_statewide, fld_fr_fath_f100, fld_fr_fath_p100, hh_inc_med_bg_2012-2016, lake_dist_asinh, lake_exposure, lat_id, lat_id_rot45, ln_bld_pop_exp_c4, ln_ha, long_id, long_id_rot45, n_bld_fp_per_ha, p_barren, p_bld_fp, p_bld_fp_500, p_bld_fp_5000, p_crops, p_forest, p_grassland, p_pasture, p_prot_2010_5000, p_shrub, p_wet, rd_dist_pvd+, river_exposure, slope, travel, year_cont

X-latlong

Main predictor set

clim_ppt_summer, clim_ppt_winter, clim_tmean_summer, clim_tmean_winter, cst_2500, cst_50, elev, f_soil_farmlandofuniqueimportance, f_soil_local, f_soil_prime, f_soil_statewide, fld_fr_fath_f100, fld_fr_fath_p100, hh_inc_med_bg_2012-2016, lake_dist_asinh, lake_exposure, lat_id, ln_bld_pop_exp_c4, ln_ha, long_id, m2_bld_fp_ihs, n_bld_fp_ihs, p_barren, p_bld_fp_500, p_bld_fp_5000, p_crops, p_forest, p_grassland, p_pasture, p_prot_2010_5000, p_shrub, p_wet, rd_dist_pvd+, river_exposure, slope, travel, year_cont

X-nonspatial

Main predictor set

clim_ppt_summer, clim_ppt_winter, clim_tmean_summer, clim_tmean_winter, cst_2500, cst_50, elev, f_soil_farmlandofuniqueimportance, f_soil_local, f_soil_prime, f_soil_statewide, fld_fr_fath_f100, fld_fr_fath_p100, hh_inc_med_bg_2012-2016, lake_dist_asinh, lake_exposure, ln_bld_pop_exp_c4, ln_ha, m2_bld_fp_ihs, n_bld_fp_ihs, p_barren, p_bld_fp_500, p_bld_fp_5000, p_crops, p_forest, p_grassland, p_pasture, p_prot_2010_5000, p_shrub, p_wet, rd_dist_pvd+, river_exposure, slope, travel, year_cont

X-xy45

Main predictor set

clim_ppt_summer, clim_ppt_winter, clim_tmean_summer, clim_tmean_winter, cst_2500, cst_50, elev, f_soil_farmlandofuniqueimportance, f_soil_local, f_soil_prime, f_soil_statewide, fld_fr_fath_f100, fld_fr_fath_p100, hh_inc_med_bg_2012-2016, lake_dist_asinh, lake_exposure, ln_bld_pop_exp_c4, ln_ha, m2_bld_fp_ihs, n_bld_fp_ihs, p_barren, p_bld_fp_500, p_bld_fp_5000, p_crops, p_forest, p_grassland, p_pasture, p_prot_2010_5000, p_shrub, p_wet, rd_dist_pvd+, river_exposure, slope, travel, x, x45, y, y45, year_cont

X-blackhispanic

Main predictor set

clim_ppt_summer, clim_ppt_winter, clim_tmean_summer, clim_tmean_winter, cst_2500, cst_50, elev, f_soil_farmlandofuniqueimportance, f_soil_local, f_soil_prime, f_soil_statewide, fld_fr_fath_f100, fld_fr_fath_p100, hh_inc_med_bg_2012-2016, lake_dist_asinh, lake_exposure, lat_id, lat_id_rot45, ln_bld_pop_exp_c4, ln_ha, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, p_barren, p_black_bg_2012-2016, p_bld_fp_500, p_bld_fp_5000, p_crops, p_forest, p_grassland, p_hispanic_bg_2012-2016, p_pasture, p_prot_2010_5000, p_shrub, p_wet, rd_dist_pvd+, river_exposure, slope, travel, year_cont

X-diverse

Main predictor set

clim_ppt_summer, clim_ppt_winter, clim_tmean_summer, clim_tmean_winter, cst_2500, cst_50, elev, f_soil_farmlandofuniqueimportance, f_soil_local, f_soil_prime, f_soil_statewide, fld_fr_fath_f100, fld_fr_fath_p100, hh_inc_med_bg_2012-2016, lake_dist_asinh, lake_exposure, lat_id, lat_id_rot45, ln_bld_pop_exp_c4, ln_ha, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, p_asian_bg_2012-2016, p_barren, p_black_bg_2012-2016, p_bld_fp_500, p_bld_fp_5000, p_crops, p_forest, p_grassland, p_hispanic_bg_2012-2016, p_mixed_bg_2012-2016, p_native_bg_2012-2016, p_pacific_bg_2012-2016, p_pasture, p_prot_2010_5000, p_shrub, p_wet, rd_dist_pvd+, river_exposure, slope, travel, year_cont

X-yha

Main predictor set, yha

clim_ppt_summer, clim_ppt_winter, clim_tmean_summer, clim_tmean_winter, cst_2500, cst_50, elev, f_soil_farmlandofuniqueimportance, f_soil_local, f_soil_prime, f_soil_statewide, fld_fr_fath_f100, fld_fr_fath_p100, hh_inc_med_bg_2012-2016, lake_dist_asinh, lake_exposure, lat_id, lat_id_rot45, ln_bld_pop_exp_c4, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, p_barren, p_bld_fp_500, p_bld_fp_5000, p_crops, p_forest, p_grassland, p_pasture, p_prot_2010_5000, p_shrub, p_wet, rd_dist_pvd+, river_exposure, slope, travel, year_cont

X-yhac

Main predictor set, yhac

clim_ppt_summer, clim_ppt_winter, clim_tmean_summer, clim_tmean_winter, cst_2500, cst_50, elev, f_soil_farmlandofuniqueimportance, f_soil_local, f_soil_prime, f_soil_statewide, fld_fr_fath_f100, fld_fr_fath_p100, hh_inc_med_bg_2012-2016, lake_dist_asinh, lake_exposure, lat_id, lat_id_rot45, ln_bld_pop_exp_c4, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, p_barren, p_bld_fp_500, p_bld_fp_5000, p_crops, p_forest, p_grassland, p_pasture, p_prot_2010_5000, p_shrub, p_wet, rd_dist_pvd+, river_exposure, slope, travel, year_cont

X-yqc

Main predictor set

clim_ppt_summer, clim_ppt_winter, clim_tmean_summer, clim_tmean_winter, cst_2500, cst_50, elev, f_soil_farmlandofuniqueimportance, f_soil_local, f_soil_prime, f_soil_statewide, fld_fr_fath_f100, fld_fr_fath_p100, hh_inc_med_bg_2012-2016, lake_dist_asinh, lake_exposure, lat_id, lat_id_rot45, ln_bld_pop_exp_c4, ln_ha, long_id, long_id_rot45, m2_bld_fp_ihs, n_bld_fp_ihs, p_barren, p_bld_fp_500, p_bld_fp_5000, p_crops, p_forest, p_grassland, p_pasture, p_prot_2010_5000, p_shrub, p_wet, rd_dist_pvd+, river_exposure, slope, travel

X-main-lm_latlong_Xha^2_Xyqd

Main, curvature, dummies, Xha^2

clim_ppt_summer, clim_ppt_winter, clim_tmean_summer, clim_tmean_winter, cst_2500, cst_50, elev, f_soil_farmlandofuniqueimportance, f_soil_local, f_soil_prime, f_soil_statewide, fld_fr_fath_f100, fld_fr_fath_p100, lake_dist_asinh, lake_exposure, lat_id, ln_bld_pop_exp_c4, ln_bld_pop_exp_c4^2, ln_ha, ln_ha^2, ln_hh_inc_med_bg_2012-2016, ln_hh_inc_med_bg_2012-2016^2, long_id, m2_bld_fp_ihs, n_bld_fp_ihs, p_barren, p_bld_fp_500, p_bld_fp_5000, p_crops, p_forest, p_grassland, p_pasture, p_prot_2010_5000, p_shrub, p_wet, rd_dist_pvd+, river_exposure, slope, travel

year_quarter

X-main-lm_latlong_Xha^2_Xyrd

Main, curvature, dummies, Xha^2

clim_ppt_summer, clim_ppt_winter, clim_tmean_summer, clim_tmean_winter, cst_2500, cst_50, elev, f_soil_farmlandofuniqueimportance, f_soil_local, f_soil_prime, f_soil_statewide, fld_fr_fath_f100, fld_fr_fath_p100, lake_dist_asinh, lake_exposure, lat_id, ln_bld_pop_exp_c4, ln_bld_pop_exp_c4^2, ln_ha, ln_ha^2, ln_hh_inc_med_bg_2012-2016, ln_hh_inc_med_bg_2012-2016^2, long_id, m2_bld_fp_ihs, n_bld_fp_ihs, p_barren, p_bld_fp_500, p_bld_fp_5000, p_crops, p_forest, p_grassland, p_pasture, p_prot_2010_5000, p_shrub, p_wet, rd_dist_pvd+, river_exposure, slope, travel

year

stack1

Stack1

lnusd-ha_region-nb, lnusd-ha_region-nb_lm, lnusd-ha_region-nb_hgb, lnusd-ha_region-nb_mv, lnusd-ha_region, lnusd-ha_county_lm, ln_ha, lat_id, long_id, lat_id_rot45, long_id_rot45, year_cont

stack2

Stack2

lnusd-ha_region-nb, lnusd-ha_region-nb_lm, lnusd-ha_region-nb_hgb, lnusd-ha_region-nb_bare, lnusd-ha_region-nb_mv, lnusd-ha_region-nb_mv_lm, lnusd-ha_conus_lm_Xyqd_high, ln_ha, lat_id, long_id, lat_id_rot45, long_id_rot45, year_cont