--- title: "1. Spatial disease mapping with SDALGCP2" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{1. Spatial disease mapping with SDALGCP2} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set(collapse = TRUE, comment = "#>", fig.width = 5, fig.height = 4.2, fig.align = "center", dpi = 100) ``` This tutorial fits a spatial disease-mapping model end to end. Every code block runs as shown, using the bundled example dataset `sdalgcp_data` so you can copy-paste and reproduce it exactly. ## The model We observe disease counts \(Y_i\) aggregated over areal units \(A_i\) (\(i=1,\dots,N\)) with an offset \(m_i\) (the expected count, e.g. population times a baseline rate). SDALGCP2 fits a **spatially discrete approximation to a log-Gaussian Cox process**: \[ Y_i \mid S \;\sim\; \mathrm{Poisson}\!\big(m_i\, e^{\eta_i}\big), \qquad \eta_i \;=\; d_i^\top\beta \;+\; S_i, \] where \(d_i\) are area-level covariates and \(S=(S_1,\dots,S_N)\) is a Gaussian spatial random effect. \(S_i\) is the average over area \(A_i\) of a continuous Gaussian process \(S(x)\) with exponential covariance \(\mathrm{Cov}\{S(x),S(x')\}=\sigma^2\exp(-\lVert x-x'\rVert/\phi)\); aggregating this process over the areas gives an \(N\times N\) covariance built from candidate points inside each region (see Tutorial 4). The parameters are the coefficients \(\beta\), the spatial variance \(\sigma^2\) and the range \(\phi\). Two quantities are reported for every area: * **Relative risk** \(\mathrm{RR}_i=e^{\eta_i}=e^{d_i^\top\beta+S_i}\) — the full relative risk, including the covariate effect (the `relative_risk` column); * **Covariate-adjusted relative risk** \(\mathrm{RR}^{\mathrm{adj}}_i=e^{S_i}\) — the residual spatial relative risk *after* adjusting for covariates (where is risk high/low beyond what the covariates explain?) — the `adjusted_rr` column. ## The data `sdalgcp()` takes an `sf` object whose columns hold the response, covariates and offset. The package ships a small simulated example, `sdalgcp_data`: 64 regions with a disease count (`cases`), a covariate (`x1`), and a population offset (`pop`). It was generated with a true covariate effect of `0.6` and a baseline log-rate of `-6`, so we can check the model recovers them. ```{r} library(SDALGCP2) library(sf) data(sdalgcp_data) head(sdalgcp_data) # crude standardised incidence ratio (SIR): observed / expected-at-overall-rate rate <- sum(sdalgcp_data$cases) / sum(sdalgcp_data$pop) sdalgcp_data$SIR <- sdalgcp_data$cases / (sdalgcp_data$pop * rate) ``` The crude SIR is noisy and over-fits sparsely populated areas — exactly what a model smooths: ```{r data-maps, fig.width = 7, fig.height = 3.4} plot(sdalgcp_data["SIR"], main = "Crude SIR (the data)") plot(sdalgcp_data["x1"], main = "Covariate x1") ``` ## Fit One call. Candidate-point spacing, the spatial range and MCMC settings are chosen automatically; `reanchor` re-simulates the latent field at the optimum a couple of times for reliable variance estimates. (We set a seed and a shorter MCMC run here so the vignette is quick and reproducible; the defaults are longer.) ```{r fit} set.seed(2024) fit <- sdalgcp(cases ~ x1 + offset(log(pop)), data = sdalgcp_data, control = sdalgcp_control(n_sim = 4000, burnin = 1000, thin = 5, reanchor = 1)) summary(fit) ``` The covariate effect `x1` is estimated close to its true value of 0.6, the spatial range `phi` and variance `sigma^2` describe the residual spatial structure, and the importance-sampling effective sample size reports how reliable the Monte Carlo likelihood is. ## Map the two relative risks ```{r risk-maps} plot(fit, "relative_risk") # relative risk exp(d'beta + S) plot(fit, "adjusted_rr") # covariate-adjusted relative risk exp(S) ``` `relative_risk` (left) is the overall pattern of risk; `adjusted_rr` (right) strips out the covariate contribution and shows the *unexplained* spatial signal — useful for spotting hotspots that the covariates do not account for. ## Uncertainty and exceedance Every quantity comes with a standard error, and you can ask for the probability that risk exceeds a policy threshold: ```{r exceedance-maps} plot(fit, "adjusted_rr_se") # standard error of the adjusted RR plot(fit, "exceedance", threshold = 1.5) # P(adjusted RR > 1.5) ``` The exceedance map is usually the most decision-relevant output: it flags areas that are confidently above the threshold rather than high by chance. By default the exceedance is computed for the covariate-adjusted relative risk (`which = "adjusted_rr"`); pass `which = "relative_risk"` for the full relative risk instead. ## A continuous surface Risk can also be predicted on a fine grid (change-of-support), giving a smooth surface instead of a choropleth: ```{r continuous} pc <- predict(fit, type = "continuous", sampler = "laplace", cellsize = 1) plot(pc, "adjusted_rr", bound = sdalgcp_data) ``` `predict()` returns an `sf` with all four quantities (`relative_risk`, `adjusted_rr` and their `_se`) for both `type = "discrete"` and `type = "continuous"`, and `exceedance()` works on either. ## Model checking Finally, check that the spatial term has absorbed the spatial structure: the Pearson residuals should show no leftover spatial autocorrelation. ```{r modelcheck} chk <- model_check(fit) chk$moran # residual Moran's I and its permutation p-value ``` A non-significant residual Moran's I indicates the model has captured the spatial pattern, and the observed-vs-fitted points lie around the identity line. ## Real data `sdalgcp_data` is simulated so we know the truth. For a real example, the package also ships `liver` — incident primary biliary cirrhosis counts by LSOA in North East England (Johnson et al. 2019), which you can fit the same way: ```{r liver, eval = FALSE, purl = FALSE} data(liver) fit_liver <- sdalgcp(cases ~ IMD + offset(log(pop)), data = liver) plot(fit_liver, "relative_risk") ``` ## Next - [Raster predictors](raster-covariates.html) — continuous covariates done right. - [Spatio-temporal](spatio-temporal.html) — space–time relative risk. - [Estimating the scale](scale-grid-vs-continuous.html) — what \(\phi\) is and how it is estimated. - [Spatial confounding](spatial-confounding.html) and [misaligned covariates](misaligned-covariates.html).