---
title: "1. Spatial disease mapping with SDALGCP2"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{1. Spatial disease mapping with SDALGCP2}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>", fig.width = 5,
                      fig.height = 4.2, fig.align = "center", dpi = 100)
```

This tutorial fits a spatial disease-mapping model end to end. Every code block
runs as shown, using the bundled example dataset `sdalgcp_data` so you can
copy-paste and reproduce it exactly.

## The model

We observe disease counts \(Y_i\) aggregated over areal units \(A_i\)
(\(i=1,\dots,N\)) with an offset \(m_i\) (the expected count, e.g. population times
a baseline rate). SDALGCP2 fits a **spatially discrete approximation to a
log-Gaussian Cox process**:
\[
Y_i \mid S \;\sim\; \mathrm{Poisson}\!\big(m_i\, e^{\eta_i}\big),
\qquad
\eta_i \;=\; d_i^\top\beta \;+\; S_i,
\]
where \(d_i\) are area-level covariates and \(S=(S_1,\dots,S_N)\) is a Gaussian
spatial random effect. \(S_i\) is the average over area \(A_i\) of a continuous
Gaussian process \(S(x)\) with exponential covariance
\(\mathrm{Cov}\{S(x),S(x')\}=\sigma^2\exp(-\lVert x-x'\rVert/\phi)\); aggregating
this process over the areas gives an \(N\times N\) covariance built from candidate
points inside each region (see Tutorial&nbsp;4). The parameters are the
coefficients \(\beta\), the spatial variance \(\sigma^2\) and the range \(\phi\).

Two quantities are reported for every area:

* **Relative risk** \(\mathrm{RR}_i=e^{\eta_i}=e^{d_i^\top\beta+S_i}\) — the full
  relative risk, including the covariate effect (the `relative_risk` column);
* **Covariate-adjusted relative risk** \(\mathrm{RR}^{\mathrm{adj}}_i=e^{S_i}\) —
  the residual spatial relative risk *after* adjusting for covariates (where is
  risk high/low beyond what the covariates explain?) — the `adjusted_rr` column.

## The data

`sdalgcp()` takes an `sf` object whose columns hold the response, covariates and
offset. The package ships a small simulated example, `sdalgcp_data`: 64 regions
with a disease count (`cases`), a covariate (`x1`), and a population offset
(`pop`). It was generated with a true covariate effect of `0.6` and a baseline
log-rate of `-6`, so we can check the model recovers them.

```{r}
library(SDALGCP2)
library(sf)

data(sdalgcp_data)
head(sdalgcp_data)

# crude standardised incidence ratio (SIR): observed / expected-at-overall-rate
rate <- sum(sdalgcp_data$cases) / sum(sdalgcp_data$pop)
sdalgcp_data$SIR <- sdalgcp_data$cases / (sdalgcp_data$pop * rate)
```

The crude SIR is noisy and over-fits sparsely populated areas — exactly what a
model smooths:

```{r data-maps, fig.width = 7, fig.height = 3.4}
plot(sdalgcp_data["SIR"], main = "Crude SIR (the data)")
plot(sdalgcp_data["x1"],  main = "Covariate x1")
```

## Fit

One call. Candidate-point spacing, the spatial range and MCMC settings are chosen
automatically; `reanchor` re-simulates the latent field at the optimum a couple of
times for reliable variance estimates. (We set a seed and a shorter MCMC run here
so the vignette is quick and reproducible; the defaults are longer.)

```{r fit}
set.seed(2024)
fit <- sdalgcp(cases ~ x1 + offset(log(pop)), data = sdalgcp_data,
               control = sdalgcp_control(n_sim = 4000, burnin = 1000, thin = 5,
                                         reanchor = 1))
summary(fit)
```

The covariate effect `x1` is estimated close to its true value of 0.6, the spatial
range `phi` and variance `sigma^2` describe the residual spatial structure, and the
importance-sampling effective sample size reports how reliable the Monte Carlo
likelihood is.

## Map the two relative risks

```{r risk-maps}
plot(fit, "relative_risk")   # relative risk exp(d'beta + S)
plot(fit, "adjusted_rr")     # covariate-adjusted relative risk exp(S)
```

`relative_risk` (left) is the overall pattern of risk; `adjusted_rr` (right) strips
out the covariate contribution and shows the *unexplained* spatial signal — useful
for spotting hotspots that the covariates do not account for.

## Uncertainty and exceedance

Every quantity comes with a standard error, and you can ask for the probability
that risk exceeds a policy threshold:

```{r exceedance-maps}
plot(fit, "adjusted_rr_se")                  # standard error of the adjusted RR
plot(fit, "exceedance", threshold = 1.5)     # P(adjusted RR > 1.5)
```

The exceedance map is usually the most decision-relevant output: it flags areas
that are confidently above the threshold rather than high by chance. By default the
exceedance is computed for the covariate-adjusted relative risk (`which =
"adjusted_rr"`); pass `which = "relative_risk"` for the full relative risk instead.

## A continuous surface

Risk can also be predicted on a fine grid (change-of-support), giving a smooth
surface instead of a choropleth:

```{r continuous}
pc <- predict(fit, type = "continuous", sampler = "laplace", cellsize = 1)
plot(pc, "adjusted_rr", bound = sdalgcp_data)
```

`predict()` returns an `sf` with all four quantities (`relative_risk`,
`adjusted_rr` and their `_se`) for both `type = "discrete"` and
`type = "continuous"`, and `exceedance()` works on either.

## Model checking

Finally, check that the spatial term has absorbed the spatial structure: the
Pearson residuals should show no leftover spatial autocorrelation.

```{r modelcheck}
chk <- model_check(fit)
chk$moran   # residual Moran's I and its permutation p-value
```

A non-significant residual Moran's I indicates the model has captured the spatial
pattern, and the observed-vs-fitted points lie around the identity line.

## Real data

`sdalgcp_data` is simulated so we know the truth. For a real example, the package
also ships `liver` — incident primary biliary cirrhosis counts by LSOA in North
East England (Johnson et al. 2019), which you can fit the same way:

```{r liver, eval = FALSE, purl = FALSE}
data(liver)
fit_liver <- sdalgcp(cases ~ IMD + offset(log(pop)), data = liver)
plot(fit_liver, "relative_risk")
```

## Next

- [Raster predictors](raster-covariates.html) — continuous covariates done right.
- [Spatio-temporal](spatio-temporal.html) — space–time relative risk.
- [Estimating the scale](scale-grid-vs-continuous.html) — what \(\phi\) is and how
  it is estimated.
- [Spatial confounding](spatial-confounding.html) and
  [misaligned covariates](misaligned-covariates.html).