EDS 222: Week 10: In-class Lab

{ANSWER KEY}

2023-12-04

Note: This lab is built from an exercise provided by Soren Wilke (original post has been taken down).

Section 0: Getting set up

  1. Store this file in a Labs/ folder where you store all your lab materials for EDS 222.

  2. Install the following new (new to EDS 222) packages:1 Hint: for gstat you do not want to install from source.

  3. Load the following packages:

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(sf)
## Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
library(raster)
## Loading required package: sp
## 
## Attaching package: 'raster'
## 
## The following object is masked from 'package:dplyr':
## 
##     select
library(gstat)
library(automap)
library(patchwork)
## 
## Attaching package: 'patchwork'
## 
## The following object is masked from 'package:raster':
## 
##     area
library(viridis)
## Loading required package: viridisLite
options(scipen = 999) # disable scientific notation

# For labs, we want to see all our code
knitr::opts_chunk$set(echo = TRUE)
  1. Set your working directory
rootdir <- ("~/Dropbox/Teaching/UCSB/EDS_222/EDS222_data")

Section 1: Spatial point data

The data for this tutorial is a xyz-file giving us the depth of the groundwater table in a part of the German state of North Rhine-Westphalia in 1988.

We have measures of the groundwater level in 1988 at a limited number of monitoring wells, but our goal is ultimately to estimate the groundwater level throughout the surrounding region. This is often the case when studying groundwater, which is generally measured with a small set of monitoring wells.

Here let’s just take a look at our input data:

  1. Load the week-ten-data.RDS file into R using readRDS.

  2. Take a look – what is the format? How many dimensions? Why?

#wellobs <- readRDS("week-ten-data.RDS") #simple way, put in same directory

wellobs <- readRDS(file.path(rootdir, "labs", "10-week-ten", "week-ten-data.RDS"))
head(wellobs)
## # A tibble: 6 × 3
##        X       Y     Z
##    <dbl>   <dbl> <dbl>
## 1 407640 5732860  81.2
## 2 411040 5729860  93.8
## 3 411540 5735460  65.5
## 4 409840 5732460  87.6
## 5 404940 5733860  74.1
## 6 407840 5727160  66.2
  1. Use ggplot to plot the locations of the monitoring wells as points, with color indicating depth of the well. I’m going to use viridis because it’s aesthetically nice and theme_classic() to clean up the plot a bit.
wellobs %>% 
  ggplot(
  mapping = aes(x = X, y = Y, color = Z)) +
  geom_point(size = 3) + 
  scale_color_viridis(option = "B") + #optional
  theme_classic() #optional

  1. What do you notice about the spatial correlation in this plot? Does it seem reasonable to attempt spatial interpolation in this setting?

Answer: This looks like ideal data for spatial interpolation. Points are well distributed across the sample (i.e., no clusters with big open white spaces), and there appears to be spatial correlation indicating spatial information will be helpful in estimating depth in unmonitored locations.

Section 2: Creating a Variogram

As we discussed in class, kriging relies on an estimated or prescribed variogram, which is essentially a function describing the relationship between distance and covariance in “Z” in your point data. Here we are going to create an empirical variogram, which gives us the distribution of spatial dependencies observed in the data.

  1. First, we need to convert the data from a data.frame to an sf object, because that is usually the easiest way to save and work with spatial data. Use the st_as_sf function to do this, noting that the raw data come in the EPSG 25832 projection.

When you do this, make sure to save the coordinates “X” and “Y” in the resulting sf object so that you can use them later if needed (some forms of kriging rely on them directly).

wellobs_sf <- st_as_sf(wellobs, coords = c("X", "Y"), crs = 25832) %>% 
  cbind(st_coordinates(.))
  1. Second, use the function autofitVariogram from the automap package to compute semivariances as a function of distance and to fit a variogram model through those points. Recall from class that you can choose the type of variogram model you want to fit to the data, such as spherical or Gaussian. However, here we will let autofitVariogram choose the model that best fits the data for us.

The first argument of autofitVariogram is the variogram formula defining the response variable we’re interested in and any possible regressors which determine the mean. Here, we will assume we are only using our monitoring wells to interpolate, so we have no other regressors to include in our estimate of the mean. Therefore, use Z~1 as the formula. The second argument is our data (in sf format).

Save the column var_model from the output; lots of other things are also stored that we don’t want to worry about.

v_mod_full <- automap::autofitVariogram(Z~1, wellobs_sf)

v_mod <- v_mod_full$var_model
head(v_mod)
##   model    psill    range kappa
## 1   Nug   0.0000    0.000     0
## 2   Ste 155.5775 4858.866     5
  1. Plot this variogram model using plot() and passing the full model as an argument. Does it seem to fit the data well?
plot(v_mod_full)

# Note: text next to dots is the number of pairs used to estimate the semi variance!

Section 3: Define a target grid

In order to interpolate we need a target grid with a target set of coordinates (“X”, “Y”) for which the modeled variable (“Z”) to be interpolated. This means we come up with an empty grid or raster. Beware – the finer resolution the grid, the longer kriging will take.

  1. Define a grid based on the bounding box of our observations. I recommend using a 200 unit cell size (increasing cell size will make kriging run faster, but will lower the resolution of your output).
grd_sf <- st_make_grid(wellobs_sf, cellsize = 200)
plot(grd_sf)

Section 4: Ordinary Kriging

Kriging comes in several flavors, as discussed in class. We will here only focus on Ordinary Kriging, but other types are feasible in gstat.

Ordinary Kriging assumes that the mean in your target grid is constant and but not known to you.

In gstat, the main way to differentiate between different kinds of kriging (or any of the implemented algorithms) is controlled by the formula supplied. We have already seen that syntax when fitting the variogram – we use Z~1 to indicate Ordinary Kriging.

  1. Use the krige() function from gstat to generate a full set of predictions of groundwater levels at all locations across the region.
OK <- krige(
  Z~1, 
  wellobs_sf,
  grd_sf,
  model = v_mod
)
## [using ordinary kriging]

Investigate the dimensions of the OK object. krige() will save the predicted value in OK$var1.pred, and the estimated variance in OK$var1.var. Do these values look like they make sense to you?

dim(OK)
## [1] 2500    5
head(OK$var1.pred)
## [1] 87.20667 83.94198 80.78110 77.89202 75.46185 73.66812
head(OK$var1.var)
## [1] 0.293939503 0.166392995 0.091809313 0.047599558 0.021619267 0.007553611
  1. Take a look at your predictions (always a good idea) by plotting the raw monitoring data next to a raster of your kriging-based predictions. For the kriging output, use geom_raster().
# Plot the raw wells using geom_point()
p_raw = ggplot(
  data = wellobs,
  mapping = aes(x = X, y = Y, color = Z)
) +
  geom_point(size = 3) + 
  scale_color_viridis(option = "B",  limits = c(48, 101)) +
  ggtitle(label = "Observation Wells Sampled") +
  theme_void() +
    theme(
      plot.title = element_text(hjust = 0.5)
    )

# Plot the kriging output using geom_raster()
p_kriging <-  ggplot(OK, aes(x = x, y = y, fill = var1.pred)) +
    geom_raster() +
    ggtitle(label = "Ordinary Kriging") +
    scale_fill_viridis(option = "B", limits = c(48, 101)) +
    theme_void() +
    theme(
      plot.title = element_text(hjust = 0.5)
    ) + guides(fill=guide_legend(title="Z", reverse=TRUE))

p_raw + p_kriging

Based on this visual, how well do you think ordinary kriging worked? What other methods would you consider trying?