Note: This lab is built from an exercise provided by Soren Wilke (original post has been taken down).
Store this file in a Labs/ folder where you store all your lab materials for EDS 222.
Install the following new (new to EDS 222) packages:1 Hint: for gstat
you do not want to install from source.
Load the following packages:
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.3 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.4.4 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(sf)
## Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
library(raster)
## Loading required package: sp
##
## Attaching package: 'raster'
##
## The following object is masked from 'package:dplyr':
##
## select
library(gstat)
library(automap)
library(patchwork)
##
## Attaching package: 'patchwork'
##
## The following object is masked from 'package:raster':
##
## area
library(viridis)
## Loading required package: viridisLite
options(scipen = 999) # disable scientific notation
# For labs, we want to see all our code
knitr::opts_chunk$set(echo = TRUE)
rootdir <- ("~/Dropbox/Teaching/UCSB/EDS_222/EDS222_data")
The data for this tutorial is a xyz-file giving us the depth of the groundwater table in a part of the German state of North Rhine-Westphalia in 1988.
We have measures of the groundwater level in 1988 at a limited number of monitoring wells, but our goal is ultimately to estimate the groundwater level throughout the surrounding region. This is often the case when studying groundwater, which is generally measured with a small set of monitoring wells.
Here let’s just take a look at our input data:
Load the week-ten-data.RDS
file into R using readRDS
.
Take a look – what is the format? How many dimensions? Why?
#wellobs <- readRDS("week-ten-data.RDS") #simple way, put in same directory
wellobs <- readRDS(file.path(rootdir, "labs", "10-week-ten", "week-ten-data.RDS"))
head(wellobs)
## # A tibble: 6 × 3
## X Y Z
## <dbl> <dbl> <dbl>
## 1 407640 5732860 81.2
## 2 411040 5729860 93.8
## 3 411540 5735460 65.5
## 4 409840 5732460 87.6
## 5 404940 5733860 74.1
## 6 407840 5727160 66.2
ggplot
to plot the locations of the monitoring wells as points, with color indicating depth of the well. I’m going to use viridis
because it’s aesthetically nice and theme_classic()
to clean up the plot a bit.wellobs %>%
ggplot(
mapping = aes(x = X, y = Y, color = Z)) +
geom_point(size = 3) +
scale_color_viridis(option = "B") + #optional
theme_classic() #optional
Answer: This looks like ideal data for spatial interpolation. Points are well distributed across the sample (i.e., no clusters with big open white spaces), and there appears to be spatial correlation indicating spatial information will be helpful in estimating depth in unmonitored locations.
As we discussed in class, kriging relies on an estimated or prescribed variogram, which is essentially a function describing the relationship between distance and covariance in “Z” in your point data. Here we are going to create an empirical variogram, which gives us the distribution of spatial dependencies observed in the data.
data.frame
to an sf
object, because that is usually the easiest way to save and work with spatial data. Use the st_as_sf
function to do this, noting that the raw data come in the EPSG 25832 projection.When you do this, make sure to save the coordinates “X” and “Y” in the resulting sf
object so that you can use them later if needed (some forms of kriging rely on them directly).
wellobs_sf <- st_as_sf(wellobs, coords = c("X", "Y"), crs = 25832) %>%
cbind(st_coordinates(.))
autofitVariogram
from the automap
package to compute semivariances as a function of distance and to fit a variogram model through those points. Recall from class that you can choose the type of variogram model you want to fit to the data, such as spherical or Gaussian. However, here we will let autofitVariogram
choose the model that best fits the data for us.The first argument of autofitVariogram
is the variogram formula defining the response variable we’re interested in and any possible regressors which determine the mean. Here, we will assume we are only using our monitoring wells to interpolate, so we have no other regressors to include in our estimate of the mean. Therefore, use Z~1
as the formula. The second argument is our data (in sf
format).
Save the column var_model
from the output; lots of other things are also stored that we don’t want to worry about.
v_mod_full <- automap::autofitVariogram(Z~1, wellobs_sf)
v_mod <- v_mod_full$var_model
head(v_mod)
## model psill range kappa
## 1 Nug 0.0000 0.000 0
## 2 Ste 155.5775 4858.866 5
plot()
and passing the full model as an argument. Does it seem to fit the data well?plot(v_mod_full)
# Note: text next to dots is the number of pairs used to estimate the semi variance!
In order to interpolate we need a target grid with a target set of coordinates (“X”, “Y”) for which the modeled variable (“Z”) to be interpolated. This means we come up with an empty grid or raster. Beware – the finer resolution the grid, the longer kriging will take.
grd_sf <- st_make_grid(wellobs_sf, cellsize = 200)
plot(grd_sf)
Kriging comes in several flavors, as discussed in class. We will here only focus on Ordinary Kriging, but other types are feasible in gstat
.
Ordinary Kriging assumes that the mean in your target grid is constant and but not known to you.
In gstat
, the main way to differentiate between different kinds of kriging (or any of the implemented algorithms) is controlled by the formula supplied. We have already seen that syntax when fitting the variogram – we use Z~1
to indicate Ordinary Kriging.
krige()
function from gstat
to generate a full set of predictions of groundwater levels at all locations across the region.OK <- krige(
Z~1,
wellobs_sf,
grd_sf,
model = v_mod
)
## [using ordinary kriging]
Investigate the dimensions of the OK
object. krige()
will save the predicted value in OK$var1.pred
, and the estimated variance in OK$var1.var
. Do these values look like they make sense to you?
dim(OK)
## [1] 2500 5
head(OK$var1.pred)
## [1] 87.20667 83.94198 80.78110 77.89202 75.46185 73.66812
head(OK$var1.var)
## [1] 0.293939503 0.166392995 0.091809313 0.047599558 0.021619267 0.007553611
geom_raster()
.# Plot the raw wells using geom_point()
p_raw = ggplot(
data = wellobs,
mapping = aes(x = X, y = Y, color = Z)
) +
geom_point(size = 3) +
scale_color_viridis(option = "B", limits = c(48, 101)) +
ggtitle(label = "Observation Wells Sampled") +
theme_void() +
theme(
plot.title = element_text(hjust = 0.5)
)
# Plot the kriging output using geom_raster()
p_kriging <- ggplot(OK, aes(x = x, y = y, fill = var1.pred)) +
geom_raster() +
ggtitle(label = "Ordinary Kriging") +
scale_fill_viridis(option = "B", limits = c(48, 101)) +
theme_void() +
theme(
plot.title = element_text(hjust = 0.5)
) + guides(fill=guide_legend(title="Z", reverse=TRUE))
p_raw + p_kriging
Based on this visual, how well do you think ordinary kriging worked? What other methods would you consider trying?