Snapping Points (USGS Gages) to Lines (Rivers) and Calculate Distances Along Lines

This may seem a fairly straight forward task, and it does use some relatively simple geospatial operations, but collectively I found this to be a bit trickier to do (in R). I wanted to post something here that would provide some examples of how to do this, since I need to do things like this quite often. There are probably several ways to do this, but I’ll show one option, mainly using {sf}. There are some interesting and rather annoying precision issues that crop up when doing this, so you’ll see the work around I’ve used below. A fair amount of googling and checking SO found that there are many who have run into this precision issue, so be kind to yourself!

I’m using a couple great packages here that will help use do a few things (including calculate distances along a line…see the next tutorial for more info). Namely, {nhdplusTools}, which is an immensely useful and powerful package for any person working with hydrology/river/streams (or wanting to make a map with river lines on it). In addition, we’ll need some of the other usual suspects:

library(sf) # spatial operations
library(mapview) # html mapping
library(leaflet) # html mapping
library(ggplot2) # plotting
library(dplyr) # wrangling data
library(here) # setting directories safely
library(viridis) # color scheme
library(USAboundaries) # county/state boundaries
library(nhdplusTools) # USGS/NHD rivers data

Important, before proceeding, you’ll want to make sure you’re connected to the internet. In addition, there have been some issues with downloading {nhdplusTools} data from webservers, so if it doesn’t work, try again later. Note: if you want to skip downloading data via {nhdplusTools}, all required data from this post is here if you want to download and unzip. Skip to [Snapping Points to Lines]

Get Data: Using {nhdplusTools}

First, let’s pick a single point (lat/lon), and make it spatial, and then show how we can use that point to download river lines upstream and downstream of that point, and search for USGS gages along the streamlines associated with this point. Let’s pick a point in one of the most beautiful National Parks, Yosemite! Once we have a point that is now “spatial” (see class()), we can use the {nhhplusTools} package to look up the nearest comid, which is the NHD ID for a single stream segment.

library(nhdplusTools)

# create a point in yosemite valley
yose <- st_sfc(st_point(c(-119.60020, 37.73787)), crs = 4326)

# check class is "sfc" and "sfc_POINT"
class(yose)
## [1] "sfc_POINT" "sfc"
# now figure out the nearest stream segment ID to our point
(yose_comid <- discover_nhdplus_id(yose))
## [1] 21609461

Great, so we can use the comid=21609461 to look up and download streamlines (or “flowlines”) for our area of interest, both upstream and downstream. First, let’s use a handy function in {nhhplusTools} which allows us to look up and see what data is available for our comid.

# first make a list defining the sourcetype and ID
yose_list <- list(featureSource="comid", featureID=yose_comid)

# what data is available?
discover_nldi_navigation(yose_list)
## $upstreamMain
## [1] "https://labs.waterdata.usgs.gov/api/nldi/linked-data/comid/21609461/navigate/UM"
## 
## $upstreamTributaries
## [1] "https://labs.waterdata.usgs.gov/api/nldi/linked-data/comid/21609461/navigate/UT"
## 
## $downstreamMain
## [1] "https://labs.waterdata.usgs.gov/api/nldi/linked-data/comid/21609461/navigate/DM"
## 
## $downstreamDiversions
## [1] "https://labs.waterdata.usgs.gov/api/nldi/linked-data/comid/21609461/navigate/DD"

Download Flowline Data

Once we know there are data available upstream and downstream, and we have a starting comid, we can move forward and download some flowline data. Here we’ll start with all the upstream segments (everything that flows into the selected stream segment we identified). We can specify a distance we want to keep our search within (so only get flowlines that extend a set distance away from our point/segment), but don’t do that here. Importantly, we need to specify data_source = "" to force the function to search and download the data from the web. Here we download the upstream and downstream flowlines, but for downstream, we specify only the “mainstem”, which in this case is the Merced River.

  • DM = “Downstream Mainstem”, UM = “Upstream Mainstem”, UT = “Upstream Tributaries”

Finally, if we want to download this data as a geopackage that we can use locally, we can use a list of all the comids and download the data using the handy subset_nhdplus function. Be aware that if you are downloading large areas of data, this can take a second, and can be large in size. However, generally it’s much smaller and faster than the entire NHDPlus dataset which is over 15 GB in size for the entire USA.

# get upstream flowlines
yose_us_flowlines <- navigate_nldi(nldi_feature = yose_list,
                                   mode="UT",
                                   data_source = "")


# get downstream mainstem only (from our starting segment):
yose_ds_flowlines <- navigate_nldi(nldi_feature = yose_list, 
                             mode = "DM", 
                             #distance_km = 50,
                             data_source = "")


# make a list of all the comids we've identified:
all_comids <- c(yose_us_flowlines$nhdplus_comid, yose_ds_flowlines$nhdplus_comid)

# download all data and create a geopackage with the comid list
yose_gpkg <- subset_nhdplus(comids=all_comids,
                              simplified = TRUE,
                              overwrite = TRUE,
                              output_file = paste0(here::here(), "/data/yose_nhdplus.gpkg"),
                              nhdplus_data = "download",
                              return_data = FALSE)

# check layers in database:
st_layers(paste0(here::here(), "/data/yose_nhdplus.gpkg"))

# pull the flowlines back in
yose_streams <- read_sf(paste0(here::here(), "/data/yose_nhdplus.gpkg"), "NHDFlowline_Network")
## Driver: GPKG 
## Available layers:
##            layer_name        geometry_type features fields
## 1 NHDFlowline_Network 3D Multi Line String      639     91
## 2         CatchmentSP        Multi Polygon      631      8
## 3             NHDArea        Multi Polygon      100     16
## 4        NHDWaterbody        Multi Polygon      877     23
## 5       yose_us_gages                Point        1      7
## 6       yose_ds_gages                Point       13      7

Let’s take a look at what we’ve got so far, using some basic mapping options and the {sf} package.

# make a map
prettymapr::prettymap({
  rosm::osm.plot(project = FALSE, 
                 bbox = matrix(st_bbox(yose_streams), byrow = FALSE, ncol = 2, 
                               dimnames = list(c("x", "y"), c("min", "max"))), 
                 type = "cartolight", quiet = TRUE, progress = "none")
  plot(yose_streams$geom, col = "steelblue", lwd = (yose_streams$streamorde / 4), add=TRUE)
  plot(yose, add=TRUE, pch=21, bg="orange", cex=1.5)
  prettymapr::addnortharrow()
})
A map of the mainstem Merced River flowing into the San Joaquin, with all headwater tributaries that flow into Yosemite Valley at our selected comid river segment (orange dot). The line width of the streamlines have been scaled by the stream order

A map of the mainstem Merced River flowing into the San Joaquin, with all headwater tributaries that flow into Yosemite Valley at our selected comid river segment (orange dot). The line width of the streamlines have been scaled by the stream order

Find & Download Nearby USGS Gages

Now that we have flowlines, we can search along these flowlines for any USGS gage locations. We’ll use these to snap points to lines, and then calculate distances along these lines. First we look at upstream flowlines…and find one gage! Let’s use that gage to search downstream, just as an example of using gages or comids to search/download data.

# find upstream gages
yose_us_gages <- navigate_nldi(yose_list,
              mode = "UT",
              data_source = "nwissite")

# get downstream everything from our only upstream gage (Happy Isles)
usgs_point <- list(featureSource="nwissite", featureID = "USGS-11264500")

# find all downstream gages on the mainstem river (Merced/San Joaquin)
yose_ds_gages <- navigate_nldi(yose_list,
              mode = "DM",
              #distance_km = 50,
              data_source = "nwissite",
              )

# let's add these data to our geopackage as well
# remember it's best to have everything in the same projection
st_crs(yose_streams)==st_crs(yose_us_gages)

# write to geopackage: overwite the layer if it exists
st_write(yose_us_gages, dsn=paste0(here::here(),"/data/yose_nhdplus.gpkg"), 
         layer="yose_us_gages", append = FALSE, delete_layer = TRUE)

st_write(yose_ds_gages, dsn=paste0(here::here(),"/data/yose_nhdplus.gpkg"), 
         layer="yose_ds_gages", append = FALSE, delete_layer = TRUE)

# check layers:
st_layers(paste0(here::here(), "/data/yose_nhdplus.gpkg"))

Let’s take a quick look at our data now using an interactive map with the {mapview} package. Here flowline color is scaled by the stream order (stream size).

m1 <- mapview(yose, col.regions="black", cex=6, layer.name="Start Point") + 
  mapview(yose_streams, zcol="streamorde", legend=TRUE, layer.name="Stream <br> Order") + 
  mapview(yose_us_gages, col.regions="orange", layer.name="U/S Gage") +
    mapview(yose_ds_gages, col.regions="maroon", layer.name="D/S Gages")

# add a measurement tool
m1@map %>% leaflet::addMeasure(primaryLengthUnit = "kilometers") %>%
  leaflet.extras::addFullscreenControl(position = "topleft")