diff --git a/Rubric.md b/Rubric.md new file mode 100644 index 0000000..6993efe --- /dev/null +++ b/Rubric.md @@ -0,0 +1,15 @@ +### General Rubric +- 2 pt for use of git hub + - are both users pushing/pulling? + - have they successfully dealt with merge conflicts? + - have they updated their repository README with names and module details? +- 2 pts for travis + - is their build passing? + - is the travis badge included and displaying correctly on their .md documents +- 2 pts for Rmarkdown + - do they include a .md file along with their .Rmd assignment file? + - do their plots all show correctly? + - is there evidence that they have attempted to clean or format their markdown? +- 6 pts for Module Code Goals + - demonstrate comfort manipulating vector data in the sf library, maintaining geometry + - load and manipulate raster spatial data, including projections diff --git a/spatial/rasters/annual_npp.tif b/spatial/rasters/annual_npp.tif index a6fd12d..24bc625 100644 Binary files a/spatial/rasters/annual_npp.tif and b/spatial/rasters/annual_npp.tif differ diff --git a/spatial/shapefiles/.gitignore b/spatial/shapefiles/.gitignore deleted file mode 100644 index 9b34496..0000000 --- a/spatial/shapefiles/.gitignore +++ /dev/null @@ -1,2 +0,0 @@ -mpas.zip -mpa_inventory_2014_* diff --git a/spatial/shapefiles/US_wc_bbox.dbf b/spatial/shapefiles/US_wc_bbox.dbf deleted file mode 100644 index 69232e2..0000000 Binary files a/spatial/shapefiles/US_wc_bbox.dbf and /dev/null differ diff --git a/spatial/shapefiles/US_wc_bbox.prj b/spatial/shapefiles/US_wc_bbox.prj deleted file mode 100644 index a30c00a..0000000 --- a/spatial/shapefiles/US_wc_bbox.prj +++ /dev/null @@ -1 +0,0 @@ -GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]] \ No newline at end of file diff --git a/spatial/shapefiles/US_wc_bbox.shp b/spatial/shapefiles/US_wc_bbox.shp deleted file mode 100644 index 7a4b453..0000000 Binary files a/spatial/shapefiles/US_wc_bbox.shp and /dev/null differ diff --git a/spatial/shapefiles/US_wc_bbox.shx b/spatial/shapefiles/US_wc_bbox.shx deleted file mode 100644 index aff265d..0000000 Binary files a/spatial/shapefiles/US_wc_bbox.shx and /dev/null differ diff --git a/spatial/shapefiles/mpas_westcoast.dbf b/spatial/shapefiles/mpas_westcoast.dbf new file mode 100644 index 0000000..cc339a0 Binary files /dev/null and b/spatial/shapefiles/mpas_westcoast.dbf differ diff --git a/spatial/shapefiles/mpas_westcoast.prj b/spatial/shapefiles/mpas_westcoast.prj new file mode 100644 index 0000000..c13dada --- /dev/null +++ b/spatial/shapefiles/mpas_westcoast.prj @@ -0,0 +1 @@ +PROJCS["Albers",GEOGCS["GCS_North_American_1983",DATUM["D_North_American_1983",SPHEROID["GRS_1980",6378137,298.257222101]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]],PROJECTION["Albers"],PARAMETER["standard_parallel_1",29.5],PARAMETER["standard_parallel_2",45.5],PARAMETER["latitude_of_origin",37.5],PARAMETER["central_meridian",-96],PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["Meter",1]] \ No newline at end of file diff --git a/spatial/shapefiles/mpas_westcoast.shp b/spatial/shapefiles/mpas_westcoast.shp new file mode 100644 index 0000000..01dbaeb Binary files /dev/null and b/spatial/shapefiles/mpas_westcoast.shp differ diff --git a/spatial/shapefiles/mpas_westcoast.shx b/spatial/shapefiles/mpas_westcoast.shx new file mode 100644 index 0000000..e7e9cb4 Binary files /dev/null and b/spatial/shapefiles/mpas_westcoast.shx differ diff --git a/spatial/shapefiles/wc_regions.dbf b/spatial/shapefiles/wc_regions.dbf deleted file mode 100644 index 4a5c4fa..0000000 Binary files a/spatial/shapefiles/wc_regions.dbf and /dev/null differ diff --git a/spatial/shapefiles/wc_regions.prj b/spatial/shapefiles/wc_regions.prj deleted file mode 100644 index a30c00a..0000000 --- a/spatial/shapefiles/wc_regions.prj +++ /dev/null @@ -1 +0,0 @@ -GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]] \ No newline at end of file diff --git a/spatial/shapefiles/wc_regions.shp b/spatial/shapefiles/wc_regions.shp deleted file mode 100644 index 6ada918..0000000 Binary files a/spatial/shapefiles/wc_regions.shp and /dev/null differ diff --git a/spatial/shapefiles/wc_regions.shx b/spatial/shapefiles/wc_regions.shx deleted file mode 100644 index 8544ff9..0000000 Binary files a/spatial/shapefiles/wc_regions.shx and /dev/null differ diff --git a/spatial/spatial-assignment.Rmd b/spatial/spatial-assignment.Rmd index 0a412fc..3f8b957 100644 --- a/spatial/spatial-assignment.Rmd +++ b/spatial/spatial-assignment.Rmd @@ -1,12 +1,12 @@ --- -title: "Spatial Assignment" -author: +title: "Spatial Assignment Answers" +author: "Dana Seidel (GSI)" output: github_document always_allow_html: yes --- ```{r setup, include=FALSE} -knitr::opts_chunk$set(messages = FALSE, cache = TRUE) +knitr::opts_chunk$set(messages = FALSE, cache = FALSE) ``` # Exercise @@ -21,24 +21,24 @@ We will map potential areas of marine aquaculture for the super cute Pacific spi ![They have adhesive pelvic disks! How cute!](./images/lumpsucker.png) -We will answer this question by taking into consideration the following spatial data: +To do this we are going to need the following spatial data: **1. Sea Surface Temperature** (raster data) **2. Net Primary Productivity** (raster data) **3. Marine Protected Areas** (vector data) ```{r libraries, include= FALSE} + +devtools::install_github("tidyverse/ggplot2") #installing the developer version of ggplot for the geom_sf() function + library(mapview) #interactive maps, raster + vector layers library(raster) #Main raster library library(tidyverse) #our old friend library(sf) #to work with simple features data -#installing the developer version of ggplot for the geom_sf() function -devtools::install_github("tidyverse/ggplot2") -library(ggplot2) ``` -## Task 1: Play with Vector Data +## Exercise 1: Play with Vector Data So to figure out where we might find the lumpsucker fish, we need to know a little about it! @@ -53,72 +53,33 @@ Key information for optimal growth: - Net Primary Productivity between **2.6 and 3 mgC/m2/day** ### Task 1: Load and Visualize data -We'll start by a downloading data file of all Marine Protected Areas monitored by the US -Federal government: . Load this is and visualize it -using ggplot. +We'll start with a data file of Marine Protected Areas monitored by the US +Federal government on the west coast: `mpas_westcoast.shp`. This is found in +your `shapefiles` directory. Load this in using the `st_read` function. -```{r} -download.file("https://marineprotectedareas.noaa.gov/pdf/helpful-resources/inventory/mpa_inventory_2014_public_shp.zip", - "shapefiles/mpas.zip") -unzip("shapefiles/mpas.zip", exdir="shapefiles") -``` -Using the `sf` library, load the downloaded file (`mpa_inventory_2014_public_shp.shp`) -and visualize it. *It's large it may take a moment to plot* ```{r} -``` - -### Task 2: Filtering only the West Coast! -Woah!The US territories are a lot bigger than we sometimes think about! - -But since this species of lumpsuckers are only found on the Pacific, let's start by limiting this -shapefile to just the West Coast of the contiguous US. - -The nice thing about simple features data, is that it can be filtered and summarised -in the same way as non-spatial data using the `dplyr` functions we learned last -week. -First to make our lives easier, `select()` only the columns that we care about: -we'll want to know the Site_Label, Site_ID, and State at a minimum. Additionally write -code, to limit the MPAs we see to only those controlled by the states on the -west coast of the US. - -```{r} ``` -Now using `ggplot`, plot the west coast mpas, colored by their state. -```{r} -``` - -Seems like maybe we're missing a lot of Federally protected areas! How might we -get at those? - -### Task 3: Filter by Intersection! - -Above we filtered out those MPAs we weren't interested in using the `dplyr` -functions we learned last week, but with such a large spatial file, -the `sf` library may offer a better way, given a second .shp file offering a -bounding box for the area we are interested in: `US-wc_bbox.shp` - -Try loading in this new shape file, visualizing it (*hint*: try using the function `mapview()`), -and using the `st_intersection` function to filter only the polygon records of `mpas` on the west coast of the -continental US. +### Task 2 +Plot a map of these protected areas using `plot`, `ggplot` with the `geom_sf` command, or +`mapview`. Whichever approach you choose, color the protected areas according +to their "State" column. Keep in mind, spatial data is memory intensive - some +plots make take a little while to load. ```{r} + ``` -### Task 4 +### Task 3 How much protected marine habitat does each state or agency control? -Which controls the most? Try using the `st_area` command. - -Below, we've gotten you started: `st_buffer(0)` is necessary before summarise functions -with this particular dataset to avoid a self-intersection warning that occurs with -the mpas polygon dataset. +Which controls the most? Try using the `st_area` command to find out. The `group_by` +and `summarise` sequence you learned in previous modules may help! ```{r} -# your_dataset_here %>% -# st_buffer(0) %>% .... + ``` @@ -130,9 +91,9 @@ In the `rasters` folder, there are 5 `.tif` files with the naming pattern `avera ### Task 1: Read in raster data To create a single average Sea Surface Temperature layer, you'll first need to read in all 5 data files (try adapting the `list.files()` function). To read them all in, -you'll use the `raster` function. +you'll use the `raster` function. This is a good place to try the `map` function too! -```{r} +```{r sst} ``` @@ -168,29 +129,33 @@ It can stack either from filenames for rasters, or from the raster objects thems Produce a rasterstack of Average Sea Surface Temperature across all 5 years. Try using the `plot` function to visualize the stack. -```{r calc avg SST} +```{r} ``` ### Task 5: Raster calcuations -As we said earlier, we want this raster in Celsius. The conversion between Kelvin +We want to take this stack of 5 rasters and make one. Additionally for easier +interpretation, we want our averaged raster in Celsius. The conversion between Kelvin and Celsius is: +$C = K - 273.15$ -Calculate the mean value per cell and then convert to Celsius by subtracting 273.15. -You could do this in multiple steps or write a small custom function. +This collapse and conversion could be done in multiple steps, but since we like +to emphasize concision in our coding, let's instead write a +small custom function to take the mean and convert to Celsius. We will then apply +this function to our rasterstack to calculate a new raster we can use for analysis. -Write this as a custom R function. +Write your custom R function. ```{r} ``` -You can perform operations on a RasterStack by using the `calc()` function from +We can perform operations on a RasterStack by using the `calc()` function from the `raster` package. Use the `calc()` function to apply your conversion from above and then plot the resulting raster. -```{r pipes sst} +```{r} ``` @@ -203,47 +168,57 @@ and the NPP layer. #### Task 1: Read in NPP raster data -Read in the NPP data, using the `raster()` command and the "annual_npp_prepped.tif" -found in the rasters folder.This data is the net primary production (mgC/m2/day). -After readng in, plot this data. +Read in the NPP data, using the `raster()` command and the "annual_npp.tif" +found in the rasters folder. This data is the net primary production (mgC/m2/day). +After reading it in, plot this data. -```{r avg npp} +```{r} ``` ### Task 2: Reproject Try adding these two layers together using `stack` and you'll get an error because -these rasters are not in the same "projection" or of the same extent - pretty obvious from the plots. -In order to do analysis acrossmultiple spatial datasources, they must be using the +these rasters are not in the same "projection" - pretty obvious from the plots. +In order to do analysis across multiple spatial datasources, they must be using the same coordinate reference system. -Use the `crs()` command to see what coordinate system each of your rasters are using: +Use the `crs()` command to see what coordinate system each of your rasters are using and, for +good measure, use the `st_crs` command to investigate the crs of our vector data, `west_coast`, from above: ```{r} - ``` -Now, we can use `projectRaster()` from the raster package to reproject a RasterLayer from +Looks like `npp` is equal to our `west_coast` but our mean SST layer is different! +It's crucial when doing spatial +analysis that your projections across all layers are equal. We can +use `projectRaster()` from the raster package to reproject a RasterLayer from one projection to another. You will need to define what the new projection should be by setting a coordinate reference system using the argument `crs =`. -Project npp into sstAvg's coordinate reference system and prove to yourself they -are equal. - -```{r reproject} +Project your average SST layer into npp's coordinate reference system and prove to yourself they +are equal using the `identicalCRS` function. +```{r} ``` -*Note: keep in mind, different projections may not be the only thing different across your rasters. To do any sort of analysis using multiple rasters, they all need to be in the same extent, projection and cell resolution.* +You will get an error about non-missing arguments here, this is because in order +to have our two raster layers match in extent, our SST layer covers a lot of +missing values on its edges which `raster` is encountering +in the projection. We can ignore this error for now, but if you're curious about +the values of a raster you can always look at them using the `values` command. Try +`summary(values(SST_layer))` -- how many NAs does our raster have? -To crop and resample the npp layer to the extent and resolution of sstAvg, adapt and run the following line: -```{r} -# npp.fit <- resample(your_projected_npp, your_sst_raster, method="bilinear") %>% crop(., your_sst_raster) -``` +It's important to note that although here we have given you mostly prepared data, +in practice the projection may not be the only thing different across your rasters. +To do any sort of analysis using multiple rasters, they all need to be in the +same extent, projection and cell resolution. You can check this with the command +`all.equal()` in the `raster` library* + +Okay! now we're ready to get to some analysis. For convenience, stack the now +matching rasters together using the `stack` function and plot them. -Stack the now matching rasters together using the `stack` function and plot them. ```{r} ``` @@ -267,12 +242,12 @@ let's get back to vector data. ### Task 1: Sample Points & Extract values from Rasters Try using the `st_sample()` function, to sample 1000 points from the west coast -mpa polygons we filtered in task 1. - -Once sampled, your points will no longer be a simple features dataframe, -instead `sp_sample()` returns a single geometry object of class `sfc` -which is not a data.frame. To turn an `sfc` object back into an `sf` object use `st_sf()`. -Try using `st_sf()` and then `st_join()` to retrieve the MPAs info for each sampled point. +mpa polygons we filtered in task 1. Once sampled, you will have an `sfc` object, +or a "simple features collection". These collections represent spatial geometry, but +don't support attribute data. If we want to extract any data to these points we need to convert +them first to a full `sf` object which functions as a dataframe. +Good news, the `st_sf()` command can do just that! Convert your sampled points and then use +`st_join()` to retrieve the MPAs info (from `west_coast`) for each sampled point. ```{r} @@ -285,7 +260,7 @@ See the `st_sample()` documentation and explain. Use your sampled points to extract information from the rasters on sea surface temperature and net primary productivity, try using the `raster::extract` function. Remember `mutate` plays nicely -with `sf` objects +with `sf` objects. ```{r}