-
Notifications
You must be signed in to change notification settings - Fork 16
Reworked critical questions and added summary #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: gh-pages
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some questions on what stats knowledge is expected for the course.
 | ||
|
||
## Data exploration | ||
|
||
Download the data to your computer and open your preferred R IDE to the directory of this tutorial. | ||
|
||
After downloading the data we begin with visualization. The data consists of all the Sentinel 2 bands at a spatial resolution of 20 m. We will also make use of training polygons for the land cover classification, which will be introduced later. | ||
After downloading the data we begin with visualization. The data consists of all the Sentinel 2 bands at a spatial resolution of 20 m, meaning that each pixel on the scene corresponds to a ground distance of 20 m by 20 m. We will also make use of training polygons for the land cover classification, which will be introduced later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a little more description since students may not be familiar with the term "spatial resolution".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and in fact the preferred term is actually "pixel size"! The term "resolution" means the smallest recognisable object. Sentinel-2 resolution is actually 10 m (you can tell apart objects that are 10 m across), but in this example, we use bands aggregated to 20 m pixel size or pixel spacing. Likewise, we could disaggregate it to 5 m pixel size (e.g. using bilinear interpolation), but that won't make the resolution any finer, as you still won't be able to identify any objects that are smaller than 10 m across.
@@ -119,7 +119,7 @@ plot(ndvi) | |||
Aside from the advantages of `app()` regarding memory usage, an additional advantage of this function is the fact that the result can be written immediately to the file by including the `filename = "..."` argument, which will allow you to write your results to file immediately, after which you can reload it in subsequent sessions without having to repeat your analysis. | |||
|
|||
```{block, type="alert alert-success"} | |||
> **Question 2**: What is the advantage of including the NDVI layer in the classification? | |||
> **Question 2**: What is the advantage of including the NDVI layer in the landcover classification? Hint: For information on NDVI, check out [this source](https://gisgeography.com/ndvi-normalized-difference-vegetation-index/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a hint with more information about NDVI to help students understand this new term. This is a pretty open ended question so it should be ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it should be good. Note that there should be a space in "land cover", though.
@@ -311,7 +323,7 @@ The mean decrease in accuracy indicates the amount by which the classification a | |||
Since the NDVI layer scores relatively low according to the mean accuracy decrease criterion, try to construct an alternate Random Forest model as above, but excluding this layer, you can name it something like 'modelRF2'. | |||
|
|||
```{block, type="alert alert-success"} | |||
> **Question 4**: What effect does this have on the overall accuracy of the results (hint: compare the confusion matrices of the original and new outputs). What effect does leaving this variable out have on the processing time (hint: use `system.time()`)? | |||
> **Question 4**: What effect does this have on the accuracy of the results? Hint: Compare the overall accuracies (or the confusion matrices) of the original and new outputs. What effect does leaving this variable out have on the processing time? Hint: use `system.time()`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question assumes that students have worked with accuracies and confusion matrices, which may not be the case. I would make the question more flexible, allowing students to solely focus on the overall accuracy to answer the question if they do not feel comfortable with the confusion matrix. (I just noticed a typo at the end of the question, misplaced question mark.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's fine. And yes, there's a misplaced question mark :) It's also technically two questions, maybe good to split them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, here are some suggestions for further improvement!
 | ||
|
||
## Data exploration | ||
|
||
Download the data to your computer and open your preferred R IDE to the directory of this tutorial. | ||
|
||
After downloading the data we begin with visualization. The data consists of all the Sentinel 2 bands at a spatial resolution of 20 m. We will also make use of training polygons for the land cover classification, which will be introduced later. | ||
After downloading the data we begin with visualization. The data consists of all the Sentinel 2 bands at a spatial resolution of 20 m, meaning that each pixel on the scene corresponds to a ground distance of 20 m by 20 m. We will also make use of training polygons for the land cover classification, which will be introduced later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and in fact the preferred term is actually "pixel size"! The term "resolution" means the smallest recognisable object. Sentinel-2 resolution is actually 10 m (you can tell apart objects that are 10 m across), but in this example, we use bands aggregated to 20 m pixel size or pixel spacing. Likewise, we could disaggregate it to 5 m pixel size (e.g. using bilinear interpolation), but that won't make the resolution any finer, as you still won't be able to identify any objects that are smaller than 10 m across.
@@ -119,7 +119,7 @@ plot(ndvi) | |||
Aside from the advantages of `app()` regarding memory usage, an additional advantage of this function is the fact that the result can be written immediately to the file by including the `filename = "..."` argument, which will allow you to write your results to file immediately, after which you can reload it in subsequent sessions without having to repeat your analysis. | |||
|
|||
```{block, type="alert alert-success"} | |||
> **Question 2**: What is the advantage of including the NDVI layer in the classification? | |||
> **Question 2**: What is the advantage of including the NDVI layer in the landcover classification? Hint: For information on NDVI, check out [this source](https://gisgeography.com/ndvi-normalized-difference-vegetation-index/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it should be good. Note that there should be a space in "land cover", though.
@@ -311,7 +323,7 @@ The mean decrease in accuracy indicates the amount by which the classification a | |||
Since the NDVI layer scores relatively low according to the mean accuracy decrease criterion, try to construct an alternate Random Forest model as above, but excluding this layer, you can name it something like 'modelRF2'. | |||
|
|||
```{block, type="alert alert-success"} | |||
> **Question 4**: What effect does this have on the overall accuracy of the results (hint: compare the confusion matrices of the original and new outputs). What effect does leaving this variable out have on the processing time (hint: use `system.time()`)? | |||
> **Question 4**: What effect does this have on the accuracy of the results? Hint: Compare the overall accuracies (or the confusion matrices) of the original and new outputs. What effect does leaving this variable out have on the processing time? Hint: use `system.time()`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's fine. And yes, there's a misplaced question mark :) It's also technically two questions, maybe good to split them?
@@ -515,8 +527,33 @@ plot(forest, col = "dark green", legend = FALSE) | |||
|
|||
# Today's summary | |||
|
|||
We learned about: | |||
Today you performed a supervised classification, you identified patches and sieve connected cells, and you learned to deal with thematic raster data. Some functions to retain: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"retain"? Maybe "remember" would be more usual here.
|
||
* `hist()`: Create a histogram for each layer of a `SpatRaster`. | ||
* `pairs()`: Create a scatterplot for each pair of layers of a `SpatRaster`. | ||
* `app()`: Apply a function to all pixels of a `SpatRaster` more efficiently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also apply a custom function, so perhaps add in parentheses "(custom)"
|
||
## Training data preparation | ||
|
||
* `extract()`: Retrieve a value for the raster below a polygon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But also point or line, so you can just say "a vector"
|
||
## Run a model on the data | ||
|
||
* `predict()`: Predict raster values based on a predefined model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps "pretrained" or "trained" would be more clear here.
|
||
## Applying a raster sieve by identifying patches | ||
|
||
* `setValues()`: Assign a new value to a raster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not entirely clear; rather, setValues()
sets all values of a raster to a certain value or certain values. This is equivalent to MySpatRaster[] = MyValue
.
@GreatEmerald @Timmarh