Skip to content

Commit

Permalink
Adding rewiev changes
Browse files Browse the repository at this point in the history
  • Loading branch information
SofiaOtero committed Jan 31, 2025
1 parent 209acc3 commit 512d2a8
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 22 deletions.
4 changes: 2 additions & 2 deletions vignettes/aedseo.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,6 @@ plot(
)
```

Using the `intensity_levels` method to define burden levels, the Outbreak is expected to fall within the low or medium
category. This is because the very low threshold is the disease-specific threshold, which must be surpassed for five
Using the `intensity_levels` method to define burden levels, the Outbreak is expected to fall within the `low` or `medium`
category. This is because the `very low` threshold is the disease-specific threshold, which must be surpassed for five
consecutive weeks along with a significant positive growth rate.
36 changes: 16 additions & 20 deletions vignettes/burden_levels.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -16,17 +16,14 @@ knitr::opts_chunk$set(

```{r setup, warning=FALSE, message=FALSE}
library(aedseo)
library(mem)
```

## Summary

To provide a concise overview of how the `seasonal_burden_levels()` algorithm operates, we utilize the same example data presented
in the `vignette("aedseo")`. The plot below illustrates the two `method` arguments available in the
`combined_seasonal_output()` function:

- **`intensity_levels`**: This method assesses burden levels by comparing it to observations from previous seasons.
- **`peak_levels`**: This method assesses burden levels by referencing only the highest-rate observations within each season.
- **`peak_levels`**: This method assesses burden levels by referencing only the highest observations within each season.

The disease-specific threshold is the `very low` threshold for both methods.

Expand Down Expand Up @@ -127,8 +124,8 @@ This is done by:

- Using `n` peak weekly observations from each season.
- Selecting only peak observations if they surpass the disease-specific threshold.
- Weightening the observations such recent observations have a greater impact than older observations.
- A proper distribution (log-normal, weibull and exponential are implemented) is used to fit the weighted
- Weighting the observations such that recent observations have a greater impact than older observations.
- A proper distribution (log-normal, weibull and exponential are implemented) is fitted to the weighted
`n` peak observations. Then the parameters of the selected distribution are optimised to select the best fit.
- Burden levels can be defined by two methods:
- `intensity_levels` which models the risk compared to what has been observed in previous seasons.
Expand All @@ -142,30 +139,30 @@ In the following sections we will describe the arguments for the function and ho
`n_peak` observations are used to describe the highest observations that are observed each season.
The default of `n_peak` is `6` as we are only interested in the highest observations.

#### Weightening
`decay_factor` is implemented due to more recent seasons often are more indicative of current and future trends.
#### Weighting
`A decay_factor` is implemented to give more weight to recent seasons as they are often more indicative of current and future trends.
As time progresses, the relevance of older seasons may decrease due to changes in factors like population immunity,
virus mutations, or intervention strategies. Weighting older seasons less reflects this reduced relevance.
The default of `decay_factor` is `0.8`, allowing the model to be responsive to recent changes without being overly
sensitive to short-term fluctuations.
The optimal decay factor can vary depending on the variability and trends within the data. For datasets where seasonal
patterns are highly stable, a higher decay factor might be appropriate. Conversely, data that has changed a lot across
patterns are highly stable, a higher decay factor (i.e. longer memory) might be appropriate. Conversely, data that has changed a lot across
seasons, a lower factor could improve predictions.

#### Distribution and optimisation
`family` is the argument used to select which distribution the `n_peak` observations should be fitted to, users can
choose between `lnorm`, `weibull` and `exp` distributions. The log-normal distribution theoretically
aligns well with the nature of epidemic data, which often exhibits multiplicative growth patterns.
In our optimization process, we evaluated the distributions to determine their performance in fitting danish non-sentinel
In our optimization process, we evaluated the distributions to determine their performance in fitting Danish non-sentinel
cases and hospitalisation data for RSV, SARS-CoV-2 and Influenza (A and B). All three distributions had comparable
objective function values during optimisation, hence we did not see any statistical significant difference in their performance.

The model uses the `fit_quantiles()` function which employes the `stats::optim` for optimisation of the distribution parameters.
The model uses the `fit_quantiles()` function which employs the `stats::optim` for optimisation of the distribution parameters.
The `optim_method` argument can be passed to `seasonal_burden_levels()`, default is `Nelder-Mead` but other methods can be selected,
see `?fit_quantiles`.

*Note:* [mem](https://github.com/lozalojo/mem) uses the log-normal distribution, which allows for more straightforward benchmarking,
due to this the default is `lnorm`.
due to this, the default is `lnorm`.

#### Burden levels
`method` is the argument used to select one of the two methods `intensity_levels`(default) and `peak_levels`.
Expand Down Expand Up @@ -253,7 +250,7 @@ intensity_levels_n_neg_t <- seasonal_burden_levels(
```

### Use the `peak_levels` method
[mem](https://github.com/lozalojo/mem) uses the n highest values of each epidemic period to fit the parameters of the distribution,
[mem](https://github.com/lozalojo/mem) uses the `n` highest observations from each epidemic period to fit the parameters of the distribution,
where `n = 30/seasons`. The data has four seasons, to align with mem, we use `n_peak = 8`
```{r}
peak_levels_n <- seasonal_burden_levels(
Expand Down Expand Up @@ -518,7 +515,7 @@ mem_levels_df |>

Upon examining all methods and data combinations, it becomes clear that the `intensity_levels` approach establishes
levels covering the entire set of observations from previous seasons. In contrast, the `peak_levels` and `mem` methods
define levels solely based on the highest-rate observations within each season.
define levels solely based on the highest observations within each season.

The highest observations for the *2024/2025* season for each data set are:

Expand Down Expand Up @@ -546,14 +543,13 @@ increasing trend between seasons.
- As observations exponentially decrease between seasons (with the highest observation this season being 3,735),
we expect the burden levels to be lower. This expectation is met across all three methods. However, the weighting
of seasons in `intensity_levels` and `peak_levels` leads to older seasons having less impact on the burden levels as
we progress forward in time. On the other hand, `mem` includes all high-rate observations from the previous 10 seasons
without diminishing the importance of older seasons, which results in sustained very high burden levels.
we progress forward in time. On the other hand, `mem` includes all high observations from the previous 10 seasons
without diminishing the importance of older seasons, which results in sustained `very high` burden levels.

- Notably, in the `mem` method, the epidemic thresholds are positioned above the medium burden level.
This means that the epidemic period begins only when the burden reaches the range of highest-rate observations
- Notably, in the `mem` method, the epidemic thresholds are positioned above the `medium` burden level.
This means that the epidemic period begins only when the burden reaches the range of high observations
observed in previous seasons.


This concludes, that using the `peak_levels` and `mem` methods does not allow us to assess the burden before the season
reaches the range of high-rate observations from previous seasons. In contrast, the intensity_levels method allows for
reaches the range of high observations from previous seasons. In contrast, the `intensity_levels` method allows for
continuous monitoring of the burden of current observation rate throughout the entire season.

0 comments on commit 512d2a8

Please sign in to comment.