|
| 1 | +# hampel |
| 2 | + |
| 3 | +``` |
| 4 | +hampel(x; spread=mad(x), threshold=2) |
| 5 | +``` |
| 6 | + |
| 7 | +Identify outliers using the Hampel criterion. |
| 8 | + |
| 9 | +Given vector `x`, identify elements xₖ such that |
| 10 | + |
| 11 | +```math |
| 12 | +|xₖ - m| > t S, |
| 13 | +``` |
| 14 | + |
| 15 | +where ``m`` is the median of the elements, the dispersion scale ``S`` is provided by the function |
| 16 | +`spread`, and the parameter ``t`` is given by `threshold`. The return value is a `Bool` vector. |
| 17 | + |
| 18 | +By default, `spread` is \myreflink{mad} and `threshold` is 2. |
| 19 | + |
| 20 | + |
| 21 | +--------- |
| 22 | + |
| 23 | +``` |
| 24 | +hampel(x, K; spread=mad, threshold=2, boundary=:truncate) |
| 25 | +``` |
| 26 | + |
| 27 | +Apply a windowed Hampel filter to a time series. |
| 28 | + |
| 29 | +Given vector `x` and half-width `K`, apply a Hampel criterion within a sliding window of width |
| 30 | +2K+1. The median ``m`` of the window replaces the element ``xₖ`` at the center of the window if it satisfies |
| 31 | + |
| 32 | +```math |
| 33 | +|xₖ - m| > t S, |
| 34 | +``` |
| 35 | + |
| 36 | +where the dispersion scale ``S`` is provided by the function `spread` and the parameter |
| 37 | +``t`` is given by `threshold`. The window shortens near the beginning and end of the vector |
| 38 | +to avoid referencing fictitious elements. Larger values of ``t`` make the filter less agressive, |
| 39 | +while ``t=0`` is the standard median filter. |
| 40 | + |
| 41 | +For recursive filtering, see `hampel!` |
| 42 | + |
| 43 | +The value of `boundary` determines how the filter handles the boundaries of the vector: |
| 44 | + |
| 45 | +- `:truncate` (default): the window is shortened at the boundaries |
| 46 | +- `:reflect`: values are reflected across the boundaries |
| 47 | +- `:repeat`: end values are repeated as necessary |
| 48 | + |
| 49 | +--------- |
| 50 | + |
| 51 | +``` |
| 52 | +hampel(x, weights; ...) |
| 53 | +``` |
| 54 | + |
| 55 | +Apply a weighted Hampel filter to a time series. |
| 56 | + |
| 57 | +Given vector `x` and a vector `weights` of positive intgers, before computing the criterion |
| 58 | +each element in the window is repeated by the number of times given by its corresponding |
| 59 | +weight. This is typically used to make the central element more influential than the others. |
| 60 | + |
| 61 | + |
| 62 | +### CREDITS |
| 63 | +This function is adapted from [HampelOutliers.jl](https://github.com/tobydriscoll/HampelOutliers.jl) and you should |
| 64 | +consult the original source for more details and examples. The differences with respect to original |
| 65 | +`HampelOutliers.jl` functions is that here we created different methods for `Hampel.identify` and |
| 66 | +`Hampel.filter` and called them collectively just ``hampel`` and let the multi-dispatch do the work. |
| 67 | + |
| 68 | +### Returns |
| 69 | +- A vector of Boolean saying if points were considered _regular_ or outliers. |
| 70 | +or |
| 71 | +- A filtered version of `x` |
| 72 | + |
| 73 | +### Example |
| 74 | + |
| 75 | +\begin{examplefig}{} |
| 76 | +```julia |
| 77 | +using GMT |
| 78 | + |
| 79 | +t = (1:50) / 10; |
| 80 | +x = [1:2:40; 5t + (@. 6cos(t + 0.5(t)^2)); fill(40,20)]; |
| 81 | +x[12] = -10; x[50:52] .= -12; x[79:82] .= [-5, 50, 55, 0]; |
| 82 | +m = hampel(x, 4, threshold=0); |
| 83 | +y = hampel(x, 4); |
| 84 | + |
| 85 | +plot(x, marker=:point, mc=:blue, lc=:blue, label="Original", xlabel="k", ylabel="x_k") |
| 86 | +scatter!(m, ms="2p", mc=:red, MarkerEdgeColor=true, label="Median filter") |
| 87 | +scatter!(y, ms="2p", mc=:green, MarkerEdgeColor=true, label="Hampel filter", show=true) |
| 88 | +``` |
| 89 | +\end{examplefig} |
0 commit comments