Skip to content

Addition of variance function in stdlib_experimental_stats #144

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Feb 18, 2020
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ set(fppFiles
stdlib_experimental_optval.fypp
stdlib_experimental_stats.fypp
stdlib_experimental_stats_mean.fypp
stdlib_experimental_stats_var.fypp
)


Expand Down
8 changes: 7 additions & 1 deletion src/Makefile.manual
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@ SRC = stdlib_experimental_ascii.f90 \
stdlib_experimental_kinds.f90 \
f18estop.f90 \
stdlib_experimental_stats.f90 \
stdlib_experimental_stats_mean.f90
stdlib_experimental_stats_mean.f90 \
stdlib_experimental_stats_var.f90

LIB = libstdlib.a

Expand Down Expand Up @@ -42,6 +43,11 @@ stdlib_experimental_stats_mean.o: \
stdlib_experimental_optval.o \
stdlib_experimental_kinds.o \
stdlib_experimental_stats.o
stdlib_experimental_stats_var.o: \
stdlib_experimental_optval.o \
stdlib_experimental_kinds.o \
stdlib_experimental_stats.o
stdlib_experimental_io.f90: stdlib_experimental_io.fypp
stdlib_experimental_stats.f90: stdlib_experimental_stats.fypp
stdlib_experimental_stats_mean.f90: stdlib_experimental_stats_mean.fypp
stdlib_experimental_stats_var.f90: stdlib_experimental_stats_var.fypp
29 changes: 29 additions & 0 deletions src/common.fypp
Original file line number Diff line number Diff line change
Expand Up @@ -113,4 +113,33 @@ ${prefix + joinstr.join([line.strip() for line in txt.split("\n")]) + suffix}$
$:"{0}_{1}_{2}{3}_{2}{3}".format(gname, rank, type[0], kind) if suffix == '' else "{0}_{1}_{2}{3}_{4}".format(gname, rank, type[0], kind, suffix)
#:enddef

#! Generates Fortran expressions.
#!
#! Args:
#! varname (str): Name of the variable to be used as origin
#! varname1 (str): Name of the variable to be used instead of varname
#! origrank (int): Rank of the original variable
#! dim (int): Index of the used expression varname1
#!
#! Returns:
#! Shape expression enclosed in braces, except for the index dim.
#!
#! E.g., (:, :, :, i, :, :)
#!

#:def rankindice(varname, varname1, origrank, dim)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably rename this macro to have a more descriptive name. If it is only used to select subarrays by reducing the dimension, we could have:

#:def select_subarray(origrank, selectors)
  #:assert origrank > 0
  #:set seldict = dict(selectors)
  #:call join_lines(joinstr=", ", prefix="(", suffix=")")
    #:for i in range(1, origrank + 1)
      $:seldict.get(i, ":")
    #:endfor
  #:endcall
#:enddef

and use it as

#!  -> x(:, i, :)
x${select_subarray(3, [(2, 'i')])}$

It could also be used, if we need to reduce more than one rank, e.g.

#!  -> x(:, :, i, j)
x${select_subarray(4, [(3, 'i'), (4, 'j')])}$

Also the description should be clarified a bit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented as suggested. The proposed macro is more general and better fit to its aim.
Could you have another review, please?

#:assert origrank > 0
#:if origrank > 0
#:call join_lines(joinstr=", ", prefix="(", suffix=")")
#:for i in range(1, origrank+1)
#:if i == dim
${varname1}$
#:else
${varname}$
#:endif
#:endfor
#:endcall
#:endif
#:enddef

#:endmute
104 changes: 103 additions & 1 deletion src/stdlib_experimental_stats.fypp
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ module stdlib_experimental_stats
implicit none
private
! Public API
public :: mean
public :: mean, var

interface mean
#:for k1, t1 in RC_KINDS_TYPES
Expand Down Expand Up @@ -104,4 +104,106 @@ module stdlib_experimental_stats

end interface mean


interface var

#:for k1, t1 in RC_KINDS_TYPES
#:for rank in RANKS
#:set RName = rname("var_all",rank, t1, k1)
module function ${RName}$(x, mask) result(res)
${t1}$, intent(in) :: x${ranksuffix(rank)}$
logical, intent(in), optional :: mask
${t1}$ :: res
end function ${RName}$
#:endfor
#:endfor

#:for k1, t1 in INT_KINDS_TYPES
#:for rank in RANKS
#:set RName = rname("var_all",rank, t1, k1, 'dp')
module function ${RName}$(x, mask) result(res)
${t1}$, intent(in) :: x${ranksuffix(rank)}$
logical, intent(in), optional :: mask
real(dp) :: res
end function ${RName}$
#:endfor
#:endfor

#:for k1, t1 in RC_KINDS_TYPES
#:for rank in RANKS
#:set RName = rname("var",rank, t1, k1)
module function ${RName}$(x, dim, mask) result(res)
${t1}$, intent(in) :: x${ranksuffix(rank)}$
integer, intent(in) :: dim
logical, intent(in), optional :: mask
${t1}$ :: res${reduced_shape('x', rank, 'dim')}$
end function ${RName}$
#:endfor
#:endfor

#:for k1, t1 in INT_KINDS_TYPES
#:for rank in RANKS
#:set RName = rname("var",rank, t1, k1, 'dp')
module function ${RName}$(x, dim, mask) result(res)
${t1}$, intent(in) :: x${ranksuffix(rank)}$
integer, intent(in) :: dim
logical, intent(in), optional :: mask
real(dp) :: res${reduced_shape('x', rank, 'dim')}$
end function ${RName}$
#:endfor
#:endfor


#:for k1, t1 in RC_KINDS_TYPES
#:for rank in RANKS
#:set RName = rname("var_mask_all",rank, t1, k1)
module function ${RName}$(x, mask) result(res)
${t1}$, intent(in) :: x${ranksuffix(rank)}$
logical, intent(in) :: mask${ranksuffix(rank)}$
${t1}$ :: res
end function ${RName}$
#:endfor
#:endfor


#:for k1, t1 in INT_KINDS_TYPES
#:for rank in RANKS
#:set RName = rname("var_mask_all",rank, t1, k1, 'dp')
module function ${RName}$(x, mask) result(res)
${t1}$, intent(in) :: x${ranksuffix(rank)}$
logical, intent(in) :: mask${ranksuffix(rank)}$
real(dp) :: res
end function ${RName}$
#:endfor
#:endfor


#:for k1, t1 in RC_KINDS_TYPES
#:for rank in RANKS
#:set RName = rname("var_mask",rank, t1, k1)
module function ${RName}$(x, dim, mask) result(res)
${t1}$, intent(in) :: x${ranksuffix(rank)}$
integer, intent(in) :: dim
logical, intent(in) :: mask${ranksuffix(rank)}$
${t1}$ :: res${reduced_shape('x', rank, 'dim')}$
end function ${RName}$
#:endfor
#:endfor


#:for k1, t1 in INT_KINDS_TYPES
#:for rank in RANKS
#:set RName = rname("var_mask",rank, t1, k1, 'dp')
module function ${RName}$(x, dim, mask) result(res)
${t1}$, intent(in) :: x${ranksuffix(rank)}$
integer, intent(in) :: dim
logical, intent(in) :: mask${ranksuffix(rank)}$
real(dp) :: res${reduced_shape('x', rank, 'dim')}$
end function ${RName}$
#:endfor
#:endfor

end interface var


end module stdlib_experimental_stats
53 changes: 53 additions & 0 deletions src/stdlib_experimental_stats.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
## Implemented

* `mean`
* `var`

## `mean` - mean of array elements

Expand Down Expand Up @@ -47,3 +48,55 @@ program demo_mean
reshape(x, [ 2, 3 ] ) > 3.) !returns [ NaN, 4.0, 5.5 ]
end program demo_mean
```

## `var` - variance of array elements

### Description

Returns the variance of all the elements of `array`, or of the elements of `array` along dimension `dim` if provided, and if the corresponding element in `mask` is `true`.

The variance is defined as the best unbiased estimator and is computed as:

```
var(x) = 1/(n-1) sum_i (array(i) - mean(array))^2
```

### Syntax

`result = var(array [, mask])`

`result = var(array, dim [, mask])`

### Arguments

`array`: Shall be an array of type `integer`, `real`, or `complex`.

`dim`: Shall be a scalar of type `integer` with a value in the range from 1 to n, where n is the rank of `array`.

`mask` (optional): Shall be of type `logical` and either by a scalar or an array of the same shape as `array`.

### Return value

If `array` is of type `real` or `complex`, the result is of the same type as `array`.
If `array` is of type `integer`, the result is of type `double precision`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will raise this issue elsewhere, but I do not agree with this API for the return type when the input is integer data. I only bring it up here because it is not quite correct to say that the return type is double precision, when in fact the type is real(real64). I'm not suggesting any changes now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nshaffer Thank you for your review.

I used double precision because they are declared as dp. But I agree they are actually real(real64). The issue with using real64 in the spec is that if the definition of dp
in stdlib_experimental_kinds changes (there has been already discussions on that), then we will need to modify the spec too.

Would it be better to write "..... the result is of type dp."?


If `dim` is absent, a scalar with the variance of all elements in `array` is returned. Otherwise, an array of rank n-1, where n equals the rank of `array`, and a shape similar to that of `ar
ray` with dimension `dim` dropped is returned.

If `mask` is specified, the result is the variance of all elements of `array` corresponding to `true` elements of `mask`. If every element of `mask` is `false`, the result is IEEE `NaN`.

### Example

```fortran
program demo_var
use stdlib_experimental_stats, only: var
implicit none
real :: x(1:6) = [ 1., 2., 3., 4., 5., 6. ]
print *, var(x) !returns 3.5
print *, var( reshape(x, [ 2, 3 ] )) !returns 3.5
print *, var( reshape(x, [ 2, 3 ] ), 1) !returns [0.5, 0.5, 0.5]
print *, var( reshape(x, [ 2, 3 ] ), 1,&
reshape(x, [ 2, 3 ] ) > 3.) !returns [0., NaN, 0.5]
end program demo_var
```

Loading