Skip to content

feat: add new package to the stats/incr/* namespace: @stdlib/stats/incr/nanmpcorrdist #6614

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from
190 changes: 190 additions & 0 deletions lib/node_modules/@stdlib/stats/incr/nanmpcorrdist/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
<!--

@license Apache-2.0

Copyright (c) 2018 The Stdlib Authors.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

-->

# incrmpcorrdist

> Compute a moving [sample Pearson product-moment correlation distance][pearson-correlation] incrementally, ignoring `NaN` values.

<section class="intro">

The [sample Pearson product-moment correlation distance][pearson-correlation] is defined as

<!-- <equation class="equation" label="eq:pearson_distance" align="center" raw="d_{x,y} = 1 - r_{x,y} = 1 - \frac{\operatorname{cov_n(x,y)}}{\sigma_x \sigma_y}" alt="Equation for the Pearson product-moment correlation distance."> -->

```math
d_{x,y} = 1 - r_{x,y} = 1 - \frac{\mathop{\mathrm{cov_n(x,y)}}}{\sigma_x \sigma_y}
```

<!-- <div class="equation" align="center" data-raw-text="d_{x,y} = 1 - r_{x,y} = 1 - \frac{\operatorname{cov_n(x,y)}}{\sigma_x \sigma_y}" data-equation="eq:pearson_distance">
<img src="https://cdn.jsdelivr.net/gh/stdlib-js/stdlib@49d8cabda84033d55d7b8069f19ee3dd8b8d1496/lib/node_modules/@stdlib/stats/incr/mpcorrdist/docs/img/equation_pearson_distance.svg" alt="Equation for the Pearson product-moment correlation distance.">
<br>
</div> -->
Comment on lines +35 to +38
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this, as this will be automatically populated.


<!-- </equation> -->

where `r` is the [sample Pearson product-moment correlation coefficient][pearson-correlation], `cov(x,y)` is the sample covariance, and `σ` corresponds to the sample standard deviation. As `r` resides on the interval `[-1,1]`, `d` resides on the interval `[0,2]`.

</section>

<!-- /.intro -->

<section class="usage">

## Usage

```javascript
var incrnanmpcorrdist = require( '@stdlib/stats/incr/nanmpcorrdist' );
```

#### incrmpcorrdist( window\[, mx, my] )

Returns an accumulator `function` which incrementally computes a moving [sample Pearson product-moment correlation distance][pearson-correlation] while ignoring `NaN` values. The `window` parameter defines the number of values over which to compute the moving [sample Pearson product-moment correlation distance][pearson-correlation].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Returns an accumulator `function` which incrementally computes a moving [sample Pearson product-moment correlation distance][pearson-correlation] while ignoring `NaN` values. The `window` parameter defines the number of values over which to compute the moving [sample Pearson product-moment correlation distance][pearson-correlation].
Returns an accumulator `function` which incrementally computes a moving [sample Pearson product-moment correlation distance][pearson-correlation], ignoring `NaN` values. The `window` parameter defines the number of values over which to compute the moving [sample Pearson product-moment correlation distance][pearson-correlation].


```javascript
var accumulator = incrnanmpcorrdist( 3 );
```

If means are already known, provide `mx` and `my` arguments.

```javascript
var accumulator = incrnanmpcorrdist( 3, 5.0, -3.14 );
```

#### accumulator( \[x, y] )

If provided input values `x` and `y`, the accumulator function returns an updated [sample Pearson product-moment correlation distance][pearson-correlation]. If not provided input values `x` and `y`, the accumulator function returns the current [sample Pearson product-moment correlation distance][pearson-correlation].

```javascript
var accumulator = incrnanmpcorrdist( 3 );

var r = accumulator();
// returns null

// Fill the window...
r = accumulator( 2.0, 1.0 ); // [(2.0, 1.0)]
// returns 1.0

r = accumulator( NaN, 3.0 ); // [(2.0, 1.0), (NaN, 3.0)]
// returns 1.0

r = accumulator( 4.0, NaN ); // [(2.0, 1.0), (NaN, 3.0), (4.0, NaN)]
// returns 1.0

// Window begins sliding...
r = accumulator( NaN, NaN ); // [(NaN, 3.0), (4.0, NaN), (NaN, NaN)]
// returns 1.0

r = accumulator( 5.0, 2.0 ); // [(4.0, NaN), (NaN, NaN), 5.0, 2.0]
// returns 0.0

r = accumulator();
// returns 0.0
```

</section>

<!-- /.usage -->

<section class="notes">

## Notes

- Input values are **not** type checked. If provided `NaN` or a value which, when used in computations, results in `NaN`, it will be ignored, and the accumulated value is the last one. If non-numeric inputs are possible, you are advised to type check and handle accordingly **before** passing the value to the accumulator function.
- As `W` (x,y) pairs are needed to fill the window buffer, the first `W-1` returned values are calculated from smaller sample sizes. Until the window is full, each returned value is calculated from all provided values.
- Due to limitations inherent in representing numeric values using floating-point format (i.e., the inability to represent numeric values with infinite precision), the [sample correlation distance][pearson-correlation] between perfectly correlated random variables may **not** be `0` or `2`. In fact, the [sample correlation distance][pearson-correlation] is **not** guaranteed to be strictly on the interval `[0,2]`. Any computed distance should, however, be within floating-point roundoff error.

</section>

<!-- /.notes -->

<section class="examples">

## Examples

<!-- eslint no-undef: "error" -->

```javascript
var randu = require( '@stdlib/random/base/randu' );
var incrnanmpcorrdist = require( '@stdlib/stats/incr/nanmpcorrdist' );

var accumulator;
var x;
var y;
var i;

// Initialize an accumulator:
accumulator = incrnanmpcorrdist( 5 );
var d;

// For each simulated datum, update the moving sample correlation distance...
for ( i = 0; i < 100; i++ ) {
if ( randu() < 0.2 ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need separate if-else statements for x and y each as the conditions are same so it can be done in a single if-else statement.

x = NaN;
}
else {
x = randu() * 100.0;
}
if ( randu() < 0.2 ) {
y = NaN;
}
else {
y = randu() * 100.0;
}
d = accumulator( x, y );
console.log( '%d\t%d\t%d', x.toFixed( 4 ), y.toFixed( 4 ), ( d === null ) ? NaN : d.toFixed( 4 ) );
}
```

</section>

<!-- /.examples -->

<!-- Section for related `stdlib` packages. Do not manually edit this section, as it is automatically populated. -->

<section class="related">

* * *

## See Also

- <span class="package-name">[`@stdlib/stats/incr/mpcorr`][@stdlib/stats/incr/mpcorr]</span><span class="delimiter">: </span><span class="description">compute a moving sample Pearson product-moment correlation coefficient incrementally.</span>
- <span class="package-name">[`@stdlib/stats/incr/pcorrdist`][@stdlib/stats/incr/pcorrdist]</span><span class="delimiter">: </span><span class="description">compute a sample Pearson product-moment correlation distance.</span>

</section>

<!-- /.related -->

<!-- Section for all links. Make sure to keep an empty line after the `section` element and another before the `/section` close. -->

<section class="links">

[pearson-correlation]: https://en.wikipedia.org/wiki/Pearson_correlation_coefficient

<!-- <related-links> -->

[@stdlib/stats/incr/mpcorr]: https://github.com/stdlib-js/stdlib/tree/develop/lib/node_modules/%40stdlib/stats/incr/mpcorr

[@stdlib/stats/incr/pcorrdist]: https://github.com/stdlib-js/stdlib/tree/develop/lib/node_modules/%40stdlib/stats/incr/pcorrdist

<!-- </related-links> -->

</section>

<!-- /.links -->
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
/**
* @license Apache-2.0
*
* Copyright (c) 2018 The Stdlib Authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

'use strict';

// MODULES //

var bench = require( '@stdlib/bench' );
var randu = require( '@stdlib/random/base/randu' );
var pkg = require( './../package.json' ).name;
var incrnanmpcorrdist = require( './../lib' );


// MAIN //

bench( pkg, function benchmark( b ) {
var f;
var i;
b.tic();
for ( i = 0; i < b.iterations; i++ ) {
f = incrnanmpcorrdist( (i%5)+1 );
if ( typeof f !== 'function' ) {
b.fail( 'should return a function' );
}
}
b.toc();
if ( typeof f !== 'function' ) {
b.fail( 'should return a function' );
}
b.pass( 'benchmark finished' );
b.end();
});

bench( pkg+'::accumulator', function benchmark( b ) {
var acc;
var v;
var i;

acc = incrnanmpcorrdist( 5 );

b.tic();
for ( i = 0; i < b.iterations; i++ ) {
v = acc( randu(), randu() );
if ( v !== v ) {
b.fail( 'should not return NaN' );
}
}
b.toc();
if ( v !== v ) {
b.fail( 'should not return NaN' );
}
b.pass( 'benchmark finished' );
b.end();
});

bench( pkg+'::accumulator,known_means', function benchmark( b ) {
var acc;
var v;
var i;

acc = incrnanmpcorrdist( 5, 3.0, -1.0 );

b.tic();
for ( i = 0; i < b.iterations; i++ ) {
v = acc( randu(), randu() );
if ( v !== v ) {
b.fail( 'should not return NaN' );
}
}
b.toc();
if ( v !== v ) {
b.fail( 'should not return NaN' );
}
b.pass( 'benchmark finished' );
b.end();
});
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading