Skip to content

Commit

Permalink
update of plots
Browse files Browse the repository at this point in the history
  • Loading branch information
HDembinski committed May 9, 2023
1 parent fc04d9c commit f90e5ff
Show file tree
Hide file tree
Showing 25 changed files with 18,766 additions and 16,547 deletions.
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ repos:
- id: sort-simple-yaml
- id: file-contents-sorter
- id: trailing-whitespace
exclude: ^doc/_static/.*.svg
exclude: .*\.svg

# Python linter (Flake8)
- repo: https://github.com/PyCQA/flake8
Expand Down
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,10 @@ Note that this is only faster if `x` has sufficient length (about 1000 elements

The following benchmarks were produced on an Intel(R) Core(TM) i7-8569U CPU @ 2.80GHz against SciPy-1.10.1. The dotted line on the right-hand figure shows the expected speedup (4x) from parallelization on a CPU with four physical cores.

We see large speed-ups with respect to `scipy` for almost all distributions. Also calls with short arrays profit from `numba_stats`, due to the reduced call-overhead. The functions `voigt.pdf` and `t.ppf` do not run faster than the `scipy` versions, because we call the respective `scipy` implementation written in FORTRAN. The advantage provided by `numba_stats` here is that you can call these functions from other `numba`-JIT'ed functions, which is not possible with the `scipy` implementations, and `voigt.pdf` still profits from auto-parallelization.

The `bernstein.density` does not profit from auto-parallelization, on the contrary it becomes much slower, so this should be avoided. This is a known issue, the internal implementation cannot be easily auto-parallelized.

![](docs/_static/norm.pdf.svg)
![](docs/_static/norm.cdf.svg)
![](docs/_static/norm.ppf.svg)
Expand All @@ -87,10 +91,8 @@ The following benchmarks were produced on an Intel(R) Core(TM) i7-8569U CPU @ 2.
![](docs/_static/truncexpon.ppf.svg)
![](docs/_static/voigt.pdf.svg)
![](docs/_static/bernstein.density.svg)

The functions `voigt.pdf`, `t.cdf`, and `t.ppf` do not run faster than the `scipy` versions, because we call the respective `scipy` implementation written in FORTRAN. The advantage provided by `numba_stats` here is that you can call these functions from other `numba`-JIT'ed functions, which is not possible with the `scipy` implementations.

The `bernstein.density` does not profit from auto-parallelization, on the contrary it becomes much slower. This is under investigation.
![](docs/_static/bernstein.density.svg)
![](docs/_static/truncexpon.pdf.plus.norm.pdf.svg)

## Documentation

Expand Down
238 changes: 11 additions & 227 deletions bench/plot.ipynb

Large diffs are not rendered by default.

35 changes: 35 additions & 0 deletions bench/test_stats.py
Original file line number Diff line number Diff line change
Expand Up @@ -159,3 +159,38 @@ def method(x, beta, xmin, xmax):
# warm-up JIT
method(x, beta, xmin, xmax)
benchmark(method, x, beta, xmin, xmax)


@pytest.mark.parametrize("n", N)
@pytest.mark.parametrize("lib", ("scipy", "ours", "ours:parallel,fastmath"))
def test_speed_truncexpon_pdf_plus_norm_pdf(benchmark, lib, n):
x = np.linspace(0, 1, n)
rng = np.random.default_rng(1)
rng.shuffle(x)

xmin = np.min(x)
xmax = np.max(x)

if lib == "scipy":
from scipy.stats import norm, truncexpon

def method(x, z, mu, sigma, slope):
p1 = truncexpon.pdf(x, xmax, xmin, slope)
p2 = norm.pdf(x, mu, sigma)
return (1 - z) * p1 + z * p2

else:
from numba_stats import norm, truncexpon

def method(x, z, mu, sigma, slope):
p1 = truncexpon.pdf(x, xmin, xmax, 0.0, slope)
p2 = norm.pdf(x, mu, sigma)
return (1 - z) * p1 + z * p2

if lib == "ours:parallel,fastmath":
method = nb.njit(parallel=True, fastmath=True)(method)

# warm-up JIT
args = 0.5, 0.5, 0.1, 1.0
method(x, *args)
benchmark(method, x, *args)
Loading

0 comments on commit f90e5ff

Please sign in to comment.