Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Julia implementation using OhMyThreads.jl #25

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

giordano
Copy link
Member

@giordano giordano commented Nov 27, 2024

OhMyThreads.jl is a new-ish package which provides user-friendly threading constructs, for example it comes with a multithreaded map reduce, which is perfect for this program. Compare the implementation in this PR

function _picalc(numsteps)
slice = 1 / numsteps
return tmapreduce(+, 1:numsteps; ntasks=nthreads()) do i
4.0 / (1.0 + ((i - 0.5) * slice) ^ 2)
end * slice
end
with the current multithreaded implementation:
function _picalc(numsteps)
slice = 1.0 / numsteps
sums = zeros(Float64, nthreads())
n = cld(numsteps, nthreads())
Threads.@threads for i in 1:nthreads()
sum_thread = 0.0
@simd for j in (1 + (i - 1) * n):min(numsteps, i * n)
x = (j - 0.5) * slice
sum_thread += 4.0 / (1.0 + x ^ 2)
end
sums[threadid()] = sum_thread
end
return sum(sums) * slice
end

Some benchmarks: on Nvidia Grace-Grace, with this implementation using Julia v1.11.1 I get

$ julia -t 144 pi.jl 1000000000000
  Activating project at `~/repo/pi_examples/julia_ohmythreads_dir`
  No Changes to `~/repo/pi_examples/julia_ohmythreads_dir/Project.toml`
  No Changes to `~/repo/pi_examples/julia_ohmythreads_dir/Manifest.toml`
  Warming up...done. [0.195s]

Calculating PI using:
  1000000000000 slices
  144 thread(s)
Obtained value of PI: 3.1415926535897936
Time taken: 5.458 seconds

With julia_threads_pi_dir (same version of Julia as above):

$ julia -t 144 pi.jl 1000000000000
  Warming up...done. [0.095s]

Calculating PI using:
  1000000000000 slices
  144 thread(s)
Obtained value of PI: 3.141592653588557
Time taken: 5.441 seconds

With c_omp_pi_dir:

$ make -B CC=gcc COPTS='-O2 -fopenmp -march=native' && OMP_NUM_THREADS=144 ./pi 1000000000000
gcc -O2 -fopenmp -march=native  -c pi.c -o pi.o
gcc -O2 -fopenmp -march=native  -o pi pi.o
Calculating PI using:
  1000000000000 slices
  144 thread(s)
Obtained value for PI: 3.14159265358896
Time taken:            5.77358439611271 seconds
$ gcc --version
gcc (GCC) 14.1.0
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

With fortran_omp_pi_dir:

$ make -Bf Makefile.gfortran FOPT='-O2 -fopenmp -march=native' && ./pi 1000000000000
gfortran -O2 -fopenmp -march=native -o pi pi.f90
Calculating PI using:
                     1000000000000 slices
                               144 OpenMP threads
Obtained value of PI: 3.1415926536
Time taken:                5.72090 seconds
$ gfortran --version
GNU Fortran (GCC) 14.1.0
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

@giordano giordano force-pushed the mg/julia-ohmythreads branch 2 times, most recently from bdd72c1 to 5c57045 Compare November 27, 2024 02:04
@giordano giordano force-pushed the mg/julia-ohmythreads branch from 5c57045 to 10a517f Compare November 29, 2024 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant