Skip to content

Add common alias cores (x) for grdlandmask #2944

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jan 4, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions pygmt/src/grdlandmask.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
R="region",
V="verbose",
r="registration",
x="cores",
)
@kwargs_to_strings(I="sequence", R="sequence", N="sequence", E="sequence")
def grdlandmask(**kwargs):
Expand Down Expand Up @@ -82,6 +83,7 @@ def grdlandmask(**kwargs):
considered outside [Default is inside].
{verbose}
{registration}
{cores}

Returns
-------
Expand Down
2 changes: 1 addition & 1 deletion pygmt/tests/test_grdlandmask.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ def test_grdlandmask_no_outgrid(expected_grid):
"""
Test grdlandmask with no set outgrid.
"""
result = grdlandmask(spacing=1, region=[125, 130, 30, 35])
result = grdlandmask(spacing=1, region=[125, 130, 30, 35], cores=2)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Linux runner for GitHub Actions has 2 CPU cores according to https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources, but not sure if we should set cores to 2 or 1 for consistent benchmarking.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using two cores is faster and can help check the OpenMP feature in GMT, so I prefer to use two cores if possible.

BTW, the macOS runner for GitHub Actions has 3 cores, so instead of cores=2, cores=True will use all available cores.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is only 1 core being used if we don't set cores, or use cores=None? I'm reading https://docs.generic-mapping-tools.org/6.4/gmt.html#core-full which says [default is to use all available cores], but is that only with cores=True (i.e. -x in GMT)?

The benchmark.yml workflow is running on Linux, but I guess we could set cores=True since this test would also run on macOS in ci_test.yml. My question was more around whether multi-threading with multiple-cores (2 or more) would provide a consistent benchmark compared to using only 1 core, but I guess we need to merge this PR to find out!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is only 1 core being used if we don't set cores, or use cores=None? I'm reading https://docs.generic-mapping-tools.org/6.4/gmt.html#core-full which says [default is to use all available cores], but is that only with cores=True (i.e. -x in GMT)?

Yes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is only 1 core being used if we don't set cores, or use cores=None? I'm reading https://docs.generic-mapping-tools.org/6.4/gmt.html#core-full which says [default is to use all available cores], but is that only with cores=True (i.e. -x in GMT)?

Yes.

Actually it uses all available cores if -x is not used.

$ gmt grdlandmask -R0/10/0/10 -Gabc.nc -I1/1 -Vi
grdlandmask [INFORMATION]: Enable all available threads (up to xxx)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe set cores=1 and see what the flame graph diff shows?

The test_grdlandmask_no_outgrid benchmark speedup went from 3x to 9.6x 🤣 It might be that there is some overhead with using threading (2 cores) vs no threading (1 core). Here's the flame graph diff from https://codspeed.io/GenericMappingTools/pygmt/branches/limit-cores again:

image

The extra gomp_team_barrier_wait_* calls are still there, but also some other stuff (in blue).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reminds me of the warnings message I saw when running the tests in random order (see https://github.com/GenericMappingTools/pygmt/actions/runs/7396374248/job/20121388367?pr=2936):

pygmt/tests/test_session_management.py::test_session_multiprocessing
pygmt/tests/test_session_management.py::test_session_multiprocessing
  /home/runner/micromamba/envs/pygmt/lib/python3.12/multiprocessing/popen_fork.py:66: DeprecationWarning: This process (pid=2433) is multi-threaded, use of fork() may lead to deadlocks in the child.
    self.pid = os.fork()

There are some related discussions at https://discuss.python.org/t/concerns-regarding-deprecation-of-fork-with-alive-threads/33555 and related PR at python/cpython#100228.

os.fork() in a multi-threaded application is a likely source of deadlocks on many platforms. We should raise a warning when people call os.fork() from a process that we know has other threads running.

Not sure if it's related to the issue here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pygmt/tests/test_session_management.py::test_session_multiprocessing
pygmt/tests/test_session_management.py::test_session_multiprocessing
  /home/runner/micromamba/envs/pygmt/lib/python3.12/multiprocessing/popen_fork.py:66: DeprecationWarning: This process (pid=2433) is multi-threaded, use of fork() may lead to deadlocks in the child.
    self.pid = os.fork()

That test_session_multiprocessing function is only running basemap and savefig right? It shouldn't have any multi-threading going on, so not sure what's happening.

I tried running some benchmarks using cores from -1 (all cores) to 0 (no-multithreading) to 16 (all cores on my laptop). This was running grdlandmask with a spacing of 0.05 arc degrees. The results do vary between runs, but in general, more cores means less time:

benchmark_grdlandmask_0.05arc_degree

Code to reproduce:

import time
import pandas as pd
import pygmt
import tqdm

timings = {}
for cores in tqdm.tqdm(range(-1, 16)):
    tic = time.perf_counter()
    pygmt.grdlandmask(spacing="0.05d", region=[0, 180, 0, 90], cores=cores)
    toc = time.perf_counter()
    timings[cores] = toc - tic

df = pd.DataFrame.from_dict(data=timings, orient="index", columns=["time"])

# %%
fig = pygmt.Figure()
region = pygmt.info(data=df.reset_index(), per_column=True, spacing=0.05)
fig.plot(
    x=df.index,
    y=df.time,
    region=region,
    pen="thick",
    frame=[
        "WSne+tBenchmark grdlandmask",
        "xaf+lCores",
        "yaf+lTime (s), lower is better",
    ],
)
fig.savefig("benchmark_grdlandmask.png")
fig.show()

If I change the spacing from 0.05 arc degrees to 1 arc degree, more cores actually take more time:

benchmark_grdlandmask_1arc_degree

For that test_grdlandmask_no_outgrid unit test which is using spacing=1 (i.e. 1 arc degree), I think we should just use cores=2, since cores=3 might not be much faster on such a low resolution. As long as we are benchmarking with a fixed number of core counts, it should be ok?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds ok

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, back to using cores=2 in 97637d4.

# check information of the output grid
assert isinstance(result, xr.DataArray)
assert result.gmt.gtype == 1 # Geographic grid
Expand Down