Skip to content

Commit

Permalink
Merge branch 'develop' into feature/decorators_simplify_examples
Browse files Browse the repository at this point in the history
  • Loading branch information
jlnav committed Jan 11, 2024
2 parents 9975f51 + 50bcefb commit a9c164d
Show file tree
Hide file tree
Showing 51 changed files with 464 additions and 563 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/basic.yml
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ jobs:
pip install -r install/testing_requirements.txt
pip install -r install/misc_feature_requirements.txt
git clone --recurse-submodules -b refactor/pounders_API https://github.com/POptUS/IBCDFO.git
git clone --recurse-submodules -b develop https://github.com/POptUS/IBCDFO.git
pushd IBCDFO/minq/py/minq5/
export PYTHONPATH="$PYTHONPATH:$(pwd)"
echo "PYTHONPATH=$PYTHONPATH" >> $GITHUB_ENV
Expand Down Expand Up @@ -167,4 +167,4 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: crate-ci/typos@v1.16.25
- uses: crate-ci/typos@v1.17.0
4 changes: 2 additions & 2 deletions .github/workflows/extra.yml
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,7 @@ jobs:
sed -i -e "s/pyzmq>=22.1.0,<23.0.0/pyzmq>=23.0.0,<24.0.0/" ./balsam/setup.cfg
cd balsam; pip install -e .; cd ..
git clone --recurse-submodules -b refactor/pounders_API https://github.com/POptUS/IBCDFO.git
git clone --recurse-submodules -b develop https://github.com/POptUS/IBCDFO.git
pushd IBCDFO/minq/py/minq5/
export PYTHONPATH="$PYTHONPATH:$(pwd)"
echo "PYTHONPATH=$PYTHONPATH" >> $GITHUB_ENV
Expand Down Expand Up @@ -250,4 +250,4 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: crate-ci/typos@v1.16.25
- uses: crate-ci/typos@v1.17.0
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
BSD 3-Clause License

Copyright (c) 2018-2023, UChicago Argonne, LLC and the libEnsemble Development Team
Copyright (c) 2018-2024, UChicago Argonne, LLC and the libEnsemble Development Team
All Rights Reserved.

Redistribution and use in source and binary forms, with or without
Expand Down
4 changes: 4 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,10 @@
:target: https://github.com/psf/black
:alt: Code style: black

.. image:: https://joss.theoj.org/papers/10.21105/joss.06031/status.svg
:target: https://doi.org/10.21105/joss.06031
:alt: JOSS Status

|
.. after_badges_rst_tag
Expand Down
2 changes: 1 addition & 1 deletion docs/advanced_installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ Further recommendations for selected HPC systems are given in the

On some platforms you may wish to run libEnsemble without ``mpi4py``,
using a serial PETSc build. This is often preferable if running on
the launch nodes of a three-tier system (e.g., Theta/Summit)::
the launch nodes of a three-tier system (e.g., Summit)::

spack install py-libensemble +scipy +mpmath +petsc4py ^py-petsc4py~mpi ^petsc~mpi~hdf5~hypre~superlu-dist

Expand Down
4 changes: 4 additions & 0 deletions docs/data_structures/libE_specs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -240,6 +240,10 @@ libEnsemble is primarily customized by setting options within a ``LibeSpecs`` cl
the equivalent ``persis_info`` settings, generators will be allocated this
many GPUs.

**use_tiles_as_gpus** [bool] = ``False``:
If ``True`` then treat a GPU tile as one GPU, assuming
``tiles_per_GPU`` is provided in ``platform_specs`` or detected.

**enforce_worker_core_bounds** [bool] = ``False``:
Permit submission of tasks with a
higher processor count than the CPUs available to the worker.
Expand Down
2 changes: 0 additions & 2 deletions docs/function_guides/simulator.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ Writing a Simulator
return Output, persis_info
Most ``sim_f`` function definitions written by users resemble::

def my_simulation(Input, persis_info, sim_specs, libE_info):
Expand Down Expand Up @@ -85,7 +84,6 @@ If ``sim_specs`` was initially defined:
user={"batch_size": 128},
)
Then user parameters and a *local* array of outputs may be obtained/initialized like::

batch_size = sim_specs["user"]["batch_size"]
Expand Down
1 change: 0 additions & 1 deletion docs/introduction_latex.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,6 @@
.. _SWIG: http://swig.org/
.. _tarball: https://github.com/Libensemble/libensemble/releases/latest
.. _Tasmanian: https://tasmanian.ornl.gov/
.. _Theta: https://www.alcf.anl.gov/alcf-resources/theta
.. _tomli: https://pypi.org/project/tomli/
.. _tqdm: https://tqdm.github.io/
.. _user guide: https://libensemble.readthedocs.io/en/latest/programming_libE.html
Expand Down
1 change: 1 addition & 0 deletions docs/nitpicky
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ py:class <class 'int'>
py:class +ScalarType

# Internal paths that are verified importable but Sphinx can't find
py:class libensemble.resources.platforms.Aurora
py:class libensemble.resources.platforms.GenericROCm
py:class libensemble.resources.platforms.Crusher
py:class libensemble.resources.platforms.Frontier
Expand Down
112 changes: 112 additions & 0 deletions docs/platforms/aurora.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
======
Aurora
======

Aurora_ is an Intel/HPE EX supercomputer located in the ALCF_ at Argonne
National Laboratory. Each compute node contains two Intel (Sapphire Rapids)
Xeon CPUs and six Intel X\ :sup:`e` GPUs (Ponte Vecchio) each with two tiles.

The PBS scheduler is used to submit jobs from login nodes to run on the
compute nodes.

Configuring Python and Installation
-----------------------------------

To obtain Python use::

module use /soft/modulefiles
module load frameworks

To obtain libEnsemble::

pip install libensemble

See :doc:`here<../advanced_installation>` for more information on advanced
options for installing libEnsemble, including using Spack.

Example
-------

To run the :doc:`forces_gpu<../tutorials/forces_gpu_tutorial>` tutorial on
Aurora.

To obtain the example you can git clone libEnsemble - although only
the forces sub-directory is needed::

git clone https://github.com/Libensemble/libensemble
cd libensemble/libensemble/tests/scaling_tests/forces/forces_app

To compile forces (a C with OpenMP target application)::

mpicc -DGPU -O3 -fiopenmp -fopenmp-targets=spir64 -o forces.x forces.c

Now go to forces_gpu directory::

cd ../forces_gpu

To make use of all available GPUs, open ``run_libe_forces.py`` and adjust
the exit_criteria to do more simulations. The following will do two
simulations for each worker::

# Instruct libEnsemble to exit after this many simulations
ensemble.exit_criteria = ExitCriteria(sim_max=nsim_workers*2)

Now grab an interactive session on two nodes (or use the batch script at
``../submission_scripts/submit_pbs_aurora.sh``)::

qsub -A <myproject> -l select=2 -l walltime=15:00 -lfilesystems=home -q EarlyAppAccess -I

Once in the interactive session, you may need to reload the frameworks module::

cd $PBS_O_WORKDIR
module use /soft/modulefiles
module load frameworks

Then in the session run::

python run_libe_forces.py --comms local --nworkers 13

This provides twelve workers for running simulations (one for each GPU across
two nodes). An extra worker is added to run the persistent generator. The
GPU settings for each worker simulation are printed.

Looking at ``libE_stats.txt`` will provide a summary of the runs.

Using tiles as GPUs
-------------------

If you wish to treat each tile as its own GPU, then add the *libE_specs*
option ``use_tiles_as_gpus=True``, so the *libE_specs* block of
``run_libe_forces.py`` becomes:

.. code-block:: python
ensemble.libE_specs = LibeSpecs(
num_resource_sets=nsim_workers,
sim_dirs_make=True,
use_tiles_as_gpus=True,
)
Now you can run again but with twice the workers for running simulations (each
will use one GPU tile)::

python run_libe_forces.py --comms local --nworkers 25

Note that the *forces* example will automatically use the GPUs available to
each worker (with one MPI rank per GPU), so if fewer workers are provided,
more than one GPU will be used per simulation.

Also see ``forces_gpu_var_resources`` and ``forces_multi_app`` examples for
cases that use varying processor/GPU counts per simulation.

Demonstration
-------------

Note that a video demonstration_ of the *forces_gpu* example on *Frontier*
is also available. The workflow is identical when running on Aurora, with the
exception of different compiler options and numbers of workers (because the
numbers of GPUs on a node differs).

.. _ALCF: https://www.alcf.anl.gov/
.. _Aurora: https://www.alcf.anl.gov/support-center/aurorasunspot/getting-started-aurora
.. _demonstration: https://youtu.be/H2fmbZ6DnVc
12 changes: 6 additions & 6 deletions docs/platforms/example_scripts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,14 +33,14 @@ for submitting workflows to almost any system or scheduler.
:caption: /examples/libE_submission_scripts/bebop_submit_slurm_distrib.sh
:language: bash

.. dropdown:: Theta - On MOM Node with Multiprocessing

.. literalinclude:: ../../examples/libE_submission_scripts/theta_submit_mproc.sh
:caption: /examples/libE_submission_scripts/theta_submit_mproc.sh
:language: bash

.. dropdown:: Summit - On Launch Nodes with Multiprocessing

.. literalinclude:: ../../examples/libE_submission_scripts/summit_submit_mproc.sh
:caption: /examples/libE_submission_scripts/summit_submit_mproc.sh
:language: bash

.. dropdown:: Cobalt - Intermediate node with Multiprocessing

.. literalinclude:: ../../examples/libE_submission_scripts/cobalt_submit_mproc.sh
:caption: /examples/libE_submission_scripts/cobalt_submit_mproc.sh
:language: bash
4 changes: 2 additions & 2 deletions docs/platforms/platforms_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ Some large systems have a 3-tier node setup. That is, they have a separate set o
(known as MOM nodes on Cray Systems). User batch jobs or interactive sessions run on a launch node.
Most such systems supply a special MPI runner that has some application-level scheduling
capability (e.g., ``aprun``, ``jsrun``). MPI applications can only be submitted from these nodes. Examples
of these systems include: Summit, Sierra, and Theta.
of these systems include Summit and Sierra.

There are two ways of running libEnsemble on these kinds of systems. The first, and simplest,
is to run libEnsemble on the launch nodes. This is often sufficient if the worker's simulation
Expand Down Expand Up @@ -209,13 +209,13 @@ libEnsemble on specific HPC systems.
:maxdepth: 2
:titlesonly:

aurora
bebop
frontier
perlmutter
polaris
spock_crusher
summit
theta
srun
example_scripts

Expand Down
1 change: 0 additions & 1 deletion docs/platforms/summit.rst
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,6 @@ to execute on the launch nodes.

It is recommended to run libEnsemble on the launch nodes (assuming workers are
submitting MPI applications) using the ``local`` communications mode (multiprocessing).
In the future, Balsam may be used to run libEnsemble on compute nodes.

Interactive Runs
^^^^^^^^^^^^^^^^
Expand Down
Loading

0 comments on commit a9c164d

Please sign in to comment.