Skip to content

Commit 55da459

Browse files
authored
Tips for memory usage reduction (#2166)
1 parent 0179921 commit 55da459

File tree

2 files changed

+47
-0
lines changed

2 files changed

+47
-0
lines changed

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@ For a general overview of our (O2) software, organization and processes, please
7272
* [Custom merging](doc/Advanced.md#custom-merging)
7373
* [QC with DPL Analysis](doc/Advanced.md#qc-with-dpl-analysis)
7474
* [Solving performance issues](doc/Advanced.md#solving-performance-issues)
75+
* [Understanding and reducing memory footprint](doc/Advanced.md#understanding-and-reducing-memory-footprint)
7576
* [CCDB / QCDB](doc/Advanced.md#ccdb--qcdb)
7677
* [Accessing objects in CCDB](doc/Advanced.md#accessing-objects-in-ccdb)
7778
* [Access GRP objects with GRP Geom Helper](doc/Advanced.md#access-grp-objects-with-grp-geom-helper)

doc/Advanced.md

+46
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,8 @@ Advanced topics
2929
* [Dispatcher](#dispatcher)
3030
* [QC Tasks](#qc-tasks)
3131
* [Mergers](#mergers)
32+
* [Understanding and reducing memory footprint](#understanding-and-reducing-memory-footprint)
33+
* [Analysing memory usage with valgrind](#analysing-memory-usage-with-valgrind)
3234
* [CCDB / QCDB](#ccdb--qcdb)
3335
* [Accessing objects in CCDB](#accessing-objects-in-ccdb)
3436
* [Access GRP objects with GRP Geom Helper](#access-grp-objects-with-grp-geom-helper)
@@ -717,6 +719,50 @@ The following points might help avoid backpressure:
717719
- if an object has its custom Merge() method, check if it could be optimized
718720
- enable multi-layer Mergers to split the computations across multiple processes (config parameter "mergersPerLayer")
719721

722+
# Understanding and reducing memory footprint
723+
724+
When developing a QC module, please be considerate in terms of memory usage.
725+
Large histograms could be optionally enabled/disabled depending on the context that the QC is ran.
726+
Investigate if reducing the bin size (e.g. TH2D to TH2F) would still provide satisfactory results.
727+
Consider loading only the parts of detector geometry which are being used by a given task.
728+
729+
## Analysing memory usage with valgrind
730+
731+
0) Install valgrind, if not yet installed
732+
733+
1) Run the QC workflow with argument `--child-driver 'valgrind --tool=massif'` (as well as any file reader / processing workflow you need to obtain data in QC)
734+
735+
2) The workflow will run and save files massif.out.<pid>
736+
737+
3) Generate a report for the file corresponding to the PID of the QC task:
738+
```
739+
ms_print massif.out.976329 > massif_abc_task.log
740+
```
741+
4) The generated report contains:
742+
- the command used to run the process
743+
- graph of the memory usage
744+
- grouped call stacks of all memory allocations on the heap (above certain threshold) within certain time intervals.
745+
The left-most call contains all the calls which lead to it, represented on the right.
746+
For example, the call stack below means that the AbcTask created a TH2F histogram in the initalize method at the line
747+
AbcTask.cxx:82, which was 51,811,760B. In total, 130,269,568B worth of TH2F histograms were created in this time interval.
748+
```
749+
98.56% (256,165,296B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
750+
->50.12% (130,269,568B) 0xFCBD1A6: TArrayF::Set(int) [clone .part.0] (TArrayF.cxx:111)
751+
| ->50.12% (130,269,568B) 0xEC1DB1C: TH2F::TH2F(char const*, char const*, int, double, double, int, double, double) (TH2.cxx:3573)
752+
| ->19.93% (51,811,760B) 0x32416518: make_unique<TH2F, char const (&)[16], char const (&)[22], unsigned int const&, int, unsigned int const&, int, int, int> (unique_ptr.h:1065)
753+
| | ->19.93% (51,811,760B) 0x32416518: o2::quality_control_modules::det::AbcTask::initialize(o2::framework::InitContext&) (AbcTask.cxx:82)
754+
```
755+
5) To get a lightweight and more digestible output, consider running the massif report through the following command to get the summary of the calls only within a QC module. This essentially tells you how much memory a given line allocates.
756+
```
757+
[O2PDPSuite/latest] ~/alice/test-rss $> grep quality_control_modules massif_abc_task.log | sed 's/^.*[0-9][0-9]\.[0-9][0-9]\% //g' | sort | uniq
758+
(242,371,376B) 0x324166B2: o2::quality_control_modules::det::AbcTask::initialize(o2::framework::InitContext&) (AbcTask.cxx:88)
759+
(4,441,008B) 0x3241633F: o2::quality_control_modules::det::AbcTask::initialize(o2::framework::InitContext&) (AbcTask.cxx:76)
760+
(4,441,008B) 0x32416429: o2::quality_control_modules::det::AbcTask::initialize(o2::framework::InitContext&) (AbcTask.cxx:79)
761+
(51,811,760B) 0x32416518: o2::quality_control_modules::det::AbcTask::initialize(o2::framework::InitContext&) (AbcTask.cxx:82)
762+
(51,811,760B) 0x324165EB: o2::quality_control_modules::det::AbcTask::initialize(o2::framework::InitContext&) (AbcTask.cxx:85)
763+
```
764+
6) Consider reducing the size and number of the biggest histogram. Consider disabling histograms which will not be useful for async QC (no allocations, no startPublishing).
765+
720766
# CCDB / QCDB
721767

722768
## Accessing objects in CCDB

0 commit comments

Comments
 (0)