You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/Advanced.md
+46
Original file line number
Diff line number
Diff line change
@@ -29,6 +29,8 @@ Advanced topics
29
29
*[Dispatcher](#dispatcher)
30
30
*[QC Tasks](#qc-tasks)
31
31
*[Mergers](#mergers)
32
+
*[Understanding and reducing memory footprint](#understanding-and-reducing-memory-footprint)
33
+
*[Analysing memory usage with valgrind](#analysing-memory-usage-with-valgrind)
32
34
*[CCDB / QCDB](#ccdb--qcdb)
33
35
*[Accessing objects in CCDB](#accessing-objects-in-ccdb)
34
36
*[Access GRP objects with GRP Geom Helper](#access-grp-objects-with-grp-geom-helper)
@@ -717,6 +719,50 @@ The following points might help avoid backpressure:
717
719
- if an object has its custom Merge() method, check if it could be optimized
718
720
- enable multi-layer Mergers to split the computations across multiple processes (config parameter "mergersPerLayer")
719
721
722
+
# Understanding and reducing memory footprint
723
+
724
+
When developing a QC module, please be considerate in terms of memory usage.
725
+
Large histograms could be optionally enabled/disabled depending on the context that the QC is ran.
726
+
Investigate if reducing the bin size (e.g. TH2D to TH2F) would still provide satisfactory results.
727
+
Consider loading only the parts of detector geometry which are being used by a given task.
728
+
729
+
## Analysing memory usage with valgrind
730
+
731
+
0) Install valgrind, if not yet installed
732
+
733
+
1) Run the QC workflow with argument `--child-driver 'valgrind --tool=massif'` (as well as any file reader / processing workflow you need to obtain data in QC)
734
+
735
+
2) The workflow will run and save files massif.out.<pid>
736
+
737
+
3) Generate a report for the file corresponding to the PID of the QC task:
738
+
```
739
+
ms_print massif.out.976329 > massif_abc_task.log
740
+
```
741
+
4) The generated report contains:
742
+
- the command used to run the process
743
+
- graph of the memory usage
744
+
- grouped call stacks of all memory allocations on the heap (above certain threshold) within certain time intervals.
745
+
The left-most call contains all the calls which lead to it, represented on the right.
746
+
For example, the call stack below means that the AbcTask created a TH2F histogram in the initalize method at the line
747
+
AbcTask.cxx:82, which was 51,811,760B. In total, 130,269,568B worth of TH2F histograms were created in this time interval.
748
+
```
749
+
98.56% (256,165,296B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
5) To get a lightweight and more digestible output, consider running the massif report through the following command to get the summary of the calls only within a QC module. This essentially tells you how much memory a given line allocates.
6) Consider reducing the size and number of the biggest histogram. Consider disabling histograms which will not be useful for async QC (no allocations, no startPublishing).
0 commit comments