Add cache_layer_indices to reduce wavefunction storage by selecting specific slice depths#95
Open
HaoranLMaoMao wants to merge 3 commits intoh-walk:mainfrom
Open
Add cache_layer_indices to reduce wavefunction storage by selecting specific slice depths#95HaoranLMaoMao wants to merge 3 commits intoh-walk:mainfrom
HaoranLMaoMao wants to merge 3 commits intoh-walk:mainfrom
Conversation
Added cache_layer_indices parameter to control which slice layers are stored. Updated related documentation and logic to handle selective layer storage.
Updated comments for clarity and consistency.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When
cache_levels=["slices"]is set,MultisliceCalculatorstores exit-wavefunctions for allnzpropagation slices. For a typical simulation (440 slices × 4,096 MD frames × 25 probes × complex128) this produces ~4 TB per block. Users who only need EELS spectra at a few target thicknesses are forced to store and manage tens of terabytes of intermediate data, even though only a small fraction is ever used downstream.Solution
This pull request adds a single optional parameter
cache_layer_indicestosetup():When set, only the listed layers are FFT'd and stored in
frame_data/wavefunction_data; the remaining layers are discarded immediately after propagation without any FFT or disk write. The full multislice propagation through allnzslices still runs (physically required).WFData.layeris updated to hold the actual layer indices instead ofarange(nz).Changes
MultisliceCalculator.setup(): new parametercache_layer_indices: Optional[List[int]] = NoneMultisliceCalculator.run(): introducesself._active_layersto replace the hardcodedrange(self.n_layers)loop; allocation ofwavefunction_dataandframe_datauseslen(self._active_layers)as the last dimensionWFData.layer: now holds the actual slice indices (e.g.[43, 87, 175, 263, 351, 439]) instead ofarange(nz), so downstream code can map a target depth back to its compact storage slot vialist(wave.layer).index(layer_idx). Whencache_layer_indices=None,wave.layerremains identical to the originalarange(nz).cache_layer_indices=None(default) preserves existing behaviour exactly — no changes required to existing scriptsDownstream usage example
Users specify layer indices directly. The corresponding physical depths can be
derived from
slice_thicknessfor reference:Impact
For the amorphous-Si benchmark (nz = 440, saving 6 layers):
wavefunction_datashape(25, 4096, 111, 111, 440)(25, 4096, 111, 111, 6)