You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary
When running sopflow at scale on Summit with HiOp PriDec on GPUs, the code would crash when attempting to write the output files for each contingency (when this was requested). It looked to me like ExaGo was trying to run multiple instances of opflow again to convert to PS format, and it was running out of GPU memory. Note that HiOp PriDec had already converged when this happened. In the end, I turned off the final output generation to collect the scaling data.
@abhyshr, would you mind elaborating on how the outputs are generated? Is there a way to get the converged solution directly?
Issue type
Relates to
Summary
When running sopflow at scale on Summit with HiOp PriDec on GPUs, the code would crash when attempting to write the output files for each contingency (when this was requested). It looked to me like ExaGo was trying to run multiple instances of
opflow
again to convert to PS format, and it was running out of GPU memory. Note that HiOp PriDec had already converged when this happened. In the end, I turned off the final output generation to collect the scaling data.@abhyshr, would you mind elaborating on how the outputs are generated? Is there a way to get the converged solution directly?
cc @pelesh @maksud
The text was updated successfully, but these errors were encountered: