-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guard some stream syncs #4351
base: development
Are you sure you want to change the base?
Guard some stream syncs #4351
Conversation
Src/AmrCore/AMReX_MFInterpolater.cpp
Outdated
@@ -202,7 +204,9 @@ MFCellConsLinInterp::interp (MultiFab const& crsemf, int ccomp, MultiFab& finemf | |||
}); | |||
} | |||
|
|||
Gpu::streamSynchronize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to keep this one because there is a local MultiFab in this block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I didn't pay attention to that. Should I use the async arena for the temporary memory or keep the stream sync?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to keep the stream sync so that the async arena does not use too much memory.
Src/AmrCore/AMReX_MFInterpolater.cpp
Outdated
@@ -373,7 +377,9 @@ MFCellConsLinMinmaxLimitInterp::interp (MultiFab const& crsemf, int ccomp, Multi | |||
} | |||
}); | |||
|
|||
Gpu::streamSynchronize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to keep this one because there is a local MultiFab in this block.
@@ -1031,7 +1033,9 @@ MLABecLaplacianT<MF>::Fsmooth (int amrlev, int mglev, MF& sol, const MF& rhs, in | |||
}); | |||
} | |||
} | |||
Gpu::streamSynchronize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at the end of Fsmooth we should add the following in case this is in nosync region.
if (Ax.local_size() > 0 && Gpu::inNoSyncRegion()) { Gpu::streamSynchronize(); }
@@ -193,7 +193,9 @@ void MLCurlCurl::restriction (int amrlev, int cmglev, MF& crse, MF& fine) const | |||
{ | |||
mlcurlcurl_restriction(idim,i,j,k,crsema[bno],finema[bno],dinfo); | |||
}); | |||
Gpu::streamSynchronize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a local temp MulitFab. So it is not safe to remove it.
@@ -228,7 +230,9 @@ void MLCurlCurl::interpolation (int amrlev, int fmglev, MF& fine, | |||
} | |||
}); | |||
} | |||
Gpu::streamSynchronize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should either move the sync into the for-loop or make MultiFab cfine
an array.
} else { | ||
ParallelFor( nmf, [=] AMREX_GPU_DEVICE(int bno, int i, int j, int k) | ||
{ | ||
mlcurlcurl_1D(i,j,k,ex[bno],ey[bno],ez[bno], | ||
rhsx[bno],rhsy[bno],rhsz[bno], | ||
b,adxinv,color,dinfo); | ||
}); | ||
Gpu::streamSynchronize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one and the one above are needed because there is a temp MulitFab.
@@ -450,7 +458,9 @@ void MLCurlCurl::smooth4 (int amrlev, int mglev, MF& sol, MF const& rhs, | |||
}); | |||
} | |||
} | |||
Gpu::streamSynchronize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also not safe to remove this one.
@@ -193,7 +197,9 @@ MLNodeABecLaplacian::restriction (int amrlev, int cmglev, MultiFab& crse, MultiF | |||
{ | |||
mlndlap_restriction(i,j,k,pcrse_ma[box_no],fine_ma[box_no],msk_ma[box_no]); | |||
}); | |||
Gpu::streamSynchronize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May not be safe.
@@ -225,7 +231,9 @@ MLNodeABecLaplacian::interpolation (int amrlev, int fmglev, MultiFab& fine, cons | |||
mlndlap_interpadd_aa(i, j, k, fine_ma[box_no], crse_ma[box_no], | |||
sig_ma[box_no], msk_ma[box_no]); | |||
}); | |||
Gpu::streamSynchronize(); | |||
if (!Gpu::inNoSyncRegion()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!Gpu::inNoSyncRegion()) { | |
if (cfine.local_size() > 0 || !Gpu::inNoSyncRegion()) { |
@@ -665,7 +669,9 @@ MLNodeLaplacian::interpolation (int amrlev, int fmglev, MultiFab& fine, const Mu | |||
mlndlap_semi_interpadd_aa(i, j, k, fine_ma[box_no], crse_ma[box_no], sig_ma[box_no], msk_ma[box_no], idir); | |||
}); | |||
} | |||
Gpu::streamSynchronize(); | |||
if (!Gpu::inNoSyncRegion()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!Gpu::inNoSyncRegion()) { | |
if (cfine.local_size() > 0 || !Gpu::inNoSyncRegion()) { |
@@ -701,7 +705,9 @@ MLNodeLaplacian::Fsmooth (int amrlev, int mglev, MultiFab& sol, const MultiFab& | |||
} | |||
} | |||
|
|||
Gpu::streamSynchronize(); | |||
if (!Gpu::inNoSyncRegion()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!Gpu::inNoSyncRegion()) { | |
if (Ax.local_size() > 0 || !Gpu::inNoSyncRegion()) { |
Summary
Additional background
Checklist
The proposed changes: