-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] XrdAdaptor: Use inactive sources first before trying to open a new source when handling a failed request #47593
base: master
Are you sure you want to change the base?
Conversation
cms-bot internal usage |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47593/44088 |
// similar logic as in checkSourcesImpl() | ||
// assume the "sort open delay" doesn't matter in case of a request failure | ||
auto bestInactiveSource = | ||
std::min_element(m_inactiveSources.begin(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the best quality really the lowest number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, yes. See e.g. existing code
cmssw/Utilities/XrdAdaptor/src/XrdRequestManager.cc
Lines 455 to 460 in eb45085
auto bestInactiveSource = | |
std::min_element(eligibleInactiveSources.begin(), | |
eligibleInactiveSources.end(), | |
[](const std::shared_ptr<Source> &s1, const std::shared_ptr<Source> &s2) { | |
return s1->getQuality() < s2->getQuality(); | |
}); |
There is some discussion about the quality metric in https://github.com/cms-sw/cmssw/blob/master/Utilities/XrdAdaptor/doc/multisource_algorithm_design.txt
and maybe it is computed in https://github.com/cms-sw/cmssw/blob/master/Utilities/XrdAdaptor/src/QualityMetric.cc, but I haven't tried to understand what is done there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks, sounds good! 👍
FYI @osschar |
@cmsbuild, please test |
+1 Size: This PR adds an extra 32KB to repository Comparison SummarySummary:
|
…r::requestFailure() According to Brian Bockelman this was the intention, but apparently was never coded.
27252f2
to
688b9dc
Compare
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47593/44101 |
Pull request #47593 was updated. |
With @bockjoo's test #43162 (comment) the present state of this PR results in the following actions with sources (with increased verbosity of XrdAdaptor printouts with process.MessageLogger.cerr.threshold = "INFO"
process.MessageLogger.cerr.XrdAdaptorInternal = dict()
process.MessageLogger.cerr.XrdAdaptorLvl1 = dict()
process.MessageLogger.cerr.XrdAdaptorLvl3 = dict() )
|
To summarize #47593 (comment)
At least this private test demonstrates the code in this PR has an impact. |
@cmsbuild, please test |
+1 Size: This PR adds an extra 32KB to repository Comparison SummarySummary:
|
PR description:
In the context of #43162 and #46086 (comment), this PR changes the XrdAdaptor behavior for request failures. Previously, after disabling the source that resulted the request failure, it passed on the request to another active source, and if there were no active sources, it tried to open a new source. This PR changes to logic to
The first commit generalizes the code where the failed source is removed from the active sources container. I found it confusing that the code checked only the first 2 active sources (even if there would only ever be at most 2 active sources). Admittedly, if there would be more than 2 active sources, the commit would change the behavior, but I would argue for the better.
Resolves cms-sw/framework-team#1306
PR validation:
Code compiles.
If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:
Possibly to be backported to 15_0_X.