-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix failed during restart in Ambari 2.5 #29
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@petroav Please note that this PR is also upgrading presto to 0.182. Think it should be split into 2 commits and update the metainfo.xml as well.
@teamsoo thanks for notes, I have noticed these details. I'll update this PR soon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR and apologies for the late review! Can you squash everything into one commit? When writing the commit message can you follow the guidelines here: https://chris.beams.io/posts/git-commit/?
You'll also have to rebase since I just pushed a change to fix the failing build.
config_properties['http-server.http.port']), | ||
all_hosts) | ||
# sleep some seconds for waiting coordinator start finished | ||
time.sleep(30) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the sleep and the comment as they're unnecessary. The PrestoClient
code has a retry with timeout mechanism to ensure the server has started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I have seen it. To add this code just because some errors will be raised (just like ExecutionFailed mentioned blew) . And the retries will be interrupted, Never retry again.
Add some seconds for sleeping will skip the ExecutionFailed exception in most cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@petroav As I said above, I think I have fixed the root cause of the interruption of the error, so this code can now be removed. It may also be possible to keep this code, and the sleep time setting is shorter, just if there are other exceptions that are thrown and then interrupted? What is your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that you fixed the underlying problem yes, I think this sleep should be removed.
Execute('{0} status'.format(daemon_control_script)) | ||
except ExecutionFailed as ef: | ||
if ef.code == 3: | ||
raise ComponentIsNotRunning("ComponentIsNotRunning") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the advantage of doing this? Will Ambari recognize the exception and do something special as a result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Ambari only accept zero value as normal return, any other return value will cause an ExecutionFailed exception thrown. As presto status
will return the code: 3 .
Also In Ambari 2.5, Ambari will handle ComponentIsNotRunning exception as a normal logic procession during restart like below:
/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py (Ambari Libaray):
So. TO AVOID THE EXCEPTION
raise Fail("Stop command finished but process keep running.")
raise a ComponentIsNotRunning exception should be the right way.
I only test it in Ambari 2.5. I'm not sure is any older version play the same behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation, that makes sense.
@@ -1,2 +1,10 @@ | |||
class ClientComponentHasNoStatus(Exception): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing the unit tests! The Ambari SDK doesn't make it easy to test your code so I had to resort to this hacky solution.
I'm a newer for submitting on github. I will take a good look at this article, and then hope to do better |
@icanfly I took the liberty to incorporate my comments, squashed your commits and merged them to master. Thanks for the pull request! |
fix the issue: #28