Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix failed during restart in Ambari 2.5 #29

Closed
wants to merge 3 commits into from

Conversation

icanfly
Copy link
Contributor

@icanfly icanfly commented Aug 10, 2017

fix the issue: #28

@petroav petroav self-assigned this Aug 10, 2017
@petroav petroav self-requested a review August 10, 2017 14:10
Copy link
Contributor

@teamsoo teamsoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@petroav Please note that this PR is also upgrading presto to 0.182. Think it should be split into 2 commits and update the metainfo.xml as well.

@icanfly
Copy link
Contributor Author

icanfly commented Aug 11, 2017

@teamsoo thanks for notes, I have noticed these details. I'll update this PR soon

Copy link
Contributor

@petroav petroav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR and apologies for the late review! Can you squash everything into one commit? When writing the commit message can you follow the guidelines here: https://chris.beams.io/posts/git-commit/?

You'll also have to rebase since I just pushed a change to fix the failing build.

config_properties['http-server.http.port']),
all_hosts)
# sleep some seconds for waiting coordinator start finished
time.sleep(30)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the sleep and the comment as they're unnecessary. The PrestoClient code has a retry with timeout mechanism to ensure the server has started.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I have seen it. To add this code just because some errors will be raised (just like ExecutionFailed mentioned blew) . And the retries will be interrupted, Never retry again.
Add some seconds for sleeping will skip the ExecutionFailed exception in most cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@petroav As I said above, I think I have fixed the root cause of the interruption of the error, so this code can now be removed. It may also be possible to keep this code, and the sleep time setting is shorter, just if there are other exceptions that are thrown and then interrupted? What is your opinion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that you fixed the underlying problem yes, I think this sleep should be removed.

Execute('{0} status'.format(daemon_control_script))
except ExecutionFailed as ef:
if ef.code == 3:
raise ComponentIsNotRunning("ComponentIsNotRunning")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the advantage of doing this? Will Ambari recognize the exception and do something special as a result?

Copy link
Contributor Author

@icanfly icanfly Aug 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Ambari only accept zero value as normal return, any other return value will cause an ExecutionFailed exception thrown. As presto status will return the code: 3 .

Also In Ambari 2.5, Ambari will handle ComponentIsNotRunning exception as a normal logic procession during restart like below:
/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py (Ambari Libaray):

image

So. TO AVOID THE EXCEPTION

raise Fail("Stop command finished but process keep running.")

raise a ComponentIsNotRunning exception should be the right way.

I only test it in Ambari 2.5. I'm not sure is any older version play the same behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, that makes sense.

@@ -1,2 +1,10 @@
class ClientComponentHasNoStatus(Exception):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the unit tests! The Ambari SDK doesn't make it easy to test your code so I had to resort to this hacky solution.

@petroav
Copy link
Contributor

petroav commented Aug 17, 2017

@teamsoo it looks like @icanfly reverted the change in a follow up commit. @icanfly if you'd like you can add a second commit to this PR that updates the Presto version to 0.182 in metainfo.xml as @teamsoo suggested as well as updating the links in download.ini.

@icanfly
Copy link
Contributor Author

icanfly commented Aug 21, 2017

I'm a newer for submitting on github. I will take a good look at this article, and then hope to do better

@icanfly icanfly changed the title Fix 'restart presto failed in Ambari 2.5' and Upgrade version of presto component Fix failed during restart in Ambari 2.5 Aug 21, 2017
@petroav
Copy link
Contributor

petroav commented Aug 21, 2017

@icanfly I took the liberty to incorporate my comments, squashed your commits and merged them to master. Thanks for the pull request!

@petroav petroav closed this Aug 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants