-
Notifications
You must be signed in to change notification settings - Fork 103
[ECE] Clarify the steps of identifying the best ZK leader candidate #1598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
## Description Previously it was documented as users / customer can find it thru logs. Recently during a sync with @pfcoperez in an internal ticket - elastic/sdh-control-plane#9169 (comment) and elastic/support-tech-lead#1554 (comment), we decide to rewrite this part. In detail, we'd recommend users / customers to only collect the essential information, and we (support) check the essential information is collected first, and engage with dev team to make further verification on this together. Motivation is because if handled in a wrong way - either identify the best ZK leader candidate, or recover, this process can potentially corrupt users data permanently. In specific, - We hide the steps to identify the ZK leader into private section in KB: - Private view: https://support.elastic.dev/knowledge/view/fa410d1f - Public view: https://support.elastic.co/knowledge/fa410d1f - We only guide users / customer to collect essential information, and let them reach out to support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some suggestions for you. I want to draw your attention to the idea of hinting at why this is a support aided process so people have clarity around the risks
troubleshoot/deployments/cloud-enterprise/rebuilding-broken-zookeeper-quorum.md
Outdated
Show resolved
Hide resolved
troubleshoot/deployments/cloud-enterprise/rebuilding-broken-zookeeper-quorum.md
Outdated
Show resolved
Hide resolved
troubleshoot/deployments/cloud-enterprise/rebuilding-broken-zookeeper-quorum.md
Outdated
Show resolved
Hide resolved
troubleshoot/deployments/cloud-enterprise/rebuilding-broken-zookeeper-quorum.md
Outdated
Show resolved
Hide resolved
troubleshoot/deployments/cloud-enterprise/rebuilding-broken-zookeeper-quorum.md
Outdated
Show resolved
Hide resolved
troubleshoot/deployments/cloud-enterprise/rebuilding-broken-zookeeper-quorum.md
Outdated
Show resolved
Hide resolved
troubleshoot/deployments/cloud-enterprise/rebuilding-broken-zookeeper-quorum.md
Outdated
Show resolved
Hide resolved
troubleshoot/deployments/cloud-enterprise/rebuilding-broken-zookeeper-quorum.md
Outdated
Show resolved
Hide resolved
troubleshoot/deployments/cloud-enterprise/rebuilding-broken-zookeeper-quorum.md
Outdated
Show resolved
Hide resolved
troubleshoot/deployments/cloud-enterprise/rebuilding-broken-zookeeper-quorum.md
Outdated
Show resolved
Hide resolved
…okeeper-quorum.md Co-authored-by: shainaraskas <[email protected]>
…okeeper-quorum.md Co-authored-by: shainaraskas <[email protected]>
…okeeper-quorum.md Co-authored-by: shainaraskas <[email protected]>
…okeeper-quorum.md Co-authored-by: shainaraskas <[email protected]>
…okeeper-quorum.md Co-authored-by: shainaraskas <[email protected]>
…okeeper-quorum.md Co-authored-by: shainaraskas <[email protected]>
…okeeper-quorum.md Co-authored-by: shainaraskas <[email protected]>
…okeeper-quorum.md Co-authored-by: shainaraskas <[email protected]>
…okeeper-quorum.md Co-authored-by: shainaraskas <[email protected]>
Thank you @shainaraskas I didn't add because we are mentioning the point in the beginning (My bad I should have mentioned this more clearly in my initial description 🙏 ):
![]() The TL;DR is, if a ZK leader candidate is wrongly chosen, then the whole ECE installation may become broken and the structure may get permanently lost. A bit more detail:
Previously the page https://www.elastic.co/docs/troubleshoot/deployments/cloud-enterprise/rebuilding-broken-zookeeper-quorum itself was a KB, but due to it's a popular one, we promoted it to public doc. Hope it's clear. That said, do you think we should add one more note to emphasize? |
Let me also attach the screenshot here to make it easier for dev friends to review: https://docs-v3-preview.elastic.dev/elastic/docs-content/pull/1598/troubleshoot/deployments/cloud-enterprise/rebuilding-broken-zookeeper-quorum#ece_determine_the_zookeeper_leader ![]() |
Thanks @kunisen - I did see that and then promptly forgot about the note at the end of my review. That's good enough for me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me from a docs POV, but we'll wait on eng review as well
Thank you so much @shainaraskas! Asked internally here 😄 |
LGTM (not approving intentionally because I want my team to take a peek too). I wonder though if you'd like to document either externally or internally what to do with the adquired information too. |
Thank you all! Will merge this. |
Description
Previously it was documented as users / customer can find it thru logs.
Recently during a sync with @pfcoperez in an internal ticket - https://github.com/elastic/sdh-control-plane/issues/9169#issuecomment-2752931242 and https://github.com/elastic/support-tech-lead/issues/1554#issuecomment-2777954862, we decide to rewrite this part.
In detail, we'd recommend users / customers to only collect the essential information, and we (support) check the essential information is collected first, and engage with dev team to make further verification on this together.
Motivation is because if handled in a wrong way - either identify the best ZK leader candidate, or recover, this process can potentially corrupt users data permanently.
In specific,
Before / After PR merge
:: Before
:: After
In public doc: (orange part will show up)
In public KB: It will show almost the same thing:
https://support.elastic.co/knowledge/fa410d1f
In KB private section: It will show the details.
https://support.elastic.dev/knowledge/view/fa410d1f
cc @mmahacek @pfcoperez