Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How many clusters the daemon can manage #70

Open
ut0f9t opened this issue Aug 29, 2019 · 8 comments
Open

How many clusters the daemon can manage #70

ut0f9t opened this issue Aug 29, 2019 · 8 comments

Comments

@ut0f9t
Copy link

ut0f9t commented Aug 29, 2019

The daemon is not able to start when the number of clusters is superior at 8.
How can i manage that by using 2 deamons and 2 configuratin files?

@lukeskaivolker
Copy link

Strange, as I have used to monitor over 60 clusters with one daemon running.
What error you get when starting daemon? How about server performance on which you are running daemon?

@ut0f9t
Copy link
Author

ut0f9t commented Oct 16, 2019

Thanks for your confirmation... its look like one cluster return an error and the daemon hang (config have to be checked)

2019-10-16 09:18:31,920:urllib3.connectionpool:WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fa92
b7f06a0>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /platform/3/cluster/config

but wit the current configuration after removed the cluster on error we can see this on the log
2019-10-16 09:26:18,330 WARNING Connection pool is full, discarding connection:

@ut0f9t
Copy link
Author

ut0f9t commented Oct 16, 2019

2019-10-16 09:28:57,264:urllib3.connectionpool:WARNING: Connection pool is full, discarding connection

@tenortim
Copy link
Collaborator

tenortim commented Nov 6, 2019

This is a concern with the way the collector schedules collection from multiple clusters (one problematic cluster can have knock-on effects). I have rewritten the collector in Go and plan to push the new version to github shortly. I intend to update the Python collector to Python 3 and to leverage the async functionality, but currently, the Go collector is more reliable and better-performing.

@ut0f9t
Copy link
Author

ut0f9t commented Nov 21, 2019

Thks .. look forward to the new version in Go bye thks again

@tenortim
Copy link
Collaborator

I still haven't posted the Go collector externally (I will, I promise), but as noted above, the Python collector is certainly not limited to 8 clusters (we're using internally for close to 20). It is rather fragile wrt errors at startup however, and I need to fix that. In this case, the startup error is because it can't resolve the cluster name (but the urllib3 errors are terrible and don't actually tell you the name part of the URL). The connectionpool warnings are annoying (I opened a bug for those), but harmless in that they do not prevent the collector working (it just means that there are more connections than the cached limited so the collector is having to reconnect more frequently than we would like).

@ut0f9t
Copy link
Author

ut0f9t commented Mar 3, 2020

Hello Tim
Many thanks .. I was thinking about a problem with the connection pool with python 2, you gave a python3 version of the collector that I never tried. I'll keep you informed.
Thks again

@tenortim
Copy link
Collaborator

tenortim commented Nov 3, 2021

FYI, the Golang collector is available at https://github.com/tenortim/gostats

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants