Deploy Status Page for various Pulsar webservices #147
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In this PR I setup two things:
How it Works
Periodically, the
status-runmicroservice will be triggered via HTTP request within Google Cloud (just likeauth-state-cleanupis) that will trigger (as often as we want) it to check our various services and depending on their current status will update a JSON file stored within Google Cloud Storage with the current status of everything.Then when a user navigates to
status.pulsar-edit.devwhich will return the contents ofstatus-view(which of course we will host on Cloudflare so any outage is separated from all of our other services) which can then read the public facing JSON document and display it's results.Why do this instead of go with a packaged solution?
Honestly for a short time I almost did scrap all of this and go with Status List, but in the end it turned out their free integration with Digital Ocean would only allow three monitors, which didn't feel like enough.
But largely I looked around and couldn't find a solution fully featured enough that was also free. Now technically, while we don't pay for Cloudflare Pages (
status-view) we do pay for Cloud Run (status-run), but in all reality with the frequency we run it at I expect the cost to be just pennys, so I'm not too worried.Plus with everything custom and in house we can do, literally, whatever we want.
What Happens if
status-runfails to update the status page?While not implemented at this current moment, I plan to have
status-viewcheck the last updated timestamp of thestatus.jsondocument, and once we determine how often we want to run this, if it's out of date the results are considered invalid and there must be something wrong.I'm testing
cloud-viewlocally and everything is down!One hard part about testing this locally is it very much relies on Google Cloud Storage, meaning I haven't found a way to properly test things well. Part of which being that your
localhostdomain can't access GCS data due to CORS. So locally, every service will be down with a CORS error.What's left to do?
Setting up something so reliant on cloud services means I need to test partly in the cloud to deploy it, which luckily this is a brand new subdomain so nobody is waiting on it, but ideally someone could review the soundness of the code alone, I'd deploy and test while still keeping an open PR, make the changes needed to actually have things work. And only then a final review of changes can get this merged and deployed.
Additionally, I'd prefer to create a proper
statusendpoint for the following services:That way they can do much more thorough checks without us increasing the complexity here much at all, which should then be able to give us much better results.
So please feel free to give any suggestions, it's brand new and something I just wanted to play around with amidst half of the internet going out seemingly every week.