-
Notifications
You must be signed in to change notification settings - Fork 92
[Queue Time Histogram] Add Job Queue Time Lambda #6435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you able to also create some sort of breakdowns/aggregations?
Like Ephemeral vs NonEphemeal, meta-owned/non-meta-owned, pet vs dynamic, Linux/Mac/Windows, etc?
added, I copied some of the File handling code from ci-pct.py |
added, it will be stored in array field as runner_labels |
arguments = parse_args() | ||
|
||
# update environment variables for input parameters | ||
os.environ["CLICKHOUSE_ENDPOINT"] = arguments.clickhouse_endpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a not so great engineering standard to hide the original value by overwriting it with cli arguments.
It makes more sense to default the CLI argparse from the environment os.environ. But you can do with a special Config class if you want to as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kk, I can make it different!
runner_labels["other"].add(machine_type) | ||
|
||
|
||
def create_runner_labels( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this piece of code be reused? So we can avoid having to update in multiple places?
Just today Nikita is planning to move macos-m2-15 from apple ownership to ours. So we'll need to update the metrics and aggregations...
If we create many places to update we're setting ourselves to make mistakes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lambda is a bit tricky to do it, i think I can do it in a BE pr, right now my focus is get this kick in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that before we move forward with more details, we need to fix a few design details. Making sure this script is idempotent is critical for any processing pipeline. Please reach out to discuss if you want to :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving now, as we agreed to work on the python script in a next iteration so to avoid a too big of a PR
fixed some nits, moving to next pr |
Description
Add aws lambda to generate in-queue job histgram, steps:
the snapshot data we generate includes:
{ queue_s, repo, workflow_name , job_name, htm_url, machine_type, time, runner_labels}
the runner_labels includes the machine_type, and other categories such as linux, dynamic etc
Design Doc:doc
working result in s3 (Ran locally)
s3 link