forked from odoo/runbot
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Syncing from upstream odoo/runbot (17.0) #802
Open
bt-admin
wants to merge
23
commits into
brain-tec:17.0
Choose a base branch
from
odoo:17.0
base: 17.0
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Use f-strings where appropriate. - Get the repo url from github instead of composing it by hand - Resolve URI template for the contents endpoint instead of hand-building the URL. URI template implementation is janky AF.
Also use the fork's response for the fork's own URL instead of composing it by hand.
Should limit the risk of cases where the fork contains outdated versions of the reference branches and we end up with odd outdated / not up to date basis for branches & updates, which can lead to confusing situations.
For historical reasons pretty much all tests used to use the contexts legal/cla and ci/runbot. While there are a few tests where we need the interactions of multiple contexts and that makes sense, on the vast majority of tests that's just extra traffic and noise in the test (from needing to send multiple statuses unnecessarily). In fact on the average PR where everything passes by default we could even remove the required statuses entirely...
Looks like `--force-with-lease` can work in multi-head updates, and this should allow just one push per repository for the entire sequence of update, then commit the entire thing if it succeeds.
This is a bit of an odd case which was only noticed because of persistent forwardport.batches, which ended up having a ton of related traceback in the logs (the mergebot kept trying to create forward ports from Jan 27th to Feb 10th, thankfully the errors happened in git so did not seem to eat through our API rate limiting). The issue was triggered by the addition of odoo/enterprise#77876 to odoo/odoo#194818. This triggered a completion job which led to the creation of odoo/enterprise#77877 to odoo/enterprise#77880, so far so good. Except the bit of code responsible for creating completion jobs only checked if the PR was being added to a batch with a descendant. That is the case of odoo/enterprise#77877 to odoo/enterprise#77879 (not odoo/enterprise#77880 because that's the end of the line). As a result, those triggered 3 more completion jobs, which kept failing in a loop because they tried pushing different commits to their next-siblings (without forcing, leading git to reject the non-ff push, hurray). A completion request should only be triggered by the addition of a new *source* (a PR without a source) to an existing batch with descendants, so add that to the check. This requires updating `_from_json` to create PRs in a single step (rather than one step to create based on github's data, and an other one for the hierarchical tracking) as we need the source to be set during `create` not as a post-action. Although there was a test which could have triggered this issue, the test only had 3 branches so was not long enough to trigger the issue: - Initial PR 1 on branch A merged then forward-ported to B and C. - Sibling PR 2 added to the batch in B. - Completed to C. - Ending there as C(1) has no descendant batch, leading to no further completion request. Adding a 4th branch did surface / show the issue by providing space for a new completion request from the creation of C(2). Interestingly even though I the test harness attempts to run all triggered crons to completion there can be misses, so the test can fail in two different ways (being now checked for): - there's a leftover forwardport.batch after we've created all our forwardports - there's an extra PR targeting D, descending from C(2) - in almost every case there's also a traceback in the logs, which does fail the build thanks to the `env` fixture's check
That's a bit of a weird one: apparently the boolean_toggle widget has an `autosave` option which should be `true` by default, effecting the row as soon as the toggle is toggled[^1]. But in 15.0 and 18.0 it seems to have no effect, the `boolean_toggle` always just stores the change in the parent form and that gets "committed on save. In 16.0 and 17.0 however it does have an effect, toggling the control will immediately save its value *without going through the parent form*, resulting in the override to `Project.write` managing new/existing branches to not be called, thus not calling `Project_followup_prs`, and ultimately not creating the followup forward ports. After contacting AAB to get more info (and grepping a bit): - autosave was added (enabled by default) in 16.0 after the owl rewrite (odoo/odoo@28e6b7e) - toggle was added in 17.0 (odoo/odoo@a449b05) - feature was removed in 18.0 (odoo/odoo@6bd2c1f) Which explains why the issue occurs in 16.0 and 17.0, and not in 15.0 or 18.0. Fixes #1051 [^1]: but apparently not going through the parent form...
When an fw batch fails, log a message to its chatter (so the reason for the failure doesn't necessarily have to be hunted down in the logs, although depending on the quality of the error that might still be an issue). Also if a batch keeps failing (fails for more than a day at a retry every hour -- increased from a previous 30mn), stop retrying it and flag it in the list view, it's clearly not going to go through. This is because we hit an issue with a completion fw batch created on January 27th which kept failing until it was deleted on February 10th. Thankfully it failed on `git push` and git operations apparently are not rate limited at all, but still it's not great stewartship to keep trying a forwardport which keeps failing. Retrying in case of transient failure makes sense, but after 24 attempts over a day it's either not transient, or it's not working because github is down and hammering it won't help.
In the case where multiple batch completions are triggered (see 3 commits earlier) one of the issues is that the "already ported" check is incorrect, as it checks the (source, target) pair against the PR used to trigger the completion check. That PR *should* be the one PR which was just added to a batch, but if one of its descendants is used to re-trigger a port then the dupe check fails and we try to port the thing again. Remember to deref' the subject's source in order to perform the dupe check, as well as set the source for the forwardport we create. Fixes #1052
Hit issues where this didn't seem to work after deactivating 17.2 and trying to recover from the autoport being broken (see previous commits), but apparently this says everything works just fine so ╮( ̄▽ ̄"")╭ Fixes #1052
Since forever, if a PR gets added to an existing forward ported batch it will be forward ported immediately, leading to a forwardport having a source with no merge date. Since the addition of the skipmerge feature, it's even more possible than before. As a result, the `outstanding` listing should not assume the merge_date of a source is always set.
This can yield odd effects when playing around e.g. opening freeze wizards then cancelling them. There is a check that there are no wizards left, but if a freeze is performed then an other is created and cancelled it will immediately re-enable forward ports, unexpectedly.
"Run manually" is a bit meh, as it runs the cron synchronously (so you have to wait for it, and hope it doesn't run for longer than the request timeout which may be smaller than the cron timeout) and it can run in a subtly different environment than normal, which can lead to different behaviour. Instead add a button to enqueue a cron trigger, now that they exist that's much closer to what we actually want, and it does run the cron in a normal / expected environment.
`groupby` returns an iterator of `(key, list[value])`, not a recordset. So `BaseModel.__or__` is invalid. I missed that bit because I didn't bother writing a test where a PR has more than 6 months and triggers generating / sending emails, but turns out we do have those in the production database, and as a result the mergebot hasn't sent an outstanding fw notification since I deployed 892fce4 in early february (oops). Thanks to @clbr-odoo for the notification / report of the issue.
Mostly for tests: it can be really difficult to correlate issues as there are 3 different processes involved (the test runner, the odoo being tested, and dummy-central (as github)) and the intermixing of logs between the test driver and the odoo being tested is not *amazing*. - `pytest-opentelemetry`'s `--export-traces` is the signal for running tests with tracing enabled, that way just having `pytest-opentelemetry` installed does not do anything untowards. - Until chrisguidry/pytest-opentelemetry#34 is merged, should probably use the linked branch as the default / base mode of having a single trace for the entire test suite is not great when there are 460 tests, especially as local clients (opentelemetry-desktop-viewer, venator) mostly work on traces and aren't very good at zooming on *spans* at least currently. - Additionally, the conftest plugin adds hooks for propagating through the xmlrpc client (communications with odoo) and enables the requests instrumentor from the opentelemetry python contribs. - The dummy `saas_worker` was moved into the filesystem, that makes it easier to review and update. - A second such module was added for the otel instrumentation *of the odoo under test*, that instruments psycopg2, requests, wsgi, and the server side of xmlrpc. - The git ops were instrumented directly in runbot_merge, as I've tried to keep `saas_tracing` relatively generic, in case it could be moved to community or used by internal eventually. Some typing was added to the conftest hooks and fixtures, and they were migrated from tmpdir (py.path) to tmp_path (pathlib) for consistency, even though technically the `mkdir` of pathlib is an annoying downgrade. Fixes #835
If a status is defined as `optional`, then the PR is considered valid if the status is never sent, but *if* the status is sent then it becomes required. Note that as ever this is a per-commit requirement, so it's mostly useful for conditional statuses. Fixes #1062
Verify that the tree is different before and after applying the patch, otherwise if there's a mistake made (e.g. a script does not check that they have content and request a patch applying an existing commit which is how odoo/enterprise#612a9cf3cadba64e4b18d535ca0ac7e3f4429a08 occurred) we end up with a completely empty commit and a duplicated commit message. Fixes #1063 Note: using `rev-parse` to retrieve the commit's tree would be 50% faster, but we're talking 3.2 versus 2.4ms and it requires string formatting instead of nice clean arguments so it's a bit meh.
It's shorter, it's simpler (kinda), and it's 70% faster (although that's unlikely to be any sort of bottleneck given applying patches involves multiple network roundtrips).
Commits can take some time to propagate through the network (I guess), or human error can lead to the wrong commit being set. Either way, because the entire thing was done using a single fetch in `check=True` the cron job would fail entirely if any of the patch commits was yet unavailable. Update the updater to: - fallback on individual fetches - remove the patch from the set of applicable patch if we (still) can't find its commit I'd have hoped `fetch` could retrieve whatever it found, but apparently the server just crashes out when it doesn't find the commit we ask for, and `fetch` doesn't update anything. No linked issue because I apparently forgot to jot it down (and only remembered about this issue with the #1063 patching issue) but this was reported by mat last week (2025-02-21) when they were wondering why one of their patches was taking a while: - At 0832 patch was created by automated script. - At 0947, an attempt to apply was made, the commit was not found. - At 1126, a second attempt was made but an other patch had been created whose commit was not found, failing both. - At 1255, there was a concurrency error ("cannot lock ref" on the target branch). - Finally at 1427 the patch was applied. All in all it took 6 hours to apply the patch, which is 3-4 staging cycles.
This used to be done in RPC but the goal is to be able to call this from another server and not rely on a user account.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
bt_gitbot