Introduce a new envvar to set a longer timeout for Debian buildds #737

manphiz · 2025-01-18T23:30:30Z

Some of the Debian supported architectures are very slow and requires a longer timeout for some of the tests to finish.

* Some of the Debian supported architectures are very slow and requires a longer timeout for some of the tests to finish.

greghendershott · 2025-01-19T03:53:04Z

Thanks for the pull request!

At first I didn't understand this. But digging around, I'm guessing this is this for running the Emacs Lisp tests on a buildd instance, for the Debian package called racket-mode?

Can you help me understand the purpose of that Debian package?

Why I ask: Racket Mode is a combination of both (a) Emacs Lisp and (b) Racket code. It is designed for both the Emacs Lisp and Racket code to be updated in sync and delivered together as an Emacs package (or people can install both directly from source, here).

It is definitely not designed/maintained to support people getting some of the code via one method, and some via another method. [In other words, I deliberately don't supply both an Emacs Lisp package and a separate Racket package, and ask users to update both. In fact, I frequently change the API between the two, in ways that don't matter when they're delivered together -- but could break badly if delivered separately and not updated perfectly in lock step.]

At a quick glance at the Debian package's test logs, I see only the Emacs tests being run (e.g. make test-elisp). But I don't see the Racket tests being run (make test-racket). The right thing to do is run both, (make test).

So I'm a little concerned that the Debian package might not be doing the right thing?

Can you help me understand more? Thanks!

manphiz · 2025-01-19T07:47:32Z

Hi Greg, Thanks for your reply, and sorry for not being very clear about the purpose of the PR. Please see my replies below. Greg Hendershott ***@***.***> writes:

Thanks for the pull request! At first I didn't understand this. But digging around, I'm guessing this is this for running the Emacs Lisp tests on a [`buildd`](https://www.debian.org/devel/buildd/) instance, for the Debian package called [`racket-mode`](https://ci.debian.net/packages/r/racket-mode/)?

This is correct.

Can you help me understand the purpose of that Debian package?

The Debian racket-mode package[1] basically provides the exact content of this repo and ship as a deb package. The result is like installing through nongnu ELPA, but is done using Debian's apt and follows Debian conventions.

Why I ask: Racket Mode is a combination of both (a) Emacs Lisp and (b) Racket code. It is designed for both the Emacs Lisp and Racket code to be updated in sync and delivered together as an Emacs package (or people can install both directly from source, here). It is definitely not designed/maintained to support people getting some of the code via one method, and some via another method. [In other words, I deliberately don't supply both an Emacs Lisp package and a separate Racket package, and ask users to update both. In fact, I frequently change the API between the two, in ways that don't matter when they're delivered together -- but could break badly if delivered separately and not updated perfectly in lock step.]

Ack. IIUC racket-mode contains Elisp code and racket code to provide editing and debugging support for Emacs, is this understanding correct? Ideally I would expect the racket-mode package only contains ELisp code to interact with a stable racket interface for operations, but I guess this may be fine before the interface stabilizes. Is this understanding correct?

At a quick glance at the Debian package's test logs, I see only the Emacs tests being run (e.g. `make test-elisp`). But I don't see the Racket tests being run (`make test-racket`). The right thing to do is run both, (`make test`).

Indeed. And actually, the Emacsen team uses a tool "dh-elpa", which is based on Debian's debhelper, to detect and run ERT and Buttercup tests without relying on any Makefiles or other build systems, hence only Elisp related tests are run.

So I'm a little concerned that the Debian package might not be doing the right thing?

Or doing half of the work: yes, currently the racket test is not run at the moment, and personally I think having this coverage is better than none. I can try to experiment with enabling the other tests.

Can you help me understand more? Thanks!

I think your understanding is more or less on point. Now for the PR, the purpose is to make the ELisp tests pass reliably on Debian buildds of slower architectures. Currently the only architecture that can finish with a timeout of 60 seconds on any operation is amd64. And when running manually, it looks like starting a REPL session takes a long time, and requires a timeout of 5 minutes on i386, and as long as 15 minutes timeout on arm64. Maybe racket is not well optimized on other architectures yet. Is the approach done in the PR acceptable? And let me know if this can be handled in a better way.

-- Reply to this email directly or view it on GitHub: #737 (comment) You are receiving this because you authored the thread. Message ID: ***@***.***>

[1] https://tracker.debian.org/pkg/racket-mode

…

-- Regards, Xiyue Deng

manphiz · 2025-01-19T09:03:29Z

Xiyue Deng ***@***.***> writes:

Now for the PR, the purpose is to make the ELisp tests pass reliably on Debian buildds of slower architectures. Currently the only architecture that can finish with a timeout of 60 seconds on any operation is amd64. And when running manually, it looks like starting a REPL session takes a long time, and requires a timeout of 5 minutes on i386, and as long as 15 minutes timeout on arm64. Maybe racket is not well optimized on other architectures yet. Is the approach done in the PR acceptable? And let me know if this can be handled in a better way.

Actually, it probably helps to disable the timeout altogether so that on slower architectures we just wait for the tests to finish. Is this possible?

…

-- Regards, Xiyue Deng

greghendershott · 2025-01-19T14:25:12Z

Thank you very much for all the context/information in your previous reply!

Xiyue Deng @.***> writes:
Now for the PR, the purpose is to make the ELisp tests pass reliably on Debian buildds of slower architectures. Currently the only architecture that can finish with a timeout of 60 seconds on any operation is amd64. And when running manually, it looks like starting a REPL session takes a long time, and requires a timeout of 5 minutes on i386, and as long as 15 minutes timeout on arm64. Maybe racket is not well optimized on other architectures yet. Is the approach done in the PR acceptable? And let me know if this can be handled in a better way.
Actually, it probably helps to disable the timeout altogether so that on slower architectures we just wait for the tests to finish. Is this possible?
…
-- Regards, Xiyue Deng

The timeout is c. 10 years old. I don't recall exact examples. In general:

There are some strict "unit" tests that, if they will ever pass/fail, will do so in seconds or minutes -- but some "integration" tests might never complete at all if there's a bug/problem.

[Not only does this include an infinite loop in synchronous code. There's also commands from the Emacs front end to the Racket back end -- these are non-blocking; the back end sends a completion message asynchronously, later. So there exist some tests that must loop checking for a command response, or the desired effect from one, repeatedly until the timeout. :( ]

So a timeout has been useful when running the tests locally (e.g. on my or a contributor's personal machine).

It also has helped on remote CI (originally Travis, lately GitHub Actions), because sometimes a human is waiting on those results (e.g. before reviewing or merging something). Eventually (~ an hour) GHA will terminate the runner, but that's a long time for a busy human to wait for a test that should have passed/failed in seconds or minutes.

For buildd, if no human is waiting starting at the screen, and if the buildd instances eventually get killed and cleaned up (do they?), then I agree the tests themselves don't need a timeout.

Because the timeout concept is woven through the tests, it's probably simplest for you to use your original PR, and just change the 900 literal to a Very Large Number. Probably the Emacs Lisp constant most-positive-fixum?

manphiz · 2025-01-20T01:26:03Z

Hi Greg, Greg Hendershott ***@***.***> writes:

Thank you very much for all the context/information in your previous reply! > Xiyue Deng ***@***.***> writes: > Now for the PR, the purpose is to make the ELisp tests pass reliably on Debian buildds of slower architectures. Currently the only architecture that can finish with a timeout of 60 seconds on any operation is amd64. And when running manually, it looks like starting a REPL session takes a long time, and requires a timeout of 5 minutes on i386, and as long as 15 minutes timeout on arm64. Maybe racket is not well optimized on other architectures yet. Is the approach done in the PR acceptable? And let me know if this can be handled in a better way. > Actually, it probably helps to disable the timeout altogether so that on slower architectures we just wait for the tests to finish. Is this possible? > […](#) > -- Regards, Xiyue Deng The timeout is c. 10 years old. I don't recall exact examples. In general: There are some strict "unit" tests that, if they will _ever_ pass/fail, will do so in seconds or minutes -- but some "integration" tests might never complete at all if there's a bug/problem. [Not only does this include an infinite loop in synchronous code. There's also commands from the Emacs front end to the Racket back end -- these are non-blocking; the back end sends a completion message asynchronously, later. So there exist some tests that must loop checking for a command response, or the desired effect from one, repeatedly until the timeout. :( ] So a timeout has been useful when running the tests locally (e.g. on my or a contributor's personal machine). It also has helped on remote CI (originally Travis, lately GitHub Actions), because sometimes a human is waiting on those results (e.g. before reviewing or merging something). Eventually (~ an hour) GHA will terminate the runner, but that's a long time for a busy human to wait for a test that should have passed/failed in seconds or minutes. For `buildd`, if no human is waiting starting at the screen, and if the `buildd` instances _eventually_ get killed and cleaned up (do they?), then I agree the tests themselves don't need a timeout. Because the timeout concept is woven through the tests, it's probably simplest for you to use your original PR, and just change the `900` literal to a Very Large Number. Probably the Emacs Lisp constant `most-positive-fixum`?

On further testing, it looks like racket-tests/repl is broken and will stuck forever. I've opened an issue[1] to track it separately. Therefore, as some tests may be broken and stuck, it seems reasonable to keep a timeout in effect, and I think 15 minutes for one operation is probably a good compromise. Wdyt? [1] #740

…

-- Reply to this email directly or view it on GitHub: #737 (comment) You are receiving this because you authored the thread. Message ID: ***@***.***>

-- Regards, Xiyue Deng

Also incorporate the substance of the changes from PR #737. Remove `racket-tests/eventually`, which was intended as a helper function and is a footgun when used directly in tests. Change how racket-tests/racket-repl detects the REPL buffer. Update the looking-back tests because the REPL now has a prompt like "repl.rkt>" instead of just ">". Revive the commented-out multiple expression tests, because I revived that functionality quite awhile ago.

Introduce a new envvar to set a longer timeout for Debian buildds

9b2abd1

* Some of the Debian supported architectures are very slow and requires a longer timeout for some of the tests to finish.

Skip tests skipped in CI mode when in Debian Buildd mode

1849d94

greghendershott closed this in aaae3ef Jan 20, 2025

manphiz deleted the longer-timeout-for-debian-buildd branch January 21, 2025 06:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce a new envvar to set a longer timeout for Debian buildds #737

Introduce a new envvar to set a longer timeout for Debian buildds #737

manphiz commented Jan 18, 2025

greghendershott commented Jan 19, 2025

manphiz commented Jan 19, 2025 via email

manphiz commented Jan 19, 2025 via email

greghendershott commented Jan 19, 2025

manphiz commented Jan 20, 2025 via email

Introduce a new envvar to set a longer timeout for Debian buildds #737

Introduce a new envvar to set a longer timeout for Debian buildds #737

Conversation

manphiz commented Jan 18, 2025

greghendershott commented Jan 19, 2025

manphiz commented Jan 19, 2025 via email

manphiz commented Jan 19, 2025 via email

greghendershott commented Jan 19, 2025

manphiz commented Jan 20, 2025 via email