Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: baml-cli test #1345

Closed
sxlijin opened this issue Jan 17, 2025 · 3 comments · Fixed by #1458
Closed

feature request: baml-cli test #1345

sxlijin opened this issue Jan 17, 2025 · 3 comments · Fixed by #1458

Comments

@sxlijin
Copy link
Collaborator

sxlijin commented Jan 17, 2025

To run all tests defined in .baml files

@prrao87
Copy link
Contributor

prrao87 commented Feb 12, 2025

I'd very much love to see this happen, thanks!

@sxlijin sxlijin changed the title baml-cli test feature request: baml-cli test Feb 12, 2025
@sxlijin
Copy link
Collaborator Author

sxlijin commented Feb 15, 2025

Implementation is in progress! See #1458 for the first pass (plan is to land it next week) and here are our design notes:

baml-cli test run --include "MyFunction::" --include "AnyMatch" --exclude "::LongTest"

# syntax: <part>::<part>, parts may only be "*\w\d_"

baml-cli test # this will default to "list" b/c it costs $$$ to run tests
baml-cli test list
baml-cli test run
baml-cli test --output-formats [pretty | github | junit=path/to/file]

Product requirements:

  • allow users to enforce that baml tests pass in CI
  • debug loop: what are users going to do when tests fail?
    • test failures need to show logs
    • re-run a specific test
      • can go into the vscode playground for this
      • still need to be able to run the same command as in CI for reproducibility's sake though
  • happy-ish path
    • show # of tests that passed, timing info, # of tests skipped - take cargo test as inspiration (maybe pytest/mocha too)
    • test execution will be concurrent
      • do we need concurrency limits?
        • need to be able to set concurrency down to 1 (claude basic tier), default concurrency should be... ???
      • definitely need timeouts too

@denizkenan
Copy link

I would be also interested in this.

github-merge-queue bot pushed a commit that referenced this issue Mar 18, 2025
NOT READY FOR MERGE YET: projected land date Feb 21

Design notes:


```bash
# past work
baml-cli test run --include "MyFunction::" --include "AnyMatch" --exclude "::LongTest"

# syntax: <part>::<part>
# syntax: "*\w\d_" for function/test filter clauses

baml-cli test list  # this is the default action because it costs $$$

baml-cli test run  # to run tests
baml-cli test --output-formats [pretty | github | junit=path/to/file.xml]
```

Product requirements:

- allow users to enforce that baml tests pass in CI
    - users may only want to run a subset of tests in CI
- filter based on file & name (in the future, maybe some kind of “test
tag” attribute)
        - `--include`
        - `--exclude`
        - tests for a given function
        - regex
    - `--verbose` - TBD
    - report timing stats - TBD
    - tests may be flakey and need re-running - TBD
- for github actions: we can use [[output
groups](https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/workflow-commands-for-github-actions#grouping-log-lines)](https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/workflow-commands-for-github-actions#grouping-log-lines)
    - for non-github aka generic integration (TBD) junit xml is a must
- cf [gitlab unit test
reports](https://docs.gitlab.com/ee/ci/testing/unit_test_reports.html)
- debug loop: what are users going to do when tests fail?
    - test failures need to show logs
    - re-run a specific test
        - can go into the wasm playground for this
- happy-ish path
- show # of tests that passed, timing info, # of tests skipped - take
`cargo test` as inspiration (maybe pytest/mocha too)
    - test execution will be concurrent
        - do we need concurrency limits?
- need to be able to set concurrency down to 1 (claude basic tier),
default concurrency should be... ???
        - definitely need timeouts too

Fixes #1345 
<!-- ELLIPSIS_HIDDEN -->


----

> [!IMPORTANT]
> Adds `baml-cli test` command for running BAML tests with filtering,
output formats, and concurrency options, along with `.env` file support
and test execution integration with GitHub Actions and JUnit XML
reporting.
> 
>   - **CLI**:
> - Adds `baml-cli test` command in `commands.rs` to run BAML tests with
options for filtering (`--include`, `--exclude`), output formats
(`--output-format`), and concurrency (`--parallel`).
> - Integrates test execution with GitHub Actions and JUnit XML
reporting in `output_github.rs` and `output_junit.rs`.
>   - **Environment**:
> - Implements `.env` file loading in `dotenv/mod.rs` with support for
multiline strings, variable interpolation, and escape sequences.
>     - Adds tests for `.env` loading in `dotenv/tests.rs`.
>   - **Test Execution**:
> - Introduces `TestExecutor` trait in `test_executor/mod.rs` for
running tests with status tracking and rendering.
> - Implements `PrettyTestExecutionStatusRenderer`,
`GithubTestExecutionStatusRenderer`, and `JUnitXMLRenderer` for
different output formats.
> - Adds `TestFilter` in `test_execution_args.rs` for filtering tests
based on patterns.
> 
> <sup>This description was created by </sup>[<img alt="Ellipsis"
src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=BoundaryML%2Fbaml&utm_source=github&utm_medium=referral)<sup>
for 01e78d6. It will automatically
update as commits are pushed.</sup>


<!-- ELLIPSIS_HIDDEN -->

---------

Co-authored-by: Vaibhav Gupta <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants