Skip to content

Fallible systems #16589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Dec 5, 2024
Merged

Fallible systems #16589

merged 25 commits into from
Dec 5, 2024

Conversation

NthTensor
Copy link
Contributor

@NthTensor NthTensor commented Dec 1, 2024

Objective

Error handling in bevy is hard. See for reference #11562, #10874 and #12660. The goal of this PR is to make it better, by allowing users to optionally return Result from systems as outlined by Cart in #14275 (comment).

Solution

This PR introduces a new ScheuleSystem type to represent systems that can be added to schedules. Instances of this type contain either an infallible BoxedSystem<(), ()> or a fallible BoxedSystem<(), Result>. ScheuleSystem implements System<In = (), Out = Result> and replaces all uses of BoxedSystem in schedules. The async executor now receives a result after executing a system, which for infallible systems is always Ok(()). Currently it ignores this result, but more useful error handling could also be implemented.

Aliases for Error and Result have been added to the bevy_ecs prelude, as well as const OK which new users may find more friendly than Ok(()).

Testing

  • Currently there are not actual semantics changes that really require new tests, but I added a basic one just to make sure we don't break stuff in the future.
  • The behavior of existing systems is totally unchanged, including logging.
  • All of the existing systems tests pass, and I have not noticed anything strange while playing with the examples

Showcase

The following minimal example prints "hello world" once, then completes.

use bevy::prelude::*;

fn main() {
    App::new().add_systems(Update, hello_world_system).run();
}

fn hello_world_system() -> Result {
    println!("hello world");
    Err("string")?;
    println!("goodbye world");
    OK
}

Migration Guide

This change should be pretty much non-breaking, except for users who have implemented their own custom executors. Those users should use ScheduleSystem in place of BoxedSystem<(), ()> and import the System trait where needed. They can choose to do whatever they wish with the result.

Current Work

  • Fix tests & doc comments
  • Write more tests
  • Add examples
  • Draft release notes

Draft Release Notes

As of this release, systems can now return results.

First a bit of background: Bevy has hisotrically expected systems to return the empty type (). While this makes sense in the context of the ecs, it's at odds with how error handling is typically done in rust: returning Result::Error to indicate failure, and using the short-circuiting ? operator to propagate that error up the call stack to where it can be properly handled. Users of functional languages will tell you this is called "monadic error handling".

Not being able to return Results from systems left bevy users with a quandry. They could add custom error handling logic to every system, or manually pipe every system into an error handler, or perhaps sidestep the issue with some combination of fallible assignents, logging, macros, and early returns. Often, users would just litter their systems with unwraps and possible panics.

While any one of these approaches might be fine for a particular user, each of them has their own drawbacks, and none makes good use of the language. Serious issues could also arrise when two different crates used by the same project made different choices about error handling.

Now, by returning results, systems can defer error handling to the application itself. It looks like this:

// Previous, handling internally
app.add_systems(my_system)
fn my_system(window: Query<&Window>) {
   let Ok(window) = query.get_single() else {
       return;
   };
   // ... do something to the window here
}

// Previous, handling externally
app.add_systems(my_system.pipe(my_error_handler))
fn my_system(window: Query<&Window>) -> Result<(), impl Error> {
   let window = query.get_single()?;
   // ... do something to the window here
   Ok(())
}

// Previous, panicking
app.add_systems(my_system)
fn my_system(window: Query<&Window>) {
   let window = query.single();
   // ... do something to the window here
}

// Now 
app.add_systems(my_system)
fn my_system(window: Query<&Window>) -> Result {
    let window = query.get_single()?;
    // ... do something to the window here
    Ok(())
}

There are currently some limitations. Systems must either return () or Result<(), Box<dyn Error + Send + Sync + 'static>>, with no in-between. Results are also ignored by default, and though implementing a custom handler is possible, it involves writing your own custom ecs executor (which is not recomended).

Systems should return errors when they cannot perform their normal behavior. In turn, errors returned to the executor while running the schedule will (eventually) be treated as unexpected. Users and library authors should prefer to return errors for anything that disrupts the normal expected behavior of a system, and should only handle expected cases internally.

We have big plans for improving error handling further:

  • Allowing users to change the error handling logic of the default executors.
  • Adding source tracking and optional backtraces to errors.
  • Possibly adding tracing-levels (Error/Warn/Info/Debug/Trace) to errors.
  • Generally making the default error logging more helpful and inteligent.
  • Adding monadic system combininators for fallible systems.
  • Possibly removing all panicking variants from our api.

@NthTensor NthTensor added C-Feature A new feature, making something new possible A-ECS Entities, components, systems, and events C-Usability A targeted quality-of-life change that makes Bevy easier to use S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Dec 1, 2024
@bushrat011899
Copy link
Contributor

Even without the actual error handling benefits this provides, just having a more blessed way to use ? in systems will be really nice. I know we can just pipe the Result with current systems, but this will hide that bit of extra boilerplate. This also pairs nicely with making more APIs return Result instead of Option, and also makes panicking variants less important (possibly even removable TBH).

Copy link
Contributor

@bushrat011899 bushrat011899 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this probably needs an example, but I like the approach. Opens up the possibility of having error handlers in the future, which would resolve the to-panic or not to-panic debate entirely. This also lays the groundwork for how fallibility in Commands could work. Really nice work!

@bushrat011899
Copy link
Contributor

It was mentioned in Discord, but I'll include it here for posterity: with fallible systems getting first-class treatment, there may be room to consider removing the panicking variants of certain functions (e.g., Query::get_entity and Query::entity), since the choice of behaviour could be controlled by a system error handler. This would be a large DX win, since the "proper" methods would get the shorter names, and it'd reduce the API surface area.

@NthTensor
Copy link
Contributor Author

NthTensor commented Dec 2, 2024

there may be room to consider removing the panicking variants of certain functions

That's in line with the third point Cart proposed in #14275 (comment). He indicated then that it was important to land all the related changes in a single release cycle, and I agree. This PR provides his (1), what (2) and (3) look like is up to @alice-i-cecile and the other designated ecs experts.

Copy link
Contributor

@tychedelia tychedelia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing to see how straightforward this is, all things considered. Very excited!

@teohhanhui
Copy link

teohhanhui commented Dec 2, 2024

as well as const OK which new users may find more friendly than Ok(()).

Why? This just makes the code more jarring compared to the rest of the Rust ecosystem, and more cognitive load to switch between returning nothing vs. returning some value.

It'd make sense if it's something useful like https://docs.rs/anyhow/latest/anyhow/fn.Ok.html

@NthTensor
Copy link
Contributor Author

This just makes the code more jarring compared to the rest of the Rust ecosystem

In this I am trying to defer to my understanding of Cart's preferences. He uses a const in the linked issue, and I believe has expressed that Ok(()) is sort of confusing and cumbersome. No strong preference here from me really.

Copy link
Member

@alice-i-cecile alice-i-cecile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nth this is good to go once it's merge-conflict free. I do prefer the "everything is fallible" approach by WrongShoe, but that's easily left to a follow-up refactor. Let's get the ball rolling here.

@alice-i-cecile alice-i-cecile added this pull request to the merge queue Dec 5, 2024
Merged via the queue into bevyengine:main with commit 0070514 Dec 5, 2024
31 of 32 checks passed
@mockersf mockersf mentioned this pull request Dec 26, 2024
github-merge-queue bot pushed a commit that referenced this pull request Dec 27, 2024
# Objective

- First step for #16718 
- #16589 introduced an api that can only ignore errors, which is risky

## Solution

- Panic instead of just ignoring the errors

## Testing

- Changed the `fallible_systems` example to return an error
```
Encountered an error in system `fallible_systems::setup`: TooManyVertices { subdivisions: 300, number_of_resulting_points: 906012 }
Encountered a panic in system `fallible_systems::setup`!
Encountered a panic in system `bevy_app::main_schedule::Main::run_main`!
```
@PixelDust22
Copy link
Contributor

Might be a bit late on this discussion, but this particular implementation feels a bit intrusive to me. Can we instead implement IntoSystemConfigs on systems that returns Result so that those systems can be added to the schedule like a normal system?

github-merge-queue bot pushed a commit that referenced this pull request Dec 31, 2024
# Objective

- #16589 added an enum to switch between fallible and infallible system.
This branching should be unnecessary if we wrap infallible systems in a
function to return `Ok(())`.

## Solution

- Create a wrapper system for `System<(), ()>`s that returns `Ok` on the
call to `run` and `run_unsafe`. The wrapper should compile out, but I
haven't checked.
- I removed the `impl IntoSystemConfigs for BoxedSystem<(), ()>` as I
couldn't figure out a way to keep the impl without double boxing.

## Testing

- ran `many_foxes` example to check if it still runs.

## Migration Guide

- `IntoSystemConfigs` has been removed for `BoxedSystem<(), ()>`. Either
use `InfallibleSystemWrapper` before boxing or make your system return
`bevy::ecs::prelude::Result`.
ecoskey pushed a commit to ecoskey/bevy that referenced this pull request Jan 6, 2025
# Objective

Error handling in bevy is hard. See for reference
bevyengine#11562,
bevyengine#10874 and
bevyengine#12660. The goal of this PR is
to make it better, by allowing users to optionally return `Result` from
systems as outlined by Cart in
<bevyengine#14275 (comment)>.

## Solution

This PR introduces a new `ScheuleSystem` type to represent systems that
can be added to schedules. Instances of this type contain either an
infallible `BoxedSystem<(), ()>` or a fallible `BoxedSystem<(),
Result>`. `ScheuleSystem` implements `System<In = (), Out = Result>` and
replaces all uses of `BoxedSystem` in schedules. The async executor now
receives a result after executing a system, which for infallible systems
is always `Ok(())`. Currently it ignores this result, but more useful
error handling could also be implemented.

Aliases for `Error` and `Result` have been added to the `bevy_ecs`
prelude, as well as const `OK` which new users may find more friendly
than `Ok(())`.

## Testing

- Currently there are not actual semantics changes that really require
new tests, but I added a basic one just to make sure we don't break
stuff in the future.
- The behavior of existing systems is totally unchanged, including
logging.
- All of the existing systems tests pass, and I have not noticed
anything strange while playing with the examples

## Showcase

The following minimal example prints "hello world" once, then completes.

```rust
use bevy::prelude::*;

fn main() {
    App::new().add_systems(Update, hello_world_system).run();
}

fn hello_world_system() -> Result {
    println!("hello world");
    Err("string")?;
    println!("goodbye world");
    OK
}
```

## Migration Guide

This change should be pretty much non-breaking, except for users who
have implemented their own custom executors. Those users should use
`ScheduleSystem` in place of `BoxedSystem<(), ()>` and import the
`System` trait where needed. They can choose to do whatever they wish
with the result.

## Current Work

+ [x] Fix tests & doc comments
+ [x] Write more tests
+ [x] Add examples
+ [X] Draft release notes

## Draft Release Notes

As of this release, systems can now return results.

First a bit of background: Bevy has hisotrically expected systems to
return the empty type `()`. While this makes sense in the context of the
ecs, it's at odds with how error handling is typically done in rust:
returning `Result::Error` to indicate failure, and using the
short-circuiting `?` operator to propagate that error up the call stack
to where it can be properly handled. Users of functional languages will
tell you this is called "monadic error handling".

Not being able to return `Results` from systems left bevy users with a
quandry. They could add custom error handling logic to every system, or
manually pipe every system into an error handler, or perhaps sidestep
the issue with some combination of fallible assignents, logging, macros,
and early returns. Often, users would just litter their systems with
unwraps and possible panics.

While any one of these approaches might be fine for a particular user,
each of them has their own drawbacks, and none makes good use of the
language. Serious issues could also arrise when two different crates
used by the same project made different choices about error handling.

Now, by returning results, systems can defer error handling to the
application itself. It looks like this:

```rust
// Previous, handling internally
app.add_systems(my_system)
fn my_system(window: Query<&Window>) {
   let Ok(window) = query.get_single() else {
       return;
   };
   // ... do something to the window here
}

// Previous, handling externally
app.add_systems(my_system.pipe(my_error_handler))
fn my_system(window: Query<&Window>) -> Result<(), impl Error> {
   let window = query.get_single()?;
   // ... do something to the window here
   Ok(())
}

// Previous, panicking
app.add_systems(my_system)
fn my_system(window: Query<&Window>) {
   let window = query.single();
   // ... do something to the window here
}

// Now 
app.add_systems(my_system)
fn my_system(window: Query<&Window>) -> Result {
    let window = query.get_single()?;
    // ... do something to the window here
    Ok(())
}
```

There are currently some limitations. Systems must either return `()` or
`Result<(), Box<dyn Error + Send + Sync + 'static>>`, with no
in-between. Results are also ignored by default, and though implementing
a custom handler is possible, it involves writing your own custom ecs
executor (which is *not* recomended).

Systems should return errors when they cannot perform their normal
behavior. In turn, errors returned to the executor while running the
schedule will (eventually) be treated as unexpected. Users and library
authors should prefer to return errors for anything that disrupts the
normal expected behavior of a system, and should only handle expected
cases internally.

We have big plans for improving error handling further:
+ Allowing users to change the error handling logic of the default
executors.
+ Adding source tracking and optional backtraces to errors.
+ Possibly adding tracing-levels (Error/Warn/Info/Debug/Trace) to
errors.
+ Generally making the default error logging more helpful and
inteligent.
+ Adding monadic system combininators for fallible systems.
+ Possibly removing all panicking variants from our api.

---------

Co-authored-by: Zachary Harrold <[email protected]>
ecoskey pushed a commit to ecoskey/bevy that referenced this pull request Jan 6, 2025
# Objective

- First step for bevyengine#16718 
- bevyengine#16589 introduced an api that can only ignore errors, which is risky

## Solution

- Panic instead of just ignoring the errors

## Testing

- Changed the `fallible_systems` example to return an error
```
Encountered an error in system `fallible_systems::setup`: TooManyVertices { subdivisions: 300, number_of_resulting_points: 906012 }
Encountered a panic in system `fallible_systems::setup`!
Encountered a panic in system `bevy_app::main_schedule::Main::run_main`!
```
ecoskey pushed a commit to ecoskey/bevy that referenced this pull request Jan 6, 2025
# Objective

- bevyengine#16589 added an enum to switch between fallible and infallible system.
This branching should be unnecessary if we wrap infallible systems in a
function to return `Ok(())`.

## Solution

- Create a wrapper system for `System<(), ()>`s that returns `Ok` on the
call to `run` and `run_unsafe`. The wrapper should compile out, but I
haven't checked.
- I removed the `impl IntoSystemConfigs for BoxedSystem<(), ()>` as I
couldn't figure out a way to keep the impl without double boxing.

## Testing

- ran `many_foxes` example to check if it still runs.

## Migration Guide

- `IntoSystemConfigs` has been removed for `BoxedSystem<(), ()>`. Either
use `InfallibleSystemWrapper` before boxing or make your system return
`bevy::ecs::prelude::Result`.
mrchantey pushed a commit to mrchantey/bevy that referenced this pull request Feb 4, 2025
# Objective

- First step for bevyengine#16718 
- bevyengine#16589 introduced an api that can only ignore errors, which is risky

## Solution

- Panic instead of just ignoring the errors

## Testing

- Changed the `fallible_systems` example to return an error
```
Encountered an error in system `fallible_systems::setup`: TooManyVertices { subdivisions: 300, number_of_resulting_points: 906012 }
Encountered a panic in system `fallible_systems::setup`!
Encountered a panic in system `bevy_app::main_schedule::Main::run_main`!
```
mrchantey pushed a commit to mrchantey/bevy that referenced this pull request Feb 4, 2025
# Objective

- bevyengine#16589 added an enum to switch between fallible and infallible system.
This branching should be unnecessary if we wrap infallible systems in a
function to return `Ok(())`.

## Solution

- Create a wrapper system for `System<(), ()>`s that returns `Ok` on the
call to `run` and `run_unsafe`. The wrapper should compile out, but I
haven't checked.
- I removed the `impl IntoSystemConfigs for BoxedSystem<(), ()>` as I
couldn't figure out a way to keep the impl without double boxing.

## Testing

- ran `many_foxes` example to check if it still runs.

## Migration Guide

- `IntoSystemConfigs` has been removed for `BoxedSystem<(), ()>`. Either
use `InfallibleSystemWrapper` before boxing or make your system return
`bevy::ecs::prelude::Result`.
JeanMertz added a commit to dcdpr/bevy that referenced this pull request Feb 7, 2025
This commit builds on top of the work done in bevyengine#16589 and bevyengine#17051, by
adding support for fallible observer systems.

As with the previous work, the actual results of the observer system
are suppressed by default, but the intention is to provide a way to
handle errors in a global way.

Until then, you can use a `PipeSystem` to manually handle results.

Signed-off-by: Jean Mertz <[email protected]>
JeanMertz added a commit to dcdpr/bevy that referenced this pull request Feb 7, 2025
This commit builds on top of the work done in bevyengine#16589 and bevyengine#17051, by
adding support for fallible observer systems.

As with the previous work, the actual results of the observer system
are suppressed by default, but the intention is to provide a way to
handle errors in a global way.

Until then, you can use a `PipeSystem` to manually handle results.

Signed-off-by: Jean Mertz <[email protected]>
JeanMertz added a commit to dcdpr/bevy that referenced this pull request Feb 9, 2025
You can now configure error handlers for fallible systems. These can be
configured on several levels:

- Globally via `App::set_systems_error_handler`
- Per-schedule via `Schedule::set_error_handler`
- Per-system via a piped system (this is existing functionality)

The "fallible_systems" example demonstrates the new functionality.

This builds on top of bevyengine#17731, bevyengine#16589, bevyengine#17051.

Signed-off-by: Jean Mertz <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Feb 11, 2025
You can now configure error handlers for fallible systems. These can be
configured on several levels:

- Globally via `App::set_systems_error_handler`
- Per-schedule via `Schedule::set_error_handler`
- Per-system via a piped system (this is existing functionality)

The default handler of panicking on error keeps the same behavior as
before this commit.

The "fallible_systems" example demonstrates the new functionality.

This builds on top of #17731, #16589, #17051.

---------

Signed-off-by: Jean Mertz <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Feb 11, 2025
This commit builds on top of the work done in #16589 and #17051, by
adding support for fallible observer systems.

As with the previous work, the actual results of the observer system are
suppressed for now, but the intention is to provide a way to handle
errors in a global way.

Until then, you can use a `PipeSystem` to manually handle results.

---------

Signed-off-by: Jean Mertz <[email protected]>
@alice-i-cecile
Copy link
Member

Thank you to everyone involved with the authoring or reviewing of this PR! This work is relatively important and needs release notes! Head over to bevyengine/bevy-website#1967 if you'd like to help out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Feature A new feature, making something new possible C-Usability A targeted quality-of-life change that makes Bevy easier to use M-Needs-Migration-Guide A breaking change to Bevy's public API that needs to be noted in a migration guide M-Needs-Release-Note Work that should be called out in the blog due to impact S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it X-Contentious There are nontrivial implications that should be thought through
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants