Fallible systems #16589

NthTensor · 2024-12-01T22:19:08Z

Objective

Error handling in bevy is hard. See for reference #11562, #10874 and #12660. The goal of this PR is to make it better, by allowing users to optionally return Result from systems as outlined by Cart in #14275 (comment).

Solution

This PR introduces a new ScheuleSystem type to represent systems that can be added to schedules. Instances of this type contain either an infallible BoxedSystem<(), ()> or a fallible BoxedSystem<(), Result>. ScheuleSystem implements System<In = (), Out = Result> and replaces all uses of BoxedSystem in schedules. The async executor now receives a result after executing a system, which for infallible systems is always Ok(()). Currently it ignores this result, but more useful error handling could also be implemented.

Aliases for Error and Result have been added to the bevy_ecs prelude, as well as const OK which new users may find more friendly than Ok(()).

Testing

Currently there are not actual semantics changes that really require new tests, but I added a basic one just to make sure we don't break stuff in the future.
The behavior of existing systems is totally unchanged, including logging.
All of the existing systems tests pass, and I have not noticed anything strange while playing with the examples

Showcase

The following minimal example prints "hello world" once, then completes.

use bevy::prelude::*;

fn main() {
    App::new().add_systems(Update, hello_world_system).run();
}

fn hello_world_system() -> Result {
    println!("hello world");
    Err("string")?;
    println!("goodbye world");
    OK
}

Migration Guide

This change should be pretty much non-breaking, except for users who have implemented their own custom executors. Those users should use ScheduleSystem in place of BoxedSystem<(), ()> and import the System trait where needed. They can choose to do whatever they wish with the result.

Current Work

Fix tests & doc comments
Write more tests
Add examples
Draft release notes

Draft Release Notes

As of this release, systems can now return results.

First a bit of background: Bevy has hisotrically expected systems to return the empty type (). While this makes sense in the context of the ecs, it's at odds with how error handling is typically done in rust: returning Result::Error to indicate failure, and using the short-circuiting ? operator to propagate that error up the call stack to where it can be properly handled. Users of functional languages will tell you this is called "monadic error handling".

Not being able to return Results from systems left bevy users with a quandry. They could add custom error handling logic to every system, or manually pipe every system into an error handler, or perhaps sidestep the issue with some combination of fallible assignents, logging, macros, and early returns. Often, users would just litter their systems with unwraps and possible panics.

While any one of these approaches might be fine for a particular user, each of them has their own drawbacks, and none makes good use of the language. Serious issues could also arrise when two different crates used by the same project made different choices about error handling.

Now, by returning results, systems can defer error handling to the application itself. It looks like this:

// Previous, handling internally
app.add_systems(my_system)
fn my_system(window: Query<&Window>) {
   let Ok(window) = query.get_single() else {
       return;
   };
   // ... do something to the window here
}

// Previous, handling externally
app.add_systems(my_system.pipe(my_error_handler))
fn my_system(window: Query<&Window>) -> Result<(), impl Error> {
   let window = query.get_single()?;
   // ... do something to the window here
   Ok(())
}

// Previous, panicking
app.add_systems(my_system)
fn my_system(window: Query<&Window>) {
   let window = query.single();
   // ... do something to the window here
}

// Now 
app.add_systems(my_system)
fn my_system(window: Query<&Window>) -> Result {
    let window = query.get_single()?;
    // ... do something to the window here
    Ok(())
}

There are currently some limitations. Systems must either return () or Result<(), Box<dyn Error + Send + Sync + 'static>>, with no in-between. Results are also ignored by default, and though implementing a custom handler is possible, it involves writing your own custom ecs executor (which is not recomended).

Systems should return errors when they cannot perform their normal behavior. In turn, errors returned to the executor while running the schedule will (eventually) be treated as unexpected. Users and library authors should prefer to return errors for anything that disrupts the normal expected behavior of a system, and should only handle expected cases internally.

We have big plans for improving error handling further:

Allowing users to change the error handling logic of the default executors.
Adding source tracking and optional backtraces to errors.
Possibly adding tracing-levels (Error/Warn/Info/Debug/Trace) to errors.
Generally making the default error logging more helpful and inteligent.
Adding monadic system combininators for fallible systems.
Possibly removing all panicking variants from our api.

bushrat011899 · 2024-12-01T22:51:46Z

Even without the actual error handling benefits this provides, just having a more blessed way to use ? in systems will be really nice. I know we can just pipe the Result with current systems, but this will hide that bit of extra boilerplate. This also pairs nicely with making more APIs return Result instead of Option, and also makes panicking variants less important (possibly even removable TBH).

bushrat011899

I agree this probably needs an example, but I like the approach. Opens up the possibility of having error handlers in the future, which would resolve the to-panic or not to-panic debate entirely. This also lays the groundwork for how fallibility in Commands could work. Really nice work!

crates/bevy_ecs/src/lib.rs

crates/bevy_ecs/src/schedule/executor/mod.rs

crates/bevy_ecs/src/schedule/executor/multi_threaded.rs

bushrat011899 · 2024-12-02T03:12:27Z

It was mentioned in Discord, but I'll include it here for posterity: with fallible systems getting first-class treatment, there may be room to consider removing the panicking variants of certain functions (e.g., Query::get_entity and Query::entity), since the choice of behaviour could be controlled by a system error handler. This would be a large DX win, since the "proper" methods would get the shorter names, and it'd reduce the API surface area.

Co-authored-by: Zachary Harrold <[email protected]>

NthTensor · 2024-12-02T04:31:38Z

there may be room to consider removing the panicking variants of certain functions

That's in line with the third point Cart proposed in #14275 (comment). He indicated then that it was important to land all the related changes in a single release cycle, and I agree. This PR provides his (1), what (2) and (3) look like is up to @alice-i-cecile and the other designated ecs experts.

tychedelia

Amazing to see how straightforward this is, all things considered. Very excited!

crates/bevy_ecs/src/result.rs

teohhanhui · 2024-12-02T06:08:15Z

as well as const OK which new users may find more friendly than Ok(()).

Why? This just makes the code more jarring compared to the rest of the Rust ecosystem, and more cognitive load to switch between returning nothing vs. returning some value.

It'd make sense if it's something useful like https://docs.rs/anyhow/latest/anyhow/fn.Ok.html

NthTensor · 2024-12-02T06:11:45Z

This just makes the code more jarring compared to the rest of the Rust ecosystem

In this I am trying to defer to my understanding of Cart's preferences. He uses a const in the linked issue, and I believe has expressed that Ok(()) is sort of confusing and cumbersome. No strong preference here from me really.

alice-i-cecile

@nth this is good to go once it's merge-conflict free. I do prefer the "everything is fallible" approach by WrongShoe, but that's easily left to a follow-up refactor. Let's get the ball rolling here.

# Objective - First step for #16718 - #16589 introduced an api that can only ignore errors, which is risky ## Solution - Panic instead of just ignoring the errors ## Testing - Changed the `fallible_systems` example to return an error ``` Encountered an error in system `fallible_systems::setup`: TooManyVertices { subdivisions: 300, number_of_resulting_points: 906012 } Encountered a panic in system `fallible_systems::setup`! Encountered a panic in system `bevy_app::main_schedule::Main::run_main`! ```

PixelDust22 · 2024-12-27T20:54:12Z

Might be a bit late on this discussion, but this particular implementation feels a bit intrusive to me. Can we instead implement IntoSystemConfigs on systems that returns Result so that those systems can be added to the schedule like a normal system?

# Objective - #16589 added an enum to switch between fallible and infallible system. This branching should be unnecessary if we wrap infallible systems in a function to return `Ok(())`. ## Solution - Create a wrapper system for `System<(), ()>`s that returns `Ok` on the call to `run` and `run_unsafe`. The wrapper should compile out, but I haven't checked. - I removed the `impl IntoSystemConfigs for BoxedSystem<(), ()>` as I couldn't figure out a way to keep the impl without double boxing. ## Testing - ran `many_foxes` example to check if it still runs. ## Migration Guide - `IntoSystemConfigs` has been removed for `BoxedSystem<(), ()>`. Either use `InfallibleSystemWrapper` before boxing or make your system return `bevy::ecs::prelude::Result`.

# Objective Error handling in bevy is hard. See for reference bevyengine#11562, bevyengine#10874 and bevyengine#12660. The goal of this PR is to make it better, by allowing users to optionally return `Result` from systems as outlined by Cart in <bevyengine#14275 (comment)>. ## Solution This PR introduces a new `ScheuleSystem` type to represent systems that can be added to schedules. Instances of this type contain either an infallible `BoxedSystem<(), ()>` or a fallible `BoxedSystem<(), Result>`. `ScheuleSystem` implements `System<In = (), Out = Result>` and replaces all uses of `BoxedSystem` in schedules. The async executor now receives a result after executing a system, which for infallible systems is always `Ok(())`. Currently it ignores this result, but more useful error handling could also be implemented. Aliases for `Error` and `Result` have been added to the `bevy_ecs` prelude, as well as const `OK` which new users may find more friendly than `Ok(())`. ## Testing - Currently there are not actual semantics changes that really require new tests, but I added a basic one just to make sure we don't break stuff in the future. - The behavior of existing systems is totally unchanged, including logging. - All of the existing systems tests pass, and I have not noticed anything strange while playing with the examples ## Showcase The following minimal example prints "hello world" once, then completes. ```rust use bevy::prelude::*; fn main() { App::new().add_systems(Update, hello_world_system).run(); } fn hello_world_system() -> Result { println!("hello world"); Err("string")?; println!("goodbye world"); OK } ``` ## Migration Guide This change should be pretty much non-breaking, except for users who have implemented their own custom executors. Those users should use `ScheduleSystem` in place of `BoxedSystem<(), ()>` and import the `System` trait where needed. They can choose to do whatever they wish with the result. ## Current Work + [x] Fix tests & doc comments + [x] Write more tests + [x] Add examples + [X] Draft release notes ## Draft Release Notes As of this release, systems can now return results. First a bit of background: Bevy has hisotrically expected systems to return the empty type `()`. While this makes sense in the context of the ecs, it's at odds with how error handling is typically done in rust: returning `Result::Error` to indicate failure, and using the short-circuiting `?` operator to propagate that error up the call stack to where it can be properly handled. Users of functional languages will tell you this is called "monadic error handling". Not being able to return `Results` from systems left bevy users with a quandry. They could add custom error handling logic to every system, or manually pipe every system into an error handler, or perhaps sidestep the issue with some combination of fallible assignents, logging, macros, and early returns. Often, users would just litter their systems with unwraps and possible panics. While any one of these approaches might be fine for a particular user, each of them has their own drawbacks, and none makes good use of the language. Serious issues could also arrise when two different crates used by the same project made different choices about error handling. Now, by returning results, systems can defer error handling to the application itself. It looks like this: ```rust // Previous, handling internally app.add_systems(my_system) fn my_system(window: Query<&Window>) { let Ok(window) = query.get_single() else { return; }; // ... do something to the window here } // Previous, handling externally app.add_systems(my_system.pipe(my_error_handler)) fn my_system(window: Query<&Window>) -> Result<(), impl Error> { let window = query.get_single()?; // ... do something to the window here Ok(()) } // Previous, panicking app.add_systems(my_system) fn my_system(window: Query<&Window>) { let window = query.single(); // ... do something to the window here } // Now app.add_systems(my_system) fn my_system(window: Query<&Window>) -> Result { let window = query.get_single()?; // ... do something to the window here Ok(()) } ``` There are currently some limitations. Systems must either return `()` or `Result<(), Box<dyn Error + Send + Sync + 'static>>`, with no in-between. Results are also ignored by default, and though implementing a custom handler is possible, it involves writing your own custom ecs executor (which is *not* recomended). Systems should return errors when they cannot perform their normal behavior. In turn, errors returned to the executor while running the schedule will (eventually) be treated as unexpected. Users and library authors should prefer to return errors for anything that disrupts the normal expected behavior of a system, and should only handle expected cases internally. We have big plans for improving error handling further: + Allowing users to change the error handling logic of the default executors. + Adding source tracking and optional backtraces to errors. + Possibly adding tracing-levels (Error/Warn/Info/Debug/Trace) to errors. + Generally making the default error logging more helpful and inteligent. + Adding monadic system combininators for fallible systems. + Possibly removing all panicking variants from our api. --------- Co-authored-by: Zachary Harrold <[email protected]>

# Objective - First step for bevyengine#16718 - bevyengine#16589 introduced an api that can only ignore errors, which is risky ## Solution - Panic instead of just ignoring the errors ## Testing - Changed the `fallible_systems` example to return an error ``` Encountered an error in system `fallible_systems::setup`: TooManyVertices { subdivisions: 300, number_of_resulting_points: 906012 } Encountered a panic in system `fallible_systems::setup`! Encountered a panic in system `bevy_app::main_schedule::Main::run_main`! ```

# Objective - bevyengine#16589 added an enum to switch between fallible and infallible system. This branching should be unnecessary if we wrap infallible systems in a function to return `Ok(())`. ## Solution - Create a wrapper system for `System<(), ()>`s that returns `Ok` on the call to `run` and `run_unsafe`. The wrapper should compile out, but I haven't checked. - I removed the `impl IntoSystemConfigs for BoxedSystem<(), ()>` as I couldn't figure out a way to keep the impl without double boxing. ## Testing - ran `many_foxes` example to check if it still runs. ## Migration Guide - `IntoSystemConfigs` has been removed for `BoxedSystem<(), ()>`. Either use `InfallibleSystemWrapper` before boxing or make your system return `bevy::ecs::prelude::Result`.

# Objective - First step for bevyengine#16718 - bevyengine#16589 introduced an api that can only ignore errors, which is risky ## Solution - Panic instead of just ignoring the errors ## Testing - Changed the `fallible_systems` example to return an error ``` Encountered an error in system `fallible_systems::setup`: TooManyVertices { subdivisions: 300, number_of_resulting_points: 906012 } Encountered a panic in system `fallible_systems::setup`! Encountered a panic in system `bevy_app::main_schedule::Main::run_main`! ```

# Objective - bevyengine#16589 added an enum to switch between fallible and infallible system. This branching should be unnecessary if we wrap infallible systems in a function to return `Ok(())`. ## Solution - Create a wrapper system for `System<(), ()>`s that returns `Ok` on the call to `run` and `run_unsafe`. The wrapper should compile out, but I haven't checked. - I removed the `impl IntoSystemConfigs for BoxedSystem<(), ()>` as I couldn't figure out a way to keep the impl without double boxing. ## Testing - ran `many_foxes` example to check if it still runs. ## Migration Guide - `IntoSystemConfigs` has been removed for `BoxedSystem<(), ()>`. Either use `InfallibleSystemWrapper` before boxing or make your system return `bevy::ecs::prelude::Result`.

This commit builds on top of the work done in bevyengine#16589 and bevyengine#17051, by adding support for fallible observer systems. As with the previous work, the actual results of the observer system are suppressed by default, but the intention is to provide a way to handle errors in a global way. Until then, you can use a `PipeSystem` to manually handle results. Signed-off-by: Jean Mertz <[email protected]>

You can now configure error handlers for fallible systems. These can be configured on several levels: - Globally via `App::set_systems_error_handler` - Per-schedule via `Schedule::set_error_handler` - Per-system via a piped system (this is existing functionality) The "fallible_systems" example demonstrates the new functionality. This builds on top of bevyengine#17731, bevyengine#16589, bevyengine#17051. Signed-off-by: Jean Mertz <[email protected]>

You can now configure error handlers for fallible systems. These can be configured on several levels: - Globally via `App::set_systems_error_handler` - Per-schedule via `Schedule::set_error_handler` - Per-system via a piped system (this is existing functionality) The default handler of panicking on error keeps the same behavior as before this commit. The "fallible_systems" example demonstrates the new functionality. This builds on top of #17731, #16589, #17051. --------- Signed-off-by: Jean Mertz <[email protected]>

This commit builds on top of the work done in #16589 and #17051, by adding support for fallible observer systems. As with the previous work, the actual results of the observer system are suppressed for now, but the intention is to provide a way to handle errors in a global way. Until then, you can use a `PipeSystem` to manually handle results. --------- Signed-off-by: Jean Mertz <[email protected]>

alice-i-cecile · 2025-03-25T20:06:56Z

Thank you to everyone involved with the authoring or reviewing of this PR! This work is relatively important and needs release notes! Head over to bevyengine/bevy-website#1967 if you'd like to help out.

Allow systems to return results

b80265b

NthTensor added C-Feature A new feature, making something new possible A-ECS Entities, components, systems, and events C-Usability A targeted quality-of-life change that makes Bevy easier to use S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Dec 1, 2024

Fix typo

1d5f7b9

NthTensor requested review from bushrat011899, tychedelia and alice-i-cecile December 1, 2024 22:21

Fix formatting

f7bbc21

NthTensor and others added 8 commits December 1, 2024 18:35

Cleanup pass

abd9e91

Merge branch 'main' into fallible_systems

63a01b0

Replace ok function with const

add1247

Suppress warning about never

00df090

Fix tests

abe7c3a

Fix formatting

311a30f

Move lint allow to module

fa753a7

Fix doclink

1136dd1

bushrat011899 approved these changes Dec 2, 2024

View reviewed changes

NthTensor and others added 3 commits December 1, 2024 22:25

Add note to suppressed warning

fde9764

Co-authored-by: Zachary Harrold <[email protected]>

Update crates/bevy_ecs/src/schedule/executor/mod.rs

2ad1b7b

Co-authored-by: Zachary Harrold <[email protected]>

Update crates/bevy_ecs/src/schedule/executor/mod.rs

ecce62b

Co-authored-by: Zachary Harrold <[email protected]>

NthTensor mentioned this pull request Dec 2, 2024

Our API suggests that panicking should be the default #14275

Open

tychedelia approved these changes Dec 2, 2024

View reviewed changes

crates/bevy_ecs/src/result.rs Show resolved Hide resolved

NthTensor mentioned this pull request Dec 2, 2024

System piping for error/warn/info logging does not show error source #8638

Closed

alice-i-cecile approved these changes Dec 5, 2024

View reviewed changes

NthTensor and others added 3 commits December 5, 2024 16:29

Hide fallible/infallible marker types

95b64c0

Merge remote-tracking branch 'upstream/main' into fallible_systems

151b7ba

Merge branch 'main' into fallible_systems

5afedeb

alice-i-cecile added this pull request to the merge queue Dec 5, 2024

Merged via the queue into bevyengine:main with commit 0070514 Dec 5, 2024
31 of 32 checks passed

alice-i-cecile mentioned this pull request Dec 8, 2024

Fallible systems need to report failures #16718

Closed

ChristopherBiscardi mentioned this pull request Dec 23, 2024

Add TileStorage::drain and return removed entities in remove StarArawn/bevy_ecs_tilemap#586

Merged

mockersf mentioned this pull request Dec 26, 2024

panic on system error #16979

Merged

hymm mentioned this pull request Dec 30, 2024

Convert to fallible system in IntoSystemConfigs #17051

Merged

NthTensor mentioned this pull request Jan 7, 2025

Improved Command Errors #17215

Merged

JeanMertz mentioned this pull request Feb 7, 2025

feat(ecs): implement fallible observer systems #17731

Merged

JeanMertz mentioned this pull request Feb 9, 2025

feat(ecs): configurable error handling for fallible systems #17753

Merged

Miou-zora mentioned this pull request Mar 14, 2025

Error management of systems EngineSquared/EngineSquared#107

Open

alice-i-cecile mentioned this pull request Mar 25, 2025

Write release notes for PR #16589: Fallible systems bevyengine/bevy-website#1967

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fallible systems #16589

Fallible systems #16589

NthTensor commented Dec 1, 2024 •

edited

Loading

bushrat011899 commented Dec 1, 2024

bushrat011899 left a comment

bushrat011899 commented Dec 2, 2024

NthTensor commented Dec 2, 2024 •

edited

Loading

tychedelia left a comment

teohhanhui commented Dec 2, 2024 •

edited

Loading

NthTensor commented Dec 2, 2024

alice-i-cecile left a comment

PixelDust22 commented Dec 27, 2024

alice-i-cecile commented Mar 25, 2025

Fallible systems #16589

Fallible systems #16589

Conversation

NthTensor commented Dec 1, 2024 • edited Loading

Objective

Solution

Testing

Showcase

Migration Guide

Current Work

Draft Release Notes

bushrat011899 commented Dec 1, 2024

bushrat011899 left a comment

Choose a reason for hiding this comment

bushrat011899 commented Dec 2, 2024

NthTensor commented Dec 2, 2024 • edited Loading

tychedelia left a comment

Choose a reason for hiding this comment

teohhanhui commented Dec 2, 2024 • edited Loading

NthTensor commented Dec 2, 2024

alice-i-cecile left a comment

Choose a reason for hiding this comment

PixelDust22 commented Dec 27, 2024

alice-i-cecile commented Mar 25, 2025

NthTensor commented Dec 1, 2024 •

edited

Loading

NthTensor commented Dec 2, 2024 •

edited

Loading

teohhanhui commented Dec 2, 2024 •

edited

Loading