Skip to content

Conversation

Thalley
Copy link
Contributor

@Thalley Thalley commented Jul 17, 2025

Adds asserts for both k_work and k_work_delayable to prevent users from initializing work items that are currently pending.

If a workqueue has e.g. 3 items pending, A, B and C, and work item B is provided to k_work_init, then the workqueue becomes corrupt and items B and C will be removed from the queue.

Adds asserts for both k_work and k_work_delayable to
prevent users from initializing work items that are
currently pending.

If a workqueue has e.g. 3 items pending, A, B and C, and work
item B is provided to k_work_init, then the workqueue becomes
corrupt and items B and C will be removed from the queue.

Signed-off-by: Emil Gydesen <[email protected]>
@Thalley Thalley force-pushed the k_work_pending_assert branch from e5b0e4d to 5039260 Compare July 17, 2025 20:18
@Thalley Thalley requested a review from Copilot July 17, 2025 20:19
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds runtime checks to prevent initializing work items that are already pending and updates tests to zero-initialize work structures.

  • Introduce __ASSERT_NO_MSG(!k_work_is_pending(work)) in k_work_init and its delayable variant
  • Include the assert header in kernel/work.c
  • Update tests to zero-initialize struct k_work and struct k_work[] to satisfy the new assertions

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
tests/kernel/workq/work_queue/src/start_stop.c Zero-initialize work and works to ensure they're not flagged pending
kernel/work.c Add __ASSERT_NO_MSG checks before initializing pending work items and include <zephyr/sys/__assert.h>
Comments suppressed due to low confidence (2)

kernel/work.c:159

  • [nitpick] The assertion uses __ASSERT_NO_MSG without a descriptive message. Consider adding a message to clarify the failure reason, e.g., "k_work_init: work is already pending".
	__ASSERT_NO_MSG(!k_work_is_pending(work));

kernel/work.c:991

  • [nitpick] The assertion uses __ASSERT_NO_MSG without a descriptive message. Consider adding a message to clarify the failure reason, e.g., "k_work_init_delayable: work is already pending".
	__ASSERT_NO_MSG(!k_work_delayable_is_pending(dwork));

Copy link

@Thalley
Copy link
Contributor Author

Thalley commented Jul 17, 2025

On a side note, it's pretty wild that invalidating a single work item added to the system workqueue can corrupt the entire system workqueue's queue.

For example, if the system workqueue has items A, B and C, and the owner of item B either intentionally or accidentally k_work_inits or just memsets item B, then item C (and any other work item on the queue) are implicitly removed as well. That was the case in #93278 where an item was accidentally k_work_inited while still on a queue, causing the whole system to crash and it was pretty hard to figure out exactly why things that were added to the system workqueue were never triggered after that.

It may also affect any other workqueue (or any slist or dlist for that matter), but the system workqueue is unique in that it's being used by a lot of subsystems and applications as well, so it's way more exposed.

@pdgendt
Copy link
Contributor

pdgendt commented Jul 18, 2025

You don't have any guarantees about the value of flags before the init call? The struct is initialized with:

	*work = (struct k_work)Z_WORK_INITIALIZER(handler);

@Thalley
Copy link
Contributor Author

Thalley commented Jul 18, 2025

You don't have any guarantees about the value of flags before the init call? The struct is initialized with:

	*work = (struct k_work)Z_WORK_INITIALIZER(handler);

Yeah, I noticed this being a problem in some of the unit tests, and setting them to ={0} seemed to do the trick (to avoid checking uninitialized values), but as you say, there's no guarantee for other calls, so maybe this is an impossible check to add.

In any case, it would also help prevent against re-initializing a k_work item using the k_work_init API, and not against any calls to memset.

@pdgendt Can you think of any other approaches to help prevent the issue I found?

@pdgendt
Copy link
Contributor

pdgendt commented Jul 18, 2025

@pdgendt Can you think of any other approaches to help prevent the issue I found?

Sorry, not really, I think this is the contract users need to adhere to.

On a side note, it's pretty wild that invalidating a single work item added to the system workqueue can corrupt the entire system workqueue's queue.

As the work items are in a single linked list, and the work struct has the pointer to the next, this does makes sense.

@Thalley
Copy link
Contributor Author

Thalley commented Jul 18, 2025

@pdgendt Can you think of any other approaches to help prevent the issue I found?

Sorry, not really, I think this is the contract users need to adhere to.

On a side note, it's pretty wild that invalidating a single work item added to the system workqueue can corrupt the entire system workqueue's queue.

As the work items are in a single linked list, and the work struct has the pointer to the next, this does makes sense.

Yeah, I suppose you are right. The only alternative I see is a completely new API using abstract objects and allocation/deallocation functions (which still doesn't prevent memsets, but at least makes it much clearer not to use those), but that's not something I am actually suggesting that we do.

I think I'll close the PR due to not being able to perform any checks on struct k_work items in the k_work_init as they may be uninitialized, and leave any such checks up to the owner/caller instead. I do wonder if there are other cases in Zephyr like #93278 that exists but haven't been caught yet :|

@Thalley Thalley closed this Jul 18, 2025
@Thalley Thalley deleted the k_work_pending_assert branch July 18, 2025 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants