Skip to content

Conversation

cretz
Copy link
Member

@cretz cretz commented Aug 13, 2025

What was changed

  • Disallow Logger, sleep, Timeout, ConditionVariable, Mutex, Queue, SizedQueue, and Monitor in workflows
  • Temporalio::Workflow module now offers workflow-safe forms Mutex, Queue, and SizedQueue
    • This is in addition to the already-present sleep and timeout utilities in Temporalio::Workflow
    • These are actually just delegates to the underlying stdlib classes for now that bypass illegal call detection

Checklist

  1. Closes [Feature Request] Research and maybe implement disallowing all Ruby stdlib sync constructs in workflows #310

@cretz cretz requested a review from a team as a code owner August 13, 2025 19:33
@cretz
Copy link
Member Author

cretz commented Aug 13, 2025

(ignore CI failures, will be fixed with next Core update which includes temporalio/sdk-core#975)

Copy link
Member

@Sushisource Sushisource left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing blocking, but I'm wondering if we want to re-export at all

@@ -18,7 +18,7 @@ class Cancellation
def initialize(*parents)
@canceled = false
@canceled_reason = nil
@canceled_mutex = Mutex.new
@canceled_mutex = Workflow::Unsafe.illegal_call_tracing_disabled { Mutex.new }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use the workflow mutex here? Is it also used from non-workflow places?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is a general purpose "cancellation token" that is also used for activities (which is why it's not in the Temporalio::Workflow module, but rather here in the top level one)

Comment on lines +679 to +695
Mutex = Unsafe._wrap_ruby_class_as_legal(::Mutex)

# Queue is a workflow-safe wrapper around {::Queue}.
#
# As of this writing, all methods on Queue are safe for workflow use and are implicitly made deterministic by the
# Fiber scheduler. The primary reason this is wrapped as safe is to be able to catch unintentional uses of Queue by
# non-workflow-safe code. However, users may prefer to use the more powerful {wait_condition} approach as a queue
# (e.g. wait until an array is non-empty before continuing).
Queue = Unsafe._wrap_ruby_class_as_legal(::Queue)

# SizedQueue is a workflow-safe wrapper around {::SizedQueue}.
#
# As of this writing, all methods on SizedQueue are safe for workflow use and are implicitly made deterministic by
# the Fiber scheduler. The primary reason this is wrapped as safe is to be able to catch unintentional uses of
# SizedQueue by non-workflow-safe code. However, users may prefer to use the more powerful {wait_condition} approach
# as a queue (e.g. wait until an array is non-empty before continuing).
SizedQueue = Unsafe._wrap_ruby_class_as_legal(::SizedQueue)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried that just wrapping and re-exporting leaves us vulnerable to a change in the underlying implementation causing NDE. Maybe it would be better to just not export these at all and push people harder towards using wait condition, or otherwise explicitly opt-in to unsafe usage at the workflow code level

Copy link
Member Author

@cretz cretz Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think recently as part of temporalio/features#500 we had decided to make at least mutex/lock-or-sempahore a workflow primitive in every SDK even if they do just wrap wait condition. The main reason for disallowing stdlib constructs was less about fear of Ruby changing its stdlib, and more about accidental use of these now-durable constructs in places that aren't meant to be durable.

I think if we remove these, we 1) would need to add Mutex back anyways but hand-written, and 2) accept that there may be envy towards Java's WorkflowQueue or Python's asyncio.Queue.

Having said that, I can remove these and make a homemade Mutex if we want to. No super strong opinion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we had firsthand experience of that exact thing happening in Python, though? If we're not worried about it changing here, in Ruby (and we can substantiate why that's unlikely) I'm fine with it. Otherwise I'd suggest the handwritten Mutex.

Copy link
Member Author

@cretz cretz Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We luckily haven't had that happen in Python yet exactly. We did have temporalio/sdk-python#429 (and temporalio/sdk-python#518) because of Python async primitives having non-deterministic behavior. That would be akin to Ruby Fiber helpers changing how it called scheduler functions, which would be trouble for us no matter what with these utilities.

Queue is quite heavily used in Ruby and would like not to have to match the Ruby interface with our own (it supports timeouts which we make durable implicitly as well). I think there is not a high risk of the basic Fiber scheduler calls these utilities make changing. I think it's unlikely because these very popular primitives are quite stable and they only make like one obvious Fiber scheduler call each (e.g. block or unblock) and the Fiber::Scheduler interface is quite stable as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, sounds good to me. We can proceed with the wrappers and re-evaluate later if necessary.

@cretz cretz merged commit 248dc46 into temporalio:main Aug 18, 2025
8 checks passed
@cretz cretz deleted the no-ruby-stdlib branch August 18, 2025 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Research and maybe implement disallowing all Ruby stdlib sync constructs in workflows
2 participants