-
Notifications
You must be signed in to change notification settings - Fork 19
💥 BREAKING CHANGE - Disallow Ruby sync constructs from stdlib and provide safe alternatives #314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
(ignore CI failures, will be fixed with next Core update which includes temporalio/sdk-core#975) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing blocking, but I'm wondering if we want to re-export at all
@@ -18,7 +18,7 @@ class Cancellation | |||
def initialize(*parents) | |||
@canceled = false | |||
@canceled_reason = nil | |||
@canceled_mutex = Mutex.new | |||
@canceled_mutex = Workflow::Unsafe.illegal_call_tracing_disabled { Mutex.new } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use the workflow mutex here? Is it also used from non-workflow places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This is a general purpose "cancellation token" that is also used for activities (which is why it's not in the Temporalio::Workflow
module, but rather here in the top level one)
Mutex = Unsafe._wrap_ruby_class_as_legal(::Mutex) | ||
|
||
# Queue is a workflow-safe wrapper around {::Queue}. | ||
# | ||
# As of this writing, all methods on Queue are safe for workflow use and are implicitly made deterministic by the | ||
# Fiber scheduler. The primary reason this is wrapped as safe is to be able to catch unintentional uses of Queue by | ||
# non-workflow-safe code. However, users may prefer to use the more powerful {wait_condition} approach as a queue | ||
# (e.g. wait until an array is non-empty before continuing). | ||
Queue = Unsafe._wrap_ruby_class_as_legal(::Queue) | ||
|
||
# SizedQueue is a workflow-safe wrapper around {::SizedQueue}. | ||
# | ||
# As of this writing, all methods on SizedQueue are safe for workflow use and are implicitly made deterministic by | ||
# the Fiber scheduler. The primary reason this is wrapped as safe is to be able to catch unintentional uses of | ||
# SizedQueue by non-workflow-safe code. However, users may prefer to use the more powerful {wait_condition} approach | ||
# as a queue (e.g. wait until an array is non-empty before continuing). | ||
SizedQueue = Unsafe._wrap_ruby_class_as_legal(::SizedQueue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little worried that just wrapping and re-exporting leaves us vulnerable to a change in the underlying implementation causing NDE. Maybe it would be better to just not export these at all and push people harder towards using wait condition, or otherwise explicitly opt-in to unsafe usage at the workflow code level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think recently as part of temporalio/features#500 we had decided to make at least mutex/lock-or-sempahore a workflow primitive in every SDK even if they do just wrap wait condition. The main reason for disallowing stdlib constructs was less about fear of Ruby changing its stdlib, and more about accidental use of these now-durable constructs in places that aren't meant to be durable.
I think if we remove these, we 1) would need to add Mutex
back anyways but hand-written, and 2) accept that there may be envy towards Java's WorkflowQueue
or Python's asyncio.Queue
.
Having said that, I can remove these and make a homemade Mutex
if we want to. No super strong opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we had firsthand experience of that exact thing happening in Python, though? If we're not worried about it changing here, in Ruby (and we can substantiate why that's unlikely) I'm fine with it. Otherwise I'd suggest the handwritten Mutex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We luckily haven't had that happen in Python yet exactly. We did have temporalio/sdk-python#429 (and temporalio/sdk-python#518) because of Python async primitives having non-deterministic behavior. That would be akin to Ruby Fiber
helpers changing how it called scheduler functions, which would be trouble for us no matter what with these utilities.
Queue is quite heavily used in Ruby and would like not to have to match the Ruby interface with our own (it supports timeouts which we make durable implicitly as well). I think there is not a high risk of the basic Fiber scheduler calls these utilities make changing. I think it's unlikely because these very popular primitives are quite stable and they only make like one obvious Fiber
scheduler call each (e.g. block
or unblock
) and the Fiber::Scheduler
interface is quite stable as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sounds good to me. We can proceed with the wrappers and re-evaluate later if necessary.
What was changed
Logger
,sleep
,Timeout
,ConditionVariable
,Mutex
,Queue
,SizedQueue
, andMonitor
in workflowsTemporalio::Workflow::Unsafe.illegal_call_tracing_disabled
if they must use these constructs with the fiber scheduler, or they can useTemporalio::Workflow::Unsafe.durable_scheduler_disabled
(which impliesTemporalio::Workflow::Unsafe.illegal_call_tracing_disabled
) if they must use these constructs in a non-durable way (e.g. needing to make a logger call or send telemetry event)Temporalio::Workflow
module now offers workflow-safe formsMutex
,Queue
, andSizedQueue
sleep
andtimeout
utilities inTemporalio::Workflow
Checklist