-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a strategy for new Numpy PRNGs #3131
Comments
Isn't the whole point that of these new interfaces that users explicitly pass the generator object around? If so, we only need to register the global PRNG that the generators are seeded off, and everything will work from there. |
Yep, that is correct!
My understanding is that the generator objects are not seeded off of a global generator, and that they can only be seeded independently; I think being able to use a global PRNG would defeat the purpose of numpy's redesign. The reason why the new system expects folks to pass around generator objects is that those generator objects can be used/seeded without concern that, in some other portion of the code, the generator object is silently getting re-seeded. |
So what do we need to do then? I was thinking of monkeypatching If the user passes an explicitly-seeded PRNG, it should be pretty obvious what's happening when or if we raise |
When making this post, my thoughts were that we would involve identify the appropriate substitutes for I am hoping to eventually find some time to loop back and hit some of the To-Dos that I laid out in my original post. It is just a matter of me scrounging up time to do so. |
Oh! We could also make a strategy in |
Based on a quick conversation, we plan to:
|
Are there any plans to address this issue? |
Hypothesis is an all-volunteer project, and so far people have been volunteering on other issues instead. If you're interested in helping out, I'm very happy to support that through advice, code review, and so on 😊 |
I would love to contribute but I don't know the internal workings of hypothesis. I was looking at #3510, is that a good starting point? |
Yep, that's a great place to start! I think this should be a pretty self-contained change - it'd be perfectly feasible to implement this strategy downstream, we want to provide it in |
This is going to be a somewhat sprawling issues. All of the topics here involve Hypothesis' approaches to making random code deterministic. I will happily close this and turn it into a collection of modular issues/PRs, but first I want to lay everything out and get @Zac-HD 's input.
Weakrefs
(Addressed in #3135 )
We should only make weak references to the generators that we manage (as well as other "register" functions that Hypothesis provides)
NumPy
NumPy has moved away from its old global random state (e.g.
np.random.seed
,np.random.uniform
, etc.). In favor of a new RNG system that uses a combination of bit-generators and generators. This API is very different from those of global-state RNG systems. Presently, it is not clear how a user should have Hypothesis make theirnumpy.random
code deterministic.To me, the bare-minimum would involve identifying the appropriate substitutes for
seed
,get_state
, andset_state
in terms of the new bit-generatore/generator system, and provide a shim to make it trivial for users to register this new source of RNG.A much more ambitious goal is to still, magically, handle all of this for the user. The only thing that comes to mind is to have NumPy register the creation of new generators, and we then tap into that registry to manage those generators. I would not be surprised if NumPy (understandably) does not want to do this.
Some near-term To-Dos:
seed
,get_state
, andset_state
for users to leverageUseful reference material
PyTorch
See if PyTorch is willing to add a plugin so that Hypothesis will manage their global generator like this (but with
register_random
instead ofregister_type_strategy
).Additionally, torch also supplies a Generator. I recall reading that PyTorch was planning to redesign things like DataLoaders to accept generators, which is similar to the new best practices for NumPy's RNG. Thus, any solution we cook up for the NumPy case should be designed to be future-compatible here as well.
Edit: I just realized that PyTorch actually uses Hypothesis for some of its tests. As far as I can tell, they do not use register_random in their test suite
The text was updated successfully, but these errors were encountered: