Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate UUIDs are generated after forking #27

Open
oschwald opened this issue Jul 31, 2018 · 1 comment
Open

Duplicate UUIDs are generated after forking #27

oschwald opened this issue Jul 31, 2018 · 1 comment

Comments

@oschwald
Copy link

oschwald commented Jul 31, 2018

As demonstrated in this blog post, it appears duplicate UUIDs can be generated after forking:

$ perl -MData::UUID -E'$u = Data::UUID->new(); my $parent = $$; $parent == $$ && fork for 1..shift; say $u->create_str' 1000 | sort | uniq -c | grep -v '^\s*1\s'
      2 02B8AAEE-9513-11E8-ABC7-22474D6EA9B9
      2 02B8FA58-9513-11E8-93A0-22474D6EA9B9
      2 02BDCE98-9513-11E8-BF6D-22474D6EA9B9
      2 02BDDCB2-9513-11E8-BFBF-22474D6EA9B9
      2 02BF0C54-9513-11E8-98FC-22474D6EA9B9
      2 02C08688-9513-11E8-84BD-22474D6EA9B9
      2 02C0E2EA-9513-11E8-8793-22474D6EA9B9
      2 02C29D24-9513-11E8-A059-22474D6EA9B9
      2 02C5DE26-9513-11E8-A9C4-22474D6EA9B9

I take no credit for discovering this. I am just opening the issue as I didn't see one already.

@eserte
Copy link

eserte commented Sep 8, 2020

Just some thoughts on this:

  • It seems that the same problem was discovered for loading state from file --- there's code in the constructor to modify the internal state using the current pid (see https://github.com/eserte/Data-UUID/blob/49bada021afbff62f2aef6b6981a1ff222f77913/UUID.xs#L370 ). Without this modification same UUIDs may be generated for independent processes which use the same /tmp/.UUID_STATE.
  • Unfortunately perl has no concept of callbacks on forks, which could be used to fix things (e.g. change the internal Data::UUID state in the forked child)
  • Probably the problem can be fixed by always using the current pid when calculating the internal state. This means an additional getpid() call on every UUID creation (on my systems that's an overhead of 5µs to 25µs per call).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants