Skip to content

Return value policy gray zone #888

@wjakob

Description

@wjakob

cc @hawkinsp @vfdev-5 @oremanj

Nanobind exposes various return value policies that generally do an OK job. But there is a gray zone where their current behavior is IMO confusing. In particular, rv_policy::move, rv_policy::take_ownership, and rv_policy::reference_internal all check if an existing Python object is already associated with the pointer/reference. In that case, they directly return that and the return value policy is ignored.

This can lead to leaks. For example, suppose that code returns an object twice -- once using rv_policy::reference, and later using rv_policy::take_ownership. The second ownership transfer will never occur.

It can also lead to weird/unexpected behavior. For example,

.def("move_field", [](Struct &s) -> Field& { return s.field; }, rv_policy::move)

will not actually move the field when somebody else has previously created a reference to s.field.

Confusion aside, these lookups to check for the existence of an instance also have a non-negligible cost (hash table traversal) that would be nice to avoid.

But it is not so simple to change this behavior.

For example, one low hanging fruit that I was looking at just now was to disable the search for existing instances when the user passes rv_policy::move (which is used for pass-by value, so this would be great optimization that hits many usecases).

However, this breaks overloaded assignment operators (specifically nb::self() += nb::self() used in test14_operators). This is related to a PR by @oremanj (#803). The operator is overloaded with rv_policy::move and enforcing that now breaks preservation of the same object for an in-place update.

Taking a step back, the ideal behavior for in-place operators is to return the same Python object if the operator returns *this, and copy or move otherwise (perhaps move in the case of pass-by value, and copy in remaining cases?) We always move for pass-by value, so it seems that we need an return_existing_or_copy return value policy...

Next, I looked at disabling the instance search for rv_policy::take_ownership. This causes many unique pointer-related tests to fail in test_holders.py, and the test suite eventually segfaults in tests/test_thread.py.

The segfault is instructive: it happens in a function binding that essentially does [](Value *value) -> Value * { return value; } (pointer pass-through). The default value policy automatic turns into take_ownership, and that's the wrong one to use here. So making any change here will probably break quite a lot of user code.

I thought that it would be useful to have a discussion about whether others think this is a problem, and how to improve it.

One potential solution that I was thinking about is to separate the return value policy into two independent aspects:

  1. what happens when an existing instance is found, and
  2. what to do if not?

So we could, e.g., have

  • always_copy
  • always_move
  • always_take_ownership
  • return_existing_or_copy
  • return_existing_or_move
  • return_existing_or_take_ownership
  • ...

(with the current policy names mapping to the return_existing_* variants).

By numbering them suitably, the use of these (many) policies could be dealt with using bit arithmetic in the implementation. To take the negative position, this change makes something that users already find confusing even more overwhelming.

Your feedback would be greatly appreaciated!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions