Return value policy gray zone

cc @hawkinsp @vfdev-5 @oremanj 

Nanobind exposes various return value policies that generally do an OK job. But there is a gray zone where their current behavior is IMO confusing.  In particular, `rv_policy::move`, `rv_policy::take_ownership`, and `rv_policy::reference_internal` all check if an existing Python object is already associated with the pointer/reference. In that case, they directly return that and the return value policy is ignored.

This can lead to leaks. For example, suppose that code returns an object twice -- once using `rv_policy::reference`, and later using `rv_policy::take_ownership`. The second ownership transfer will never occur.

It can also lead to weird/unexpected behavior. For example, 
```cpp
.def("move_field", [](Struct &s) -> Field& { return s.field; }, rv_policy::move)
```
will not actually move the field when somebody else has previously created a reference to `s.field`.

Confusion aside, these lookups to check for the existence of an instance also have a non-negligible cost (hash table traversal) that would be nice to avoid.

But it is not so simple to change this behavior.

For example, one low hanging fruit that I was looking at just now was to disable the search for existing instances when the user passes `rv_policy::move` (which is used for pass-by value, so this would be great optimization that hits many usecases).

However, this breaks overloaded assignment operators (specifically `nb::self() += nb::self()` used in `test14_operators`). This is related to a PR by @oremanj (#803). The operator is overloaded with `rv_policy::move` and enforcing that now breaks preservation of the same object for an in-place update.

Taking a step back, the ideal behavior for in-place operators is to return the same Python object if the operator returns `*this`, and copy or move otherwise (perhaps move in the case of pass-by value, and copy in remaining cases?) We always move for pass-by value, so it seems that we need an `return_existing_or_copy` return value policy...

Next, I looked at disabling the instance search for `rv_policy::take_ownership`. This causes many unique pointer-related tests to fail in `test_holders.py`, and the test suite eventually segfaults in `tests/test_thread.py`.

The segfault is instructive: it happens in a function binding that essentially does `[](Value *value) -> Value * { return value; }` (pointer pass-through). The default value policy `automatic` turns into `take_ownership`, and that's the wrong one to use here. So making any change here will probably break quite a lot of user code.

I thought that it would be useful to have a discussion about whether others think this is a problem, and how to improve it.

One potential solution that I was thinking about is to separate the return value policy into two independent aspects:
1. what happens when an existing instance is found, and
2. what to do if not?

So we could, e.g., have 

- `always_copy`
- `always_move`
- `always_take_ownership`
- `return_existing_or_copy`
- `return_existing_or_move`
- `return_existing_or_take_ownership`
- ...

(with the current policy names mapping to the `return_existing_*` variants).

By numbering them suitably, the use of these (many) policies could be dealt with using bit arithmetic in the implementation. To take the negative position, this change makes something that users already find confusing even more overwhelming.

Your feedback would be greatly appreaciated!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Return value policy gray zone #888

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Return value policy gray zone #888

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions