Skip to content

[Mono.Android] fix "replaceable" objects in ManagedValueManager #10004

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

jonathanpeppers
Copy link
Member

The following test is failing on NativeAOT as well as any case we'd use ManagedValueManager:

[Test]
public void JnienvCreateInstance_RegistersMultipleInstances ()
{
    using (var adapter = new CreateInstance_OverrideAbsListView_Adapter (Application.Context)) {

        var intermediate  = CreateInstance_OverrideAbsListView_Adapter.Intermediate;
        var registered    = Java.Lang.Object.GetObject<CreateInstance_OverrideAbsListView_Adapter>(adapter.Handle, JniHandleOwnership.DoNotTransfer);

        Assert.AreNotSame (adapter, intermediate); // Passes
        Assert.AreSame (adapter, registered);      // Fails!
    }
}

With the assertion:

Expected: same as <com.xamarin.android.runtimetests.CreateInstance_OverrideAbsListView_Adapter{cbd0e5a V.ED.VC.. ......I. 0,0-0,0}>
But was:  <com.xamarin.android.runtimetests.CreateInstance_OverrideAbsListView_Adapter{cbd0e5a V.ED.VC.. ......I. 0,0-0,0}>

The second assertion fails because registered is the same instance as intermediate. In this example, this is a code path where intermediate should be "replaced" with adapter.

After lots of debugging, I found the problem are these lines in the ManagedValueManager.AddPeer() method:

var o = PeekPeer (value.PeerReference);
if (o != null)
    return;

If we PeekPeer() in the middle of AddPeer() and a type is "replaceable", it would find an instance and not replace it! I did not find equivalent code in AndroidValueManager.AddPeer(), which is what is used in Mono & production today.

With these lines removed, the test passes. I will look if we should also update these lines in dotnet/java-interop in a future PR.

The following test is failing on NativeAOT as well as any case we'd
use `ManagedValueManager`:

    [Test]
    public void JnienvCreateInstance_RegistersMultipleInstances ()
    {
        using (var adapter = new CreateInstance_OverrideAbsListView_Adapter (Application.Context)) {

            var intermediate  = CreateInstance_OverrideAbsListView_Adapter.Intermediate;
            var registered    = Java.Lang.Object.GetObject<CreateInstance_OverrideAbsListView_Adapter>(adapter.Handle, JniHandleOwnership.DoNotTransfer);

            Assert.AreNotSame (adapter, intermediate); // Passes
            Assert.AreSame (adapter, registered);      // Fails!
        }
    }

With the assertion:

    Expected: same as <com.xamarin.android.runtimetests.CreateInstance_OverrideAbsListView_Adapter{cbd0e5a V.ED.VC.. ......I. 0,0-0,0}>
    But was:  <com.xamarin.android.runtimetests.CreateInstance_OverrideAbsListView_Adapter{cbd0e5a V.ED.VC.. ......I. 0,0-0,0}>

The second assertion fails because `registered` is the same instance
as `intermediate`. In this example, this is a code path where
`intermediate` should be "replaced" with `adapter`.

After lots of debugging, I found the problem are these lines in the
`ManagedValueManager.AddPeer()` method:

    var o = PeekPeer (value.PeerReference);
    if (o != null)
        return;

If we `PeekPeer()` in the middle of `AddPeer()` and a type is
"replaceable", it would find an instance and not replace it! I did not
find equivalent code in `AndroidValueManager.AddPeer()`, which is what
is used in Mono & production today.

With these lines removed, the test passes. I will look if we should
also update these lines in dotnet/java-interop in a future PR.
jonathanpeppers added a commit to dotnet/java-interop that referenced this pull request Apr 3, 2025
Context: dotnet/android#10004

This breaks the "replaceable" logic otherwise.
@jonathanpeppers jonathanpeppers requested a review from Copilot April 3, 2025 16:50
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a bug in ManagedValueManager where "replaceable" objects were not properly replaced, leading to incorrect instance registrations in tests.

  • Removed an early return check that prevented the replacement of instances.
  • The changes align ManagedValueManager's behavior with AndroidValueManager's implementation.
Comments suppressed due to low confidence (1)

src/Mono.Android/Microsoft.Android.Runtime/ManagedValueManager.cs:70

  • Removing the PeekPeer check ensures that replaceable objects are correctly updated, but please add a code comment to clarify why this behavior is necessary to prevent future regressions.
var o = PeekPeer (value.PeerReference);

@jonathanpeppers
Copy link
Member Author

jonpryor added a commit to dotnet/java-interop that referenced this pull request Apr 9, 2025
Context: dotnet/android#10004

It looks like dotnet/android#10004 is closely tied to dotnet/android#9862.

Might as well update tests to hit this behavior!
@jonpryor
Copy link
Contributor

jonpryor commented Apr 9, 2025

What's "funny" is the interaction between #9862, dotnet/java-interop#1323, and this PR #10004.

The "problem" is that the current AndroidValueManager.AddPeer() semantics are "wrong" wrt #9862, in that we would like to preserve/retain the "replaceable" value if the new value is also "replaceable".

I think that ManagedValueManager, because of the PeekPeer() call within AddPeer(), has this semantic!

…but at the possible cost of the scenario #10004 is trying to address.

Additionally this doesn't "lower" things so that the issue is visible in dotnet/java-interop. I had thought that an existing test case there hit this scenario, but I was wrong; there is partial overlap, but not complete overlap. I've updated dotnet/java-interop#1323 so that JnienvCreateInstance_RegistersMultipleInstances() and CreateManagedInstanceFirst() more closely resemble each other…

@jonpryor
Copy link
Contributor

After more thinking… I think the actual bug is in CreateInstance_OverrideAbsListView_Adapter(Context):

https://github.com/dotnet/android/blob/main/tests/Mono.Android-Tests/Mono.Android-Tests/Java.Lang/ObjectTest.cs#L151-L160

As per private chat, this doesn't resemble current binding constructors at all. I just didn't think further on it at the time…

Aside: dotnet/java-interop@3043d89

Additional aside: We no longer support NewInvokeRequired: dotnet/java-interop@dec35f5, which means bindings never use JNIEnv::NewObject().

What the constructor should be is:

/*    01 */ public CreateInstance_OverrideAbsListView_Adapter (Context context)
/*    02 */ 	: base (IntPtr.Zero, JniHandleOwnership.DoNotTransfer)
/*    03 */ {
/*    04 */ 	const string __id = "(Landroid/content/Context;)V";
/*    05 */ 
/*    06 */ 	if (((global::Java.Lang.Object) this).Handle != IntPtr.Zero)
/*    07 */ 		return;
/*    08 */ 
/*    09 */ 	try {
/*    10 */ 		JniArgumentValue* __args = stackalloc JniArgumentValue [1];
/*    11 */ 		__args [0] = new JniArgumentValue ((context == null) ? IntPtr.Zero : ((global::Java.Lang.Object) context).Handle);
/*    12 */ 		var __r = _members.InstanceMethods.StartCreateInstance (__id, ((object) this).GetType (), __args);
/*    13 */ 		SetHandle (__r.Handle, JniHandleOwnership.TransferLocalRef);
/*    14 */ 		_members.InstanceMethods.FinishCreateInstance (__id, this, __args);
/*    15 */ 	} finally {
/*    16 */ 		global::System.GC.KeepAlive (context);
/*    17 */ 	}
/*    18 */ 
/*    19 */ 	AdapterValue = new ArrayAdapter (context, 0);
/*    20 */ }

Line 2: null handle, we're creating a new object below.

Line 6-7: If Handle is already set, we don't need to create a new instance; bail.

Lines 10-11: JNI argument marshaling

Line 12: _members.InstanceMethods.StartCreateInstance() is a JNIEnv::AllocObject(), which allocates the Java peer but does not run the constructor. It returns the JNI handle to the allocated Java peer.

Line 13: SetHandle() eventually calls AddPeer(), which creates the mapping between the Java peer in __r and this

Line 14: _members.InstanceMethods.FinishCreateInstance() invokes the Java side constructor.

This "allocate, create mapping, invoke constructor" pattern allows us to avoid the whole conundrum of "the Java constructor invokes an overridden virtual method! What instance do we invoke it on!" by ensuring that we have the mapping before the Java constructor is even invoked!

Which brings us to the "faulty" existing implementation:

		public CreateInstance_OverrideAbsListView_Adapter (Context context)
			: base (
					JNIEnv.CreateInstance (
						JcwType,
						"(Landroid/content/Context;)V",
						new JValue (context)),
					JniHandleOwnership.TransferLocalRef)

The use of JNIEnv.CreateInstance() is equivalent to using JNIEnv::NewObject(), which invokes the Java constructor and doesn't return until after the Java constructor has finished executing. This is what causes CreateInstance_OverrideAbsListView_Adapter.Intermediate to be set to a non-null value, because (in this scenario) an "intermediate" instance via Activation Constructor needs to exist to handle the overridden getAdapter() method / Adapter property, which is invoked by the AbsListView constructor, and that intermediate can't be the registered value, because we're in the constructor that is constructing the value that we need to register!

JNIEnv::NewObject() / JNIEnv.CreateInstance() has always been problematic, and has not been used in our bindings in over a decade! See also:


Thus, the question: what is the "fix"? What are we trying to fix?

Option 1: Update CreateInstance_OverrideAbsListView_Adapter(Context) to resemble a proper binding.

  • Pro: Consistency!
  • "Pro": JnienvCreateInstance_RegistersMultipleInstances() doesn't break, because intermediate will be null, which isn't the same as adapter!
  • Con: We won't have any tests which attempt to test this behavior around JNIEnv::NewObject().
  • @jonpryor isn't sure we want to continue testing this behavior. Only pre-API-10 targets require JNIEnv::NewObject(), so unless you're avoiding bindings and manually calling JNIEnv::NewObject()/JniEnvironment.Object.NewObject(), you should never encounter this behavior.

Option 2: Assert.Ignore() this test for CoreCLR & NativeAOT.

  • Pro: we don't need to worry about it!
  • Con? Uncertain.

Option 3: Add a JNIEnv::NewObject() variant test to dotnet/java-interop, and make that work.

  • Pro: Assert that this scenario works.
  • Con: @jonpryor isn't sure this is useful?

@jonpryor
Copy link
Contributor

…and because I'm a glutton for punishment, I've started working on (3) within dotnet/java-interop#1323

@jonpryor
Copy link
Contributor

I am asserting that dotnet/java-interop#1323 and #9921 will fix JnienvCreateInstance_RegistersMultipleInstances(). Keeping this open until those PRs are merged and the assertion can be verified.

jonpryor added a commit to dotnet/java-interop that referenced this pull request Apr 15, 2025
Context: 3043d89
Context: dec35f5
Context: dotnet/android#9862
Context: dotnet/android#9862 (comment)
Context: dotnet/android#10004
Context: https://github.com/xamarin/monodroid/commit/326509e56d4e582c53bbe5dfe6d5c741a27f1af5
Context: https://github.com/xamarin/monodroid/commit/940136ebf1318a7c57a855e2728ce2703c0240af

Ever get the feeling that everything is inextricably related?

JNI has two pattens for create an instance of a Java type:

 1. [`JNIEnv::NewObject(jclass clazz, jmethodID methodID, const jvalue* args)`][0]

 2. [`JNIEnv::AllocObject(jclass clazz)`][1] + 
    [`JNIEnv::CallNonvirtualVoidMethod(jobject obj, jclass clazz, jmethodID methodID, const jvalue* args)`][2]

In both patterns:

  * `clazz` is the `java.lang.Class` of the type to create.
  * `methodID` is the constructor to execute
  * `args` are the constructor arguments.

In .NET terms:

  * `JNIEnv::NewObject()` is equivalent to using
    `System.Reflection.ConstructorInfo.Invoke(object?[]?)`, while

  * `JNIEnv::AllocObject()` + `JNIEnv::CallNonvirtualVoidMethod()` is
    equivalent to using
    `System.Runtime.CompilerServices.RuntimeHelpers.GetUninitializedObject(Type)` +
    `System.Reflection.MethodBase.Invoke(object?, object?[]?)`.

Why prefer one over the other?

When hand-writing your JNI code, `JNIEnv::NewObject()` is easier.
This is less of a concern when a code generator is used.

The *real* reason to *avoid* `JNIEnv::NewObject()` whenever possible
is the [Java Activation][3] scenario, summarized as the "are you sure
you want to do this?" [^1] scenario of invoking a virtual method from the
constructor:

	class Base {
	  public Base() {
	    VirtualMethod();
	  }
	  public virtual void VirtualMethod() {}
	}

	class Derived : Base {
	  public override void VirtualMethod() {}
	}


Java and C# are identical here: when a constructor invokes a virtual
method, the most derived method implementation is used, which will
occur *before* the constructor of the derived type has *started*
execution.  (With lots of quibbling about field initializers…)

Thus, assume you have a Java `CallVirtualFromConstructorBase` type,
which has its constructor Do The Wrong Thing™ and invoke a virtual
method from the constructor, and that method is overridden in C#?

	// Java
	public class CallVirtualFromConstructorBase {
	  public CallVirtualFromConstructorBase(int value) {
	    calledFromConstructor(value);
	  }
	  public void calledFromConstructor(int value) {
	  }
	}

	// C#
	public class CallVirtualFromConstructorDerived : CallVirtualFromConstructorBase {
	  public CallVirtualFromConstructorDerived(int value)
	    : base (value)
	  {
	  }

	  public override void CalledFromConstructor(int value)
	  {
	  }
	}

What happens with:

	var p = new CallVirtualFromConstructorDerived(42);

The answer depends on whether or not `JNIEnv::NewObject()` is used.

If `JNIEnv::NewObject()` is *not* used (the default!)

 1. `CallVirtualFromConstructorDerived(int)` constructor begins
    execution, immediately calls `base(value)`.

 2. `CallVirtualFromConstructorBase(int)` constructor runs, uses
    `JNIEnv::AllocObject()` to *create* (but not construct!) Java
    `CallVirtualFromConstructorDerived` instance.

 3. `JavaObject.Construct(ref JniObjectReference, JniObjectReferenceOptions)`
    invoked, creating a mapping between the C# instance created in
    (1) and the Java instance created in (2).

 4. `CallVirtualFromConstructorBase(int)` C# constructor calls
    `JniPeerMembers.InstanceMethods.FinishGenericCreateInstance()`,
    which eventually invokes `JNIEnv::CallNonvirtualVoidMethod()`
    with the Java `CallVirtualFromConstructorDerived(int)` ctor.

 5. Java `CallVirtualFromConstructorDerived(int)` constructor invokes
    Java `CallVirtualFromConstructorBase(int)` constructor, which
    invokes `CallVirtualFromConstructorDerived.calledFromConstructor()`.

 6. Marshal method (356485e) for
    `CallVirtualFromConstructorBase.CalledFromConstructor()` invoked,
    *immediately* calls `JniRuntime.JniValueManager.GetPeer()`
    (e288589) to obtain an instance upon which to invoke
    `.CalledFromConstructor()`, finds the instance mapping from (3),
    invokes 
    `CallVirtualFromConstructorDerived.CalledFromConstructor()`
    override.

 7. Marshal Method for `CalledFromConstructor()` returns, Java
    `CallVirtualFromConstructorBase(int)` constructor finishes,
    Java `CallVirtualFromConstructorDerived(int)` constructor
    finishes, `JNIEnv::CallNonvirtualVoidMethod()` finishes.

 8. `CallVirtualFromConstructorDerived` instance finishes construction.

If `JNIEnv::NewObject()` is used:

 1. `CallVirtualFromConstructorDerived(int)` constructor begins
    execution, immediately calls `base(value)`.

    Note that this is the first created `CallVirtualFromConstructorDerived`
    instance, but it hasn't been registered yet.

 2. `CallVirtualFromConstructorBase(int)` constructor runs, uses
    `JNIEnv::NewObject()` to construct Java
    `CallVirtualFromConstructorDerived` instance.

 3. `JNIEnv::NewObject()` invokes Java
    `CallVirtualFromConstructorDerived(int)` constructor, which invokes
    `CallVirtualFromConstructorBase(int)` constructor, which invokes
    `CallVirtualFromConstructorDerived.calledFromConstructor()`.

 4. Marshal method (356485e) for
    `CallVirtualFromConstructorBase.CalledFromConstructor()` invoked,
    *immediately* calls `JniRuntime.JniValueManager.GetPeer()`
    (e288589) to obtain an instance upon which to invoke
    `.CalledFromConstructor()`.

    Here is where things go "off the rails" compared to the
    `JNIEnv::AllocObject()` code path:

    There is no such instance -- we're still in the middle of
    constructing it! -- so we look for an "activation constructor".

 5. `CallVirtualFromConstructorDerived(ref JniObjectReference, JniObjectReferenceOptions)`
    activation constructor executed.

    This is the *second* `CallVirtualFromConstructorDerived` instance
    created, and registers a mapping from the Java instance that
    we started constructing in (3) to what we'll call the
    "activation intermediary".

    The activation intermediary instance is marked as "Replaceable".

 6. `CallVirtualFromConstructorDerived.CalledFromConstructor()` method
    override invoked on the activation intermediary.

 7. Marshal Method for `CalledFromConstructor()` returns, Java
    `CallVirtualFromConstructorBase(int)` constructor finishes,
    Java `CallVirtualFromConstructorDerived(int)` constructor
    finishes, `JNIEnv::NewObject()` returns instance.

 8. C# `CallVirtualFromConstructorBase(int)` constructor calls
    `JavaObject.Construct(ref JniObjectReference, JniObjectReferenceOptions)`,
    to create a mapping between (3) and (1).

    In .NET for Android, this causes the C# instance created in (1)
    to *replace* the C# instance created in (5), which allows
    "Replaceable" instance to be replaced.

    In dotnet/java-interop, this replacement *didn't* happen, which
    meant that `ValueManager.PeekPeer(p.PeerReference)` would return
    the activation intermediary, *not* `p`, which confuses everyone.

 9. `CallVirtualFromConstructorDerived` instance finishes construction.

For awhile, dotnet/java-interop did not fully support this scenario
around `JNIEnv::NewObject()`.  Additionally, support for using
`JNIEnv::NewObject()` as part of
`JniPeerMembers.JniInstanceMethods.StartCreateInstance()` was
*removed* in dec35f5.

Which brings us to dotnet/android#9862: where there is an observed
"race condition" around `Android.App.Application` subclass creation.
*Two* instances of `AndroidApp` were created, one from the "normal"
app startup:

	at crc647fae2f69c19dcd0d.AndroidApp.n_onCreate(Native Method)
	at crc647fae2f69c19dcd0d.AndroidApp.onCreate(AndroidApp.java:25)
	at android.app.Instrumentation.callApplicationOnCreate(Instrumentation.java:1316)

and another from an `androidx.work.WorkerFactory`:

	at mono.android.TypeManager.n_activate(Native Method)
	at mono.android.TypeManager.Activate(TypeManager.java:7)
	at crc647fae2f69c19dcd0d.SyncWorker.<init>(SyncWorker.java:23)
	at java.lang.reflect.Constructor.newInstance0(Native Method)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:343)
	at androidx.work.WorkerFactory.createWorkerWithDefaultFallback(WorkerFactory.java:95)

However, what was odd about this "race condition" was that the
*second* instance created would reliably win!

Further investigation suggested that this was less of a
"race condition" and more a bug in `AndroidValueManager`, wherein when
"Replaceable" instances were created, an existing instance would
*always* be replaced, even if the new instance was also Replaceable!
This feels bananas; yes, Replaceable should be replaceable, but the
expectation was that it would be replaced by *non*-Replaceable
instances, not just any instance that came along later.

Update `JniRuntimeJniValueManagerContract` to add a new
`CreatePeer_ReplaceableDoesNotReplace()` test to codify the desired
semantic that Replaceable instances do not replace Replaceable
instances.

Surprisingly, this new test did not fail on java-interop, as
`ManagedValueManager.AddPeer()` bails early when `PeekPeer()` finds
a value, while `AndroidValueManager.AddPeer()` does not bail early.

An obvious fix for `CreatePeer_ReplaceableDoesNotReplace()` within
dotnet/android would be to adopt the "`AddPeer()` calls `PeekPeer()`"
logic from java-interop.  The problem is that doing so breaks
[`ObjectTest.JnienvCreateInstance_RegistersMultipleInstances()`][5],
as seen in dotnet/android#10004!

`JnienvCreateInstance_RegistersMultipleInstances()` in turn fails
when `PeekPeer()` is used because follows the `JNIEnv::NewObject()`
[construction codepath][6]!

	public CreateInstance_OverrideAbsListView_Adapter (Context context)
	  : base (
	      JNIEnv.CreateInstance (
	        JcwType,
	        "(Landroid/content/Context;)V",
	        new JValue (context)),
	      JniHandleOwnership.TransferLocalRef)
	{
	  AdapterValue = new ArrayAdapter (context, 0);
	}

as `JNIEnv.CreateInstance()` uses `JNIEnv.NewObject()`.

We thus have a conundrum: how do we fix *both*
`CreatePeer_ReplaceableDoesNotReplace()` *and*
`JnienvCreateInstance_RegistersMultipleInstances()`?

The answer is to add proper support for the `JNIEnv::NewObject()`
construction scenario to dotnet/java-interop, which in turn requires
"lowering" the setting of `.Replaceable`.  Previously, we would set
`.Replaceable` *after* the activation constructor was invoked:

	// dotnet/android TypeManager.CreateInstance(), paraphrasing
	var result = CreateProxy (type, handle, transfer);
	result.SetJniManagedPeerState (JniManagedPeerStates.Replaceable | JniManagedPeerStates.Activatable);
	return result;

This is *too late*, as during execution of the activation constructor,
the instance thinks it *isn't* replaceable, and thus creation of a new
instance via the activation constructor will replace an already
existing replaceable instance; it's not until *after* the constructor
finished executing that we'd set `.Replaceable`.

To fix this, update `JniRuntime.JniValueManager.TryCreatePeerInstance()`
to first create an *uninitialized* instance, set `.Replaceable`, and
*then* invoke the activation constructor.  This allows
`JniRuntime.JniValueManager.AddPeer()` to check to see if the new
value is also replaceable, and ignore the replacement if appropriate.

This in turn requires replacing:

	partial class /* JniRuntime. */ JniValueManager {
	  protected virtual IJavaPeerable? TryCreatePeer ()
	      ref JniObjectReference reference,
	      JniObjectReferenceOptions options,
	      Type type);
	}

with:

	partial class /* JniRuntime. */ JniValueManager {
	  protected virtual bool TryConstructPeer ()
	      IJavaPeerable self,
	      ref JniObjectReference reference,
	      JniObjectReferenceOptions options,
	      Type type);
	}

This is fine because we haven't shipped `TryCreatePeer()` in a stable
release yet.

[^1]:   See also [Framework Design Guidelines > Constructor Design][4]:

    > ❌ AVOID calling virtual members on an object inside its constructor.
    > Calling a virtual member will cause the most derived override to be
    > called, even if the constructor of the most derived type has not
    > been fully run yet.

[0]: https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/functions.html#NewObject
[1]: https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/functions.html#AllocObject
[2]: https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/functions.html#CallNonvirtual_type_Method_routines
[3]: https://learn.microsoft.com/en-us/previous-versions/xamarin/android/internals/architecture#java-activation
[4]: https://learn.microsoft.com/dotnet/standard/design-guidelines/constructor
[5]: https://github.com/dotnet/android/blob/9ad492a42b384519a8b1f1987adae82335536d9c/tests/Mono.Android-Tests/Mono.Android-Tests/Java.Lang/ObjectTest.cs#L68-L79
[6]: https://github.com/dotnet/android/blob/9ad492a42b384519a8b1f1987adae82335536d9c/tests/Mono.Android-Tests/Mono.Android-Tests/Java.Lang/ObjectTest.cs#L151-L160
@jonathanpeppers
Copy link
Member Author

Closing in favor of dotnet/java-interop#1323

@github-actions github-actions bot locked and limited conversation to collaborators May 16, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants