optee 3.21.0: core: mobj_reg_shm_dec_map #6120

omasse-linaro · 2023-06-15T16:29:32Z

Hi

I'm getting an assert in reg_shm_unmap_helper when performing a simple stress test with two threads which invoke in a infinite loop a ping command to a TA with an input shared memory as parameter.

E/TC:1 1 assertion 'r->mm' failed at core/arch/arm/mm/mobj_dyn_shm.c:117 <reg_s>
E/TC:1 1 Panic at core/kernel/assert.c:28 <_assert_break>
E/TC:1 1 TEE load address @ 0xe1b32000

it seems that reg_shm_unmap_helper is called by mobj_reg_shm_dec_map when the mapcount reach 0 with an already unmapped object.

looking at the refcount.h (core/include/kernel/refcount.h) it is specify that when refcount_dec(r) is true, the object o is uninitialize if refcount_val(r) is still 0 AND o is not NULL.

however mobj_reg_shm_dec_map in core/arch/arm/mm/mobj_dyn_shm.c do not check if r->mm is not NULL before calling reg_shm_unmap_helper.

doing so, my stress test run fine without assert.
could we have a race condition when two threads call to mobj_reg_shm_dec_map occur at the same time ?

after doing refcount_dec on mapcount, thread 1 take the lock before thread 2 and call reg_shm_unmap_helper and release the lock.
thread 2 then take the lock and call reg_shm_unmap_helper as well, which fail with assert(r->mm).

jenswi-linaro · 2023-06-16T08:25:39Z

Each call to mobj_dec_map() must be matched by a previous call to mobj_inc_map(). When that rule is followed then the race you're describing can't occur.

I believe the real problem is that the mobj_inc_map() and mobj_dec_map() calls are unbalanced for some reason.

omasse-linaro · 2023-06-20T07:58:00Z

indeed, I see that you have fix a race issue in mobj_reg_shm_inc_map, but why not just get the reg_shm_map_lock spin lock before incrementing or decrementing the map refcount ?

jenswi-linaro · 2023-06-20T08:32:37Z

Does that fix your problem?

The point with refcount_inc() and refcount_dec() is that they can be called without holding a lock. If we're going to hold a lock then we can access the counter directly instead.

omasse-linaro · 2023-06-21T09:29:50Z

Yes, I just reproduce the issue with qemu as well.

Fixed with the following patch

diff --git a/core/arch/arm/mm/mobj_dyn_shm.c b/core/arch/arm/mm/mobj_dyn_shm.c
index 3e2ac5022..158e25211 100644
--- a/core/arch/arm/mm/mobj_dyn_shm.c
+++ b/core/arch/arm/mm/mobj_dyn_shm.c
@@ -185,11 +185,11 @@ static TEE_Result mobj_reg_shm_inc_map(struct mobj *mobj)
        size_t sz = 0;
 
        while (true) {
-               if (refcount_inc(&r->mapcount))
-                       return TEE_SUCCESS;
-
                exceptions = cpu_spin_lock_xsave(&reg_shm_map_lock);
 
+               if (refcount_inc(&r->mapcount))
+                       goto out;
+
                if (!refcount_val(&r->mapcount))
                        break; /* continue to reinitialize */
                /*
@@ -233,14 +233,15 @@ static TEE_Result mobj_reg_shm_dec_map(struct mobj *mobj)
        struct mobj_reg_shm *r = to_mobj_reg_shm(mobj);
        uint32_t exceptions = 0;
 
-       if (!refcount_dec(&r->mapcount))
-               return TEE_SUCCESS;
-
        exceptions = cpu_spin_lock_xsave(&reg_shm_map_lock);
 
+       if (!refcount_dec(&r->mapcount))
+               goto out;
+
        if (!refcount_val(&r->mapcount))
                reg_shm_unmap_helper(r);
 
+out:
        cpu_spin_unlock_xrestore(&reg_shm_map_lock, exceptions);
 
        return TEE_SUCCESS;

omasse-linaro · 2023-06-23T12:38:12Z

draft submitted here #6133

jenswi-linaro · 2023-06-26T14:15:18Z

What's regression case 1040?

omasse-linaro · 2023-06-26T14:26:12Z

It's a dedicated added test that ping a TA with an input shared memory.
PFA a patch for optee_test.
xtest_1040.zip

jenswi-linaro · 2023-06-26T14:27:19Z

It's a dedicated added test that ping a TA with an input shared memory. PFA a patch for optee_test. xtest_1040.zip

Thanks

jenswi-linaro · 2023-06-26T14:53:22Z

What are the steps needed to reproduce the problem on QEMU?
I understand that I need OP-TEE/optee_test#663, but what else is needed? Which flags do you compile OP-TEE with? Which patches do you have in OP-TEE OS?

Edit: sorry I see now that it has nothing to do with OP-TEE/optee_test#663 (which has a 1040 case by the way). I'll apply your patch and test.

omasse-linaro · 2023-06-26T14:59:48Z

Test was done on optee os master branch at revision 504f159
Test fail without any added patches.

OP-TEE is build with CFG_TEE_CORE_LOG_LEVEL set to 1.
make run CFG_TEE_CORE_LOG_LEVEL=1

omasse-linaro · 2023-06-26T15:01:36Z

The race condition could be reproduced by running two xtest clients as follow:
xtest -t regression 1040 &
xtest -t regression 1040 &

NB: issue could but reproduce quicker with more running xtest clients.

jenswi-linaro · 2023-06-26T17:41:25Z

Can you reproduce the problem with #6137?

omasse-linaro · 2023-06-27T06:21:14Z

No, issue is fixed with that patch.

jenswi-linaro · 2023-06-27T07:39:32Z

Thanks, let's use #6137 instead then.

omasse-linaro · 2023-06-27T12:21:28Z

Thanks, let's close this issue.

jenswi-linaro mentioned this issue Jun 26, 2023

core: fix race in mobj_reg_shm_dec_map() #6137

Merged

omasse-linaro closed this as completed Jun 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optee 3.21.0: core: mobj_reg_shm_dec_map #6120

optee 3.21.0: core: mobj_reg_shm_dec_map #6120

omasse-linaro commented Jun 15, 2023

jenswi-linaro commented Jun 16, 2023

omasse-linaro commented Jun 20, 2023

jenswi-linaro commented Jun 20, 2023

omasse-linaro commented Jun 21, 2023 •

edited

Loading

omasse-linaro commented Jun 23, 2023

jenswi-linaro commented Jun 26, 2023

omasse-linaro commented Jun 26, 2023

jenswi-linaro commented Jun 26, 2023

jenswi-linaro commented Jun 26, 2023 •

edited

Loading

omasse-linaro commented Jun 26, 2023

omasse-linaro commented Jun 26, 2023 •

edited

Loading

jenswi-linaro commented Jun 26, 2023

omasse-linaro commented Jun 27, 2023

jenswi-linaro commented Jun 27, 2023

omasse-linaro commented Jun 27, 2023

optee 3.21.0: core: mobj_reg_shm_dec_map #6120

optee 3.21.0: core: mobj_reg_shm_dec_map #6120

Comments

omasse-linaro commented Jun 15, 2023

jenswi-linaro commented Jun 16, 2023

omasse-linaro commented Jun 20, 2023

jenswi-linaro commented Jun 20, 2023

omasse-linaro commented Jun 21, 2023 • edited Loading

omasse-linaro commented Jun 23, 2023

jenswi-linaro commented Jun 26, 2023

omasse-linaro commented Jun 26, 2023

jenswi-linaro commented Jun 26, 2023

jenswi-linaro commented Jun 26, 2023 • edited Loading

omasse-linaro commented Jun 26, 2023

omasse-linaro commented Jun 26, 2023 • edited Loading

jenswi-linaro commented Jun 26, 2023

omasse-linaro commented Jun 27, 2023

jenswi-linaro commented Jun 27, 2023

omasse-linaro commented Jun 27, 2023

omasse-linaro commented Jun 21, 2023 •

edited

Loading

jenswi-linaro commented Jun 26, 2023 •

edited

Loading

omasse-linaro commented Jun 26, 2023 •

edited

Loading