Skip to content

Commit da3ef93

Browse files
committed
rxrpc: Fix incoming call setup race
jira LE-1907 Rebuild_History Non-Buildable kernel-rt-5.14.0-284.30.1.rt14.315.el9_2 commit-author David Howells <[email protected]> commit 42f229c Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-rt-5.14.0-284.30.1.rt14.315.el9_2/42f229c3.failed An incoming call can race with rxrpc socket destruction, leading to a leaked call. This may result in an oops when the call timer eventually expires: BUG: kernel NULL pointer dereference, address: 0000000000000874 RIP: 0010:_raw_spin_lock_irqsave+0x2a/0x50 Call Trace: <IRQ> try_to_wake_up+0x59/0x550 ? __local_bh_enable_ip+0x37/0x80 ? rxrpc_poke_call+0x52/0x110 [rxrpc] ? rxrpc_poke_call+0x110/0x110 [rxrpc] ? rxrpc_poke_call+0x110/0x110 [rxrpc] call_timer_fn+0x24/0x120 with a warning in the kernel log looking something like: rxrpc: Call 00000000ba5e571a still in use (1,SvAwtACK,1061d,0)! incurred during rmmod of rxrpc. The 1061d is the call flags: RECVMSG_READ_ALL, RX_HEARD, BEGAN_RX_TIMER, RX_LAST, EXPOSED, IS_SERVICE, RELEASED but no DISCONNECTED flag (0x800), so it's an incoming (service) call and it's still connected. The race appears to be that: (1) rxrpc_new_incoming_call() consults the service struct, checks sk_state and allocates a call - then pauses, possibly for an interrupt. (2) rxrpc_release_sock() sets RXRPC_CLOSE, nulls the service pointer, discards the prealloc and releases all calls attached to the socket. (3) rxrpc_new_incoming_call() resumes, launching the new call, including its timer and attaching it to the socket. Fix this by read-locking local->services_lock to access the AF_RXRPC socket providing the service rather than RCU in rxrpc_new_incoming_call(). There's no real need to use RCU here as local->services_lock is only write-locked by the socket side in two places: when binding and when shutting down. Fixes: 5e6ef4f ("rxrpc: Make the I/O thread take over the call and local processor work") Reported-by: Marc Dionne <[email protected]> Signed-off-by: David Howells <[email protected]> cc: [email protected] (cherry picked from commit 42f229c) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # net/rxrpc/ar-internal.h # net/rxrpc/call_accept.c # net/rxrpc/security.c
1 parent 9c7c0bb commit da3ef93

File tree

1 file changed

+282
-0
lines changed

1 file changed

+282
-0
lines changed
Lines changed: 282 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,282 @@
1+
rxrpc: Fix incoming call setup race
2+
3+
jira LE-1907
4+
Rebuild_History Non-Buildable kernel-rt-5.14.0-284.30.1.rt14.315.el9_2
5+
commit-author David Howells <[email protected]>
6+
commit 42f229c350f57a8e825f7591e17cbc5c87e50235
7+
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
8+
Will be included in final tarball splat. Ref for failed cherry-pick at:
9+
ciq/ciq_backports/kernel-rt-5.14.0-284.30.1.rt14.315.el9_2/42f229c3.failed
10+
11+
An incoming call can race with rxrpc socket destruction, leading to a
12+
leaked call. This may result in an oops when the call timer eventually
13+
expires:
14+
15+
BUG: kernel NULL pointer dereference, address: 0000000000000874
16+
RIP: 0010:_raw_spin_lock_irqsave+0x2a/0x50
17+
Call Trace:
18+
<IRQ>
19+
try_to_wake_up+0x59/0x550
20+
? __local_bh_enable_ip+0x37/0x80
21+
? rxrpc_poke_call+0x52/0x110 [rxrpc]
22+
? rxrpc_poke_call+0x110/0x110 [rxrpc]
23+
? rxrpc_poke_call+0x110/0x110 [rxrpc]
24+
call_timer_fn+0x24/0x120
25+
26+
with a warning in the kernel log looking something like:
27+
28+
rxrpc: Call 00000000ba5e571a still in use (1,SvAwtACK,1061d,0)!
29+
30+
incurred during rmmod of rxrpc. The 1061d is the call flags:
31+
32+
RECVMSG_READ_ALL, RX_HEARD, BEGAN_RX_TIMER, RX_LAST, EXPOSED,
33+
IS_SERVICE, RELEASED
34+
35+
but no DISCONNECTED flag (0x800), so it's an incoming (service) call and
36+
it's still connected.
37+
38+
The race appears to be that:
39+
40+
(1) rxrpc_new_incoming_call() consults the service struct, checks sk_state
41+
and allocates a call - then pauses, possibly for an interrupt.
42+
43+
(2) rxrpc_release_sock() sets RXRPC_CLOSE, nulls the service pointer,
44+
discards the prealloc and releases all calls attached to the socket.
45+
46+
(3) rxrpc_new_incoming_call() resumes, launching the new call, including
47+
its timer and attaching it to the socket.
48+
49+
Fix this by read-locking local->services_lock to access the AF_RXRPC socket
50+
providing the service rather than RCU in rxrpc_new_incoming_call().
51+
There's no real need to use RCU here as local->services_lock is only
52+
write-locked by the socket side in two places: when binding and when
53+
shutting down.
54+
55+
Fixes: 5e6ef4f1017c ("rxrpc: Make the I/O thread take over the call and local processor work")
56+
Reported-by: Marc Dionne <[email protected]>
57+
Signed-off-by: David Howells <[email protected]>
58+
59+
(cherry picked from commit 42f229c350f57a8e825f7591e17cbc5c87e50235)
60+
Signed-off-by: Jonathan Maple <[email protected]>
61+
62+
# Conflicts:
63+
# net/rxrpc/ar-internal.h
64+
# net/rxrpc/call_accept.c
65+
# net/rxrpc/security.c
66+
diff --cc net/rxrpc/ar-internal.h
67+
index 46ce41afb431,433060cade03..000000000000
68+
--- a/net/rxrpc/ar-internal.h
69+
+++ b/net/rxrpc/ar-internal.h
70+
@@@ -276,18 -277,26 +276,24 @@@ struct rxrpc_local
71+
struct rcu_head rcu;
72+
atomic_t active_users; /* Number of users of the local endpoint */
73+
refcount_t ref; /* Number of references to the structure */
74+
- struct net *net; /* The network namespace */
75+
- struct rxrpc_net *rxnet; /* Our bits in the network namespace */
76+
+ struct rxrpc_net *rxnet; /* The network ns in which this resides */
77+
struct hlist_node link;
78+
struct socket *socket; /* my UDP socket */
79+
++<<<<<<< HEAD
80+
+ struct work_struct processor;
81+
+ struct list_head ack_tx_queue; /* List of ACKs that need sending */
82+
+ spinlock_t ack_tx_lock; /* ACK list lock */
83+
+ struct rxrpc_sock __rcu *service; /* Service(s) listening on this endpoint */
84+
++=======
85+
+ struct task_struct *io_thread;
86+
+ struct completion io_thread_ready; /* Indication that the I/O thread started */
87+
+ struct rxrpc_sock *service; /* Service(s) listening on this endpoint */
88+
++>>>>>>> 42f229c350f5 (rxrpc: Fix incoming call setup race)
89+
struct rw_semaphore defrag_sem; /* control re-enablement of IP DF bit */
90+
- struct sk_buff_head rx_queue; /* Received packets */
91+
- struct list_head conn_attend_q; /* Conns requiring immediate attention */
92+
- struct list_head call_attend_q; /* Calls requiring immediate attention */
93+
-
94+
+ struct sk_buff_head reject_queue; /* packets awaiting rejection */
95+
+ struct sk_buff_head event_queue; /* endpoint event packets awaiting processing */
96+
struct rb_root client_bundles; /* Client connection bundles by socket params */
97+
spinlock_t client_bundles_lock; /* Lock for client_bundles */
98+
- bool kill_all_client_conns;
99+
- struct list_head idle_client_conns;
100+
- struct timer_list client_conn_reap_timer;
101+
- unsigned long client_conn_flags;
102+
-#define RXRPC_CLIENT_CONN_REAP_TIMER 0 /* The client conn reap timer expired */
103+
-
104+
spinlock_t lock; /* access lock */
105+
rwlock_t services_lock; /* lock for services list */
106+
int debug_id; /* debug ID for printks */
107+
diff --cc net/rxrpc/call_accept.c
108+
index afe1f587aaf0,3e8689fdc437..000000000000
109+
--- a/net/rxrpc/call_accept.c
110+
+++ b/net/rxrpc/call_accept.c
111+
@@@ -354,6 -338,33 +354,35 @@@ struct rxrpc_call *rxrpc_new_incoming_c
112+
113+
_enter("");
114+
115+
++<<<<<<< HEAD
116+
++=======
117+
+ /* Don't set up a call for anything other than a DATA packet. */
118+
+ if (sp->hdr.type != RXRPC_PACKET_TYPE_DATA)
119+
+ return rxrpc_protocol_error(skb, rxrpc_eproto_no_service_call);
120+
+
121+
+ read_lock(&local->services_lock);
122+
+
123+
+ /* Weed out packets to services we're not offering. Packets that would
124+
+ * begin a call are explicitly rejected and the rest are just
125+
+ * discarded.
126+
+ */
127+
+ rx = local->service;
128+
+ if (!rx || (sp->hdr.serviceId != rx->srx.srx_service &&
129+
+ sp->hdr.serviceId != rx->second_service)
130+
+ ) {
131+
+ if (sp->hdr.type == RXRPC_PACKET_TYPE_DATA &&
132+
+ sp->hdr.seq == 1)
133+
+ goto unsupported_service;
134+
+ goto discard;
135+
+ }
136+
+
137+
+ if (!conn) {
138+
+ sec = rxrpc_get_incoming_security(rx, skb);
139+
+ if (!sec)
140+
+ goto unsupported_security;
141+
+ }
142+
+
143+
++>>>>>>> 42f229c350f5 (rxrpc: Fix incoming call setup race)
144+
spin_lock(&rx->incoming_lock);
145+
if (rx->sk.sk_state == RXRPC_SERVER_LISTEN_DISABLED ||
146+
rx->sk.sk_state == RXRPC_CLOSE) {
147+
@@@ -394,50 -391,43 +423,73 @@@
148+
rx->notify_new_call(&rx->sk, call, call->user_call_ID);
149+
150+
spin_lock(&conn->state_lock);
151+
- if (conn->state == RXRPC_CONN_SERVICE_UNSECURED) {
152+
+ switch (conn->state) {
153+
+ case RXRPC_CONN_SERVICE_UNSECURED:
154+
conn->state = RXRPC_CONN_SERVICE_CHALLENGING;
155+
set_bit(RXRPC_CONN_EV_CHALLENGE, &call->conn->events);
156+
- rxrpc_queue_conn(call->conn, rxrpc_conn_queue_challenge);
157+
+ rxrpc_queue_conn(call->conn);
158+
+ break;
159+
+
160+
+ case RXRPC_CONN_SERVICE:
161+
+ write_lock(&call->state_lock);
162+
+ if (call->state < RXRPC_CALL_COMPLETE)
163+
+ call->state = RXRPC_CALL_SERVER_RECV_REQUEST;
164+
+ write_unlock(&call->state_lock);
165+
+ break;
166+
+
167+
+ case RXRPC_CONN_REMOTELY_ABORTED:
168+
+ rxrpc_set_call_completion(call, RXRPC_CALL_REMOTELY_ABORTED,
169+
+ conn->abort_code, conn->error);
170+
+ break;
171+
+ case RXRPC_CONN_LOCALLY_ABORTED:
172+
+ rxrpc_abort_call("CON", call, sp->hdr.seq,
173+
+ conn->abort_code, conn->error);
174+
+ break;
175+
+ default:
176+
+ BUG();
177+
}
178+
spin_unlock(&conn->state_lock);
179+
-
180+
spin_unlock(&rx->incoming_lock);
181+
++<<<<<<< HEAD
182+
++=======
183+
+ read_unlock(&local->services_lock);
184+
++>>>>>>> 42f229c350f5 (rxrpc: Fix incoming call setup race)
185+
186+
- if (hlist_unhashed(&call->error_link)) {
187+
- spin_lock(&call->peer->lock);
188+
- hlist_add_head(&call->error_link, &call->peer->error_targets);
189+
- spin_unlock(&call->peer->lock);
190+
- }
191+
+ rxrpc_send_ping(call, skb);
192+
+
193+
+ /* We have to discard the prealloc queue's ref here and rely on a
194+
+ * combination of the RCU read lock and refs held either by the socket
195+
+ * (recvmsg queue, to-be-accepted queue or user ID tree) or the kernel
196+
+ * service to prevent the call from being deallocated too early.
197+
+ */
198+
+ rxrpc_put_call(call, rxrpc_call_put_discard_prealloc);
199+
200+
_leave(" = %p{%d}", call, call->debug_id);
201+
- rxrpc_input_call_event(call, skb);
202+
- rxrpc_put_call(call, rxrpc_call_put_input);
203+
- return true;
204+
+ return call;
205+
206+
++<<<<<<< HEAD
207+
+no_call:
208+
+ spin_unlock(&rx->incoming_lock);
209+
+ _leave(" = NULL [%u]", skb->mark);
210+
+ return NULL;
211+
++=======
212+
+ unsupported_service:
213+
+ read_unlock(&local->services_lock);
214+
+ return rxrpc_direct_abort(skb, rxrpc_abort_service_not_offered,
215+
+ RX_INVALID_OPERATION, -EOPNOTSUPP);
216+
+ unsupported_security:
217+
+ read_unlock(&local->services_lock);
218+
+ return rxrpc_direct_abort(skb, rxrpc_abort_service_not_offered,
219+
+ RX_INVALID_OPERATION, -EKEYREJECTED);
220+
+ no_call:
221+
+ spin_unlock(&rx->incoming_lock);
222+
+ read_unlock(&local->services_lock);
223+
+ _leave(" = f [%u]", skb->mark);
224+
+ return false;
225+
+ discard:
226+
+ read_unlock(&local->services_lock);
227+
+ return true;
228+
++>>>>>>> 42f229c350f5 (rxrpc: Fix incoming call setup race)
229+
}
230+
231+
/*
232+
diff --cc net/rxrpc/security.c
233+
index 50cb5f1ee0c0,cb8dd1d3b1d4..000000000000
234+
--- a/net/rxrpc/security.c
235+
+++ b/net/rxrpc/security.c
236+
@@@ -161,9 -178,9 +161,13 @@@ struct key *rxrpc_look_up_server_securi
237+
sprintf(kdesc, "%u:%u",
238+
sp->hdr.serviceId, sp->hdr.securityIndex);
239+
240+
- rcu_read_lock();
241+
+ read_lock(&conn->local->services_lock);
242+
243+
++<<<<<<< HEAD
244+
+ rx = rcu_dereference(conn->params.local->service);
245+
++=======
246+
+ rx = conn->local->service;
247+
++>>>>>>> 42f229c350f5 (rxrpc: Fix incoming call setup race)
248+
if (!rx)
249+
goto out;
250+
251+
diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
252+
index 0f4d34f420f0..1df9cb1f9275 100644
253+
--- a/net/rxrpc/af_rxrpc.c
254+
+++ b/net/rxrpc/af_rxrpc.c
255+
@@ -155,10 +155,10 @@ static int rxrpc_bind(struct socket *sock, struct sockaddr *saddr, int len)
256+
257+
if (service_id) {
258+
write_lock(&local->services_lock);
259+
- if (rcu_access_pointer(local->service))
260+
+ if (local->service)
261+
goto service_in_use;
262+
rx->local = local;
263+
- rcu_assign_pointer(local->service, rx);
264+
+ local->service = rx;
265+
write_unlock(&local->services_lock);
266+
267+
rx->sk.sk_state = RXRPC_SERVER_BOUND;
268+
@@ -872,9 +872,9 @@ static int rxrpc_release_sock(struct sock *sk)
269+
270+
sk->sk_state = RXRPC_CLOSE;
271+
272+
- if (rx->local && rcu_access_pointer(rx->local->service) == rx) {
273+
+ if (rx->local && rx->local->service == rx) {
274+
write_lock(&rx->local->services_lock);
275+
- rcu_assign_pointer(rx->local->service, NULL);
276+
+ rx->local->service = NULL;
277+
write_unlock(&rx->local->services_lock);
278+
}
279+
280+
* Unmerged path net/rxrpc/ar-internal.h
281+
* Unmerged path net/rxrpc/call_accept.c
282+
* Unmerged path net/rxrpc/security.c

0 commit comments

Comments
 (0)