Skip to content

Commit 983512f

Browse files
Sebastian Andrzej SiewiorPaolo Abeni
authored andcommitted
net: Drop the lock in skb_may_tx_timestamp()
skb_may_tx_timestamp() may acquire sock::sk_callback_lock. The lock must not be taken in IRQ context, only softirq is okay. A few drivers receive the timestamp via a dedicated interrupt and complete the TX timestamp from that handler. This will lead to a deadlock if the lock is already write-locked on the same CPU. Taking the lock can be avoided. The socket (pointed by the skb) will remain valid until the skb is released. The ->sk_socket and ->file member will be set to NULL once the user closes the socket which may happen before the timestamp arrives. If we happen to observe the pointer while the socket is closing but before the pointer is set to NULL then we may use it because both pointer (and the file's cred member) are RCU freed. Drop the lock. Use READ_ONCE() to obtain the individual pointer. Add a matching WRITE_ONCE() where the pointer are cleared. Link: https://lore.kernel.org/all/20260205145104.iWinkXHv@linutronix.de Fixes: b245be1 ("net-timestamp: no-payload only sysctl") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260220183858.N4ERjFW6@linutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
1 parent 82aec77 commit 983512f

3 files changed

Lines changed: 20 additions & 7 deletions

File tree

include/net/sock.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2098,7 +2098,7 @@ static inline int sk_rx_queue_get(const struct sock *sk)
20982098

20992099
static inline void sk_set_socket(struct sock *sk, struct socket *sock)
21002100
{
2101-
sk->sk_socket = sock;
2101+
WRITE_ONCE(sk->sk_socket, sock);
21022102
if (sock) {
21032103
WRITE_ONCE(sk->sk_uid, SOCK_INODE(sock)->i_uid);
21042104
WRITE_ONCE(sk->sk_ino, SOCK_INODE(sock)->i_ino);

net/core/skbuff.c

Lines changed: 18 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5590,15 +5590,28 @@ static void __skb_complete_tx_timestamp(struct sk_buff *skb,
55905590

55915591
static bool skb_may_tx_timestamp(struct sock *sk, bool tsonly)
55925592
{
5593-
bool ret;
5593+
struct socket *sock;
5594+
struct file *file;
5595+
bool ret = false;
55945596

55955597
if (likely(tsonly || READ_ONCE(sock_net(sk)->core.sysctl_tstamp_allow_data)))
55965598
return true;
55975599

5598-
read_lock_bh(&sk->sk_callback_lock);
5599-
ret = sk->sk_socket && sk->sk_socket->file &&
5600-
file_ns_capable(sk->sk_socket->file, &init_user_ns, CAP_NET_RAW);
5601-
read_unlock_bh(&sk->sk_callback_lock);
5600+
/* The sk pointer remains valid as long as the skb is. The sk_socket and
5601+
* file pointer may become NULL if the socket is closed. Both structures
5602+
* (including file->cred) are RCU freed which means they can be accessed
5603+
* within a RCU read section.
5604+
*/
5605+
rcu_read_lock();
5606+
sock = READ_ONCE(sk->sk_socket);
5607+
if (!sock)
5608+
goto out;
5609+
file = READ_ONCE(sock->file);
5610+
if (!file)
5611+
goto out;
5612+
ret = file_ns_capable(file, &init_user_ns, CAP_NET_RAW);
5613+
out:
5614+
rcu_read_unlock();
56025615
return ret;
56035616
}
56045617

net/socket.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -674,7 +674,7 @@ static void __sock_release(struct socket *sock, struct inode *inode)
674674
iput(SOCK_INODE(sock));
675675
return;
676676
}
677-
sock->file = NULL;
677+
WRITE_ONCE(sock->file, NULL);
678678
}
679679

680680
/**

0 commit comments

Comments
 (0)