Commit 4219196
ibmvnic: fix race between xmit and reset
There is a race between reset and the transmit paths that can lead to
ibmvnic_xmit() accessing an scrq after it has been freed in the reset
path. It can result in a crash like:
Kernel attempted to read user page (0) - exploit attempt? (uid: 0)
BUG: Kernel NULL pointer dereference on read at 0x00000000
Faulting instruction address: 0xc0080000016189f8
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [c0080000016189f8] ibmvnic_xmit+0x60/0xb60 [ibmvnic]
LR [c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
Call Trace:
[c008000001618f08] ibmvnic_xmit+0x570/0xb60 [ibmvnic] (unreliable)
[c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
[c000000000c9cfcc] sch_direct_xmit+0xec/0x330
[c000000000bfe640] __dev_xmit_skb+0x3a0/0x9d0
[c000000000c00ad4] __dev_queue_xmit+0x394/0x730
[c008000002db813c] __bond_start_xmit+0x254/0x450 [bonding]
[c008000002db8378] bond_start_xmit+0x40/0xc0 [bonding]
[c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
[c000000000c00ca4] __dev_queue_xmit+0x564/0x730
[c000000000cf97e0] neigh_hh_output+0xd0/0x180
[c000000000cfa69c] ip_finish_output2+0x31c/0x5c0
[c000000000cfd244] __ip_queue_xmit+0x194/0x4f0
[c000000000d2a3c4] __tcp_transmit_skb+0x434/0x9b0
[c000000000d2d1e0] __tcp_retransmit_skb+0x1d0/0x6a0
[c000000000d2d984] tcp_retransmit_skb+0x34/0x130
[c000000000d310e8] tcp_retransmit_timer+0x388/0x6d0
[c000000000d315ec] tcp_write_timer_handler+0x1bc/0x330
[c000000000d317bc] tcp_write_timer+0x5c/0x200
[c000000000243270] call_timer_fn+0x50/0x1c0
[c000000000243704] __run_timers.part.0+0x324/0x460
[c000000000243894] run_timer_softirq+0x54/0xa0
[c000000000ea713c] __do_softirq+0x15c/0x3e0
[c000000000166258] __irq_exit_rcu+0x158/0x190
[c000000000166420] irq_exit+0x20/0x40
[c00000000002853c] timer_interrupt+0x14c/0x2b0
[c000000000009a00] decrementer_common_virt+0x210/0x220
--- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
The immediate cause of the crash is the access of tx_scrq in the following
snippet during a reset, where the tx_scrq can be either NULL or an address
that will soon be invalid:
ibmvnic_xmit()
{
...
tx_scrq = adapter->tx_scrq[queue_num];
txq = netdev_get_tx_queue(netdev, queue_num);
ind_bufp = &tx_scrq->ind_buf;
if (test_bit(0, &adapter->resetting)) {
...
}
But beyond that, the call to ibmvnic_xmit() itself is not safe during a
reset and the reset path attempts to avoid this by stopping the queue in
ibmvnic_cleanup(). However just after the queue was stopped, an in-flight
ibmvnic_complete_tx() could have restarted the queue even as the reset is
progressing.
Since the queue was restarted we could get a call to ibmvnic_xmit() which
can then access the bad tx_scrq (or other fields).
We cannot however simply have ibmvnic_complete_tx() check the ->resetting
bit and skip starting the queue. This can race at the "back-end" of a good
reset which just restarted the queue but has not cleared the ->resetting
bit yet. If we skip restarting the queue due to ->resetting being true,
the queue would remain stopped indefinitely potentially leading to transmit
timeouts.
IOW ->resetting is too broad for this purpose. Instead use a new flag
that indicates whether or not the queues are active. Only the open/
reset paths control when the queues are active. ibmvnic_complete_tx()
and others wake up the queue only if the queue is marked active.
So we will have:
A. reset/open thread in ibmvnic_cleanup() and __ibmvnic_open()
->resetting = true
->tx_queues_active = false
disable tx queues
...
->tx_queues_active = true
start tx queues
B. Tx interrupt in ibmvnic_complete_tx():
if (->tx_queues_active)
netif_wake_subqueue();
To ensure that ->tx_queues_active and state of the queues are consistent,
we need a lock which:
- must also be taken in the interrupt path (ibmvnic_complete_tx())
- shared across the multiple queues in the adapter (so they don't
become serialized)
Use rcu_read_lock() and have the reset thread synchronize_rcu() after
updating the ->tx_queues_active state.
While here, consolidate a few boolean fields in ibmvnic_adapter for
better alignment.
Based on discussions with Brian King and Dany Madden.
Fixes: 7ed5b31 ("net/ibmvnic: prevent more than one thread from running in reset")
Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>1 parent 4fa331b commit 4219196
2 files changed
Lines changed: 55 additions & 15 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1429 | 1429 | | |
1430 | 1430 | | |
1431 | 1431 | | |
| 1432 | + | |
| 1433 | + | |
| 1434 | + | |
| 1435 | + | |
| 1436 | + | |
| 1437 | + | |
| 1438 | + | |
| 1439 | + | |
| 1440 | + | |
1432 | 1441 | | |
1433 | 1442 | | |
1434 | 1443 | | |
| |||
1603 | 1612 | | |
1604 | 1613 | | |
1605 | 1614 | | |
| 1615 | + | |
| 1616 | + | |
| 1617 | + | |
| 1618 | + | |
| 1619 | + | |
| 1620 | + | |
| 1621 | + | |
| 1622 | + | |
1606 | 1623 | | |
1607 | 1624 | | |
1608 | 1625 | | |
| |||
1842 | 1859 | | |
1843 | 1860 | | |
1844 | 1861 | | |
| 1862 | + | |
1845 | 1863 | | |
| 1864 | + | |
1846 | 1865 | | |
1847 | 1866 | | |
1848 | | - | |
1849 | | - | |
1850 | | - | |
1851 | | - | |
1852 | | - | |
| 1867 | + | |
| 1868 | + | |
| 1869 | + | |
| 1870 | + | |
| 1871 | + | |
| 1872 | + | |
| 1873 | + | |
| 1874 | + | |
| 1875 | + | |
| 1876 | + | |
1853 | 1877 | | |
1854 | 1878 | | |
1855 | 1879 | | |
| |||
1904 | 1928 | | |
1905 | 1929 | | |
1906 | 1930 | | |
1907 | | - | |
1908 | | - | |
1909 | | - | |
1910 | | - | |
1911 | | - | |
| 1931 | + | |
| 1932 | + | |
| 1933 | + | |
| 1934 | + | |
| 1935 | + | |
| 1936 | + | |
1912 | 1937 | | |
1913 | 1938 | | |
1914 | 1939 | | |
| |||
1917 | 1942 | | |
1918 | 1943 | | |
1919 | 1944 | | |
| 1945 | + | |
| 1946 | + | |
| 1947 | + | |
| 1948 | + | |
1920 | 1949 | | |
1921 | 1950 | | |
1922 | 1951 | | |
1923 | 1952 | | |
1924 | 1953 | | |
1925 | 1954 | | |
1926 | 1955 | | |
| 1956 | + | |
1927 | 1957 | | |
1928 | 1958 | | |
1929 | 1959 | | |
| |||
2078 | 2108 | | |
2079 | 2109 | | |
2080 | 2110 | | |
| 2111 | + | |
2081 | 2112 | | |
2082 | 2113 | | |
2083 | 2114 | | |
| |||
3732 | 3763 | | |
3733 | 3764 | | |
3734 | 3765 | | |
3735 | | - | |
3736 | | - | |
3737 | | - | |
| 3766 | + | |
| 3767 | + | |
| 3768 | + | |
| 3769 | + | |
| 3770 | + | |
| 3771 | + | |
| 3772 | + | |
| 3773 | + | |
| 3774 | + | |
3738 | 3775 | | |
3739 | 3776 | | |
3740 | 3777 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1006 | 1006 | | |
1007 | 1007 | | |
1008 | 1008 | | |
1009 | | - | |
1010 | | - | |
1011 | 1009 | | |
1012 | 1010 | | |
1013 | 1011 | | |
| 1012 | + | |
| 1013 | + | |
| 1014 | + | |
| 1015 | + | |
| 1016 | + | |
1014 | 1017 | | |
1015 | 1018 | | |
1016 | 1019 | | |
| |||
0 commit comments