Author: Zack McKevitt <zachary.mckevitt@oss.qualcomm.com>
Date: Thu Apr 30 12:39:01 2026 -0700
accel/qaic: Add overflow check to remap_pfn_range during mmap
[ Upstream commit aa16b2bc0f02709919e2435f531406531e5bcc69 ]
The call to remap_pfn_range in qaic_gem_object_mmap is susceptible to
(re)mapping beyond the VMA if the BO is too large. This can cause use
after free issues when munmap() unmaps only the VMA region and not the
additional mappings. To prevent this, check the remaining size of the
VMA before remapping and truncate the remapped length if sg->length is
too large.
Reported-by: Lukas Maar <lukas.maar@tugraz.at>
Fixes: ff13be830333 ("accel/qaic: Add datapath")
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Zack McKevitt <zachary.mckevitt@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
[jhugo: fix braces from checkpatch --strict]
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://patch.msgid.link/20260430193858.1178641-1-zachary.mckevitt@oss.qualcomm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jann Horn <jannh@google.com>
Date: Mon May 18 18:51:30 2026 +0200
af_unix: Fix UAF read of tail->len in unix_stream_data_wait()
commit be309f8eae8b474a4a617eaae01324da996fc719 upstream.
unix_stream_data_wait() does skb_peek_tail(&sk->sk_receive_queue) without
holding any lock that prevents SKBs on that queue from being dequeued and
freed.
This has been the case since commit 79f632c71bea ("unix/stream: fix
peeking with an offset larger than data in queue").
The first consequence of this is that the pointer comparison
`tail != last` can be false even if `last` semantically refers to an
already-freed SKB while `tail` is a new SKB allocated at the same address;
which can cause unix_stream_data_wait() to wrongly keep blocking after new
data has arrived, but only in a weird scenario where a peeking recv() and
a normal recv() on the same socket are racing, which is probably not a
real problem.
But since commit 2b514574f7e8 ("net: af_unix: implement splice for stream
af_unix sockets"), `tail` is actually dereferenced, which can cause UAF in
the following race scenario (where test_setup() runs single-threaded,
and afterwards, test_thread1() and test_thread2() run concurrently in
two threads:
```
static int socks[2];
void test_setup(void) {
socketpair(AF_UNIX, SOCK_STREAM, 0, socks);
send(socks[1], "A", 1, 0);
int peekoff = 1;
setsockopt(socks[0], SOL_SOCKET, SO_PEEK_OFF, &peekoff, sizeof(peekoff));
}
void test_thread1(void) {
char dummy;
recv(socks[0], &dummy, 1, MSG_PEEK);
}
void test_thread2(void) {
char dummy;
recv(socks[0], &dummy, 1, 0);
shutdown(socks[1], SHUT_WR);
}
```
when racing like this:
```
thread1 thread2
unix_stream_read_generic
mutex_lock(&u->iolock)
skb_peek(&sk->sk_receive_queue)
skb_peek_next(skb, &sk->sk_receive_queue)
mutex_unlock(&u->iolock)
unix_stream_read_generic
unix_state_lock(sk)
skb_peek(&sk->sk_receive_queue)
unix_state_unlock(sk)
unix_stream_data_wait
unix_state_lock(sk)
tail = skb_peek_tail(&sk->sk_receive_queue)
spin_lock(&sk->sk_receive_queue.lock)
__skb_unlink(skb, &sk->sk_receive_queue)
spin_unlock(&sk->sk_receive_queue.lock)
consume_skb(skb) [frees the SKB]
`tail != last`: false
`tail`: true
`tail->len != last_len` ***UAF***
```
Fix the UAF by removing the read of tail->len; checking tail->len would
only make sense if SKBs in the receive queue of a UNIX socket could grow,
which can no longer happen.
Kuniyuki explained:
> When commit 869e7c62486e ("net: af_unix: implement stream sendpage
> support") added sendpage() support, data could be appended to the last
> skb in the receiver's queue.
>
> That's why we needed to check if the length of the last skb was changed
> while waiting for new data in unix_stream_data_wait().
>
> However, commit a0dbf5f818f9 ("af_unix: Support MSG_SPLICE_PAGES") and
> commit 57d44a354a43 ("unix: Convert unix_stream_sendpage() to use
> MSG_SPLICE_PAGES") refactored sendmsg(), and now data is always added
> to a new skb.
That means this fix is not suitable for kernels before 6.5.
Fixes: 2b514574f7e8 ("net: af_unix: implement splice for stream af_unix sockets")
Cc: stable@vger.kernel.org # 6.5.x
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260518-b4-unix-recv-wait-hotfix-v2-1-83e29ce8ad31@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kuniyuki Iwashima <kuniyu@google.com>
Date: Tue May 26 13:45:49 2026 +0800
af_unix: Give up GC if MSG_PEEK intervened.
[ Upstream commit e5b31d988a41549037b8d8721a3c3cae893d8670 ]
Igor Ushakov reported that GC purged the receive queue of
an alive socket due to a race with MSG_PEEK with a nice repro.
This is the exact same issue previously fixed by commit
cbcf01128d0a ("af_unix: fix garbage collect vs MSG_PEEK").
After GC was replaced with the current algorithm, the cited
commit removed the locking dance in unix_peek_fds() and
reintroduced the same issue.
The problem is that MSG_PEEK bumps a file refcount without
interacting with GC.
Consider an SCC containing sk-A and sk-B, where sk-A is
close()d but can be recv()ed via sk-B.
The bad thing happens if sk-A is recv()ed with MSG_PEEK from
sk-B and sk-B is close()d while GC is checking unix_vertex_dead()
for sk-A and sk-B.
GC thread User thread
--------- -----------
unix_vertex_dead(sk-A)
-> true <------.
\
`------ recv(sk-B, MSG_PEEK)
invalidate !! -> sk-A's file refcount : 1 -> 2
close(sk-B)
-> sk-B's file refcount : 2 -> 1
unix_vertex_dead(sk-B)
-> true
Initially, sk-A's file refcount is 1 by the inflight fd in sk-B
recvq. GC thinks sk-A is dead because the file refcount is the
same as the number of its inflight fds.
However, sk-A's file refcount is bumped silently by MSG_PEEK,
which invalidates the previous evaluation.
At this moment, sk-B's file refcount is 2; one by the open fd,
and one by the inflight fd in sk-A. The subsequent close()
releases one refcount by the former.
Finally, GC incorrectly concludes that both sk-A and sk-B are dead.
One option is to restore the locking dance in unix_peek_fds(),
but we can resolve this more elegantly thanks to the new algorithm.
The point is that the issue does not occur without the subsequent
close() and we actually do not need to synchronise MSG_PEEK with
the dead SCC detection.
When the issue occurs, close() and GC touch the same file refcount.
If GC sees the refcount being decremented by close(), it can just
give up garbage-collecting the SCC.
Therefore, we only need to signal the race during MSG_PEEK with
a proper memory barrier to make it visible to the GC.
Let's use seqcount_t to notify GC when MSG_PEEK occurs and let
it defer the SCC to the next run.
This way no locking is needed on the MSG_PEEK side, and we can
avoid imposing a penalty on every MSG_PEEK unnecessarily.
Note that we can retry within unix_scc_dead() if MSG_PEEK is
detected, but we do not do so to avoid hung task splat from
abusive MSG_PEEK calls.
Fixes: 118f457da9ed ("af_unix: Remove lock dance in unix_peek_fds().")
Reported-by: Igor Ushakov <sysroot314@gmail.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260311054043.1231316-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[ Using include/net/af_unix.h instead of net/unix/af_unix.h on 6.12.y ]
Signed-off-by: Leon Chen <leonchen.oss@139.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Takashi Iwai <tiwai@suse.de>
Date: Fri May 15 10:55:58 2026 +0200
ALSA: asihpi: Fix potential OOB array access at reading cache
commit 7b7d6572145c1dab2dd9bfb550b188e5f0ff3c3f upstream.
find_control() to retrieve a cached info accesses the array with the
given index blindly, which may lead to an OOB array access.
Add a sanity check for avoiding it.
Link: https://sashiko.dev/#/patchset/20260511230121.28606-1-rosenp%40gmail.com
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20260515085606.242284-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Shuhao Fu <sfual@cse.ust.hk>
Date: Tue Apr 28 16:12:38 2026 +0800
ALSA: hda: cs35l41: Put ACPI device on missing physical node
[ Upstream commit fca7401fe37f7abc6e54147ea560f37279231137 ]
acpi_dev_get_first_match_dev() returns a refcounted ACPI device and
callers must balance it with acpi_dev_put().
cs35l41_hda_read_acpi() stores the returned ACPI device in
cs35l41->dacpi. That reference is normally released by the later
probe cleanup or the remove path, but the NULL-check on
physdev exits before either of those paths can run.
Drop the lookup reference before returning -ENODEV.
Fixes: c34b04cc6178 ("ALSA: hda: cs35l41: Fix NULL pointer dereference in cs35l41_hda_read_acpi()")
Signed-off-by: Shuhao Fu <sfual@cse.ust.hk>
Tested-by: Simon Trimmer <simont@opensource.cirrus.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20260428081238.GA1659932@chcpu16
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shuhao Fu <sfual@cse.ust.hk>
Date: Tue Apr 28 16:01:39 2026 +0800
ALSA: hda: cs35l56: Put ACPI device after setting companion
[ Upstream commit aa2fbece1b07954ef26488c800d126a36a8ab93e ]
acpi_dev_get_first_match_dev() returns a refcounted ACPI device and
callers are expected to balance it with acpi_dev_put().
When no companion is already attached, cs35l56_hda_read_acpi() looks
up an ACPI device and sets it with ACPI_COMPANION_SET(), but leaves
the lookup reference held.
ACPI_COMPANION_SET() does not take ownership of that reference, so
drop it with acpi_dev_put() after attaching the companion.
Fixes: 73cfbfa9caea ("ALSA: hda/cs35l56: Add driver for Cirrus Logic CS35L56 amplifier")
Signed-off-by: Shuhao Fu <sfual@cse.ust.hk>
Tested-by: Simon Trimmer <simont@opensource.cirrus.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20260428080139.GA1649104@chcpu16
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Takashi Iwai <tiwai@suse.de>
Date: Sun May 17 18:51:20 2026 +0200
ALSA: pcm: Don't setup bogus iov_iter for silencing
commit e4d3386b74fba8e01280484b67ee481ece00201e upstream.
At transition to the iov_iter for PCM data transfer, we blindly
applied the iov_iter setup also for silencing (i.e. data = NULL), and
it leads to a calculation of bogus iov_iter. Fortunately this didn't
cause troubles on most of architectures but it goes wrong on RISC-V
now, causing a NULL dereference.
Handle the NULL data case to treat the silencing in interleaved_copy()
for addressing the bug above. noninterleaved_copy() has already the
NULL data handling, so it doesn't need changes.
Reported-by: Jiakai Xu <xujiakai24@mails.ucas.ac.cn>
Closes: https://lore.kernel.org/20260515051516.3103036-1-xujiakai24@mails.ucas.ac.cn
Fixes: cf393babb37a ("ALSA: pcm: Add copy ops with iov_iter")
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20260517165121.31399-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Robertus Diawan Chris <robertusdchris@gmail.com>
Date: Fri May 8 10:39:14 2026 +0700
ALSA: scarlett2: Add missing error check when initialise Autogain Status
[ Upstream commit c0e4fffc0f474b7ed10adee4ab2bc1a66d36fc72 ]
When initialise new control with scarlett2_add_new_ctl() function for
Autogain Status, scarlett2_add_new_ctl() might throw an error. So, add
error check after initialise new control for Autogain Status.
This is reported by Coverity Scan with CID 1598781 as UNUSED_VALUE.
Fixes: 0a995e38dc44 ("ALSA: scarlett2: Add support for software-controllable input gain")
Signed-off-by: Robertus Diawan Chris <robertusdchris@gmail.com>
Link: https://patch.msgid.link/20260508033914.111596-1-robertusdchris@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zhang Cen <rollkingzzc@gmail.com>
Date: Wed May 20 18:32:49 2026 +0800
ALSA: seq: Serialize UMP output teardown with event_input
[ Upstream commit 60a1969fae6209644698fca91c185d153674f631 ]
seq_ump_process_event() borrows client->out_rfile.output without
synchronizing with the first-open and last-close transition in
seq_ump_client_open() and seq_ump_client_close().
The last output unuse can therefore drop opened[STR_OUT] to zero and
release the rawmidi file while an in-flight event_input callback is still
inside snd_rawmidi_kernel_write(). That leaves the rawmidi substream
runtime exposed to teardown before the write path has taken its own
buffer reference.
Add a per-client rwlock for the event_input-visible output file. Publish
a newly opened output file under the write side, and hold the read side
from the output lookup through snd_rawmidi_kernel_write(). The last
output close copies and clears the visible output file under the write
side, then drops the lock and releases the saved rawmidi file. Use
IRQ-safe rwlock guards because event_input can also be reached from
atomic sequencer delivery.
The buggy scenario involves two paths, with each column showing the
order within that path:
path A label: event_input path path B label: last unuse path
1. seq_ump_process_event() reads 1. seq_ump_client_close()
client->out_rfile.output. drops opened[STR_OUT] to zero.
2. snd_rawmidi_kernel_write1() 2. snd_rawmidi_kernel_release()
has not yet pinned runtime. closes the output file.
3. The writer continues using 3. close_substream() frees
the borrowed substream. substream->runtime.
This keeps the output substream and runtime alive for the full
event_input write while keeping rawmidi release outside the rwlock.
KASAN reproduced this as a slab-use-after-free in
snd_rawmidi_kernel_write1(), with allocation through
seq_ump_use()/snd_seq_port_connect() and free through
seq_ump_unuse()/snd_seq_port_disconnect().
Suggested-by: Takashi Iwai <tiwai@suse.de>
Validation reproduced this kernel report:
KASAN slab-use-after-free in snd_rawmidi_kernel_write1+0x9d/0x400
RIP: 0033:0x7f5528af837f
Read of size 8
Call trace:
dump_stack_lvl+0x73/0xb0 (?:?)
print_report+0xd1/0x650 (?:?)
srso_alias_return_thunk+0x5/0xfbef5 (?:?)
__virt_addr_valid+0x1a7/0x340 (?:?)
kasan_complete_mode_report_info+0x64/0x200 (?:?)
kasan_report+0xf7/0x130 (?:?)
snd_rawmidi_kernel_write1+0x9d/0x400 (?:?)
__asan_load8+0x82/0xb0 (?:?)
update_stack_state+0x1ef/0x2d0 (?:?)
snd_rawmidi_kernel_write+0x1a/0x20 (?:?)
seq_ump_process_event+0xd4/0x120 (sound/core/seq/seq_ump_client.c:82)
__snd_seq_deliver_single_event+0x8a/0xe0 (?:?)
snd_seq_deliver_from_ump+0x2b2/0xd60 (?:?)
lock_acquire+0x14e/0x2e0 (?:?)
find_held_lock+0x31/0x90 (?:?)
snd_seq_port_use_ptr+0xa6/0xe0 (?:?)
__kasan_check_write+0x18/0x20 (?:?)
do_raw_read_unlock+0x32/0xa0 (?:?)
_raw_read_unlock+0x26/0x50 (?:?)
snd_seq_deliver_single_event+0x45c/0x4b0 (?:?)
snd_seq_deliver_event+0x10d/0x1b0 (?:?)
snd_seq_client_enqueue_event+0x192/0x240 (?:?)
snd_seq_write+0x2cd/0x450 (?:?)
apparmor_file_permission+0x20/0x30 (?:?)
security_file_permission+0x51/0x60 (?:?)
vfs_write+0x1ce/0x850 (?:?)
__fget_files+0x12b/0x220 (?:?)
lock_release+0xc8/0x2a0 (?:?)
__rcu_read_unlock+0x74/0x2d0 (?:?)
__fget_files+0x135/0x220 (?:?)
ksys_write+0x15a/0x180 (?:?)
rcu_is_watching+0x24/0x60 (?:?)
__x64_sys_write+0x46/0x60 (?:?)
x64_sys_call+0x7d/0x20d0 (?:?)
do_syscall_64+0xc1/0x360 (arch/x86/entry/syscall_64.c:87)
entry_SYSCALL_64_after_hwframe+0x77/0x7f (?:?)
Fixes: 81fd444aa371 ("ALSA: seq: Bind UMP device")
Signed-off-by: Zhang Cen <rollkingzzc@gmail.com>
Link: https://patch.msgid.link/20260520103249.3048345-1-rollkingzzc@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Date: Tue May 19 00:32:15 2026 -0300
ALSA: ua101: Reject too-short USB descriptors
commit b59d5c51bb328a60749b4dd5fe7e649bfb4089b4 upstream.
find_format_descriptor() walks the class-specific interface extras by
advancing with bLength. It rejects descriptors that extend past the
remaining buffer, but it does not reject descriptor lengths smaller than
a USB descriptor header.
Reject too-short descriptors before using bLength to advance the local
scan. This keeps the UA-101 parser robust against malformed descriptor
data and matches the usual USB descriptor walking rules.
Fixes: 63978ab3e3e9 ("sound: add Edirol UA-101 support")
Cc: stable@vger.kernel.org
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260519-alsa-ua101-desc-len-v1-1-4307d1a5e054@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Lukas Bulwahn <lukas.bulwahn@redhat.com>
Date: Tue May 26 19:24:40 2026 +0000
arm64: Kconfig: Remove selecting replaced HAVE_FUNCTION_GRAPH_RETVAL
commit f458b2165d7ac0f2401fff48f19c8f864e7e1e38 upstream.
Commit a3ed4157b7d8 ("fgraph: Replace fgraph_ret_regs with ftrace_regs")
replaces the config HAVE_FUNCTION_GRAPH_RETVAL with the config
HAVE_FUNCTION_GRAPH_FREGS, and it replaces all the select commands in the
various architecture Kconfig files. In the arm64 architecture, the commit
adds the 'select HAVE_FUNCTION_GRAPH_FREGS', but misses to remove the
'select HAVE_FUNCTION_GRAPH_RETVAL', i.e., the select on the replaced
config.
Remove selecting the replaced config. No functional change, just cleanup.
Fixes: a3ed4157b7d8 ("fgraph: Replace fgraph_ret_regs with ftrace_regs")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com>
Link: https://lore.kernel.org/r/20250117125522.99071-1-lukas.bulwahn@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Gyokhan Kochmarla <gyokhan@amazon.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vladimir Murzin <vladimir.murzin@arm.com>
Date: Fri May 15 14:37:29 2026 +0100
arm64: probes: Handle probes on hinted conditional branch instructions
commit 2ccd8ff980b50e842481bae71102fa3883fc4377 upstream.
BC.cond instructions introduced by FEAT_HBC cannot be executed
out-of-line, like other branch instructions. However, they can be
simulated in the same way as B.cond instructions.
Extend the B.cond decoder mask to match BC.cond instructions as well,
and handle them using the existing B.cond simulation path.
Fixes: 7f86d128e437 ("arm64: add HWCAP for FEAT_HBC (hinted conditional branches)")
Cc: <stable@vger.kernel.org>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Marek Vasut <marek.vasut+renesas@mailbox.org>
Date: Sat Mar 28 00:42:10 2026 +0100
ARM: dts: renesas: genmai: Drop superfluous cells
[ Upstream commit 714e1d6bba0e0abe5c87c8e189a35fa690540df4 ]
Drop superfluous address-cells and size-cells to fix DTC W=1 warning:
arch/arm/boot/dts/renesas/r7s72100-genmai.dts:28.17-55.4: Warning (avoid_unnecessary_addr_size): /flash@18000000: unnecessary #address-cells/#size-cells without "ranges", "dma-ranges" or child "reg" or "ranges" property
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Fixes: 30e0a8cf886cb459 ("ARM: dts: renesas: genmai: Add FLASH nodes")
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260327234244.91707-6-marek.vasut+renesas@mailbox.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marek Vasut <marek.vasut+renesas@mailbox.org>
Date: Sat Mar 28 00:42:11 2026 +0100
ARM: dts: renesas: rskrza1: Drop superfluous cells
[ Upstream commit ab83176d3cf1cf1c1f6e604432905bda4515d17f ]
Drop superfluous address-cells and size-cells to fix DTC W=1 warning:
arch/arm/boot/dts/renesas/r7s72100-rskrza1.dts:32.17-72.4: Warning (avoid_unnecessary_addr_size): /flash@18000000: unnecessary #address-cells/#size-cells without "ranges", "dma-ranges" or child "reg" or "ranges" property
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Fixes: 98537eb77d3ef185 ("ARM: dts: renesas: rskrza1: Add FLASH nodes")
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260327234244.91707-7-marek.vasut+renesas@mailbox.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Guenter Roeck <linux@roeck-us.net>
Date: Tue May 5 21:15:37 2026 +0200
ARM: integrator: Fix early initialization
[ Upstream commit 90d77b30a666049ad24df463f52e5d529c44e8cd ]
Starting with commit bdb249fce9ad4 ("ARM: integrator: read counter using
syscon/regmap"), intcp_init_early calls syscon_regmap_lookup_by_compatible
which in turn calls of_syscon_register. This function allocates memory.
Since the memory management code has not been initialized at that time,
the call always fails. It either returns -ENOMEM or crashes as follows.
Unable to handle kernel NULL pointer dereference at virtual address 0000000c when read
[0000000c] *pgd=00000000
Internal error: Oops: 5 [#1] ARM
Modules linked in:
CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc5-00026-g5fcc9bf84ee5 #1 PREEMPT
Hardware name: ARM Integrator/CP (Device Tree)
PC is at __kmalloc_cache_noprof+0xec/0x39c
LR is at __kmalloc_cache_noprof+0x34/0x39c
...
Call trace:
__kmalloc_cache_noprof from of_syscon_register+0x7c/0x310
of_syscon_register from device_node_get_regmap+0xa4/0xb0
device_node_get_regmap from intcp_init_early+0xc/0x40
intcp_init_early from start_kernel+0x60/0x688
start_kernel from 0x0
The crash is seen due to a dereferenced pointer which is not supposed to be
NULL but is NULL if the memory management subsystem has not been
initialized. The crash is not seen with all versions of gcc. Some versions
such as gcc 9.x apparently do not dereference the pointer, presumably if
tracing is disabled. The problem has been reproduced with gcc 10.x, 11.x,
and 13.x. Either case, if the crash is not seen, the call to
syscon_regmap_lookup_by_compatible returns -ENOMEM, and
sched_clock_register is never called.
Fix the problem by moving the early initialization code into the standard
machine initialization code.
Fixes: bdb249fce9ad4 ("ARM: integrator: read counter using syscon/regmap")
Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/20250518164118.3859567-1-linux@roeck-us.net
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20260505-integrator-fixes-v1-1-56ab9aac59db@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Richard Fitzgerald <rf@opensource.cirrus.com>
Date: Thu May 21 13:30:57 2026 +0100
ASoC: cs35l56: Fix flushing of IRQ work in cs35l56_sdw_remove()
[ Upstream commit 18e7bd9f2446664053f8c34b72abd4606d22d858 ]
Use flush_work() instead of cancel_work_sync() to terminate pending IRQ
work in cs35l56_sdw_remove(). And flush_work() again after masking the
interrupts to flush any queueing that was racing with the masking. This is
the same sequence as cs35l56_sdw_system_suspend().
cs35l56_sdw_interrupt() takes the pm_runtime to prevent the bus powering-
down before the interrupt status can be read and handled. The work releases
this pm_runtime. So cancelling it, instead of flushing, could leave an
unbalanced pm_runtime.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: e49611252900 ("ASoC: cs35l56: Add driver for Cirrus Logic CS35L56")
Link: https://patch.msgid.link/20260521123057.988732-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Niklas Cassel <cassel@kernel.org>
Date: Thu May 14 09:39:02 2026 +0200
ata: libata-scsi: do not needlessly defer commands when using PMP with FBS
commit 759e8756da00aa115d504a18155b1d1ee1cc12e8 upstream.
The ACS specification does not allow a non-NCQ command to be issued while
an NCQ command is outstanding.
Commit 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
introduced a feature where a deferred non-NCQ command gets issued from a
workqueue. The design stores a single non-NCQ command per port.
However, when using Port Multipliers (PMPs), specifically PMPs that
support FIS-Based Switching (FBS), non-NCQ and NCQ commands can be mixed
on the same port, just not for the same link, see e.g. ata_std_qc_defer()
which is, and always has operated on a per-link basis.
Therefore, move the deferred_qc from struct ata_port to struct ata_link.
This way, when using a PMP with FBS, we will not needlessly defer commands
to all other links, just because one link issued a non-NCQ command while
having an NCQ command outstanding. Only commands for that specific link
will be deferred. This is in line with how PMPs with FBS worked before
commit 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation").
Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
Tested-by: Tommy Kelly <linux@tkel.ly>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Niklas Cassel <cassel@kernel.org>
Date: Thu May 14 09:39:00 2026 +0200
ata: libata-scsi: do not use the deferred QC feature for ATA_DEFER_PORT
commit ce4548807d2e4ae48fd0dbe38865467369877913 upstream.
The deferred QC feature was meant to handle mixed NCQ and non-NCQ commands,
i.e. for return value ATA_DEFER_LINK.
ATA_DEFER_PORT is returned by PATA drivers, but also certain SATA drivers
like sata_mv and sata_sil24 that uses ap->excl_link to workaround hardware
bugs in these HBAs. Regardless of the reason, using the deferred QC feature
for ATA_DEFER_PORT is always wrong, and will break the ap->excl_link usage
of the SATA drivers that rely on that feature.
Modify ata_scsi_qc_issue() to only use the deferred QC feature when mixing
NCQ and non-NCQ commands, i.e. ATA_DEFER_LINK.
Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
Tested-by: Tommy Kelly <linux@tkel.ly>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Niklas Cassel <cassel@kernel.org>
Date: Thu May 14 09:39:01 2026 +0200
ata: libata-scsi: do not use the deferred QC feature on PMPs with CBS
commit f233124fb36cd57ef09f96d517a38ab4b902e15e upstream.
When using Port Multipliers (PMPs) with Command-Based Switching (CBS), you
can only issue commands to one link at a time. For PMPs with CBS, there is
already code to handle commands being sent to different links in
sata_pmp_qc_defer_cmd_switch() using ap->excl_link. sata_sil24 also makes
use of ap->excl_link.
A user on the list reported that commit 0ea84089dbf6 ("ata: libata-scsi:
avoid Non-NCQ command starvation") broke PMPs with CBS. The commit
introduced code that stores a deferred qc in ap->deferred_qc, to later be
issued via a workqueue. It turns out that this change is incompatible with
the existing ap->excl_link handling used by PMPs with CBS.
Thus, modify sata_pmp_qc_defer_cmd_switch() and sil24_qc_defer() to return
ATA_DEFER_LINK_EXCL, and make sure that the deferred QC handling via
workqueue is not used for this return value.
This way, PMPs with CBS will work once again. Note that the starvation
referenced in commit 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ
command starvation") can only happen on libsas ports, and libsas does not
support Port Multipliers, thus there is no harm of reverting back to the
previous way of deferring commands for PMPs with CBS.
Non-libsas ports connected to anything but a PMP with CBS (e.g. a normal
drive or a PMP with FBS) will continue using the deferred workqueue, since
it does result in lower completion latencies for non-NCQ commands, even
though the workqueue is not strictly needed to avoid starvation for
non-libsas ports.
If we want to modify the scope of the workqueue issuing to also handle
PMPs with CBS, then we should ensure that we can save both NCQ and non-NCQ
commands in ap->deferred_qc, while also removing the existing PMP CBS
handling using ap->excl_link, such that we don't duplicate features.
While at it, also add a comment explaining how the ap->excl_link mechanism
works.
Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
Tested-by: Tommy Kelly <linux@tkel.ly>
Reported-by: Tommy Kelly <linux@tkel.ly>
Closes: https://lore.kernel.org/linux-ide/ce09cc21-a8e9-4845-b205-35411e22fba9@tkel.ly/
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Niklas Cassel <cassel@kernel.org>
Date: Thu May 14 09:38:59 2026 +0200
ata: libata-scsi: improve readability of ata_scsi_qc_issue()
commit 360190bd965f93794d5f5685a6de22ce6da2b672 upstream.
Improve readability of ata_scsi_qc_issue().
No functional changes.
Tested-by: Tommy Kelly <linux@tkel.ly>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sven Eckelmann <sven@narfation.org>
Date: Sun May 10 11:43:20 2026 +0200
batman-adv: bla: fix report_work leak on backbone_gw purge
commit 0459430add32ea41f3e2ef9351610e6d33627a6b upstream.
batadv_bla_purge_backbone_gw() removes stale backbone gateway entries,
but fails to properly handle their associated report_work:
- If report_work is running, the purge must wait for it to finish before
freeing the backbone_gw, otherwise the worker may access freed memory
(e.g. bat_priv).
- If report_work is pending, the purge must cancel it and release the
reference held for that pending work item.
The previous implementation called hlist_for_each_entry_safe() inside a
spin_lock_bh() section, but cancel_work_sync() may sleep and therefore
cannot be called from within a spinlock-protected region.
Restructure the loop to handle one entry per spinlock critical section:
acquire the lock, find the next entry to purge, remove it from the hash
list, then release the lock before calling cancel_work_sync() and
dropping the hash_entry reference. Repeat until no more entries require
purging.
Cc: stable@kernel.org
Fixes: 23721387c409 ("batman-adv: add basic bridge loop avoidance code")
Reviewed-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ruijie Li <ruijieli51@gmail.com>
Date: Thu May 14 16:13:25 2026 +0800
batman-adv: clear current gateway during teardown
commit a340a51ed801eab7bb454150c226323b865263cc upstream.
batadv_gw_node_free() removes the gateway list entries during mesh teardown,
but it does not clear the currently selected gateway. This leaves stale
gateway state behind across cleanup and can break a later mesh recreation.
Clear bat_priv->gw.curr_gw before walking the gateway list so the selected
gateway reference is dropped as part of teardown.
Fixes: 2265c1410864 ("batman-adv: gateway election code refactoring")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Ruijie Li <ruijieli51@gmail.com>
Signed-off-by: Zhanpeng Li <lzhanpeng2025@lzu.edu.cn>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sven Eckelmann <sven@narfation.org>
Date: Wed May 13 09:01:34 2026 +0200
batman-adv: dat: handle forward allocation error
commit 2d8826a2d3657cea66fb0370f9e521575a673871 upstream.
batadv_dat_forward_data() calls pskb_copy_for_clone() to duplicate an skb
for each DHT candidate, but does not check the return value before passing
it to batadv_send_skb_prepare_unicast_4addr(). That function dereferences
the skb unconditionally, so a failed allocation triggers a NULL pointer
dereference.
Skip forwarding to the current DHT candidate on allocation failure.
Cc: stable@kernel.org
Fixes: 785ea1144182 ("batman-adv: Distributed ARP Table - create DHT helper functions")
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Reviewed-by: Yuan Tan <yuantan098@gmail.com>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ruide Cao <caoruide123@gmail.com>
Date: Wed May 13 11:58:15 2026 +0800
batman-adv: fix fragment reassembly length accounting
commit 9cd3f16c320bfdadd4509358122368deb56a5741 upstream.
batman-adv keeps a running payload length for queued fragments and uses it
to validate a fragment chain before reassembly.
That accounting currently allows the accumulated fragment length to be
truncated during updates. As a result, malformed fragment chains can
bypass the intended validation and drive reassembly with inconsistent
length state, leading to a local denial of service.
Fix the accounting by storing the accumulated length in a length-typed
field and rejecting update overflows before the existing validation logic
runs.
The fix was verified against the original reproducer and against valid
fragment reassembly paths.
Fixes: 610bfc6bc99b ("batman-adv: Receive fragmented packets and merge")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Ruide Cao <caoruide123@gmail.com>
Tested-by: Ren Wei <enjou1224z@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Luxiao Xu <rakukuip@gmail.com>
Date: Mon May 11 18:52:09 2026 +0200
batman-adv: fix tp_meter counter underflow during shutdown
commit 94f3b133168d1c49895e7cc6afbcf1cc0b354602 upstream.
batadv_tp_sender_shutdown() unconditionally decrements the "sending"
atomic counter. If multiple paths (e.g. timeout, user cancel, and
normal finish) call this function, the counter can underflow to -1.
Since the sender logic treats any non-zero value as "still sending",
a negative value causes the sender kthread to loop indefinitely.
This leads to a use-after-free when the interface is removed while
the zombie thread is still active.
Fix this by using atomic_xchg() to ensure the counter only transitions
from 1 to 0 once.
Fixes: 33a3bb4a3345 ("batman-adv: throughput meter implementation")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Luxiao Xu <rakukuip@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
[sven: added missing change in batadv_tp_send]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sven Eckelmann <sven@narfation.org>
Date: Wed May 13 09:01:36 2026 +0200
batman-adv: frag: disallow unicast fragment in fragment
commit bc62216dc8e221e3781afa14430f45208bfa9af9 upstream.
batadv_frag_skb_buffer() is called by batadv_batman_skb_recv() when a
BATADV_UNICAST_FRAG packet is received. Once all fragments are collected
and the packet is reassembled, batadv_recv_frag_packet() calls
batadv_batman_skb_recv() again to process the defragmented payload.
A malicious sender can craft a BATADV_UNICAST_FRAG packet whose reassembled
payload is itself a BATADV_UNICAST_FRAG packet (matryoshka-style nesting).
Each nesting level recurses through batadv_batman_skb_recv() without bound,
growing the kernel stack until it is exhausted.
Since refragmentation or fragments in fragments are not actually allowed,
discard all packets which are still BATADV_UNICAST_FRAG packets after the
defragmentation process.
Cc: stable@kernel.org
Fixes: 610bfc6bc99b ("batman-adv: Receive fragmented packets and merge")
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Reviewed-by: Yuan Tan <yuantan098@gmail.com>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sven Eckelmann <sven@narfation.org>
Date: Thu May 14 19:22:02 2026 +0200
batman-adv: mcast: fix use-after-free in orig_node RCU release
commit 20c2d6a20ca936f5aaa6dd40f73f262ac45c87cc upstream.
batadv_mcast_purge_orig() removes entries from RCU-protected hlists but
does not wait for an RCU grace period before returning. Concurrent RCU
readers may still accesses references to those entries at the point of
removal. RCU-protected readers trying to operate on entries like
orig->mcast_want_all_ipv6_node will then access already freed memory.
Fix this by moving batadv_mcast_purge_orig() to batadv_orig_node_release(),
just before the call_rcu() invocation. This ensures RCU readers that were
active at purge time have drained before the orig_node memory is reclaimed.
Cc: stable@kernel.org
Fixes: ab49886e3da7 ("batman-adv: Add IPv4 link-local/IPv6-ll-all-nodes multicast support")
Acked-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sven Eckelmann <sven@narfation.org>
Date: Wed May 13 09:01:35 2026 +0200
batman-adv: tp_meter: avoid use of uninit sender vars
commit 6c65cf23d4c6170fcf5714c32aa64689718cb142 upstream.
batadv_tp_recv_ack() and batadv_tp_stop() are only valid for tp_vars in the
BATADV_TP_SENDER role. When called with a BATADV_TP_RECEIVER role, it
proceeds to read sender-only members that were never initialized, leading
to undefined behavior.
This can be triggered when a node that is currently acting as a receiver in
an ongoing tp_meter session receives a malicious ACK packet.
Guard against this by checking tp_vars->role immediately after the
lookup and bailing out if it is not BATADV_TP_SENDER, before any of
those members are accessed.
Cc: stable@kernel.org
Fixes: 33a3bb4a3345 ("batman-adv: throughput meter implementation")
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Reviewed-by: Yuan Tan <yuantan098@gmail.com>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sven Eckelmann <sven@narfation.org>
Date: Wed May 13 23:38:54 2026 +0200
batman-adv: tp_meter: fix race condition in send error reporting
commit 71dce47f0758537fff78fddb5fb0d4632d29b29f upstream.
batadv_tp_sender_shutdown() previously used two separate variables to track
session state: sending (an atomic flag indicating whether the session was
active) and reason (a plain enum storing the stop reason). This introduced
a race window between the two writes: after sending was cleared to 0,
batadv_tp_send() could observe the stopped state and call
batadv_tp_sender_end() before reason was written, causing the wrong stop
reason to be reported to the caller.
Fix this by consolidating both variables into a single atomic send_result,
which holds 0 while the session is running and the stop reason once it
ends.
Cc: stable@kernel.org
Fixes: 33a3bb4a3345 ("batman-adv: throughput meter implementation")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sven Eckelmann <sven@narfation.org>
Date: Sun May 10 11:31:03 2026 +0200
batman-adv: tp_meter: fix tp_vars reference leak in receiver shutdown
commit 77098e4bea37af51d3962efa88a5af2ea5e1ac57 upstream.
The receiver shutdown timer handler, batadv_tp_receiver_shutdown(), is
responsible for releasing the tp_vars reference it holds. However, the
existing logic for coordinating this release with batadv_tp_stop_all() was
flawed.
timer_shutdown_sync() guarantees the timer will not fire again after it
returns, but it returns non-zero only when the timer was pending at the
time of the call. If the timer had already expired (and
batadv_tp_stop_all() would unsucessfully try to rearm itself),
batadv_tp_stop_all() skips its batadv_tp_vars_put(), and
batadv_tp_receiver_shutdown() fails to put its own reference as well.
Fix this by introducing a new atomic variable receiving that is set to 1
when the receiver is initialized and cleared atomically with atomic_xchg()
by whichever side claims it first. Only the side that observes the
transition from 1 to 0 is responsible for releasing the tp_vars timer
reference, eliminating the uncertainty.
Cc: stable@kernel.org
Fixes: 3d3cf6a7314a ("batman-adv: stop tp_meter sessions during mesh teardown")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sven Eckelmann <sven@narfation.org>
Date: Sat May 2 19:53:21 2026 +0200
batman-adv: tt: fix negative last_changeset_len
commit fc92cdfcb295cefa4344d71a527d61b638b7bfc4 upstream.
batadv_piv_tt::last_changeset_len len was declared as s16, but the field is
never intended to hold a negative value. When a value greater than 32767 is
assigned, it wraps to a negative signed integer.
In batadv_send_my_tt_response(), last_changeset_len is temporarily widened
to s32. The incorrectly negative s16 value propagates into the s32, causing
batadv_tt_prepare_tvlv_local_data() to allocate a full sized buffer but
populates only a small portion of it with the collected changeset. All
remaining bits are kept uninitialized.
Using an u16 avoids this type confusion and ensures that no (negative) sign
extension is performed in batadv_send_my_tt_response().
Cc: stable@kernel.org
Fixes: a73105b8d4c7 ("batman-adv: improved client announcement mechanism")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sven Eckelmann <sven@narfation.org>
Date: Sat May 2 19:53:21 2026 +0200
batman-adv: tt: fix negative tt_buff_len
commit b64963a2ceeb7529310b6cf253a1e540784422f4 upstream.
batadv_orig_node::tt_buff_len was declared as s16, but the field is never
intended to hold a negative value. When a value greater than 32767 is
assigned, it wraps to a negative signed integer.
In batadv_send_other_tt_response(), tt_buff_len is temporarily widened to
s32. The incorrectly negative s16 value propagates into the s32, causing
batadv_tt_prepare_tvlv_global_data() to allocate a full sized buffer but
populates only a small portion of it with the collected changeset. All
remaining bits are kept uninitialized.
Using an u16 avoids this type confusion and ensures that no (negative) sign
extension is performed in batadv_send_other_tt_response().
Cc: stable@kernel.org
Fixes: a73105b8d4c7 ("batman-adv: improved client announcement mechanism")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Keith Busch <kbusch@kernel.org>
Date: Wed Sep 3 12:33:16 2025 -0700
blk-integrity: enable p2p source and destination
[ Upstream commit 05ceea5d3ec9a1b1d6858ffd4739fdb0ed1b8eaf ]
Set the extraction flags to allow p2p pages for the metadata buffer if
the block device allows it. Similar to data payloads, ensure the bio
does not use merging if we see a p2p page.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: 8582792cf23b ("block: bio-integrity: Fix null-ptr-deref in bio_integrity_map_user()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Keith Busch <kbusch@kernel.org>
Date: Wed Oct 16 13:13:09 2024 -0700
blk-integrity: remove seed for user mapped buffers
[ Upstream commit 133008e84b99e4f5f8cf3d8b768c995732df9406 ]
The seed is only used for kernel generation and verification. That
doesn't happen for user buffers, so passing the seed around doesn't
accomplish anything.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anuj Gupta <anuj20.g@samsung.com>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Link: https://lore.kernel.org/r/20241016201309.1090320-1-kbusch@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: 637ad3a56a3b ("block: don't overwrite bip_vcnt in bio_integrity_copy_user()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Keith Busch <kbusch@kernel.org>
Date: Wed Aug 27 07:12:57 2025 -0700
blk-integrity: use simpler alignment check
[ Upstream commit 69d7ed5b9ef661230264bfa0db4c96fa25b8efa4 ]
We're checking length and addresses against the same alignment value, so
use the more simple iterator check.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: 8582792cf23b ("block: bio-integrity: Fix null-ptr-deref in bio_integrity_map_user()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sungwoo Kim <iam@sung-woo.kim>
Date: Tue May 12 01:09:29 2026 -0400
block: bio-integrity: Fix null-ptr-deref in bio_integrity_map_user()
[ Upstream commit 8582792cf23b3d94674d4d838f7cde9a28d0fcaf ]
pin_user_pages_fast() can partially succeed and return the number of
pages that were actually pinned. However, the bio_integrity_map_user()
does not handle this partial pinning. This leads to a general protection
fault since bvec_from_pages() dereferences an unpinned page address,
which is 0.
To fix this, add a check to verify that all requested memory is pinned.
If partial pinning occurs, unpin the memory and return -EFAULT.
Kernel Oops:
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 0 UID: 0 PID: 1061 Comm: nvme-passthroug Not tainted 7.0.0-11783-g90957f9314e8-dirty #16 PREEMPT(lazy)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
RIP: 0010:bio_integrity_map_user.cold+0x1b0/0x9d6
Fixes: 492c5d455969 ("block: bio-integrity: directly map user buffers")
Acked-by: Chao Shi <cshi008@fiu.edu>
Acked-by: Weidong Zhu <weizhu@fiu.edu>
Acked-by: Dave Tian <daveti@purdue.edu>
Signed-off-by: Sungwoo Kim <iam@sung-woo.kim>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://github.com/linux-blktests/blktests/pull/244
Link: https://patch.msgid.link/20260512050929.541397-2-iam@sung-woo.kim
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Carlier <devnexen@gmail.com>
Date: Mon May 11 22:51:51 2026 +0100
block: don't overwrite bip_vcnt in bio_integrity_copy_user()
[ Upstream commit 637ad3a56a3b889527d1dacea6fea2a8bd648140 ]
bio_integrity_add_page() already sets bip_vcnt to 1 for the bounce
segment. Overwriting it with nr_vecs breaks bip_vcnt <= bip_max_vcnt
on WRITE (bip_max_vcnt is 1), so the gap-merge checks in block/blk.h
read past the bip_vec[] flex array. On READ the read is in bounds
but lands on a saved user bvec instead of the bounce.
The line was added for split propagation, but bio_integrity_clone()
doesn't copy bip_vcnt and BIP_CLONE_FLAGS excludes BIP_COPY_USER.
Fixes: 3991657ae707 ("block: set bip_vcnt correctly")
Signed-off-by: David Carlier <devnexen@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260511215151.346228-1-devnexen@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Caleb Sander Mateos <csander@purestorage.com>
Date: Tue Jun 3 12:31:32 2025 -0600
block: drop direction param from bio_integrity_copy_user()
[ Upstream commit c09a8b00f850d3ca0af998bff1fac4a3f6d11768 ]
direction is determined from bio, which is already passed in. Compute
op_is_write(bio_op(bio)) directly instead of converting it to an iter
direction and back to a bool.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Anuj Gupta <anuj20.g@samsung.com>
Link: https://lore.kernel.org/r/20250603183133.1178062-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: 8582792cf23b ("block: bio-integrity: Fix null-ptr-deref in bio_integrity_map_user()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jens Axboe <axboe@kernel.dk>
Date: Fri Nov 29 15:53:58 2024 -0700
block: make bio_integrity_map_user() static inline
[ Upstream commit 546d191427cf5cf3215529744c2ea8558f0279db ]
If CONFIG_BLK_DEV_INTEGRITY isn't set, then the dummy helper must be
static inline to avoid complaints about the function being unused.
Fixes: fe8f4ca7107e ("block: modify bio_integrity_map_user to accept iov_iter as argument")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202411300229.y7h60mDg-lkp@intel.com/
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Anuj Gupta <anuj20.g@samsung.com>
Date: Thu Nov 28 16:52:33 2024 +0530
block: modify bio_integrity_map_user to accept iov_iter as argument
[ Upstream commit fe8f4ca7107e968b0eb7328155c8811f2a19424a ]
This patch refactors bio_integrity_map_user to accept iov_iter as
argument. This is a prep patch.
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20241128112240.8867-4-anuj20.g@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: 8582792cf23b ("block: bio-integrity: Fix null-ptr-deref in bio_integrity_map_user()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Casey Chen <cachen@purestorage.com>
Date: Mon May 11 15:22:30 2026 -0600
block: recompute nr_integrity_segments in blk_insert_cloned_request
[ Upstream commit 2c6e6a18a37b905cb584eb0dda3ae482162a81ca ]
blk_insert_cloned_request() already recomputes nr_phys_segments
against the bottom queue, because "the queue settings related to
segment counting may differ from the original queue." The exact same
reasoning applies to integrity segments: a stacked driver's underlying
queue can have tighter virt_boundary_mask, seg_boundary_mask, or
max_segment_size than the top queue, in which case
blk_rq_count_integrity_sg() against the bottom queue produces a
different count than the cached rq->nr_integrity_segments inherited
from the source request by blk_rq_prep_clone().
When the cached count is lower than the bottom queue's actual count,
blk_rq_map_integrity_sg() trips
BUG_ON(segments > rq->nr_integrity_segments);
on dispatch. The same families of stacked setups that motivated the
existing nr_phys_segments recompute -- dm-multipath fanning out to
nvme-rdma in particular -- can produce this.
Mirror the nr_phys_segments handling: when the request carries
integrity, recompute nr_integrity_segments against the bottom queue
and reject the request if it exceeds the bottom queue's
max_integrity_segments. blk_rq_count_integrity_sg() and
queue_max_integrity_segments() are both already available via
<linux/blk-integrity.h>, which blk-mq.c includes.
This closes a latent gap in the stacking contract and brings the
integrity-segment accounting in line with the existing
phys-segment accounting.
Fixes: 76c313f658d2 ("blk-integrity: improved sg segment mapping")
Signed-off-by: Casey Chen <cachen@purestorage.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260511212230.27511-1-cachen@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jann Horn <jannh@google.com>
Date: Tue May 12 22:15:39 2026 +0200
Bluetooth: bnep: Fix UAF read of dev->name
commit 59e932ded949fa6f0340bf7c6d7818f962fa4fd2 upstream.
bnep_add_connection() needs to keep holding the bnep_session_sem while
reading dev->name (just like bnep_get_connlist() does); otherwise the
bnep_session() thread can concurrently free the net_device, which can for
example be triggered by a concurrent bnep_del_connection().
(This UAF is fairly uninteresting from a security perspective;
calling bnep_add_connection() requires passing a capable(CAP_NET_ADMIN)
check. It also requires completely tearing down a netdev during a fairly
tight race window.)
Cc: stable@vger.kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jiajia Liu <liujiajia@kylinos.cn>
Date: Mon May 18 10:24:02 2026 +0800
Bluetooth: btmtk: fix urb->setup_packet leak in error paths
[ Upstream commit dd1dda6b8d6e1f4376a5b3055a04f0ecbdb4d6bd ]
The setup_packet of control urb is not freed if usb_submit_urb fails or
the submitted urb is killed. Add free in these two paths.
Fixes: a1c49c434e150 ("Bluetooth: btusb: Add protocol support for MediaTek MT7668U USB devices")
Signed-off-by: Jiajia Liu <liujiajia@kylinos.cn>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Safa Karakuş <safa.karakus@secunnix.com>
Date: Sat May 16 21:15:04 2026 +0300
Bluetooth: fix UAF in l2cap_sock_cleanup_listen() vs l2cap_conn_del()
commit ab1513597c6cf17cd1ad2a21e3b045421b48e022 upstream.
bt_accept_dequeue() unlinks a not-yet-accepted child from the parent
accept queue and release_sock()s it before returning, so the returned
sk has no caller reference and is unlocked.
l2cap_sock_cleanup_listen() walks these children on listening-socket
close. A concurrent HCI disconnect drives hci_rx_work ->
l2cap_conn_del() which runs l2cap_chan_del() + l2cap_sock_kill() and
frees the child sk and its l2cap_chan; cleanup_listen() then uses both:
BUG: KASAN: slab-use-after-free in l2cap_sock_kill
l2cap_sock_kill / l2cap_sock_cleanup_listen / __x64_sys_close
Freed by: l2cap_conn_del -> l2cap_sock_close_cb -> l2cap_sock_kill
This is distinct from the two fixes already in this area: commit
e83f5e24da741 ("Bluetooth: serialize accept_q access") serialises the
accept_q list/poll and takes temporary refs inside bt_accept_dequeue(),
and CVE-2025-39860 serialises the userspace close()/accept() race by
calling cleanup_listen() under lock_sock() in l2cap_sock_release().
Neither covers l2cap_conn_del() running from hci_rx_work, so this UAF
still reproduces on current bluetooth/master.
Take the reference at the source: bt_accept_dequeue() does sock_hold()
while sk is still locked, before release_sock(); callers sock_put().
cleanup_listen() pins the chan with l2cap_chan_hold_unless_zero() under
a brief child sk lock (serialising vs l2cap_sock_teardown_cb()), drops
it before l2cap_chan_lock(), and skips a duplicate l2cap_sock_kill() on
SOCK_DEAD. conn->lock is not taken here: cleanup_listen() runs under
the parent sk lock and that would invert
conn->lock -> chan->lock -> sk_lock (lockdep).
KASAN/SMP: an unprivileged listen/close vs HCI-disconnect race produced
12 use-after-free reports per run before this change; 0, and no lockdep
report, over 1600+ raced iterations after it on bluetooth/master.
Fixes: 15f02b910562 ("Bluetooth: L2CAP: Add initial code for Enhanced Credit Based Mode")
Cc: stable@vger.kernel.org
Reported-by: Siwei Zhang <oss@fourdim.xyz>
Reviewed-by: Siwei Zhang <oss@fourdim.xyz>
Signed-off-by: Safa Karakuş <safa.karakus@secunnix.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Mingyu Wang <25181214217@stu.xidian.edu.cn>
Date: Mon May 18 10:49:49 2026 +0800
Bluetooth: hci_uart: fix UAFs and race conditions in close and init paths
commit c1bb9336ae6b54a5f6a353c4bd4ed9a4307e429b upstream.
Vulnerabilities leading to Use-After-Free (UAF) and Null Pointer
Dereference (NPD) conditions were observed in the lifecycle management
of hci_uart.
The primary issue arises because the workqueues (init_ready and
write_work) are only flushed/cancelled if the HCI_UART_PROTO_READY
flag is set during TTY close. If a hangup occurs before setup completes,
hci_uart_tty_close() skips the teardown of these workqueues and
proceeds to free the `hu` struct. When the scheduled work executes
later, it blindly dereferences the freed `hu` struct.
Furthermore, several data races and UAFs were identified in the teardown
sequence:
1. Calling hci_uart_flush() from hci_uart_close() without effectively
disabling write_work causes a race condition where both can concurrently
double-free hu->tx_skb. This happens because protocol timers can
concurrently invoke hci_uart_tx_wakeup() and requeue write_work.
2. Calling hci_free_dev(hdev) before hu->proto->close(hu) causes a UAF
when vendor specific protocol close callbacks dereference hu->hdev.
3. In the initialization error paths, failing to take the proto_lock
write lock before clearing PROTO_READY leads to races with active
readers. Additionally, hci_uart_tty_receive() accesses hu->hdev
outside the read lock, leading to UAFs if the initialization error
path frees hdev concurrently.
Fix these synchronization and lifecycle issues by:
1. Re-ordering hci_uart_tty_close() to clear HCI_UART_PROTO_READY first,
followed immediately by a cancel_work_sync(&hu->write_work). Clearing
the flag locks out concurrent protocol timers from successfully invoking
hci_uart_tx_wakeup(), effectively rendering the cancellation permanent
and preventing the tx_skb double-free.
2. Note: Clearing PROTO_READY early causes hci_uart_close() to skip
hu->proto->flush(). This is perfectly safe in the tty_close path
because hu->proto->close() executes shortly after, which intrinsically
purges all protocol SKB queues and tears down the state.
3. Relocating hu->proto->close(hu) strictly prior to hci_free_dev(hdev)
across all close and error paths to prevent vendor-level UAFs.
4. Moving the hdev->stat.byte_rx increment in hci_uart_tty_receive()
inside the proto_lock read-side critical section to safely synchronize
with device unregistration.
5. Adding cancel_work_sync(&hu->write_work) to hci_uart_close() to safely
flush the workqueue before hci_uart_flush() is invoked via the HCI core.
6. Utilizing cancel_work_sync() instead of disable_work_sync() across
all paths to prevent permanently breaking user-space retry capabilities.
Fixes: 3b799254cf6f ("Bluetooth: hci_uart: Cancel init work before unregistering")
Cc: stable@vger.kernel.org
Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: David Carlier <devnexen@gmail.com>
Date: Fri May 15 07:25:25 2026 +0100
Bluetooth: ISO: drop ISO_END frames received without prior ISO_START
commit 84c24fb151fc1179355296d7ff29129ac7c42129 upstream.
ISO data PDUs carry a packet-boundary flag indicating START, CONT, END
or SINGLE. The ISO_CONT branch of iso_recv() guards against a missing
ISO_START by checking conn->rx_len before touching conn->rx_skb, but
ISO_END does not.
If a peer sends an ISO_END as the first packet on a fresh ISO
connection, conn->rx_skb is still NULL and conn->rx_len is zero, so
skb_put(conn->rx_skb, ...) dereferences NULL and oopses. For BIS,
where receivers sync to a broadcaster without pairing, any broadcaster
on the air can trigger this.
Mirror the ISO_CONT check at the top of ISO_END so a stray end fragment
is logged and dropped instead of crashing the host.
Fixes: ccf74f2390d6 ("Bluetooth: Add BTPROTO_ISO socket type")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: David Carlier <devnexen@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Michael Bommarito <michael.bommarito@gmail.com>
Date: Mon May 11 08:26:41 2026 -0400
Bluetooth: L2CAP: ecred_reconfigure: send packed pdu, not stack pointer
commit 3374ef8cf99368a40f7efd51a2a375a4c5dc6f0d upstream.
Commit 1c08108f3014 ("Bluetooth: L2CAP: Avoid -Wflex-array-member-not-at-end
warnings") converted the on-stack request PDU in l2cap_ecred_reconfigure()
from an explicit packed struct to DEFINE_RAW_FLEX(), but did not adjust the
size and source-pointer arguments to l2cap_send_cmd():
- struct {
- struct l2cap_ecred_reconf_req req;
- __le16 scid;
- } pdu;
+ DEFINE_RAW_FLEX(struct l2cap_ecred_reconf_req, pdu, scid, 1);
...
l2cap_send_cmd(conn, chan->ident, L2CAP_ECRED_RECONF_REQ,
sizeof(pdu), &pdu);
After the conversion, DEFINE_RAW_FLEX() expands to declare an anonymous
union pdu_u plus a local pointer "pdu" pointing at it. Therefore:
- sizeof(pdu) is now sizeof(struct l2cap_ecred_reconf_req *) = 8 on
64-bit (4 on 32-bit), not the 6 bytes of (mtu, mps, scid[1]).
- &pdu is the address of the local pointer's stack storage, not the
address of the request payload.
l2cap_send_cmd() forwards (data, count) to l2cap_build_cmd(), which calls
skb_put_data(skb, data, count). The L2CAP_ECRED_RECONFIGURE_REQ packet
body therefore contains 8 bytes copied from the kernel stack starting at
&pdu -- the 8 bytes overlap the pdu pointer's value, leaking a kernel
stack address to the paired Bluetooth peer. The intended (mtu, mps, scid)
fields are not transmitted at all, so the peer rejects the request as
malformed and the L2CAP_ECRED_RECONFIGURE feature itself has been broken
for the local-side initiator since the introducing commit landed.
The sibling site l2cap_ecred_conn_req() in the same commit was converted
correctly (sizeof(*pdu) + len, pdu); only this site was missed.
Restore the original semantics: pass the full flex-struct size via
struct_size(pdu, scid, 1) and the pdu pointer (the struct address) as
the source.
Validated on a stock 7.0-based host kernel via the real call path:
setsockopt(SOL_BLUETOOTH, BT_RCVMTU, ...) on a BT_CONNECTED
L2CAP_MODE_EXT_FLOWCTL socket emits an L2CAP_ECRED_RECONFIGURE_REQ
whose body is 8 bytes (the on-stack pdu local's value) rather than
the expected 6. Three captures from fresh socket / fresh hciemu peer
on the same host -- low bytes vary per call, high 0xffff confirms a
kernel virtual address (KASLR-randomised stack slot, not a fixed
string):
RECONF_REQ body (ident=0x02 len=8): 42 fb 54 af 0e ca ff ff
RECONF_REQ body (ident=0x02 len=8): 52 3d 2e af 0e ca ff ff
RECONF_REQ body (ident=0x02 len=8): b2 fc 5b af 0e ca ff ff
After this patch the body is 6 bytes carrying the expected
little-endian (mtu, mps, scid).
Cc: stable@vger.kernel.org
Fixes: 1c08108f3014 ("Bluetooth: L2CAP: Avoid -Wflex-array-member-not-at-end warnings")
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Michael Bommarito <michael.bommarito@gmail.com>
Date: Fri May 15 10:38:19 2026 -0400
Bluetooth: MGMT: validate Add Extended Advertising Data length
commit d3f7d17960ed50df3a6709c5158caff989c8c905 upstream.
MGMT_OP_ADD_EXT_ADV_DATA is registered as a variable-length command,
with MGMT_ADD_EXT_ADV_DATA_SIZE as the fixed header size. The handler
then uses cp->adv_data_len and cp->scan_rsp_len to validate and copy
cp->data, but it never checks that those bytes are part of the mgmt
command payload.
A short command can therefore make add_ext_adv_data() pass an
out-of-bounds pointer into tlv_data_is_valid(). If the bytes beyond
the command buffer are addressable, they can also be copied into the
advertising instance as scan response data, where the caller can read
them back via MGMT_OP_GET_ADV_INSTANCE. The trigger requires
CAP_NET_ADMIN in the initial user namespace; KASAN reports an 8-byte
slab-out-of-bounds read.
Reject commands whose length does not match the fixed header plus both
advertising data lengths before parsing cp->data.
Fixes: 12410572833a ("Bluetooth: Break add adv into two mgmt commands")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jiexun Wang <wangjiexun2025@gmail.com>
Date: Wed May 6 19:43:30 2026 +0800
Bluetooth: serialize accept_q access
commit e83f5e24da741fa9405aeeff00b08c5ee7c37b88 upstream.
bt_sock_poll() walks the accept queue without synchronization, while
child teardown can unlink the same socket and drop its last reference.
The unsynchronized accept queue walk has existed since the initial
Bluetooth import.
Protect accept_q with a dedicated lock for queue updates and polling.
Also rework bt_accept_dequeue() to take temporary child references under
the queue lock before dropping it and locking the child socket.
Fixes: 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Reported-by: Jann Horn <jannh@google.com>
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Jiexun Wang <wangjiexun2025@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Signed-off-by: Jiexun Wang <wangjiexun2025@gmail.com>
Reviewed-by: Jann Horn <jannh@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Xingwang Xiang <v3rdant.xiang@gmail.com>
Date: Sun May 17 23:56:26 2026 +0900
bpf, skmsg: fix verdict sk_data_ready racing with ktls rx
[ Upstream commit ddf8029623a1af20e984c040e89ff918158397ab ]
sk_psock_strp_data_ready() already checks tls_sw_has_ctx_rx() and
defers to psock->saved_data_ready when a TLS RX context is present,
avoiding a conflict with the TLS strparser's ownership of the receive
queue (commit e91de6afa81c, "bpf: Fix running sk_skb program types
with ktls").
sk_psock_verdict_data_ready() has no equivalent guard. When a socket
is inserted into a sockmap (BPF_SK_SKB_VERDICT) before TLS RX is
configured, tls_sw_strparser_arm() saves sk_psock_verdict_data_ready
as rx_ctx->saved_data_ready. On data arrival:
tls_data_ready -> tls_strp_data_ready -> tls_rx_msg_ready
-> saved_data_ready() = sk_psock_verdict_data_ready()
-> tcp_read_skb() drains sk_receive_queue via __skb_unlink()
without calling tcp_eat_skb(), so copied_seq is not advanced.
tls_strp_msg_load() then finds tcp_inq() >= full_len (stale), calls
tcp_recv_skb() on the now-empty queue, hits WARN_ON_ONCE(!first), and
returns with rx_ctx->strp.anchor.frag_list pointing at a psock-owned
(potentially freed) skb. tls_decrypt_sg() subsequently walks that
frag_list: use-after-free.
Apply the same fix as sk_psock_strp_data_ready(): if a TLS RX context
is present, call psock->saved_data_ready (sock_def_readable) to wake
recv() waiters and return immediately, leaving the receive queue
untouched. TLS retains sole ownership of the queue and decrypts the
record normally through tls_sw_recvmsg().
Fixes: ef5659280eb1 ("bpf, sockmap: Allow skipping sk_skb parser program")
Signed-off-by: Xingwang Xiang <v3rdant.xiang@gmail.com>
Link: https://patch.msgid.link/20260517145630.20521-2-v3rdant.xiang@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ido Schimmel <idosch@nvidia.com>
Date: Sun May 17 15:11:21 2026 +0300
bridge: mcast: Fix a possible use-after-free when removing a bridge port
[ Upstream commit 4df78ff02629c7729168f0696a7a2123c389818d ]
When per-VLAN multicast snooping is enabled, the bridge iterates over
all the bridge ports, disables the per-port multicast context on each
port and enables the per-{port, VLAN} multicast contexts instead. The
reverse happens when per-VLAN multicast snooping is disabled.
When global multicast snooping is enabled, the bridge iterates over all
the bridge ports and enables the per-port multicast context on each
port. The reverse happens when multicast snooping is disabled.
The above scheme can result in a situation where both types of contexts
(per-port and per-{port, VLAN}) are enabled on a single bridge port:
# ip link add name br1 up type bridge mcast_snooping 1 mcast_querier 1 vlan_filtering 1
# ip link add name dummy1 up master br1 type dummy
# ip link set dev br1 type bridge mcast_vlan_snooping 1
# ip link set dev br1 type bridge mcast_snooping 0
# ip link set dev br1 type bridge mcast_snooping 1
This is not intended and it is a problem since the commit cited below.
Prior to this commit, when removing a bridge port,
br_multicast_disable_port() would disable the per-port multicast context
and the per-{port, VLAN} multicast contexts would get disabled when
flushing VLANs.
After this commit, br_multicast_disable_port() only disables the
per-port multicast context if per-VLAN multicast snooping is disabled.
If both types of contexts were enabled on the port when it was removed,
the per-port multicast context would remain enabled when freeing the
bridge port, leading to a use-after-free [1].
Fix by preventing the bridge from enabling / disabling the per-port
multicast contexts when toggling global multicast snooping if per-VLAN
multicast snooping is enabled.
[1]
ODEBUG: free active (active state 0) object: ffff88810f8bda78 object type: timer_list hint: br_ip6_multicast_port_query_expired (net/bridge/br_multicast.c:1927)
WARNING: lib/debugobjects.c:629 at debug_print_object+0x1b1/0x3e0, CPU#5: swapper/5/0
[...]
Call Trace:
<IRQ>
__debug_check_no_obj_freed (lib/debugobjects.c:1116)
kfree (mm/slub.c:2620 mm/slub.c:6250 mm/slub.c:6565)
kobject_cleanup (lib/kobject.c:689)
rcu_do_batch (kernel/rcu/tree.c:2617)
rcu_core (kernel/rcu/tree.c:2869)
handle_softirqs (kernel/softirq.c:622)
__irq_exit_rcu (kernel/softirq.c:656 kernel/softirq.c:496 kernel/softirq.c:735)
irq_exit_rcu (kernel/softirq.c:752)
sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1061 (discriminator 47) arch/x86/kernel/apic/apic.c:1061 (discriminator 47))
</IRQ>
Fixes: 4b30ae9adb04 ("net: bridge: mcast: re-implement br_multicast_{enable, disable}_port functions")
Reported-by: syzbot+ae231e0552fa77b26ea1@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/87qznowlfs.ffs@tglx/
Reported-by: Thomas Gleixner <tglx@kernel.org>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260517121122.188333-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xiang Mei <xmei5@asu.edu>
Date: Fri Mar 27 23:30:00 2026 -0700
bridge: mrp: reject zero test interval to avoid OOM panic
[ Upstream commit fa6e24963342de4370e3a3c9af41e38277b74cf3 ]
br_mrp_start_test() and br_mrp_start_in_test() accept the user-supplied
interval value from netlink without validation. When interval is 0,
usecs_to_jiffies(0) yields 0, causing the delayed work
(br_mrp_test_work_expired / br_mrp_in_test_work_expired) to reschedule
itself with zero delay. This creates a tight loop on system_percpu_wq
that allocates and transmits MRP test frames at maximum rate, exhausting
all system memory and causing a kernel panic via OOM deadlock.
The same zero-interval issue applies to br_mrp_start_in_test_parse()
for interconnect test frames.
Use NLA_POLICY_MIN(NLA_U32, 1) in the nla_policy tables for both
IFLA_BRIDGE_MRP_START_TEST_INTERVAL and
IFLA_BRIDGE_MRP_START_IN_TEST_INTERVAL, so zero is rejected at the
netlink attribute parsing layer before the value ever reaches the
workqueue scheduling code. This is consistent with how other bridge
subsystems (br_fdb, br_mst) enforce range constraints on netlink
attributes.
Fixes: 20f6a05ef635 ("bridge: mrp: Rework the MRP netlink interface")
Fixes: 7ab1748e4ce6 ("bridge: mrp: Extend MRP netlink interface for configuring MRP interconnect")
Reported-by: Weiming Shi <bestswngs@gmail.com>
Signed-off-by: Xiang Mei <xmei5@asu.edu>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260328063000.1845376-1-xmei5@asu.edu
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Boris Burkov <boris@bur.io>
Date: Mon May 11 19:53:46 2026 -0700
btrfs: fix squota accounting during enable generation
[ Upstream commit d7c600554816b8ef70adffe078a0e360c055d82b ]
The first transaction that enables squotas is special and a bit tricky.
We have to set BTRFS_FS_QUOTA_ENABLED after the transaction to avoid a
deadlock, so any delayed refs that run before we set the bit are not
squota accounted. For data this is fine, we don't get an owner_ref, so
there is no real harm, it's as if the extent predated squotas. However
for metadata, the tree block will have gen == enable_gen so when we free
it later, we will decrement the squota accounting, which can result in
an underflow. Before it is freed, btrfs check shows errors, as we have
mismatched usage between the node generations/owners and the squota
values.
There are two angles to this fix:
1. For extents that come in delayed_refs that run during the
enable_gen transaction, we must actually set enable_gen to the *next*
transaction. That is the first transaction that we can really
properly account in any way.
2. For extents that come in between the end of our transaction handle
and the time we set the BTRFS_FS_QUOTA_ENABLED bit, we need an
additional bit, BTRFS_FS_SQUOTA_ENABLING which only affects recording
squota deltas, so we do pick up those extents. Otherwise, we would
miss them, even for enable_gen + 1.
Fixes: bd7c1ea3a302 ("btrfs: qgroup: check generation when recording simple quota delta")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Filipe Manana <fdmanana@suse.com>
Date: Tue Apr 28 16:58:56 2026 +0100
btrfs: tracepoints: fix sleep while in atomic context in btrfs_sync_file()
[ Upstream commit c73370c677646e86fc4b1780fb07027bdf847375 ]
The trace event btrfs_sync_file() is called in an atomic context (all trace
events are) and its call to dput(), which is needed due to the call to
dget_parent(), can sleep, triggering a kernel splat.
This can be reproduced by enabling the trace event and running btrfs/056
from fstests for example. The splat shown in dmesg is the following:
[53.919] BUG: sleeping function called from invalid context at fs/dcache.c:970
[53.947] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 32773, name: xfs_io
[53.988] preempt_count: 2, expected: 0
[53.967] RCU nest depth: 0, expected: 0
[53.943] Preemption disabled at:
[53.944] [<0000000000000000>] 0x0
[54.078] CPU: 0 UID: 0 PID: 32773 Comm: xfs_io Tainted: G W 7.1.0-rc1-btrfs-next-232+ #1 PREEMPT(full)
[54.070] Tainted: [W]=WARN
[54.071] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
[54.072] Call Trace:
[54.074] <TASK>
[54.076] dump_stack_lvl+0x56/0x80
[54.079] __might_resched.cold+0xd6/0x10f
[54.072] dput.part.0+0x24/0x110
[54.078] trace_event_raw_event_btrfs_sync_file+0x75/0x140 [btrfs]
[54.089] btrfs_sync_file+0x1ed/0x530 [btrfs]
[54.087] ? __handle_mm_fault+0x8ae/0xed0
[54.089] btrfs_do_write_iter+0x172/0x210 [btrfs]
[54.091] vfs_write+0x21f/0x450
[54.094] __x64_sys_pwrite64+0x8d/0xc0
[54.096] ? do_user_addr_fault+0x20c/0x670
[54.099] do_syscall_64+0x60/0xf20
[54.092] ? clear_bhb_loop+0x60/0xb0
[54.094] entry_SYSCALL_64_after_hwframe+0x76/0x7e
So stop using dget_parent() and dput() and access the parent dentry
directly as dentry->d_parent. This is also what ext4 is doing in
its equivalent trace event ext4_sync_file_enter().
Fixes: a85b46db143f ("btrfs: tracepoints: get correct superblock from dentry in event btrfs_sync_file()")
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Guopeng Zhang <zhangguopeng@kylinos.cn>
Date: Fri May 22 09:03:03 2026 -0400
cgroup/cpuset: Reset DL migration state on can_attach() failure
[ Upstream commit 4a39eda5fdd867fc39f3c039714dd432cee00268 ]
cpuset_can_attach() accumulates temporary SCHED_DEADLINE migration
state in the destination cpuset while walking the taskset.
If a later task_can_attach() or security_task_setscheduler() check
fails, cgroup_migrate_execute() treats cpuset as the failing subsystem
and does not call cpuset_cancel_attach() for it. The partially
accumulated state is then left behind and can be consumed by a later
attach, corrupting cpuset DL task accounting and pending DL bandwidth
accounting.
Reset the pending DL migration state from the common error exit when
ret is non-zero. Successful can_attach() keeps the state for
cpuset_attach() or cpuset_cancel_attach().
Fixes: 2ef269ef1ac0 ("cgroup/cpuset: Free DL BW in case can_attach() fails")
Cc: stable@vger.kernel.org # v6.10+
Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Chen Ridong <chenridong@huaweicloud.com>
Reviewed-by: Waiman Long <longman@redhat.com>
[ omitted upstream context line `cs->dl_bw_cpu = cpu;` ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Zhihao Cheng <chengzhihao1@huawei.com>
Date: Tue May 19 17:18:05 2026 +0800
cifs: Fix busy dentry used after unmounting
commit c68337442f03953237a94577beb468ab2662a851 upstream.
Since commit 340cea84f691c ("cifs: open files should not hold ref on
superblock"), cifs file only holds the dentry ref_cnt, the cifs file
close work(cfile->deferred) could be executed after unmounting, which
will trigger a warning in generic_shutdown_super:
BUG: Dentry 00000000a14a6845{i=c,n=file} still in use (1) [unmount of
cifs cifs]
The detailed processs is:
process A process B kworker
fd = open(PATH)
vfs_open
file->__f_path = *path // dentry->d_lockref.count = 1
cifs_open
cifs_new_fileinfo
cfile->dentry = dget(dentry) // dentry->d_lockref.count = 2
close(fd)
__fput
cifs_close
queue_delayed_work(deferredclose_wq, cfile->deferred)
dput(dentry) // dentry->d_lockref.count = 1
smb2_deferred_work_close
_cifsFileInfo_put
list_del(&cifs_file->flist)
umount
cleanup_mnt
deactivate_super
cifs_kill_sb
cifs_close_all_deferred_files_sb
cifs_close_all_deferred_files
// cannot find cfile, skip _cifsFileInfo_put
kill_anon_super
generic_shutdown_super
shrink_dcache_for_umount
umount_check
WARN ! // dentry->d_lockref.count = 1
cifsFileInfo_put_final
dput(cifs_file->dentry)
// dentry->d_lockref.count = 0
Fix it by flushing 'deferredclose_wq' before calling kill_anon_super.
Fetch a reproducer in https://bugzilla.kernel.org/show_bug.cgi?id=221548.
Fixes: 340cea84f691c ("cifs: open files should not hold ref on superblock")
Cc: stable@vger.kernel.org
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Date: Wed May 6 13:57:00 2026 +0200
device property: set fwnode->secondary to NULL in fwnode_init()
commit 215c90ee656114f5e8c32408228d97082f8e0eef upstream.
If a firmware node is allocated on the stack (for instance: temporary
software node whose life-time we control) or on the heap - but using a
non-zeroing allocation function - and initialized using fwnode_init(),
its secondary pointer will contain uninitalized memory which likely will
be neither NULL nor IS_ERR() and so may end up being dereferenced (for
example: in dev_to_swnode()). Set fwnode->secondary to NULL on
initialization.
Cc: stable <stable@kernel.org>
Fixes: 01bb86b380a3 ("driver core: Add fwnode_init()")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://patch.msgid.link/20260506115701.23035-1-bartosz.golaszewski@oss.qualcomm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Muchun Song <muchun.song@linux.dev>
Date: Tue Apr 28 16:52:18 2026 +0800
drivers/base/memory: fix memory block reference leak in poison accounting
commit 03a2cc1756a0570f887d624cd6c535ea0cbd4951 upstream.
memblk_nr_poison_inc() and memblk_nr_poison_sub() look up a memory block
via find_memory_block_by_id(), which acquires a reference to the memory
block device.
Both helpers use the returned memory block without dropping that
reference, leaking the device reference on each successful lookup. Drop
the reference after updating nr_hwpoison.
Link: https://lore.kernel.org/20260428085219.1316047-3-songmuchun@bytedance.com
Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Oscar Salvador <osalvador@suse.de>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Huang, Ying" <huang.ying.caritas@gmail.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Harry Wentland <harry.wentland@amd.com>
Date: Mon May 4 11:14:45 2026 -0400
drm/amd/display: Fix integer overflow in bios_get_image()
commit cd86529ec61474a38c3837fb7823790a7c3f8cce upstream.
[Why&How]
The bounds check in bios_get_image() computes 'offset + size' using
unsigned 32-bit arithmetic before comparing against bios_size. If a
VBIOS image contains a near-UINT32_MAX offset the addition wraps to a
small value, the comparison passes, and the function returns a wild
pointer past the VBIOS mapping.
Additionally, the comparison uses '<' (strict), which incorrectly
rejects the valid exact-fit case where offset + size == bios_size.
Fix both issues by restructuring the check to avoid the addition
entirely: first reject if offset alone exceeds bios_size, then check
size against the remaining space (bios_size - offset). This eliminates
the overflow and correctly permits exact-fit accesses.
Assisted-by: GitHub Copilot:claude-opus-4.6
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d40fb392af659c4a02b560319f226842f6ec1a95)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Harry Wentland <harry.wentland@amd.com>
Date: Mon May 4 16:14:11 2026 -0400
drm/amd/display: Validate GPIO pin LUT table size before iterating
commit 86d2b20644b11d21fe52c596e6e922b4590a3e3f upstream.
[Why&How]
The GPIO pin table parsers in get_gpio_i2c_info() and
bios_parser_get_gpio_pin_info() derive an element count from the VBIOS
table_header.structuresize field, then iterate over gpio_pin[] entries.
However, GET_IMAGE() only validates that the table header itself fits
within the BIOS image. If the VBIOS reports a structuresize larger than
the actual mapped data, the loop reads past the end of the BIOS image,
causing an out-of-bounds read.
Fix this by calling bios_get_image() to validate that the full claimed
structuresize is accessible within the BIOS image before entering the
loop in both functions.
Assisted-by: GitHub Copilot:claude-opus-4-6
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ba5e95b43b773ae1bf1f66ee6b31eb774e65afe3)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Harry Wentland <harry.wentland@amd.com>
Date: Thu May 7 16:26:31 2026 -0400
drm/amd/display: Validate payload length and link_index in dc_process_dmub_aux_transfer_async
commit 6c92f6d9600efa3ef0d9e560a2b52776d9803c29 upstream.
[Why&How]
dc_process_dmub_aux_transfer_async() copies payload->length bytes into a
16-byte stack buffer (dpaux.data[16]) guarded only by an ASSERT(), which
is a no-op in release builds. If a caller ever passes length > 16 this
results in a stack buffer overflow via memcpy.
Additionally, link_index is used to dereference dc->links[] without
bounds checking against dc->link_count, risking an out-of-bounds access.
Replace the ASSERT with a hard runtime check that returns false when
payload->length exceeds the destination buffer size, and add a bounds
check for link_index before it is used.
Assisted-by: GitHub Copilot:Claude claude-4-opus
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ba4caa9fecdf7a38f98c878ad05a8a64148b6881)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Alan Liu <haoping.liu@amd.com>
Date: Fri May 1 12:35:48 2026 +0800
drm/amdgpu/vpe: Force collaborate sync after TRAP
commit b6074630a461b1322a814988779005cbc43612ea upstream.
VPE1 could possibly hang and fail to power off at the end of commands in
collaboration mode. This workaround adds a COLLAB_SYNC after TRAP to
force instances synchronized to avoid VPE1 fail to power off.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Alan liu <haoping.liu@amd.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5171
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a8b749c5c5afb7e5daa2bfb95d958fb3c6b8f055)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Osama Abdelkader <osama.abdelkader@gmail.com>
Date: Thu Apr 30 21:49:42 2026 +0200
drm/bridge: chipone-icn6211: use devm_drm_bridge_add in i2c probe
commit 73d01051e8040c0b1de7fd26b3b8d0c2ffa6895c upstream.
Use devm_drm_bridge_add() so the bridge is released if probe
fails after registration, and drop drm_bridge_remove() in chipone_i2c_probe.
Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com>
Fixes: 8dde6f7452a1 ("drm: bridge: icn6211: Add I2C configuration support")
Cc: stable@vger.kernel.org
Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Link: https://patch.msgid.link/20260430194944.78119-1-osama.abdelkader@gmail.com
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Julien Chauveau <chauveau.julien@gmail.com>
Date: Tue Mar 24 20:30:11 2026 +0100
drm/bridge: it66121: acquire reset GPIO in probe
commit e02b5262fd288cc235f14e12233ea54e78c04611 upstream.
The it66121_ctx structure has a gpio_reset field, and it66121_hw_reset()
calls gpiod_set_value() on it. However, the GPIO descriptor is never
acquired via devm_gpiod_get(), leaving gpio_reset as NULL throughout
the driver lifetime.
gpiod_set_value() silently returns when passed a NULL descriptor, so
the hardware reset sequence in it66121_hw_reset() is a no-op. This
leaves the chip in an undefined state at probe time, which can prevent
it from responding on the I2C bus.
The DT binding marks reset-gpios as a required property, so all
compliant device trees provide this GPIO. Add the missing
devm_gpiod_get() call after enabling power supplies and before the
hardware reset, so the chip is properly reset with power applied.
Fixes: 988156dc2fc9 ("drm: bridge: add it66121 driver")
Cc: stable@vger.kernel.org
Signed-off-by: Julien Chauveau <chauveau.julien@gmail.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Tested-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patch.msgid.link/20260324193011.16583-1-chauveau.julien@gmail.com
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Osama Abdelkader <osama.abdelkader@gmail.com>
Date: Thu Apr 30 21:56:59 2026 +0200
drm/bridge: megachips: remove bridge when irq request fails
commit d45d5c819f2cd0b6b5d76a194a537a5f4aeefecb upstream.
If devm_request_threaded_irq() fails after drm_bridge_add(), remove the
bridge before returning.
Keep drm_bridge_add() rather than devm_drm_bridge_add(): registration is
tied to the STDP4028 device while ge_b850v3_register() may complete from
either I2C probe; devm would not unwind the bridge if the other client's
probe fails.
Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com>
Fixes: fcfa0ddc18ed ("drm/bridge: Drivers for megachips-stdpxxxx-ge-b850v3-fw (LVDS-DP++)")
Cc: stable@vger.kernel.org
Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Tested-by: Ian Ray <ian.ray@gehealthcare.com>
Link: https://patch.msgid.link/20260430195700.80317-1-osama.abdelkader@gmail.com
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Date: Mon May 11 18:02:15 2026 +0530
drm/i915/dp: Fix readback for target_rr in Adaptive Sync SDP
[ Upstream commit f87abd0c6604fb6cc31cc86fc7ccc6a576924352 ]
Correct the bit-shift logic to properly readback the 10 bit target_rr from
DB3 and DB4.
v2: Align the style with readback for vtotal. (Ville)
Fixes: 12ea89291603 ("drm/i915/dp: Add Read/Write support for Adaptive Sync SDP")
Cc: Mitul Golani <mitulkumar.ajitkumar.golani@intel.com>
Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20260511123218.1589830-2-ankit.k.nautiyal@intel.com
(cherry picked from commit f7abc4af2b19240a145a221461dfe756cc01d74a)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alessio Belle <alessio.belle@imgtec.com>
Date: Tue May 26 09:13:07 2026 +0100
drm/imagination: Synchronize interrupts before suspending the GPU
commit 2d7f05cddf4c268cc36256a2476946041dbdd36d upstream.
The runtime PM suspend callback doesn't know whether the IRQ handler is
in progress on a different CPU core and doesn't wait for it to finish.
Depending on timing, the IRQ handler could be running while the GPU is
suspended, leading to it being killed when trying to access GPU
registers. See example signature below.
In a power off sequence initiated by the runtime PM suspend callback,
wait for any IRQ handlers in progress on other CPU cores to finish, by
calling synchronize_irq().
This version of the patch contains only the part of the upstream commit
that applies to 6.12; the rest was a revert of code added in 6.16.
The second paragraph above is different because on 6.12 this kind of bug
doesn't seem to crash the entire kernel, only the IRQ handler, leaving
the driver unusable in practice.
The crash signature below is also different, both because of the above,
and because there was no support for TI AM68 SK in 6.12.
Example signature on a TI AM62 SK platform:
[ 7827.189088] Internal error: synchronous external abort: 0000000096000010 [#1] PREEMPT SMP
[ 7827.197311] Modules linked in:
[ 7827.222015] CPU: 0 UID: 0 PID: 461 Comm: irq/405-gpu Tainted: G M 6.12.90 #5
[ 7827.230461] Tainted: [M]=MACHINE_CHECK
[ 7827.234203] Hardware name: Texas Instruments AM625 SK (DT)
[ 7827.239682] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 7827.246637] pc : pvr_device_irq_thread_handler+0x64/0x180 [powervr]
[ 7827.252941] lr : irq_thread_fn+0x2c/0xa8
[ 7827.256872] sp : ffff800082d8bd50
[ 7827.260179] x29: ffff800082d8bd70 x28: ffff8000800ce374 x27: ffff800081829cc0
[ 7827.267328] x26: ffff000004701e80 x25: ffff000005b884ac x24: ffff000005bd5780
[ 7827.274472] x23: ffff00000da40bc0 x22: ffff00000da40ba0 x21: ffff800082d8bd58
[ 7827.281614] x20: ffff00000da40000 x19: ffff000004701e80 x18: 08000000c6af9003
[ 7827.288750] x17: 0000000000000010 x16: 0000000000000068 x15: 0df234008df66400
[ 7827.295886] x14: 0000000000000000 x13: 000005c68f6e7191 x12: 000000000000025e
[ 7827.303020] x11: 00000000000000c0 x10: 0000000000000ac0 x9 : ffff800082d8bd00
[ 7827.310157] x8 : ffff000005bd62a0 x7 : ffff000077261380 x6 : 00000000000005c6
[ 7827.317292] x5 : 000000000000425e x4 : 0000000000000000 x3 : 0000000000000000
[ 7827.324428] x2 : 00000000000008a8 x1 : ffff800082d608a8 x0 : ffff000005bd5780
[ 7827.331568] Call trace:
[ 7827.334011] pvr_device_irq_thread_handler+0x64/0x180 [powervr]
[ 7827.339954] irq_thread_fn+0x2c/0xa8
[ 7827.343530] irq_thread+0x16c/0x2f4
[ 7827.347019] kthread+0x110/0x114
[ 7827.350248] ret_from_fork+0x10/0x20
[ 7827.353834] Code: f9446682 f943c281 b9404442 8b020021 (b9400021)
[ 7827.359921] ---[ end trace 0000000000000000 ]---
[ 7827.364820] genirq: exiting task "irq/405-gpu" (461) is an active IRQ thread (irq 405)
[ 8011.230278] powervr fd00000.gpu: Job timeout
[ 8011.230350] powervr fd00000.gpu: Job timeout
[ 8011.230426] powervr fd00000.gpu: Job timeout
Fixes: cc1aeedb98ad ("drm/imagination: Implement firmware infrastructure and META FW support")
Fixes: 96822d38ff57 ("drm/imagination: Handle Rogue safety event IRQs")
Cc: stable@vger.kernel.org
Signed-off-by: Alessio Belle <alessio.belle@imgtec.com>
Reviewed-by: Matt Coster <matt.coster@imgtec.com>
Link: https://patch.msgid.link/20260310-drain-irqs-before-suspend-v1-1-bf4f9ed68e75@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Date: Tue Apr 28 20:21:38 2026 +0300
drm/msm/dsi: don't dump registers past the mapped region
[ Upstream commit 5b49a46baa853b26dbefa65c6c75dd9ff69f63d4 ]
On DSI 6G platforms the IO address space is internally adjusted by
io_offset. Later this adjusted address might be used for memory dumping.
However the size that is used for memory dumping isn't adjusted to
account for the io_offset, leading to the potential access to the
unmapped region. Lower ctrl_size by the io_offset value to prevent
access past the mapped area.
msm_disp_snapshot_add_block+0x1d4/0x3c8 [msm] (P)
msm_dsi_host_snapshot+0x4c/0x78 [msm]
msm_dsi_snapshot+0x28/0x50 [msm]
msm_disp_snapshot_capture_state+0x74/0x140 [msm]
msm_disp_snapshot_state_sync+0x60/0x90 [msm]
_msm_disp_snapshot_work+0x30/0x90 [msm]
kthread_worker_fn+0xdc/0x460
kthread+0x120/0x140
Fixes: bac2c6a62ed9 ("drm/msm: get rid of msm_iomap_size")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/721747/
Link: https://lore.kernel.org/r/20260428-msm-fix-dsi-dump-v1-1-5d4cb5ccfac7@oss.qualcomm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Date: Sat May 16 14:53:45 2026 +0300
drm/msm/snapshot: fix dumping of the unaligned regions
[ Upstream commit 76824d2467feb1828b745d6add2541918d7be3da ]
The snapshotting code internally aligns data segment to 16 bytes. This
works fine for DPU code (where most of the regions are aligned), but
fails for snapshotting of the DSI data (because DSI data region is
shifted by 4 bytes). Fix the code by removing length alignment and by
accurately printing last registers in the region. While reworking the
code also fix the 16x memory overallocation in
msm_disp_state_dump_regs().
Fixes: 98659487b845 ("drm/msm: add support to take dpu snapshot")
Reported-by: Salendarsingh Gaud <sgaud@qti.qualcomm.com>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/725449/
Message-ID: <20260516-msm-fix-dsi-dump-2-v2-1-9e49fb2d240e@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mikko Perttunen <mperttunen@nvidia.com>
Date: Tue Apr 21 13:02:38 2026 +0900
drm/msm: Fix iommu_map_sgtable() return value check and avoid WARN
[ Upstream commit 55e0f0d1c1a4ee1e46da7da4d443eb3044fb3851 ]
Commit "iommu: return full error code from iommu_map_sg[_atomic]()"
changed iommu_map_sgtable() to return an ssize_t and negative values
in error cases, rather than a size_t and a zero.
Store the return value in the appropriate type and in case of error,
return it rather than WARNing.
Fixes: ad8f36e4b6b1 ("iommu: return full error code from iommu_map_sg[_atomic]()")
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Patchwork: https://patchwork.freedesktop.org/patch/719685/
Message-ID: <20260421-iommu_map_sgtable-return-v1-3-fb484c07d2a1@nvidia.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Deepanshu Kartikey <kartikey406@gmail.com>
Date: Tue May 19 13:52:47 2026 +0530
drm/virtio: use uninterruptible resv lock for plane updates
commit 9af1b6e175c82daf4b423da339a722d8e67a735a upstream.
virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush() lock
the framebuffer BO's dma_resv via virtio_gpu_array_lock_resv() and
ignore its return value. The function can fail with -EINTR from
dma_resv_lock_interruptible() (signal during lock wait) or with
-ENOMEM from dma_resv_reserve_fences() (fence slot allocation),
leaving the resv lock not held. The queue path then walks the object
array and calls dma_resv_add_fence(), which requires the lock held;
with lockdep enabled this trips dma_resv_assert_held():
WARNING: drivers/dma-buf/dma-resv.c:296 at dma_resv_add_fence+0x71e/0x840
Call Trace:
virtio_gpu_array_add_fence
virtio_gpu_queue_ctrl_sgs
virtio_gpu_queue_fenced_ctrl_buffer
virtio_gpu_cursor_plane_update
drm_atomic_helper_commit_planes
drm_atomic_helper_commit_tail
commit_tail
drm_atomic_helper_commit
drm_atomic_commit
drm_atomic_helper_update_plane
__setplane_atomic
drm_mode_cursor_universal
drm_mode_cursor_common
drm_mode_cursor_ioctl
drm_ioctl
__x64_sys_ioctl
Beyond the WARN, mutating the dma_resv fence list without the lock
races with concurrent readers/writers and can corrupt the list.
Both call sites run inside the .atomic_update plane callback, which
DRM atomic helpers do not allow to fail (by the time it runs, the
commit has been signed off to userspace and there is no clean
rollback path). Moving the lock acquisition to .prepare_fb was
rejected because the broader lock scope deadlocks against other BO
locking paths in the same atomic commit.
Introduce virtio_gpu_lock_one_resv_uninterruptible() that uses
dma_resv_lock() instead of dma_resv_lock_interruptible(). This
eliminates the -EINTR failure mode -- the realistic syzbot trigger
-- without extending the lock hold across the commit. The helper
locks a single BO and rejects nents > 1 with -EINVAL; both fix
sites lock exactly one BO.
Use it from virtio_gpu_cursor_plane_update() and
virtio_gpu_resource_flush(); check the return value to handle the
remaining -ENOMEM case from dma_resv_reserve_fences() by freeing
the objs and skipping the plane update for that frame. The
framebuffer BOs touched here are not shared with other contexts
and lock contention is expected to be brief, so the loss of
signal-interruptibility is acceptable.
Other callers of virtio_gpu_array_lock_resv() (the ioctl paths)
continue to use the interruptible variant.
The bug was reported by syzbot, triggered via fault injection
(fail_nth) on the DRM_IOCTL_MODE_CURSOR path, which forces the
-ENOMEM branch in dma_resv_reserve_fences().
Reported-by: syzbot+72bd3dd3a5d5f39a0271@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=72bd3dd3a5d5f39a0271
Fixes: 5cfd31c5b3a3 ("drm/virtio: fix virtio_gpu_cursor_plane_update().")
Cc: stable@vger.kernel.org
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patch.msgid.link/20260519082247.34470-1-kartikey406@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Shuicheng Lin <shuicheng.lin@intel.com>
Date: Mon May 11 15:41:34 2026 +0000
drm/xe/gsc: Fix double-free of managed BO in error path
[ Upstream commit d3ded53fab90996e7d94a39049e11962dd066725 ]
The error path in xe_gsc_init_post_hwconfig() explicitly frees a BO
allocated with xe_managed_bo_create_pin_map() via
xe_bo_unpin_map_no_vm(). Since the managed BO already has a devm
cleanup action registered, this causes a double-free when devm
unwinds during probe failure.
Remove the explicit free and let devm handle it, consistent with
all other xe_managed_bo_create_pin_map() callers.
Fixes: 2e5d47fe7839 ("drm/xe/uc: Use managed bo for HuC and GSC objects")
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Assisted-by: Claude:claude-opus-4.6
Link: https://patch.msgid.link/20260511154134.223696-1-shuicheng.lin@intel.com
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
(cherry picked from commit 71d61e3e299a17139e47f980a4d6f425b2c59bf7)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gustavo Sousa <gustavo.sousa@intel.com>
Date: Fri May 22 16:40:58 2026 -0300
drm/xe/hdcp: Add NULL check for media_gt in intel_hdcp_gsc_check_status()
commit 60a1e131a811b68703da58fd805ab359b704ab03 upstream.
When media GT is disabled via configfs, there is no allocation for
media_gt, which is kept as NULL. In such scenario,
intel_hdcp_gsc_check_status() results in a kernel pagefault error due to
>->uc.gsc being evaluated as an invalid memory address.
Fix that by introducing a NULL check on media_gt and bailing out early
if so.
While at it, also drop the NULL check for gsc, since it can't be NULL if
media_gt is not NULL.
v2:
- Get address for gsc only after checking that gt is not NULL.
(Shuicheng)
- Drop the NULL check for gsc. (Shuicheng)
v3:
- Add "Fixes" and "Cc: <stable...>" tags. (Matt)
Fixes: 4af50beb4e0f ("drm/xe: Use gsc_proxy_init_done to check proxy status")
Cc: <stable@vger.kernel.org> # v6.10+
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20260416-check-for-null-media_gt-in-intel_hdcp_gsc_check_status-v2-1-9adb9fd3b621@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
(cherry picked from commit bfaf87e84ca3ca3f6e275f9ae56da47a8b55ffd1)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shuicheng Lin <shuicheng.lin@intel.com>
Date: Thu May 14 20:32:10 2026 +0000
drm/xe/oa: Fix exec_queue leak on width check in stream open
[ Upstream commit 4d25342543c01310fc4e0cba7cb17c775e2421e2 ]
In xe_oa_stream_open_ioctl(), when param.exec_q->width > 1 the
function returns -EOPNOTSUPP directly, skipping the existing
err_exec_q cleanup path. The exec_queue reference obtained by
xe_exec_queue_lookup() is leaked.
The exec queue holds a reference on the xe_file, which is only
dropped during queue teardown. The leaked lookup ref is not on
the file's exec_queue xarray, so file close cannot release it.
This keeps both the exec queue and the file private state pinned
indefinitely.
Jump to err_exec_q instead of returning directly so the reference
is released.
Fixes: f0ed39830e60 ("xe/oa: Fix query mode of operation for OAR/OAC")
Assisted-by: Claude:claude-opus-4.6
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patch.msgid.link/20260514203210.593488-1-shuicheng.lin@intel.com
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
(cherry picked from commit 339fa0be9e4a5d69fa47e91f4a36574224fb478f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mohanram Meenakshisundaram <mohanram.meenakshisundaram@intel.com>
Date: Thu May 14 23:19:18 2026 +0530
drm/xe/pf: Fix CFI failure in debugfs access
[ Upstream commit 96bf49b526e2d03a2b7f6e861925a08f46ed0d28 ]
Reading debugfs file (/sys/kernel/debug/dri/0/gt*/pf/adverse_events)
with CFI (Control Flow Integrity) enabled, the kernel panics at
xe_gt_debugfs_simple_show+0x82/0xc0.
xe_gt_debugfs_simple_show() declare a function pointer expecting int
return type, but xe_gt_sriov_pf_monitor_print_events() is void return
type, leading to CFI failure and kernel panic.
[507620.973657] CFI failure at xe_gt_debugfs_simple_show+0x82/0xc0 [xe]
(target: xe_gt_sriov_pf_monitor_print_events+0x0/0x130 [xe]; expected
type: 0xd72c7139)
Fix xe_gt_sriov_pf_monitor_print_events() function by updating to return
an int type.
Fixes: 1c99d3d3edab ("drm/xe/pf: Expose PF monitor details via debugfs")
Signed-off-by: Mohanram Meenakshisundaram <mohanram.meenakshisundaram@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260514174918.1556357-2-mohanram.meenakshisundaram@intel.com
(cherry picked from commit ff1d386a8359746d9699ac30336e3b0684c68958)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michal Wajdeczko <michal.wajdeczko@intel.com>
Date: Thu May 14 17:57:26 2026 +0200
drm/xe/vf: Fix signature of print functions
[ Upstream commit 9bb2f1d7e6e58b8e434ddc2048c661bf87ccdf2a ]
We have plugged-in existing VF print functions into our GT debugfs
show helper as-is, but we missed that the helper expects functions
to return int, while they were defined as void. This can lead to
errors being reported when CFI is enabled.
Fixes: 63d8cb8fe3dd ("drm/xe/vf: Expose SR-IOV VF attributes to GT debugfs")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Mohanram Meenakshisundaram <mohanram.meenakshisundaram@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20260514155726.7165-1-michal.wajdeczko@intel.com
(cherry picked from commit 314e31c9a8a1c421ee4f7f755b9348aefbbca090)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ard Biesheuvel <ardb@kernel.org>
Date: Tue May 19 10:03:00 2026 +0200
efi: Allocate runtime workqueue before ACPI init
commit 13c6da02e767152c9ac4330962247a5e47011035 upstream.
Since commit
5894cf571e14 ("acpi/prmt: Use EFI runtime sandbox to invoke PRM handlers")
ACPI PRM calls are delegated to a workqueue which runs in a kernel
thread, making it easier to detect and mitigate faulting memory accesses
performed by the firmware.
Rafael reports that such PRM accesses may occur before efisubsys_init()
executes, which is where the workqueue is allocated, leading to NULL
pointer dereferences. Since acpi_init() [which triggers the early PRM
accesses] executes as a subsys_initcall() as well, and has its own
dependencies that may be sensitive to initcall ordering, deferring
acpi_init() is not an option.
So instead, split off the workqueue allocation into its own postcore
initcall, as this is the only missing piece to allow EFI runtime calls
to be made. This ensures that EFI runtime call (including PRM calls) are
accessible to all code running at subsys_initcall() level.
Cc: <stable@vger.kernel.org>
Fixes: 5894cf571e14 ("acpi/prmt: Use EFI runtime sandbox to invoke PRM handlers")
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Chenguang Zhao <zhaochenguang@kylinos.cn>
Date: Mon May 11 09:43:43 2026 +0800
ethtool: fix ethnl_bitmap32_not_zero() bit interval semantics
[ Upstream commit 3d042592ebd4c7e44974d556de0b727cb7db4dab ]
ethnl_bitmap32_not_zero() should return true if some bit in [start, end)
is set:
- Fix inverted memchr_inv() sense: return true when the scan finds a
non-zero byte, not when the middle words are all zero.
- Return false for an empty interval (end <= start).
- When end is 32-bit aligned, indices in [start, end) do not include any
bits from map[end_word]; return false after earlier checks found no
non-zero data.
Fixes: 10b518d4e6dd ("ethtool: netlink bitset handling")
Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sudeep Holla <sudeep.holla@kernel.org>
Date: Tue Apr 28 19:33:33 2026 +0100
firmware: arm_ffa: Align RxTx buffer size before mapping
[ Upstream commit 0399e3f872ca3d78044bb715a73ea645806d2c7b ]
Commit 83210251fd70 ("firmware: arm_ffa: Use the correct buffer size during
RXTX_MAP") advertises PAGE_ALIGN(rxtx_bufsz) to firmware when mapping the
buffers but the driver continues to stores the minimum FF-A buffer size
in drv_info->rxtx_bufsz which is used elsewhere in the driver.
Align the size before storing it so that the allocation, validation and
FFA_RXTX_MAP all use the same buffer size.
Fixes: 83210251fd70 ("firmware: arm_ffa: Use the correct buffer size during RXTX_MAP")
Cc: Sebastian Ene <sebastianene@google.com>
Link: https://sashiko.dev/#/patchset/20260402113939.930221-1-sebastianene@google.com
Reviewed-by: Sebastian Ene <sebastianene@google.com>
Link: https://patch.msgid.link/20260428-ffa_fixes-v2-9-8595ae450034@kernel.org
Signed-off-by: Sudeep Holla <sudeep.holla@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sudeep Holla <sudeep.holla@kernel.org>
Date: Mon Feb 17 15:38:59 2025 +0000
firmware: arm_ffa: Allow multiple UUIDs per partition to register SRI callback
[ Upstream commit be61da938576671c664382a059f961d7b4b2fc41 ]
A partition can implement multiple UUIDs and currently we successfully
register each UUID service as a FF-A device. However when adding the
same partition info to the XArray which tracks the SRI callbacks more
than once, it fails.
In order to allow multiple UUIDs per partition to register SRI callbacks
the partition information stored in the XArray needs to be extended to
a listed list.
A function to remove the list of partition information in the XArray
is not added as there are no users at the time. All the partitions are
added at probe/initialisation and removed at cleanup stage.
Tested-by: Viresh Kumar <viresh.kumar@linaro.org>
Message-Id: <20250217-ffa_updates-v3-18-bd1d9de615e7@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Stable-dep-of: 6d3daa9b8d31 ("firmware: arm_ffa: Unregister bus notifier on teardown for FF-A v1.0")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sudeep Holla <sudeep.holla@kernel.org>
Date: Tue Apr 28 19:33:25 2026 +0100
firmware: arm_ffa: Check for NULL FF-A ID table while driver registration
[ Upstream commit 0a5e695095c557d2380131b613dea4e8d90371be ]
The bus match callback assumes that every FF-A driver provides an
id_table and dereferences it unconditionally. Enforce that contract at
registration time so a buggy client driver cannot crash the bus during
match.
Fixes: 92743071464f ("firmware: arm_ffa: Ensure drivers provide a probe function")
Link: https://patch.msgid.link/20260428-ffa_fixes-v2-1-8595ae450034@kernel.org
Signed-off-by: Sudeep Holla <sudeep.holla@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sudeep Holla <sudeep.holla@kernel.org>
Date: Tue Apr 28 19:33:28 2026 +0100
firmware: arm_ffa: Fix per-vcpu self notifications handling in workqueue
[ Upstream commit 9985d5357ed93af0d1933969c247e966957730e1 ]
Per-vcpu notification handling already runs from a per-cpu work item on
the target cpu. Routing that path back through smp_call_function_single()
re-enters the call-function IPI path and executes the notification
handler with interrupts disabled. That makes the framework path unsafe,
since it takes a mutex, allocates memory with GFP_KERNEL, and invokes
client callbacks.
Handle per-vcpu self notifications directly from the existing per-cpu
work item instead. This keeps the per-vcpu path in task context and
avoids the extra IPI hop entirely.
Fixes: 3a3e2b83e805 ("firmware: arm_ffa: Avoid queuing work when running on the worker queue")
Link: https://patch.msgid.link/20260428-ffa_fixes-v2-4-8595ae450034@kernel.org
Signed-off-by: Sudeep Holla <sudeep.holla@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sudeep Holla <sudeep.holla@kernel.org>
Date: Tue Apr 28 19:33:35 2026 +0100
firmware: arm_ffa: Fix sched-recv callback partition lookup
[ Upstream commit a6848a50404eefb6f0b131c21881a2d8d21b31a9 ]
ffa_sched_recv_cb_update() used list_for_each_entry_safe() to search for
a matching partition and then tested the iterator against NULL. That is
not a valid end-of-list check for circular lists and can fall through
with an invalid pointer. Use a normal iterator and detect the not-found
case correctly before touching the partition state.
Fixes: be61da938576 ("firmware: arm_ffa: Allow multiple UUIDs per partition to register SRI callback")
Link: https://patch.msgid.link/20260428-ffa_fixes-v2-11-8595ae450034@kernel.org
Signed-off-by: Sudeep Holla <sudeep.holla@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Viresh Kumar <viresh.kumar@linaro.org>
Date: Mon Feb 17 15:38:47 2025 +0000
firmware: arm_ffa: Refactor addition of partition information into XArray
[ Upstream commit 3c3d6767466ea316869c9f2bdd976aec8ce44545 ]
Move the common code handling addition of the FF-A partition information
into the XArray as a new routine. No functional change.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Message-Id: <20250217-ffa_updates-v3-6-bd1d9de615e7@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Stable-dep-of: 6d3daa9b8d31 ("firmware: arm_ffa: Unregister bus notifier on teardown for FF-A v1.0")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sudeep Holla <sudeep.holla@kernel.org>
Date: Mon Feb 17 15:38:54 2025 +0000
firmware: arm_ffa: Remove unnecessary declaration of ffa_partitions_cleanup()
[ Upstream commit 9982cabf403fbd06a120a2d5b21830effd32b370 ]
In order to keep the uniformity, just move the ffa_partitions_cleanup()
before it's first usage and drop the unnecessary forward declaration.
No functional change.
Tested-by: Viresh Kumar <viresh.kumar@linaro.org>
Message-Id: <20250217-ffa_updates-v3-13-bd1d9de615e7@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Stable-dep-of: 6d3daa9b8d31 ("firmware: arm_ffa: Unregister bus notifier on teardown for FF-A v1.0")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sudeep Holla <sudeep.holla@kernel.org>
Date: Tue Apr 28 19:33:26 2026 +0100
firmware: arm_ffa: Skip free_pages on RX buffer alloc failure
[ Upstream commit 09527e2c534911619d7e098729711100290bc3e1 ]
If the RX buffer allocation fails in ffa_init(), the error path jumps to
free_pages even though no buffer has been allocated yet. Route that case
directly to free_drv_info so the cleanup path is only used after at
least one RX/TX buffer allocation has succeeded.
Fixes: 3bbfe9871005 ("firmware: arm_ffa: Add initial Arm FFA driver support")
Link: https://patch.msgid.link/20260428-ffa_fixes-v2-2-8595ae450034@kernel.org
Signed-off-by: Sudeep Holla <sudeep.holla@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sudeep Holla <sudeep.holla@kernel.org>
Date: Tue Apr 28 19:33:29 2026 +0100
firmware: arm_ffa: Unregister bus notifier on teardown for FF-A v1.0
[ Upstream commit 6d3daa9b8d313f42d52e75590310f26a29b61b44 ]
For FF-A v1.0 the driver registers a bus notifier to backfill UUID
matching, but the notifier was never unregistered on cleanup paths.
Track the registration state and unregister it during teardown and early
partition-setup failure.
Fixes: 9dd15934f60d ("firmware: arm_ffa: Move the FF-A v1.0 NULL UUID workaround to bus notifier")
Link: https://patch.msgid.link/20260428-ffa_fixes-v2-5-8595ae450034@kernel.org
Signed-off-by: Sudeep Holla <sudeep.holla@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sudeep Holla <sudeep.holla@kernel.org>
Date: Mon Feb 17 15:38:49 2025 +0000
firmware: arm_ffa: Unregister the FF-A devices when cleaning up the partitions
[ Upstream commit 46dcd68aaccac0812c12ec3f4e59c8963e2760ad ]
Both the FF-A core and the bus were in a single module before the
commit 18c250bd7ed0 ("firmware: arm_ffa: Split bus and driver into distinct modules").
The arm_ffa_bus_exit() takes care of unregistering all the FF-A devices.
Now that there are 2 distinct modules, if the core driver is unloaded and
reloaded, it will end up adding duplicate FF-A devices as the previously
registered devices weren't unregistered when we cleaned up the modules.
Fix the same by unregistering all the FF-A devices on the FF-A bus during
the cleaning up of the partitions and hence the cleanup of the module.
Fixes: 18c250bd7ed0 ("firmware: arm_ffa: Split bus and driver into distinct modules")
Tested-by: Viresh Kumar <viresh.kumar@linaro.org>
Message-Id: <20250217-ffa_updates-v3-8-bd1d9de615e7@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Stable-dep-of: 6d3daa9b8d31 ("firmware: arm_ffa: Unregister bus notifier on teardown for FF-A v1.0")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Date: Thu May 28 11:23:27 2026 +0800
fs/ntfs3: handle attr_set_size() errors when truncating files
[ Upstream commit 576248a34b927e93b2fd3fff7df735ba73ad7d01 ]
If attr_set_size() fails while truncating down, the error is silently
ignored and the inode may be left in an inconsistent state.
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
[ Minor context conflict resolved. ]
Signed-off-by: Bin Lan <lanbincn@139.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Date: Thu May 21 10:42:16 2026 +0200
gpio: cdev: check if uAPI v2 config attributes are correctly zeroed
[ Upstream commit 3e6ccd790ed69bedd3d9626d01dd35cf9821c121 ]
We check the padding of other uAPI v2 structures but not that of line
config attributes. For used attributes: check if their padding is
zeroed, for unused: check if the entire structure is zeroed.
Fixes: 3c0d9c635ae2 ("gpiolib: cdev: support GPIO_V2_GET_LINE_IOCTL and GPIO_V2_LINE_GET_VALUES_IOCTL")
Reviewed-by: Kent Gibson <warthog618@gmail.com>
Link: https://patch.msgid.link/20260521-gpio-cdev-attr-padding-check-v3-1-ec3bcbe2e358@oss.qualcomm.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andy Shevchenko <andy.shevchenko@gmail.com>
Date: Sun Nov 10 22:16:15 2024 +0200
gpiolib: cdev: use !mem_is_zero() instead of memchr_inv(s, 0, n)
[ Upstream commit e106b1dd38e723ec2bb2bf57ea9b2aff464b9423 ]
Use the mem_is_zero() helper where possible.
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20241110201706.16614-1-andy.shevchenko@gmail.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Stable-dep-of: 3e6ccd790ed6 ("gpio: cdev: check if uAPI v2 config attributes are correctly zeroed")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Lukas Bulwahn <lukas.bulwahn@redhat.com>
Date: Thu Feb 5 09:11:31 2026 +0100
HID: quirks: really enable the intended work around for appledisplay
[ Upstream commit 5f90dcfa8dc32a488581b78e575cdd7808ba5c78 ]
Commit c7fabe4ad921 ("HID: quirks: work around VID/PID conflict for
appledisplay") intends to add a quirk for kernels built with Apple Cinema
Display support, but it refers to the non-existing config option
CONFIG_APPLEDISPLAY, whereas the config option for Apple Cinema Display
support is named CONFIG_USB_APPLEDISPLAY.
Refer to the intended config option CONFIG_USB_APPLEDISPLAY in the ifdef
directive.
Fixes: c7fabe4ad921 ("HID: quirks: work around VID/PID conflict for appledisplay")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Takashi Iwai <tiwai@suse.de>
Date: Tue Apr 28 10:33:16 2026 +0200
HID: uclogic: Fix regression of input name assignment
[ Upstream commit 487359284509a6745e14b8c0518768bc277809b0 ]
The previous fix for adding the devm_kasprintf() return check in the
commit bd07f751208b ("HID: uclogic: Add NULL check in
uclogic_input_configured()") changed the condition of hi->input->name
assignment, and it resulted in missing the proper input device name
when no custom suffix is defined.
Restore the conditional to the original content to address the
regression.
Fixes: bd07f751208b ("HID: uclogic: Add NULL check in uclogic_input_configured()")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Date: Fri May 15 15:11:51 2026 -0700
hwmon: (pmbus/adm1266) bounce blackbox records through a protocol-sized buffer
commit 43cae21424ff8e33894a0f86c6b80b840c049fd7 upstream.
adm1266_pmbus_block_xfer() copies the device-supplied block payload
into the caller-provided buffer using the device-supplied length:
memcpy(data_r, &msgs[1].buf[1], msgs[1].buf[0]);
The helper does not know how large data_r is and trusts the device to
return at most one record's worth of bytes. adm1266_nvmem_read_blackbox()
violates that contract: it advances read_buff inside data->dev_mem in
ADM1266_BLACKBOX_SIZE (64-byte) strides while the helper is willing to
write up to ADM1266_PMBUS_BLOCK_MAX (255) bytes. A device that returns
more than 64 bytes on the trailing record (read_buff offset 1984 in
the 2048-byte dev_mem allocation) overflows dev_mem by up to 191 bytes
before the post-call
if (ret != ADM1266_BLACKBOX_SIZE)
return -EIO;
can reject the response.
Contain the fix in the caller without changing the helper signature:
read each record into a 255-byte local bounce buffer that matches the
helper's maximum output, validate the returned length, and only then
copy exactly ADM1266_BLACKBOX_SIZE bytes into the dev_mem slot.
Fixes: 407dc802a9c0 ("hwmon: (pmbus/adm1266) Add Block process call")
Cc: stable@vger.kernel.org
Signed-off-by: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Link: https://lore.kernel.org/r/20260515-adm1266-fixes-v1-5-1c1ea1349cfe@nexthop.ai
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Date: Mon May 18 17:52:25 2026 -0700
hwmon: (pmbus/adm1266) cap PDIO scan in get_multiple at ADM1266_PDIO_NR
commit d7834d92251baade796812876e95555e2066fa9f upstream.
adm1266_gpio_get_multiple() iterates the PDIO portion of the
caller-supplied mask using
for_each_set_bit_from(gpio_nr, mask,
ADM1266_GPIO_NR + ADM1266_PDIO_STATUS) {
...
}
where ADM1266_PDIO_STATUS is the PMBus command code (0xE9, i.e. 233),
not the number of PDIO pins. The intended upper bound is
ADM1266_GPIO_NR + ADM1266_PDIO_NR = 25.
gpiolib hands in a mask sized for gc.ngpio (= 25 bits on this chip),
so the iteration walks find_next_bit() up to 242, reading up to 217
extra bits (a handful of unsigned-long words: four on 64-bit, seven
on 32-bit) of whatever lives past the end of the mask in the
caller's stack. Any incidental set bit in that range then drives a
set_bit(gpio_nr, bits) call that writes past the end of the
caller-supplied bits array too -- both out-of-bounds.
Substitute ADM1266_PDIO_NR for the constant so the scan stops at the
last real PDIO bit.
Fixes: d98dfad35c38 ("hwmon: (pmbus/adm1266) Add support for GPIOs")
Cc: stable@vger.kernel.org
Signed-off-by: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Link: https://lore.kernel.org/r/20260518-adm1266-gpio-fixes-v3-1-e425e4f88139@nexthop.ai
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Date: Mon May 18 17:52:26 2026 -0700
hwmon: (pmbus/adm1266) don't clobber GPIO bits before PDIO read in get_multiple
commit 3327a12aee9e10ffa903e28b8445dfd1af5307c0 upstream.
adm1266_gpio_get_multiple() zeroes *bits before the GPIO_STATUS loop
and then a second time before the PDIO_STATUS loop:
*bits = 0;
for_each_set_bit(gpio_nr, mask, ADM1266_GPIO_NR) {
...
set_bit(gpio_nr, bits);
}
ret = i2c_smbus_read_block_data(data->client, ADM1266_PDIO_STATUS, ...);
...
*bits = 0;
for_each_set_bit_from(gpio_nr, mask, ADM1266_GPIO_NR + ADM1266_PDIO_NR) {
...
set_bit(gpio_nr, bits);
}
The second *bits = 0 throws away every GPIO bit the first loop just
populated, so callers asking for any combination of GPIO and PDIO
pins always see the GPIO portion of the returned bits as zero.
Drop the redundant second assignment so both halves of the result
survive.
Fixes: d98dfad35c38 ("hwmon: (pmbus/adm1266) Add support for GPIOs")
Cc: stable@vger.kernel.org
Signed-off-by: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Link: https://lore.kernel.org/r/20260518-adm1266-gpio-fixes-v3-2-e425e4f88139@nexthop.ai
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Date: Fri May 15 15:11:50 2026 -0700
hwmon: (pmbus/adm1266) include PEC byte in pmbus_block_xfer read buffer
commit 487566cb1ccdf3756fdd7bf8d875e612ff3169bb upstream.
adm1266_pmbus_block_xfer() sets up the read transaction with
.buf = data->read_buf,
.len = ADM1266_PMBUS_BLOCK_MAX + 2,
but read_buf in struct adm1266_data is declared as
u8 read_buf[ADM1266_PMBUS_BLOCK_MAX + 1];
For a max-length block response (length byte = 255 + up to 1 PEC
byte), the i2c controller is told to write 257 bytes into a 256-byte
buffer, putting one byte past the end of read_buf. The same response
also makes the subsequent PEC compare
if (crc != msgs[1].buf[msgs[1].buf[0] + 1])
read a byte beyond the array.
Bump the read_buf declaration to ADM1266_PMBUS_BLOCK_MAX + 2 so the
buffer can hold the length byte, up to 255 payload bytes, and the PEC
byte the i2c_msg length already accounts for.
Fixes: 407dc802a9c0 ("hwmon: (pmbus/adm1266) Add Block process call")
Cc: stable@vger.kernel.org
Signed-off-by: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Link: https://lore.kernel.org/r/20260515-adm1266-fixes-v1-4-1c1ea1349cfe@nexthop.ai
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Date: Mon May 18 17:52:28 2026 -0700
hwmon: (pmbus/adm1266) register the gpio_chip after pmbus_do_probe()
commit 491403b9b76cf66abd81301c5901aa4a4549f1e8 upstream.
adm1266_probe() calls adm1266_config_gpio() -- which goes on to
devm_gpiochip_add_data() and exposes the gpio_chip callbacks to
gpiolib -- before pmbus_do_probe() has initialised the per-client
PMBus state (notably the pmbus_lock mutex the core hands out via
pmbus_get_data()).
That ordering is already a latent hazard: any GPIO access that lands
between adm1266_config_gpio() and the end of pmbus_do_probe() (for
example a sysfs read from a user space agent that opens the gpiochip
the instant gpiolib advertises it) races pmbus_do_probe()'s own
device accesses with no serialisation.
Move adm1266_config_gpio() down past pmbus_do_probe() so the chip
isn't reachable from userspace until the PMBus state it depends on
is fully initialised.
Fixes: d98dfad35c38 ("hwmon: (pmbus/adm1266) Add support for GPIOs")
Cc: stable@vger.kernel.org
Signed-off-by: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260518-adm1266-gpio-fixes-v3-4-e425e4f88139@nexthop.ai
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Date: Mon May 18 17:52:29 2026 -0700
hwmon: (pmbus/adm1266) register the nvmem device after pmbus_do_probe()
commit 6af713af91d5c34ec049eb3cc2c5b3f5eba953b8 upstream.
adm1266_probe() calls adm1266_config_nvmem() -- which goes on to
devm_nvmem_register() and exposes adm1266_nvmem_read() to userspace --
before pmbus_do_probe() has initialised the per-client PMBus state.
Same latent hazard as the gpio_chip one fixed in the previous patch:
once the nvmem device is registered, gpiolib's nvmem char-dev / sysfs
interface is reachable, and any concurrent read triggers
adm1266_nvmem_read() -> adm1266_nvmem_read_blackbox(), which issues
PMBus traffic that races pmbus_do_probe()'s own device accesses with
no serialisation.
Move adm1266_config_nvmem() down past pmbus_do_probe() so the nvmem
device isn't reachable from userspace until the PMBus state the
nvmem accessors depend on is fully initialised.
Fixes: 15609d189302 ("hwmon: (pmbus/adm1266) read blackbox")
Cc: stable@vger.kernel.org
Signed-off-by: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Link: https://lore.kernel.org/r/20260518-adm1266-gpio-fixes-v3-5-e425e4f88139@nexthop.ai
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Date: Fri May 15 15:11:49 2026 -0700
hwmon: (pmbus/adm1266) reject implausible blackbox record_count
commit 4afca954622d672ea65ed961bed01cf91caa034e upstream.
adm1266_nvmem_read_blackbox() loops over a record_count that comes
straight from byte 3 of the BLACKBOX_INFO response. The destination
buffer is data->dev_mem, sized for the nvmem cell's declared 2048
bytes (ADM1266_BLACKBOX_MAX_RECORDS * ADM1266_BLACKBOX_SIZE = 32 * 64).
A device that reports a record_count greater than 32 -- whether due
to firmware bugs, bus corruption, or a non-responsive slave returning
0xff -- would walk read_buff past the end of the dev_mem allocation
on the trailing iterations.
Cap record_count at ADM1266_BLACKBOX_MAX_RECORDS (introduced here)
before entering the loop and return -EIO on any larger value, so a
malformed BLACKBOX_INFO response cannot drive the loop out of bounds.
Fixes: 15609d189302 ("hwmon: (pmbus/adm1266) read blackbox")
Cc: stable@vger.kernel.org
Signed-off-by: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Link: https://lore.kernel.org/r/20260515-adm1266-fixes-v1-3-1c1ea1349cfe@nexthop.ai
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Date: Mon May 18 17:52:27 2026 -0700
hwmon: (pmbus/adm1266) reject short block-read responses in the GPIO accessors
commit a7232f68c43ca62f545049b7f5fbfc75137b843b upstream.
adm1266_gpio_get() and adm1266_gpio_get_multiple() both compose the
pin-status word as
pins_status = read_buf[0] + (read_buf[1] << 8);
right after i2c_smbus_read_block_data(), guarding only against an
error return. A well-behaved device returns 2 bytes for
GPIO_STATUS/PDIO_STATUS, but the helper happily reports a 0- or
1-byte response too. If the device returns 0 bytes, both read_buf
slots are uninitialized stack memory; if it returns 1 byte, read_buf[1]
is.
The composed value then flows through set_bit() into the caller's
*bits in adm1266_gpio_get_multiple(), or into the return value of
adm1266_gpio_get(), and ends up in userspace via gpiolib (sysfs and
the char-dev ioctls). That leaks a few bits of kernel stack per
request on any device whose firmware glitch, bus error, or hostile
slave produces a short block-read response.
Add the missing length check to both call sites and surface a short
response as -EIO.
Fixes: d98dfad35c38 ("hwmon: (pmbus/adm1266) Add support for GPIOs")
Cc: stable@vger.kernel.org
Signed-off-by: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260518-adm1266-gpio-fixes-v3-3-e425e4f88139@nexthop.ai
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Date: Fri May 15 15:11:47 2026 -0700
hwmon: (pmbus/adm1266) seed timestamp from the real-time clock
commit b86095e3d7dcf2bf80c747349a35912a87a85098 upstream.
adm1266_set_rtc() seeds the chip's SET_RTC register from
ktime_get_seconds(), which returns CLOCK_MONOTONIC -- i.e. seconds
since the host last booted, not seconds since the Unix epoch.
The chip stamps that value into every blackbox record it captures.
Userspace reading those timestamps back expects wall-clock seconds:
that's what the SET_RTC frame layout documents (datasheet Rev. D,
Table 84) and what every other consumer of "seconds since epoch"
assumes. Seeding from CLOCK_MONOTONIC gives blackbox records a
timestamp that is only meaningful within a single boot of the host
and silently resets to small values on every reboot.
Switch to ktime_get_real_seconds() so the seed matches what the
register is documented to hold.
Fixes: 15609d189302 ("hwmon: (pmbus/adm1266) read blackbox")
Cc: stable@vger.kernel.org
Signed-off-by: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Link: https://lore.kernel.org/r/20260515-adm1266-fixes-v1-1-1c1ea1349cfe@nexthop.ai
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Date: Fri May 15 15:11:48 2026 -0700
hwmon: (pmbus/adm1266) widen blackbox-info buffer to I2C_SMBUS_BLOCK_MAX
commit eee213daa1e1b402eb631bcd1b8c5aa340a6b081 upstream.
adm1266_nvmem_read_blackbox() declares a 5-byte stack buffer and
passes it to i2c_smbus_read_block_data() to retrieve the 4-byte
BLACKBOX_INFO response. i2c_smbus_read_block_data() does not honour
caller buffer sizes -- it memcpy()s data.block[0] bytes from the
SMBus transaction (where data.block[0] is the length byte returned by
the slave device, up to I2C_SMBUS_BLOCK_MAX = 32):
memcpy(values, &data.block[1], data.block[0]);
If the device returns any block length above 5, the call overflows
the caller's 5-byte stack buffer before the post-call
if (ret != 4)
return -EIO;
check has a chance to reject the response.
Widen the local buffer to I2C_SMBUS_BLOCK_MAX so the helper has room
for any well-formed SMBus block response, matching the convention used
by the other i2c_smbus_read_block_data() callers in this driver.
Fixes: 15609d189302 ("hwmon: (pmbus/adm1266) read blackbox")
Cc: stable@vger.kernel.org
Signed-off-by: Abdurrahman Hussain <abdurrahman@nexthop.ai>
Link: https://lore.kernel.org/r/20260515-adm1266-fixes-v1-2-1c1ea1349cfe@nexthop.ai
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Guenter Roeck <linux@roeck-us.net>
Date: Wed May 27 10:22:30 2026 +0800
hwmon: (pmbus/core) Protect regulator operations with mutex
[ Upstream commit 754bd2b4a084b90b5e7b630e1f423061a9b9b761 ]
The regulator operations pmbus_regulator_get_voltage(),
pmbus_regulator_set_voltage(), and pmbus_regulator_list_voltage()
access PMBus registers and shared data but were not protected by
the update_lock mutex. This could lead to race conditions.
However, adding mutex protection directly to these functions causes
a deadlock because pmbus_regulator_notify() (which calls
regulator_notifier_call_chain()) is often called with the mutex
already held (e.g., from pmbus_fault_handler()). If a regulator
callback then calls one of the now-protected voltage functions,
it will attempt to acquire the same mutex.
Rework pmbus_regulator_notify() to utilize a worker function to
send notifications outside of the mutex protection. Events are
stored as atomics in a per-page bitmask and processed by the worker.
Initialize the worker and its associated data during regulator
registration, and ensure it is cancelled on device removal using
devm_add_action_or_reset().
While at it, remove the unnecessary include of linux/of.h.
Cc: Sanman Pradhan <psanman@juniper.net>
Fixes: ddbb4db4ced1b ("hwmon: (pmbus) Add regulator support")
Reviewed-by: Sanman Pradhan <psanman@juniper.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Fang Wang <32840572@qq.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bart Van Assche <bvanassche@acm.org>
Date: Wed May 6 14:48:15 2026 -0700
ice: fix locking in ice_dcb_rebuild()
[ Upstream commit 0ded1f36ba4021cba50513e80be6b6e173710168 ]
Move the mutex_lock() call up to prevent that DCB settings change after
the first ice_query_port_ets() call. The second ice_query_port_ets()
call in ice_dcb_rebuild() is already protected by pf->tc_mutex.
This also fixes a bug in an error path, as before taking the first
"goto dcb_error" in the function jumped over mutex_lock() to
mutex_unlock().
This bug has been detected by the clang thread-safety analyzer.
Cc: intel-wired-lan@lists.osuosl.org
Fixes: 242b5e068b25 ("ice: Fix DCB rebuild after reset")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-6-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marcin Szycik <marcin.szycik@intel.com>
Date: Fri May 15 11:24:10 2026 -0700
ice: fix setting promisc mode while adding VID filter
commit ebc8de716c9ec2be384abdc2dd866da26c6580d1 upstream.
There are at least two paths through which VSI promiscuous mode can be
independently configured via ice_fltr_set_vsi_promisc():
- ice_vlan_rx_add_vid() (netdev op)
- ice_service_task() -> ... -> ice_set_promisc()
Both paths may try to program promiscuous mode concurrently. One such
scenario is:
1. Add ice netdev to bond
2. Add the bond netdev to bridge
3. ice netdev enters allmulticast mode (IFF_ALLMULTI)
4. Service task programs promisc mode filter
5. Bridge -> bond calls ice_vlan_rx_add_vid()
Crucially, ice_vlan_rx_add_vid() fails if ice_fltr_set_vsi_promisc()
returns any error, including -EEXIST. This causes VLAN filtering setup
to fail on the bond interface. ice_set_promisc() already handles -EEXIST
correctly.
Fix by adding the same -EEXIST check to ice_vlan_rx_add_vid(): if the
promisc filter is already programmed, continue without returning error.
Fixes: 1273f89578f2 ("ice: Fix broken IFF_ALLMULTI handling")
Cc: stable@vger.kernel.org
Signed-off-by: Marcin Szycik <marcin.szycik@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20260515182419.1597859-4-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Marcin Szycik <marcin.szycik@linux.intel.com>
Date: Wed May 6 14:48:14 2026 -0700
ice: fix setting RSS VSI hash for E830
[ Upstream commit b3cda96feb60d91fe88d52b974ff110dcfa91239 ]
ice_set_rss_hfunc() performs a VSI update, in which it sets hashing
function, leaving other VSI options unchanged. However, ::q_opt_flags is
mistakenly set to the value of another field, instead of its original
value, probably due to a typo. What happens next is hardware-dependent:
On E810, only the first bit is meaningful (see
ICE_AQ_VSI_Q_OPT_PE_FLTR_EN) and can potentially end up in a different
state than before VSI update.
On E830, some of the remaining bits are not reserved. Setting them
to some unrelated values can cause the firmware to reject the update
because of invalid settings, or worse - succeed.
Reproducer:
sudo ethtool -X $PF1 equal 8
Output in dmesg:
Failed to configure RSS hash for VSI 6, error -5
Fixes: 352e9bf23813 ("ice: enable symmetric-xor RSS for Toeplitz hash function")
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-5-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Grzegorz Nitka <grzegorz.nitka@intel.com>
Date: Fri May 15 11:24:11 2026 -0700
ice: ptp: serialize E825 PHY timer start with PTP lock
[ Upstream commit 781ff8f2d575a794a2a4f11605288ae06757f5eb ]
ice_start_phy_timer_eth56g() programs TIMETUS registers and issues
INIT_INCVAL without holding the global PTP semaphore.
This allows concurrent PTP command paths to interleave with PHY timer
start, which can make the sequence fail and leave timer initialization
inconsistent.
Take the PTP lock around TIMETUS registers programming and INIT_INCVAL
command execution, and make sure the lock is released on all error paths.
Keep the subsequent sync step outside of this critical section, since
ice_sync_phy_timer_eth56g() takes the same semaphore internally.
Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products")
Reviewed-by: Arkadiusz Kubalewski <Arkadiusz.kubalewski@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Alexander Nowlin <alexander.nowlin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20260515182419.1597859-5-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Grzegorz Nitka <grzegorz.nitka@intel.com>
Date: Fri May 15 11:24:13 2026 -0700
ice: restore PTP Rx timestamp config after ethtool set-channels
commit 975b564d195b13ca6ee1ef5e6a9561734898eb17 upstream.
When ethtool -L changes queue counts, ice_vsi_recfg_qs() closes and
rebuilds the VSI, reallocating Rx rings. The newly allocated rings have
ptp_rx cleared, so RX hardware timestamps are no longer attached to skb
until hwtstamp configuration is applied again.
Restore timestamp mode after ice_vsi_open() in the queue reconfiguration
path, matching reset/rebuild behavior and ensuring newly rebuilt Rx rings
have PTP RX timestamping re-enabled.
Testing hints:
- run ptp4l application in client synchronization mode:
ptp4l -i ethX -m -s
- run PTP traffic
- change queue number on ethX netdev interface:
ethtool -L ethX combined new_queue_size
- observe ptp4l output
- expected result: no "received DELAY_REQ without timestamp" messages
Fixes: 77a781155a65 ("ice: enable receive hardware timestamping")
Cc: stable@vger.kernel.org
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Alexander Nowlin <alexander.nowlin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20260515182419.1597859-7-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jens Axboe <axboe@kernel.dk>
Date: Fri May 15 10:19:09 2026 -0600
io_uring/net: punt IORING_OP_BIND async if it needs file create
[ Upstream commit ccd25890f73c082fe2657ed227b497d6ac5fdc40 ]
For two reasons:
1) An opcode cannot block inside io_uring_enter() doing submissions, as
it'll stall the submission side pipeline.
2) Ending up in sb_start_write() -> __sb_start_write() ->
percpu_down_read_freezable() introduces a new lockdep edge, which it
correctly complains about.
Check if the socket type is AF_UNIX and has a non-empty pathname. If it
does, mark it REQ_F_FORCE_ASYNC to punt the submission to io-wq rather
than attempt to do it inline.
Fixes: 7481fd93fa0a ("io_uring: Introduce IORING_OP_BIND")
Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Heechan Kang <gganji11@naver.com>
Date: Sun May 17 03:47:09 2026 +0900
io_uring/waitid: clear waitid info before copying it to userspace
commit 93d93f5f8da791e98159795c6ef683f45bd95d13 upstream.
IORING_OP_WAITID stores its result fields in struct io_waitid::info and
later copies them to userspace siginfo. The prep path initializes the
request arguments, but it does not initialize info itself.
If the wait operation completes without reporting a child event, the common
wait code can return without writing wo_info. In that case io_waitid_finish()
still copies iw->info to userspace, exposing stale bytes from the reused
io_kiocb command storage.
Clear the result storage during prep so the io_uring path matches the
regular waitid syscall, which uses a zero-initialized struct waitid_info.
Fixes: f31ecf671ddc ("io_uring: add IORING_OP_WAITID support")
Cc: stable@vger.kernel.org # 6.7+
Signed-off-by: Heechan Kang <gganji11@naver.com>
Link: https://patch.msgid.link/20260516184709.852814-1-gganji11@naver.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Lu Baolu <baolu.lu@linux.intel.com>
Date: Tue May 26 19:24:01 2026 +0000
iommu/vt-d: Draining PRQ in sva unbind path when FPD bit set
commit cf08ca81d08a04b3b304e8fb4e052f323a09783d upstream.
When a device uses a PASID for SVA (Shared Virtual Address), it's possible
that the PASID entry is marked as non-present and FPD bit set before the
device flushes all ongoing DMA requests and removes the SVA domain. This
can occur when an exception happens and the process terminates before the
device driver stops DMA and calls the iommu driver to unbind the PASID.
There's no need to drain the PRQ in the mm release path. Instead, the PRQ
will be drained in the SVA unbind path. But in such case,
intel_pasid_tear_down_entry() only checks the presence of the pasid entry
and returns directly.
Add the code to clear the FPD bit and drain the PRQ.
Fixes: c43e1ccdebf2 ("iommu/vt-d: Drain PRQs when domain removed from RID")
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20241217024240.139615-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Gyokhan Kochmarla <gyokhan@amazon.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michael Bommarito <michael.bommarito@gmail.com>
Date: Tue May 12 16:51:14 2026 -0400
ipv4: raw: reject IP_HDRINCL packets with ihl < 5
commit 915fab69823a14c170dbaa3b41978768e0fe62fc upstream.
raw_send_hdrinc() validates that the caller-supplied IPv4 header
fits within the message length:
iphlen = iph->ihl * 4;
err = -EINVAL;
if (iphlen > length)
goto error_free;
if (iphlen >= sizeof(*iph)) {
/* fix up saddr, tot_len, id, csum, transport_header */
}
It does not, however, reject ihl < 5. For such a packet the
"if (iphlen >= sizeof(*iph))" branch is skipped, leaving the
crafted iphdr untouched, but the packet is still handed to
__ip_local_out() and onward. Downstream consumers that read
iph->ihl assume a sane value: net/ipv4/ah4.c:ah_output() in
particular subtracts sizeof(struct iphdr) from top_iph->ihl * 4
and passes the (signed-int-negative, then cast to size_t)
result to memcpy(), producing an OOB access of length close to
SIZE_MAX and a host kernel panic.
An IPv4 header with ihl < 5 is malformed by definition (RFC 791:
"Internet Header Length is the length of the internet header in
32 bit words ... Note that the minimum value for a correct header
is 5."). The kernel should not be willing to inject such a
packet into its own output path.
Reject "iphlen < sizeof(*iph)" alongside the existing
"iphlen > length" check. This matches the principle that locally
constructed packets that re-enter the IP stack must pass the same
basic sanity tests that a foreign packet would be subjected to.
Once this lands, the "if (iphlen >= sizeof(*iph))" wrapper around
the fixup branch becomes redundant; left in place to keep the
patch minimal and backport-friendly. A follow-up can unwrap it.
Note that commit 86f4c90a1c5c ("ipv4, ipv6: ensure raw socket
message is big enough to hold an IP header") ensures the message
buffer is large enough to hold an iphdr, but does not constrain
the self-reported iph->ihl.
Reachability: the malformed packet source is any caller with
CAP_NET_RAW, including an unprivileged process in a user+net
namespace on a kernel with CONFIG_USER_NS=y. The reproduced AH
crash also requires a matching xfrm AH policy on the outgoing
route; a container granted CAP_NET_ADMIN can install that state
and policy in its netns. Loopback bypasses xfrm_output, so the
trigger uses a real netdev.
Reproduced on UML + KASAN: kernel-mode fault at addr 0x0 with
memcpy_orig at the crash site. Same shape reproduces inside a
rootless Docker container with --cap-add NET_ADMIN on a stock
distro kernel.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Link: https://patch.msgid.link/77ec2b5e8111961c2c39883c92e8aa2709039c17.1778614451.git.michael.bommarito@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Justin Iurman <justin.iurman@gmail.com>
Date: Sun May 17 20:30:59 2026 +0200
ipv6: ioam: add NULL check for idev in ipv6_hop_ioam()
commit d4ea0dfd75011b78cebf3808f98ac4c4f51a6fb9 upstream.
Reported by Sashiko:
The function ipv6_hop_ioam() accesses
__in6_dev_get(skb->dev)->cnf.ioam6_enabled without validating the returned
idev pointer. Because addrconf_ifdown() can concurrently clear dev->ip6_ptr
via RCU, __in6_dev_get() can return NULL during interface teardown, which
could cause a NULL pointer dereference when processing an IOAM Hop-by-Hop
option.
Let's add a check and use SKB_DROP_REASON_IPV6DISABLED accordingly.
Fixes: 9ee11f0fff20 ("ipv6: ioam: Data plane support for Pre-allocated Trace")
Cc: stable@vger.kernel.org
Signed-off-by: Justin Iurman <justin.iurman@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260517183059.29140-1-justin.iurman@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Justin Iurman <justin.iurman@gmail.com>
Date: Wed May 20 14:42:42 2026 +0200
ipv6: ioam: refresh hdr pointer before ioam6_event()
commit e46e6bc97fb1f339730ff1ba74267fbf48e7a422 upstream.
Reported by Sashiko:
In ipv6_hop_ioam(), the hdr pointer is initialized to point into the
skb's linear data buffer. Later, the code calls skb_ensure_writable(),
which might reallocate the buffer:
if (skb_ensure_writable(skb, optoff + 2 + hdr->opt_len))
goto drop;
/* Trace pointer may have changed */
trace = (struct ioam6_trace_hdr *)(skb_network_header(skb)
+ optoff + sizeof(*hdr));
ioam6_fill_trace_data(skb, ns, trace, true);
ioam6_event(IOAM6_EVENT_TRACE, dev_net(skb->dev),
GFP_ATOMIC, (void *)trace, hdr->opt_len - 2);
If the skb is cloned or lacks sufficient linear headroom,
skb_ensure_writable() will invoke pskb_expand_head(), which reallocates
the skb's data buffer and frees the old one, invalidating pointers to
it. While the code recalculates the trace pointer immediately after the
call to skb_ensure_writable(), it fails to recalculate the hdr pointer.
This patch fixes the above by recalculating the hdr pointer before
passing hdr->opt_len to ioam6_event(), so that we avoid any UaF.
Fixes: f655c78d6225 ("net: exthdrs: ioam6: send trace event")
Cc: stable@vger.kernel.org
Signed-off-by: Justin Iurman <justin.iurman@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260520124242.32320-1-justin.iurman@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jiayuan Chen <jiayuan.chen@linux.dev>
Date: Mon Mar 30 15:32:29 2026 +0800
irq_work: Fix use-after-free in irq_work_single() on PREEMPT_RT
[ Upstream commit 91840be8f710370607f949a627e070896faeddb8 ]
On PREEMPT_RT, non-HARD irq_work runs in per-CPU kthreads via
run_irq_workd(), so irq_work_sync() uses rcuwait() to wait for BUSY==0.
After irq_work_single() clears BUSY via atomic_cmpxchg(), it still
dereferences @work for irq_work_is_hard() and rcuwait_wake_up().
An irq_work_sync() caller on another CPU that enters after BUSY is cleared
can observe BUSY==0 immediately, return, and free the work before those
accesses complete — causing a use-after-free.
Fix this by wrapping run_irq_workd() in guard(rcu)() so that the entire
irq_work_single() execution is within an RCU read-side critical
section. Then add synchronize_rcu() in irq_work_sync() after
rcuwait_wait_event() to ensure the caller waits for the RCU grace period
before returning, preventing premature frees.
Fixes: 810979682ccc ("irq_work: Allow irq_work_sync() to sleep if irq_work() no IRQ support.")
Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20260330073234.303732-1-jiayuan.chen@linux.dev
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rosen Penev <rosenp@gmail.com>
Date: Wed May 6 01:55:22 2026 -0700
irqchip/ath79-cpu: Remove unused function
[ Upstream commit 0fa10fb77069fb67aa51384868ef3702b7791465 ]
ath79_cpu_irq_init() was part of the legacy pre-OF code that got removed a
while back.
Remove it to get rid of a missing prototype warning, reported by the kernel test
robot.
[ tglx: Fix the subject prefix. Sigh ... ]
Fixes: 51fa4f8912c0 ("MIPS: ath79: drop legacy IRQ code")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260506085522.1210143-1-rosenp@gmail.com
Closes: https://lore.kernel.org/oe-kbuild-all/202412011509.kGQkDr1y-lkp@intel.com/
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michael Bommarito <michael.bommarito@gmail.com>
Date: Fri May 15 11:24:14 2026 -0700
ixgbevf: fix use-after-free in VEPA multicast source pruning
commit 5d49b568c188dc77199d8d2b959c91da8cc27cf1 upstream.
ixgbevf_clean_rx_irq() prunes frames whose source MAC matches the VF's
own address (VEPA multicast workaround) by freeing the skb and
continuing to the next descriptor:
dev_kfree_skb_irq(skb);
continue;
The skb pointer is declared outside the while loop and persists across
iterations. Because the continue skips the "skb = NULL" reset at the
bottom of the loop, the next iteration enters the "else if (skb)" path
and calls ixgbevf_add_rx_frag() on the freed skb, dereferencing
skb_shinfo(skb)->nr_frags - a use-after-free in NAPI softirq context.
The sibling driver iavf already handles this correctly by nulling the
pointer before continuing. Apply the same pattern here.
I do not have ixgbevf hardware; the bug was found by static analysis
(scan_drop_continue_loops.py + semgrep drop_continue_in_loop, multi-tool
corroboration with the highest score in the scan). The UAF was confirmed
under KASAN by loading a test module that reproduces the exact code
pattern (alloc skb, kfree_skb, then read skb_shinfo(skb)->nr_frags):
BUG: KASAN: slab-use-after-free in ixgbevf_uaf_test_init+0x100/0x1000
Read of size 8 at addr 000000006163ae78 by task insmod/30
freed 208-byte region [000000006163adc0, 000000006163ae90)
QEMU emulates igb (82576) but not ixgbe (82599), and the igbvf VF
driver does not include the VEPA source pruning path, so a full
end-to-end reproduction with emulated hardware was not possible.
Fixes: bad17234ba70 ("ixgbevf: Change receive model to use double buffered page based receives")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20260515182419.1597859-8-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Viktor Jägersküpper <viktor_jaegerskuepper@freenet.de>
Date: Fri May 15 23:58:45 2026 +0200
kbuild: pacman-pkg: make "rc" releases adhere to pacman versioning scheme
[ Upstream commit 202550713128da20d9381d6d2dc0f6b73839f434 ]
The package versioning scheme does not enable smooth upgrades from "rc"
releases to the corresponding stable releases (e.g. 7.0.0-rc7 -> 7.0.0)
because pacman considers that a downgrade due to the underscore in
pkgver (e.g. 7.0.0_rc7), see e.g. vercmp(8) for an explanation of the
package version comparison used by pacman. Package versions which are
derived from said releases (e.g. built from git revisions) are
similarly affected. Fix this by modifying pkgver in order to remove the
hyphen from kernel versions containing "-rcN", where N is a
non-negative integer.
Acked-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Viktor Jägersküpper <viktor_jaegerskuepper@freenet.de>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20260515215913.92481-1-viktor_jaegerskuepper@freenet.de
Fixes: c8578539deba ("kbuild: add script and target to generate pacman package")
Signed-off-by: Nicolas Schier <nsc@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jianpeng Chang <jianpeng.chang.cn@windriver.com>
Date: Fri May 8 09:56:36 2026 +0900
kprobes: skip non-symbol addresses in kprobe_add_ksym_blacklist()
[ Upstream commit 307abfac04a254c09c5705d816b33354acee97a0 ]
When kprobe_add_area_blacklist() iterates through a section like
.kprobes.text, the start address may not correspond to a named symbol.
On ARM64 with CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS=y (introduced by
commit baaf553d3bc3 ("arm64: Implement
HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")), the compiler flag
-fpatchable-function-entry=4,2 inserts 2 NOPs before each function entry
point for ftrace call_ops. These pre-function NOPs sit at the section base
address, before the first named function symbol. The compiler emits a $x
mapping symbol at offset 0x00 to mark the start of code, but
find_kallsyms_symbol() ignores mapping symbols.
Without CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS (e.g. defconfig), no
pre-function NOPs are inserted, the first function starts at offset
0x00, and the bug does not trigger.
This only affects modules that have a .kprobes.text section (i.e. those
using the __kprobes annotation). Modules using NOKPROBE_SYMBOL() instead
(like kretprobe_example.ko) blacklist exact function addresses via the
_kprobe_blacklist section and are not affected.
For kprobe_example.ko on ARM64 with -fpatchable-function-entry=4,2,
the .kprobes.text section layout is:
offset 0x00: $x + 2 NOPs (mapping symbol + ftrace preamble)
offset 0x08: handler_post (64 bytes)
offset 0x50: handler_pre (68 bytes)
kprobe_add_area_blacklist() starts iterating from the section base
address (offset 0x00), which only has the $x mapping symbol.
kprobe_add_ksym_blacklist() then calls kallsyms_lookup_size_offset()
for this address, which goes through:
kallsyms_lookup_size_offset()
-> module_address_lookup()
-> find_kallsyms_symbol()
find_kallsyms_symbol() scans all module symbols to find the closest
preceding symbol.
Since no named text symbol exists at offset 0x00,
find_kallsyms_symbol() picks __UNIQUE_ID_vermagic (a .modinfo symbol
whose address is in the temporary image) as the "best" match. The
computed "size" = next_text_symbol - modinfo_symbol spans across
these two unrelated memory regions, creating a blacklist entry with
a bogus range of tens of terabytes.
Whether this causes a visible failure depends on address randomization,
here is what happens on Raspberry Pi 4/5:
- On RPi5, the bogus size was ~35 TB. start + size stayed within
64-bit range, so the blacklist entry covered the entire kernel
text. register_kprobe() in the module's own init function failed
with -EINVAL.
- On RPi4, the bogus size was ~75 TB. start + size overflowed
64 bits and wrapped to a small address near zero. The range
check (addr >= start && addr < end) then failed because end
wrapped around, so the bogus entry was accidentally harmless
and kprobes worked by luck.
The same bug exists on both machines, but randomization determines whether
the integer overflow masks it or not.
Fix this by adding notrace to the __kprobes macro. Functions in
.kprobes.text are kprobe infrastructure handlers that should never be
traced by ftrace. With notrace, the compiler stops inserting them and the
non-symbol gap at the section start disappears entirely.
Link: https://lore.kernel.org/all/20260506012706.2785785-1-jianpeng.chang.cn@windriver.com/
Fixes: baaf553d3bc3 ("arm64: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")
Signed-off-by: Jianpeng Chang <jianpeng.chang.cn@windriver.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: DaeMyung Kang <charsyam@gmail.com>
Date: Tue Apr 28 23:08:56 2026 +0900
ksmbd: close durable scavenger races against m_fp_list lookups
[ Upstream commit bf736184d063da1a552ffeff0481813599a182cc ]
ksmbd_durable_scavenger() has two related races against any walker
that iterates f_ci->m_fp_list, including ksmbd_lookup_fd_inode()
(used by ksmbd_vfs_rename) and the share-mode checks in
fs/smb/server/smb_common.c.
(1) fp->node list-head reuse. Durable-preserved handles can remain
linked on f_ci->m_fp_list after session teardown so share-mode checks
still see them while the handle is reconnectable. The scavenger
collected expired handles by adding fp->node to a local
scavenger_list after removing them from the global durable idr.
Because fp->node is the same list_head used by m_fp_list,
list_add(&fp->node, &scavenger_list) overwrites the m_fp_list links
and corrupts both lists. CONFIG_DEBUG_LIST can report this on the
share-mode walk path.
(2) Refcount race against m_fp_list walkers. The scavenger qualifies
an expired durable handle with atomic_read(&fp->refcount) > 1 and
fp->conn under global_ft.lock, removes fp from global_ft, then drops
global_ft.lock before unlinking fp from m_fp_list and freeing it.
During that gap fp is still linked on m_fp_list with f_state ==
FP_INITED. ksmbd_lookup_fd_inode() under m_lock read calls
ksmbd_fp_get() (atomic_inc_not_zero on refcount that is still 1) and
takes a live reference; the scavenger then unlinks and frees fp
while the holder owns a reference, leading to UAF on the holder's
subsequent ksmbd_fd_put() and on any field reads performed by a
concurrent share-mode walker that iterates m_fp_list without taking
ksmbd_fp_get() (smb_check_perm_dleases-like paths).
Fix both:
* Stop reusing fp->node as a scavenger-private list node. Remove
one expired handle from global_ft under global_ft.lock, take an
explicit transient reference, drop the lock, unlink fp->node
from m_fp_list under f_ci->m_lock, then drop both the durable
lifetime and transient references with atomic_sub_and_test(2,
&fp->refcount). If the scavenger is the last putter the close
runs there; otherwise an in-flight holder that already raced
through the m_fp_list lookup owns the final close via its
ksmbd_fd_put() path. The one-at-a-time disposal can rescan the
durable idr when multiple handles expire in the same pass, but
durable scavenging is a background expiration path and the final
full scan recomputes min_timeout before the next wait.
* Clear fp->persistent_id inside __ksmbd_remove_durable_fd() right
after idr_remove(), so a delayed final close from a holder that
snatched fp does not re-issue idr_remove() on a persistent id
that idr_alloc_cyclic() in ksmbd_open_durable_fd() may have
already handed out to a brand-new durable handle.
* Bypass the per-conn open_files_count decrement in
__put_fd_final() when fp is detached from any session table
(fp->conn cleared by session_fd_check() at durable preserve --
paired with the volatile_id clear at unpublish, so checking
fp->conn alone is sufficient). The walker that owns the final
close runs from an unrelated work->conn whose
stats.open_files_count never tracked this durable fp; without
this guard the holder would underflow that unrelated counter.
The two races are folded into one patch because patch (1) alone
cleans up the corrupted list but leaves a deterministic UAF window
for m_fp_list walkers that the transient-reference and
persistent_id discipline in (2) close; bisecting onto an
intermediate state would land on a UAF that pre-patch chaos merely
made less reproducible.
Validation:
* CONFIG_DEBUG_LIST coverage for the list_head reuse path.
* KASAN-enabled direct SMB2 durable-handle coverage that exercised
ksmbd_durable_scavenger() and non-NULL ksmbd_lookup_fd_inode()
returns while durable handles expired under concurrent rename
lookups, with no KASAN, UAF, list-corruption, ODEBUG, or WARNING
reports.
* checkpatch --strict
* make -j$(nproc) M=fs/smb/server
Fixes: d484d621d40f ("ksmbd: add durable scavenger timer")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jeremy Laratro <research@aradex.io>
Date: Wed May 13 08:26:16 2026 +0900
ksmbd: fix null pointer dereference in compare_guid_key()
commit 4b83cbc4c15f09b000cc06f033f64b0824b6dc87 upstream.
session_fd_check() walks the per-inode m_op_list during durable-handle
session teardown and sets op->conn = NULL for every opinfo whose conn
matched the closing session's connection. The matching opinfo, however,
stays linked in its per-ClientGuid lease_table_list entry's lb->lease_list
because destroy_lease_table() only runs on full TCP-connection teardown,
not on SESSION_LOGOFF.
If the same TCP connection then negotiates a fresh session with the
same ClientGuid (ClientGuid is bound to NEGOTIATE, not the session, and
is unchanged across LOGOFF + SETUP) and issues a SMB2 CREATE with a
lease context on a different inode, find_same_lease_key() walks
lb->lease_list, reaches the stale opinfo, and calls compare_guid_key(),
which unconditionally dereferences opinfo->conn->ClientGUID. The conn
pointer is NULL and the kernel panics.
Reproducer requires only a successful SMB2 SESSION_SETUP and a share
configured with 'durable handles = yes'. KASAN report on mainline
70390501d194:
general protection fault, probably for non-canonical address
0xdffffc0000000069: 0000 [#1] SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000348-0x000000000000034f]
Workqueue: ksmbd-io handle_ksmbd_work
RIP: 0010:bcmp+0x5b/0x230
Call Trace:
compare_guid_key+0x4b/0xd0
find_same_lease_key+0x324/0x690
smb2_open+0x6aea/0x8e60
handle_ksmbd_work+0x796/0xee0
...
Faulting address 0x348 is the offset of ClientGUID within struct
ksmbd_conn, confirming opinfo->conn was NULL.
Read opinfo->conn once and bail out if it has been cleared by a
concurrent session_fd_check(). A half-detached opinfo cannot be the
owner of an active lease, so returning 0 is the correct match result.
Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2")
Cc: stable@vger.kernel.org
Signed-off-by: Jeremy Laratro <research@aradex.io>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ferry Meng <mengferry@linux.alibaba.com>
Date: Mon May 11 21:18:16 2026 +0800
ksmbd: fix SID memory leak in set_posix_acl_entries_dacl() on overflow
commit af92ee994cc7f7e83a41c2025f32257a2f82a7ef upstream.
Commit 299f962c0b02 ("ksmbd: use check_add_overflow() to prevent u16
DACL size overflow") added check_add_overflow() guards that break out
of the ACE-building loops in set_posix_acl_entries_dacl() when the
accumulated DACL size would wrap past 65535.
However, each iteration allocates a struct smb_sid via kmalloc_obj()
at the top of the loop and relies on the kfree(sid) call at the end
of the loop body (the 'pass_same_sid' label in the first loop, and
the explicit kfree at the tail of the second loop) to release it.
The newly introduced 'break' statements bypass those kfree() calls,
leaking the sid buffer every time an overflow is detected.
A malicious or malformed file with enough POSIX ACL entries to trip
the overflow check will leak one or more struct smb_sid allocations
on every request that touches the file's DACL, providing a trivial
kernel memory exhaustion vector.
Free sid before breaking out of the loops to plug the leak.
Fixes: 299f962c0b02 ("ksmbd: use check_add_overflow() to prevent u16 DACL size overflow")
Cc: stable@vger.kernel.org
Signed-off-by: Ferry Meng <mengferry@linux.alibaba.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Namjae Jeon <linkinjeon@kernel.org>
Date: Fri May 22 15:25:12 2026 +0800
ksmbd: validate owner of durable handle on reconnect
[ Upstream commit 49110a8ce654bbe56bef7c5e44cce31f4b102b8a ]
Currently, ksmbd does not verify if the user attempting to reconnect
to a durable handle is the same user who originally opened the file.
This allows any authenticated user to hijack an orphaned durable handle
by predicting or brute-forcing the persistent ID.
According to MS-SMB2, the server MUST verify that the SecurityContext
of the reconnect request matches the SecurityContext associated with
the existing open.
Add a durable_owner structure to ksmbd_file to store the original opener's
UID, GID, and account name. and catpure the owner information when a file
handle becomes orphaned. and implementing ksmbd_vfs_compare_durable_owner()
to validate the identity of the requester during SMB2_CREATE (DHnC).
Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2")
Reported-by: Davide Ornaghi <d.ornaghi97@gmail.com>
Reported-by: Navaneeth K <knavaneeth786@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
[ Minor context conflict resolved. ]
Signed-off-by: Alva Lan <alvalan9@foxmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Junyi Liu <moss80199@gmail.com>
Date: Tue May 19 16:12:04 2026 +0900
ksmbd: validate SID in parent security descriptor during ACL inheritance
commit 69f030cf95488ae1186c72ac8c66fd279664ea7f upstream.
Introduce smb_validate_ntsd_sid() helper to safely validate Owner SID
and Group SID inside the NT Security Descriptor (smb_ntsd) retrieved
from the parent directory.
Cc: stable@vger.kernel.org
Signed-off-by: Junyi Liu <moss80199@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: David Gow <david@davidgow.net>
Date: Sat Apr 25 11:41:53 2026 +0800
kunit: config: Enable KUNIT_DEBUGFS by default
[ Upstream commit 17e4c68ff35090d8cb743e3c82c09f92fda1ebda ]
The KUNIT_DEBUGFS option is currently enabled based on the value of
KUNIT_ALL_TESTS, but it really doesn't have anything to do with the set of
enabled tests, so just enable it by default anyway. In particular, this
shouldn't be only visible if KUNIT_ALL_TESTS is set, which is quite
confusing.
Link: https://lore.kernel.org/r/20260425034155.53913-1-david@davidgow.net
Fixes: beaed42c427d ("kunit: default KUNIT_* fragments to KUNIT_ALL_TESTS")
Signed-off-by: David Gow <david@davidgow.net>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Gow <david@davidgow.net>
Date: Sat Apr 25 11:41:54 2026 +0800
kunit: config: KUNIT_DEBUGFS should depend on DEBUG_FS
[ Upstream commit 8f80b5b227ef9ea422080487715c841856339aed ]
CONFIG_KUNIT_DEBUGFS is totally useless without debugfs, so it should
depend on CONFIG_DEBUG_FS.
Link: https://lore.kernel.org/r/20260425034155.53913-2-david@davidgow.net
Fixes: e2219db280e3 ("kunit: add debugfs /sys/kernel/debug/kunit/<suite>/results display")
Signed-off-by: David Gow <david@davidgow.net>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michael Bommarito <michael.bommarito@gmail.com>
Date: Tue May 19 09:25:19 2026 -0400
KVM: arm64: vgic-its: Reject restored DTE with out-of-range num_eventid_bits
commit 9ce754ed8e7ab4e3999767ce1505f85c449ccb07 upstream.
Userspace can restore an ITS Device Table Entry whose Size field encodes
more EventID bits than the virtual ITS supports. The live MAPD path
rejects that state, but vgic_its_restore_dte() accepts it and stores the
out-of-range value in dev->num_eventid_bits.
Reject restored DTEs with num_eventid_bits > VITS_TYPER_IDBITS before
allocating the device. This mirrors the MAPD check and prevents the
restored state from reaching vgic_its_restore_itt(), where the unchecked
value can be converted into an oversized scan_its_table() range.
Fixes: 57a9a117154c ("KVM: arm64: vgic-its: Device table save/restore")
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Link: https://lore.kernel.org/r/20260519132519.2142458-1-michael.bommarito@gmail.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Michael Bommarito <michael.bommarito@gmail.com>
Date: Tue May 19 09:50:42 2026 -0400
KVM: arm64: vgic: Free private_irqs when init fails after allocation
commit f19c354dbd457759dfcf1195ab4bdba2bb568323 upstream.
Companion to commit 250f25367b58 ("KVM: arm64: Tear down vGIC on
failed vCPU creation"), which added the missing kvm_vgic_vcpu_destroy()
call to the kvm_share_hyp() failure path in kvm_arch_vcpu_create(). The
kvm_vgic_vcpu_init() failure path immediately above it has the same
shape and still needs the same cleanup.
Call kvm_vgic_vcpu_destroy() when kvm_vgic_vcpu_init() fails so private
IRQs allocated before a redistributor iodev registration failure are
released before the failed vCPU is freed.
Fixes: 03b3d00a70b5 ("KVM: arm64: vgic: Allocate private interrupts on demand")
Cc: stable@vger.kernel.org
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Yuan Yao <yaoyuan@linux.alibaba.com>
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Link: https://lore.kernel.org/r/20260519135042.2219239-1-michael.bommarito@gmail.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Michael Bommarito <michael.bommarito@gmail.com>
Date: Mon May 18 14:34:47 2026 -0400
l2tp: use list_del_rcu in l2tp_session_unhash
commit 979c017803c40829b03acd9e5236e354b7622360 upstream.
An unprivileged local user can pin a host CPU indefinitely in
l2tp_session_get_by_ifname() by issuing L2TP_CMD_SESSION_GET on
L2TP_ATTR_IFNAME concurrently with L2TP_CMD_SESSION_CREATE and
L2TP_CMD_SESSION_DELETE on the same tunnel. All three commands take
GENL_UNS_ADMIN_PERM, so CAP_NET_ADMIN in the netns user namespace
suffices; on any host that has l2tp_core loaded the trigger is
reachable from a standard `unshare -Urn` sandbox.
l2tp_session_unhash() removes a session from tunnel->session_list
with list_del_init(), but that list is walked by
l2tp_session_get_by_ifname() with list_for_each_entry_rcu() under
rcu_read_lock_bh(). list_del_init() leaves the deleted entry's
next/prev self-pointing; a reader that has loaded the entry and
then advances pos->list.next reads &session->list, container_of()s
back to the same session, and list_for_each_entry_rcu() never
reaches the list head. The CPU stays in strcmp() inside the
walker, with BH and preemption disabled, so RCU grace periods on
the host stall behind it and the wedged thread cannot be killed
(SIGKILL is delivered on syscall return).
Use list_del_rcu() to match the existing list_add_rcu() in
l2tp_session_register(); the deleted session remains visible to
in-flight walkers with consistent next/prev pointers until
kfree_rcu() in l2tp_session_free() releases it. tunnel->session_list
has exactly one list_del_init() call site; the list_del_init
(&session->clist) at l2tp_core.c:533 operates on the per-collision
list, which is not walked under RCU. list_empty(&session->list) is
not used anywhere in net/l2tp/ after the unhash point, so dropping
the post-delete self-init is safe; the fix has no userspace-visible
behavior change.
Fixes: 89b768ec2dfef ("l2tp: use rcu list add/del when updating lists")
Cc: stable@vger.kernel.org # 6.11+
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Link: https://patch.msgid.link/20260518183447.64078-1-michael.bommarito@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matthieu Buffet <matthieu@buffet.re>
Date: Thu May 28 12:14:26 2026 +0000
landlock: Fix TCP handling of short AF_UNSPEC addresses
[ Upstream commit e4d82cbce2258f454634307fdabf33aa46b61ab0 ]
current_check_access_socket() treats AF_UNSPEC addresses as
AF_INET ones, and only later adds special case handling to
allow connect(AF_UNSPEC), and on IPv4 sockets
bind(AF_UNSPEC+INADDR_ANY).
This would be fine except AF_UNSPEC addresses can be as
short as a bare AF_UNSPEC sa_family_t field, and nothing
more. The AF_INET code path incorrectly enforces a length of
sizeof(struct sockaddr_in) instead.
Move AF_UNSPEC edge case handling up inside the switch-case,
before the address is (potentially incorrectly) treated as
AF_INET.
Fixes: fff69fb03dde ("landlock: Support network rules with TCP bind and connect")
Signed-off-by: Matthieu Buffet <matthieu@buffet.re>
Link: https://lore.kernel.org/r/20251027190726.626244-4-matthieu@buffet.re
Signed-off-by: Mickaël Salaün <mic@digikod.net>
[ There was a conflict due to missing commit 9f74411a40ce ("landlock:
Log TCP bind and connect denials") ]
Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Mon Jun 1 17:46:33 2026 +0200
Linux 6.12.92
Link: https://lore.kernel.org/r/20260528194629.379955525@linuxfoundation.org
Tested-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
Tested-by: Ron Economos <re@w6rz.net>
Tested-by: Miguel Ojeda <ojeda@kernel.org>
Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Tested-by: Brett A C Sheffield <bacs@librecast.net>
Tested-by: Salvatore Bonaccorso <carnil@debian.org>
Tested-by: Peter Schneider <pschneider1968@googlemail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Tiezhu Yang <yangtiezhu@loongson.cn>
Date: Fri May 22 15:05:07 2026 +0800
LoongArch: kprobes: Fix handling of fatal unrecoverable recursions
[ Upstream commit 1c856e158fd34ef2c4475a81c1dc386329989938 ]
KPROBE_HIT_SS and KPROBE_REENTER are two types of fatal recursions that
can not be safely recovered in kprobes.
KPROBE_HIT_SS means that a kprobe is hit during single-stepping. At
this point, the architecture-specific single-step context is already
active. Nested single-stepping would corrupt the state, as the kprobe
control block (kcb) and hardware registers cannot safely store multiple
levels of stepping state.
KPROBE_REENTER means that a third-level recursion occurs when a probe
is hit while the system is already handling a nested probe (second-
level). The kcb only provides a single slot (prev_kprobe) to backup the
state. When a third probe is hit, there is no more space to save the
state without corrupting the first-level backup.
Kprobes work by replacing instructions with breakpoints. In order to
execute the original instruction and continue, it must be moved to a
temporary "single-step" slot. Since there is no backup space left to
set up this slot safely, the CPU would be forced to return to the same
original breakpoint address, triggering an endless loop.
Currently, the code only prints a warning and returns. This leads to
an infinite re-entry loop as the CPU repeatedly hits the same trap and
a "stuck" CPU core because preemption was disabled at the start of the
handler and never re-enabled in this early return path.
Fix the logic by:
1. Merging KPROBE_HIT_SS and KPROBE_REENTER cases, as both represent
fatal recursions that cannot be safely recovered.
2. Replacing WARN_ON_ONCE() with BUG() to terminate the system. This
aligns LoongArch with other architectures (x86, arm64, riscv) and
prevents stack overflow while providing diagnostic information.
Fixes: 6d4cc40fb5f5 ("LoongArch: Add kprobes support")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Huacai Chen <chenhuacai@kernel.org>
Date: Thu May 21 20:58:40 2026 +0800
LoongArch: Remove unused code to avoid build warning
commit 0ccc9d47cf020994097ff51827cebd04aa2b0bf4 upstream.
After commit feee6b2989165631b1 ("mm/memory_hotplug: shrink zones when
offlining memory"), __remove_pages() doesn't need the "zone" parameter
so the "page" variable is also unused. Remove the unused code to avoid
such build warning:
arch/loongarch/mm/init.c: In function 'arch_remove_memory':
arch/loongarch/mm/init.c:134:22: warning: variable 'page' set but not used [-Wunused-but-set-variable=]
134 | struct page *page = pfn_to_page(start_pfn);
Cc: <stable@vger.kernel.org>
Reviewed-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Stephen Smalley <stephen.smalley.work@gmail.com>
Date: Wed May 13 14:05:06 2026 -0400
lsm: hold cred_guard_mutex for lsm_set_self_attr()
commit 4a9b16541ad3faf8bccb398532bf3f8b6bbf1188 upstream.
Just as proc_pid_attr_write() already does before calling the LSM
hook. This only matters for SELinux and AppArmor which check
whether the process is being ptraced and if so, whether to
allow the transition.
Cc: stable@vger.kernel.org
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: SeongJae Park <sj@kernel.org>
Date: Sun Apr 26 10:36:12 2026 -0700
mm/damon/sysfs-schemes: call missing mem_cgroup_iter_break()
commit d4e7b5c4cc353f154d5ab8bb2e1ce7714d77a6e9 upstream.
damon_sysfs_memcg_path_to_id() breaks mem_cgroup_iter() loop without
calling mem_cgroup_iter_break(). This leaks the cgroup reference. Fix
the issue by calling mem_cgroup_iter_break() before the break.
The issue was discovered [1] by Sashiko.
Link: https://lore.kernel.org/20260426173625.86521-1-sj@kernel.org
Link: https://lore.kernel.org/20260423004148.74722-1-sj@kernel.org [1]
Fixes: 29cbb9a13f05 ("mm/damon/sysfs-schemes: implement scheme filters")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> # 6.3.x
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Muchun Song <muchun.song@linux.dev>
Date: Tue Apr 28 16:52:17 2026 +0800
mm/memory_hotplug: fix memory block reference leak on remove
commit 93866f55f7e292fe3d47d36c9efe5ee10213a06b upstream.
Patch series "mm: Fix memory block leaks and locking", v2.
This series fixes two memory block device reference leaks and one locking
issue around the per-memory_block hwpoison counter.
This patch (of 2):
remove_memory_blocks_and_altmaps() looks up each memory block with
find_memory_block(), which acquires a reference to the memory block
device.
That reference is never dropped on this path, resulting in a leaked device
reference when removing memory blocks and their altmaps. Drop the
reference after retrieving mem->altmap and clearing mem->altmap, before
removing the memory block device.
Link: https://lore.kernel.org/20260428085219.1316047-1-songmuchun@bytedance.com
Link: https://lore.kernel.org/20260428085219.1316047-2-songmuchun@bytedance.com
Fixes: 6b8f0798b85a ("mm/memory_hotplug: split memmap_on_memory requests across memblocks")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Oscar Salvador <osalvador@suse.de>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Huang, Ying" <huang.ying.caritas@gmail.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date: Thu May 21 05:08:48 2026 +0200
mptcp: pm: ADD_ADDR rtx: allow ID 0
commit 03f324f3f1f7619a47b9c91282cb12775ab0a2f1 upstream.
ADD_ADDR can be sent for the ID 0, which corresponds to the local
address and port linked to the initial subflow.
Indeed, this address could be removed, and re-added later on, e.g. what
is done in the "delete re-add signal" MPTCP Join selftests. So no reason
to ignore it.
Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-2-fca8091060a4@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[ applied to net/mptcp/pm_netlink.c instead of upstream's pm_kernel.c ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date: Thu May 21 05:08:49 2026 +0200
mptcp: pm: ADD_ADDR rtx: always decrease sk refcount
commit 9634cb35af17019baec21ca648516ce376fa10e6 upstream.
When an ADD_ADDR is retransmitted, the sk is held in sk_reset_timer().
It should then be released in all cases at the end.
Some (unlikely) checks were returning directly instead of calling
sock_put() to decrease the refcount. Jump to a new 'exit' label to call
__sock_put() (which will become sock_put() in the next commit) to fix
this potential leak.
While at it, drop the '!msk' check which cannot happen because it is
never reset, and explicitly mark the remaining one as "unlikely".
Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-4-fca8091060a4@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[ applied to net/mptcp/pm_netlink.c instead of upstream's pm_kernel.c ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date: Thu May 21 05:08:50 2026 +0200
mptcp: pm: ADD_ADDR rtx: free sk if last
commit b7b9a461569734d33d3259d58d2507adfac107ed upstream.
When an ADD_ADDR is retransmitted, the sk is held in sk_reset_timer(),
and released at the end.
If at that moment, it was the last reference being held, the sk would
not be freed. sock_put() should then be called instead of __sock_put().
But that's not enough: if it is the last reference, sock_put() will call
sk_free(), which will end up calling sk_stop_timer_sync() on the same
timer, and waiting indefinitely to finish. So it is needed to mark that
the timer is done at the end of the timer handler when it has not been
rescheduled, not to call sk_stop_timer_sync() on "itself".
Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-5-fca8091060a4@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[ Applied to net/mptcp/pm_netlink.c instead of upstream's pm_kernel.c.
Also, there were conflicts, because commit 30549eebc4d8 ("mptcp: make
ADD_ADDR retransmission timeout adaptive") is not in this version and
changed the context. Also, other conflicts were due to newer patches
being backported with resolved conflicts before this one. ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gang Yan <yangang@kylinos.cn>
Date: Thu May 21 05:08:47 2026 +0200
mptcp: sync the msk->sndbuf at accept() time
commit fcf04b14334641f4b0b8647824480935e9416d52 upstream.
On passive MPTCP connections, the msk sndbuf is not updated correctly.
The root cause is an order issue in the accept path:
- tcp_check_req() -> subflow_syn_recv_sock() -> mptcp_sk_clone_init()
calls __mptcp_propagate_sndbuf() to copy the ssk sndbuf into msk
- Later, tcp_child_process() -> tcp_init_transfer() ->
tcp_sndbuf_expand() grows the ssk sndbuf.
So __mptcp_propagate_sndbuf() runs before the ssk sndbuf has been
expanded and the msk ends up with a much smaller sndbuf than the
subflow:
MPTCP: msk->sndbuf:20480, msk->first->sndbuf:2626560
Fix this by moving the __mptcp_propagate_sndbuf() call from
mptcp_sk_clone_init() -- the ssk sndbuf is not yet finalized there -- to
__mptcp_propagate_sndbuf() at accept() time, when the ssk sndbuf has
been fully expanded by tcp_sndbuf_expand().
Fixes: 8005184fd1ca ("mptcp: refactor sndbuf auto-tuning")
Cc: stable@vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/602
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260420-net-mptcp-sync-sndbuf-accept-v1-1-e3523e3aeb44@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
[ No conflicts, but move __mptcp_propagate_sndbuf() above the for-loop
(mptcp_for_each_subflow()) present in this version, which will modify
'subflow' used by __mptcp_propagate_sndbuf() in this new patch. ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jeroen Massar <jmassar@nvidia.com>
Date: Wed May 13 09:33:02 2026 +0300
net/mlx5: Do not restore destination-less TC rules
[ Upstream commit 8d0a5af8b1ba598e7340761729801624e7a9330e ]
After IPsec policy/state TX rules are added, any TC flow rule, which
forwards packets to uplink, is modified to forward to IPsec TX tables.
As these tables are destroyed dynamically, whenever there is no
reference to them, the destinations of this kind of rules must be
restored to uplink, unless there is no destination for that rule.
The flow rules FLOW_ACTION_ACCEPT, DROP, TRAP, GOTO and SAMPLE do not
have a destination port, and thus out_count = 0.
At cleanup time of the rules in mlx5_esw_ipsec_modify_flow_dests
we call mlx5_eswitch_restore_ipsec_rule but as the above types
do not have a destination we get an underflow of out_count, as
the port is passed, which is esw_attr->out_count - 1.
This change avoids calling mlx5_eswitch_restore_ipsec_rule when
there are no output destinations and thus avoids the underflow.
Fixes: d1569537a837 ("net/mlx5e: Modify and restore TC rules for IPSec TX rules")
Signed-off-by: Jeroen Massar <jmassar@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260513063302.333761-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jianbo Liu <jianbol@nvidia.com>
Date: Tue May 26 19:21:20 2026 +0000
net/mlx5e: Trigger neighbor resolution for unresolved destinations
commit 9ab89bde13e5251e1d0507e1cc426edcdfe19142 upstream.
When initializing the MAC addresses for an outbound IPsec packet offload
rule in mlx5e_ipsec_init_macs, the call to dst_neigh_lookup is used to
find the next-hop neighbor (typically the gateway in tunnel mode).
This call might create a new neighbor entry if one doesn't already
exist. This newly created entry starts in the INCOMPLETE state, as the
kernel hasn't yet sent an ARP or NDISC probe to resolve the MAC
address. In this case, neigh_ha_snapshot will correctly return an
all-zero MAC address.
IPsec packet offload requires the actual next-hop MAC address to
program the rule correctly. If the neighbor state is INCOMPLETE when
the rule is created, the hardware rule is programmed with an all-zero
destination MAC address. Packets sent using this rule will be
subsequently dropped by the receiving network infrastructure or host.
This patch adds a check specifically for the outbound offload path. If
neigh_ha_snapshot returns an all-zero MAC address, it proactively
calls neigh_event_send(n, NULL). This ensures the kernel immediately
sends the initial ARP or NDISC probe if one isn't already pending,
accelerating the resolution process. This helps prevent the hardware
rule from being programmed with an invalid MAC address and avoids
packet drops due to unresolved neighbors.
Fixes: 71670f766b8f ("net/mlx5e: Support routed networks during IPsec MACs initialization")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1765284977-1363052-8-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Gyokhan Kochmarla <gyokhan@amazon.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jianbo Liu <jianbol@nvidia.com>
Date: Tue May 26 19:22:14 2026 +0000
net/mlx5e: Use ip6_dst_lookup instead of ipv6_dst_lookup_flow for MAC init
commit e35d7da8dd9e55b37c3e8ab548f6793af0c2ab49 upstream.
Replace ipv6_stub->ipv6_dst_lookup_flow() with ip6_dst_lookup() in
mlx5e_ipsec_init_macs() since IPsec transformations are not needed
during Security Association setup - only basic routing information is
required for nexthop MAC address resolution.
This resolves an issue where XfrmOutNoStates error counter would be
incremented when xfrm policy is configured before xfrm state, as the
IPsec-aware routing function would attempt policy checks during SA
initialization.
Fixes: 71670f766b8f ("net/mlx5e: Support routed networks during IPsec MACs initialization")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1765284977-1363052-7-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Gyokhan Kochmarla <gyokhan@amazon.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xiang Mei <xmei5@asu.edu>
Date: Sun May 10 15:26:40 2026 -0700
net/smc: avoid NULL deref of conn->lnk in smc_msg_event tracepoint
[ Upstream commit 7bf563badd37cb796df5477d2b78bb64148a1268 ]
The smc_msg_event tracepoint class, shared by smc_tx_sendmsg and
smc_rx_recvmsg, unconditionally dereferences smc->conn.lnk:
__string(name, smc->conn.lnk->ibname)
conn->lnk is only set for SMC-R; for SMC-D it is NULL. Other code on
these paths already handles this (e.g. !conn->lnk in
SMC_STAT_RMB_TX_SIZE_SMALL()). With the tracepoint enabled, the first
sendmsg()/recvmsg() on an SMC-D socket crashes:
Oops: general protection fault, probably for non-canonical address
KASAN: null-ptr-deref in range [...]
RIP: 0010:strlen+0x1e/0xa0
Call Trace:
trace_event_raw_event_smc_msg_event (net/smc/smc_tracepoint.h:44)
smc_rx_recvmsg (net/smc/smc_rx.c:515)
smc_recvmsg (net/smc/af_smc.c:2859)
__sys_recvfrom (net/socket.c:2315)
__x64_sys_recvfrom (net/socket.c:2326)
do_syscall_64
The faulting address 0x3e0 is offsetof(struct smc_link, ibname),
confirming the NULL ->lnk deref. Enabling the tracepoint requires
root, but the trigger itself is unprivileged: socket(AF_SMC, ...) has
no capability check, and SMC-D negotiation needs no admin step on
s390 or on x86 with the loopback ISM device loaded.
Log an empty device name for SMC-D instead of dereferencing NULL.
Fixes: aff3083f10bf ("net/smc: Introduce tracepoints for tx and rx msg")
Reported-by: Weiming Shi <bestswngs@gmail.com>
Signed-off-by: Xiang Mei <xmei5@asu.edu>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Sidraya Jayagond <sidraya@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xiang Mei <xmei5@asu.edu>
Date: Sun May 10 23:21:38 2026 -0700
net/smc: reject CHID-0 ACCEPT that matches an empty ism_dev slot
[ Upstream commit 277740023def559a4a2ddc3e8e784ee37a0f16a9 ]
On the SMC-D client, slot 0 of ini->ism_dev[]/ini->ism_chid[] is
reserved for an SMC-Dv1 device. smc_find_ism_v2_device_clnt()
populates V2 entries starting at index 1, so when no V1 device is
selected slot 0 is left in its kzalloc()'ed state with ism_dev[0] ==
NULL and ism_chid[0] == 0.
smc_v2_determine_accepted_chid() then matches the peer's CHID against
the array starting from index 0 using the CHID alone. A malicious
peer replying to a SMC-Dv2-only proposal with d1.chid == 0 matches
the empty slot, ini->ism_selected becomes 0, and the subsequent
ism_dev[0]->lgr_lock dereference in smc_conn_create() faults at
offsetof(struct smcd_dev, lgr_lock) == 0x68:
BUG: KASAN: null-ptr-deref in _raw_spin_lock_bh+0x79/0xe0
Write of size 4 at addr 0000000000000068 by task exploit/144
Call Trace:
_raw_spin_lock_bh
smc_conn_create (net/smc/smc_core.c:1997)
__smc_connect (net/smc/af_smc.c:1447)
smc_connect (net/smc/af_smc.c:1720)
__sys_connect
__x64_sys_connect
do_syscall_64
Require ism_dev[i] to be non-NULL before accepting a CHID match.
Fixes: a7c9c5f4af7f ("net/smc: CLC accept / confirm V2")
Reported-by: Weiming Shi <bestswngs@gmail.com>
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Xiang Mei <xmei5@asu.edu>
Link: https://patch.msgid.link/20260511062138.2839584-1-xmei5@asu.edu
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rosen Penev <rosenp@gmail.com>
Date: Sat May 16 14:26:16 2026 -0700
net: ag71xx: check error for platform_get_irq
[ Upstream commit e7c70bf97e90d974cd575e4c90f8f9b07d056da3 ]
Complete error handling for a failed platform_get_irq() call
Fixes: d51b6ce441d3 ("net: ethernet: add ag71xx driver")
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20260516212616.11758-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nicolai Buchwitz <nb@tipi-net.de>
Date: Wed May 20 20:43:20 2026 +0200
net: bcmgenet: keep RBUF EEE/PM disabled
commit 9a1730245e416d11ad5c0f2c100061d61cc43f60 upstream.
Setting RBUF_EEE_EN | RBUF_PM_EN in RBUF_ENERGY_CTRL breaks the RX
path on GENET hardware once MAC EEE becomes active. RX traffic stops
flowing while the link stays up and the usual descriptor/RX error
counters remain quiet. In that state the MAC still accepts frames
(rbuf_ovflow_cnt keeps climbing) but RBUF no longer forwards them to
DMA, so rx_packets is no longer incremented at the netdev level. On
some boards the corruption ends up as a paging fault in
skb_release_data via bcmgenet_rx_poll on an LPI exit.
Reproduced on Pi 4B (BCM2711 + BCM54213PE) and confirmed by Florian
Fainelli on an internal Broadcom 4908-family board with the same crash
signature. RBUF_PM_EN is not publicly documented.
This shows up more often now that phy_support_eee() enables EEE by
default, but it also affects older kernels as soon as TX LPI is
turned on via ethtool, so it is not specific to recent changes.
Always clear RBUF_EEE_EN | RBUF_PM_EN in bcmgenet_eee_enable_set so
the bits stay off across resets. UMAC and TBUF setup is left alone so
TX-side EEE keeps working.
Link: https://github.com/raspberrypi/linux/issues/7304
Fixes: 6ef398ea60d9 ("net: bcmgenet: add EEE support")
Cc: stable@vger.kernel.org
Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20260520184320.652053-1-nb@tipi-net.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Petr Machata <petrm@nvidia.com>
Date: Thu Oct 23 16:45:37 2025 +0200
net: bridge: Flush multicast groups when snooping is disabled
[ Upstream commit 68800bbf583f26f71491141e4b3c8582f9cfcbde ]
When forwarding multicast packets, the bridge takes MDB into account when
IGMP / MLD snooping is enabled. Currently, when snooping is disabled, the
MDB is retained, even though it is not used anymore.
At the same time, during the time that snooping is disabled, the IGMP / MLD
control packets are obviously ignored, and after the snooping is reenabled,
the administrator has to assume it is out of sync. In particular, missed
join and leave messages would lead to traffic being forwarded to wrong
interfaces.
Keeping the MDB entries around thus serves no purpose, and just takes
memory. Note also that disabling per-VLAN snooping does actually flush the
relevant MDB entries.
This patch flushes non-permanent MDB entries as global snooping is
disabled.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/5e992df1bb93b88e19c0ea5819e23b669e3dde5d.1761228273.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 4df78ff02629 ("bridge: mcast: Fix a possible use-after-free when removing a bridge port")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniel Golle <daniel@makrotopia.org>
Date: Thu May 14 15:04:21 2026 +0100
net: dsa: mt7530: fix FDB entries not aging out with short timeout
[ Upstream commit e824e40d0e841fab66ab7897d6c7b14dc81c66a7 ]
The DSA forwarding selftests bridge_vlan_aware.sh and
bridge_vlan_unaware.sh configure the bridge with ageing_time set to
LOW_AGEING_TIME (1000 centiseconds, i.e. 10 seconds) and then run
learning_test() in lib.sh, which expects a learned FDB entry to be
removed after ageing_time + 10 seconds. On MT7530/MT7531 the entry
persisted past the deadline and the "Found FDB record when should
not" assertion failed.
With msecs=10000, the algorithm in mt7530_set_ageing_time() finds
AGE_CNT=0 and AGE_UNIT=9 as the first exact match (starting the
search from tmp_age_count=0). The per-entry aging counter is
initialized to AGE_CNT when a MAC address is learned, so with
AGE_CNT=0 new entries start with a counter value of 0, which the
hardware treats as "already aged" and never removes, effectively
disabling aging.
Fix this by starting the search from tmp_age_count=1 to ensure
entries always have a non-zero initial aging counter. For a
10-second ageing time this yields AGE_CNT=1 and AGE_UNIT=4 instead:
the timer ticks every 5 seconds and entries are removed after 2
ticks.
Starting the search at AGE_CNT=1 raises the minimum representable
ageing time from 1 to 2 seconds. Without bounds, a stale ageing_time
of 1 second would now make the loop fall through without setting
age_count and age_unit, leaving them uninitialized when written to
the MT7530_AAC hardware register. Set ds->ageing_time_min and
ds->ageing_time_max so the DSA core validates the range before the
callback is invoked, and drop the now-redundant range check from
mt7530_set_ageing_time().
Fixes: ea6d5c924e39 ("net: dsa: mt7530: support setting ageing time")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/7788ded12dc07b1bce329ec35fa70f4b45f3f9b7.1778766629.git.daniel@makrotopia.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniel Golle <daniel@makrotopia.org>
Date: Thu May 14 15:04:35 2026 +0100
net: dsa: mt7530: preserve VLAN tags on trapped link-local frames
[ Upstream commit 3ac85bcfd404b588298c95c6fba8aad4ad334f57 ]
The BPC, RGAC1 and RGAC2 registers control the handling of link-local
frames with reserved MAC DAs (01:80:C2:00:00:0x). These frames are
correctly trapped to the CPU port, but the egress VLAN tag attribute was
set to MT7530_VLAN_EG_UNTAGGED which causes the switch to strip any
VLAN tags from trapped frames before they reach the CPU.
This causes VLAN-tagged link-local frames (STP BPDUs, LLDP, PTP Peer
Delay Requests) to arrive at the CPU without their VLAN tag, so they
are delivered to the base network interface instead of the VLAN
sub-interface. The DSA local_termination selftest confirms this: all
link-local protocol tests on VLAN upper interfaces fail.
Set the EG_TAG attribute to MT7530_VLAN_EG_DISABLED (system default)
so that the switch does not modify VLAN tags in trapped frames. This
way VLAN-tagged frames retain their original tag and are delivered to
the correct VLAN sub-interface, matching the behavior of non-trapped
frames which pass through without VLAN tag modification.
Fixes: 69ddba9d170b ("net: dsa: mt7530: fix handling of all link-local frames")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Acked-by: Chester A. Unal <chester.a.unal@arinc9.com>
Link: https://patch.msgid.link/891e0cd34db2a5fe20ceb73283a81fb5f71427ca.1778766629.git.daniel@makrotopia.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Linus Walleij <linusw@kernel.org>
Date: Sat May 9 00:13:38 2026 +0200
net: ethernet: cortina: Carry over frag counter
[ Upstream commit ebd8ec2b309e3a447851b456ccaf8fb39f3661e7 ]
The gmac_rx() NAPI poll function assembles packets in an
SKB from a ring buffer.
If the ring buffer gets completely emptied during a poll cycle,
we exit gmac_rx(), but the packet is not yet completely
assembled in the SKB, yet the fragment counter frag_nr is
reset to zero on the next invocation.
Solve this by making the RX fragment counter a part of the
port struct, and carry it over between invocations.
Reset the fragment counter only right after calling
napi_gro_frags(), on error (after calling napi_free_frags())
or if stopping the port.
Reset it in some place where not strictly necessary just to
emphasize what is going on.
This was found by Sashiko during normal patch review.
Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
Link: https://sashiko.dev/#/patchset/20260505-gemini-ethernet-fix-v2-1-997c31d06079%40kernel.org
Signed-off-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260509-gemini-ethernet-fixes-v1-3-6c5d20ddc35b@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andreas Haarmann-Thiemann <eitschman@nebelreich.de>
Date: Tue May 5 23:52:17 2026 +0200
net: ethernet: cortina: Drop half-assembled SKB
[ Upstream commit b266bacba796ff5c4dcd2ae2fc08aacf7ab39153 ]
In gmac_rx() (drivers/net/ethernet/cortina/gemini.c), when
gmac_get_queue_page() returns NULL for the second page of a multi-page
fragment, the driver logs an error and continues — but does not free the
partially assembled skb that was being assembled via napi_build_skb() /
napi_get_frags().
Free the in-progress partially assembled skb via napi_free_frags()
and increase the number of dropped frames appropriately
and assign the skb pointer NULL to make sure it is not lingering
around, matching the pattern already used elsewhere in the driver.
Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
Signed-off-by: Andreas Haarmann-Thiemann <eitschman@nebelreich.de>
Signed-off-by: Linus Walleij <linusw@kernel.org>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20260505-gemini-ethernet-fix-v2-1-997c31d06079@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Linus Walleij <linusw@kernel.org>
Date: Sat May 9 00:13:37 2026 +0200
net: ethernet: cortina: Make RX SKB per-port
[ Upstream commit 06937db21ee311ed07eba47954447245041a982d ]
The SKB used to assemble packets from fragments in gmac_rx()
is static local, but the Gemini has two ethernet ports, meaning
there can be races between the ports on a bad day if a device
is using both.
Make the RX SKB a per-port variable and carry it over between
invocations in the port struct instead.
Zero the pointer once we call napi_gro_frags(), on error (after
calling napi_free_frags()) or if the port is stopped.
Zero it in some place where not strictly necessary just to
emphasize what is going on.
This was found by Sashiko during normal patch review.
Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
Link: https://sashiko.dev/#/patchset/20260505-gemini-ethernet-fix-v2-1-997c31d06079%40kernel.org
Signed-off-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260509-gemini-ethernet-fixes-v1-2-6c5d20ddc35b@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ethan Nelson-Moore <enelsonmoore@gmail.com>
Date: Fri May 8 19:37:28 2026 -0700
net: ethernet: cs89x0: remove stale CONFIG_MACH_MX31ADS reference
[ Upstream commit 36a8d04a8293afcb9304cf0cd3741f67698f2a1a ]
The legacy ARM board file for MACH_MX31ADS was removed in commit
c93197b0041d ("ARM: imx: Remove i.MX31 board files"), but a reference
to it remained in the cs89x0 driver. Drop this unused code.
Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
Fixes: c93197b0041d ("ARM: imx: Remove i.MX31 board files")
Link: https://patch.msgid.link/20260509023732.42256-1-enelsonmoore@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sabrina Dubroca <sd@queasysnail.net>
Date: Wed May 20 22:44:42 2026 +0200
net: gro: don't merge zcopy skbs
[ Upstream commit 4db79a322db8c97f7b73b8a347395ef4d685eb40 ]
skb_gro_receive() can currently copy frags between the source and GRO
skb, without checking the zerocopy status, and in particular the
SKBFL_MANAGED_FRAG_REFS flag.
When SKBFL_MANAGED_FRAG_REFS is set, the skb doesn't hold a reference
on the pages in shinfo->frags. Appending those frags to another skb's
frags without fixing up the page refcount can lead to UAF.
When either the last skb in the GRO chain (the one we would append
frags to) or the source skb is zerocopy, don't merge the skbs.
Fixes: 753f1ca4e1e5 ("net: introduce managed frags infrastructure")
Reported-by: Huzaifa Sidhpurwala <huzaifas@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/c3b7f906bbfcbdfd7b4fa9d6c18a438870df85be.1779307748.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michael Bommarito <michael.bommarito@gmail.com>
Date: Wed May 13 21:37:39 2026 -0400
net: ifb: report ethtool stats over num_tx_queues
commit 5db89c99566fc4728cc92e941d8e1975711e24b5 upstream.
ifb_dev_init() allocates dp->tx_private to dev->num_tx_queues
entries via kzalloc_objs(*txp, dev->num_tx_queues). Both IFB
per-queue RX and TX stats live in those entries: ifb_xmit() updates
txp->rx_stats using the skb queue mapping, ifb_ri_tasklet() updates
txp->tx_stats, and ifb_stats64() aggregates both over
dev->num_tx_queues.
The ethtool stats callbacks instead size and walk the per-queue
stats with dev->real_num_rx_queues and dev->real_num_tx_queues. With
an asymmetric device where the RX queue count exceeds the TX queue
count, for example:
ip link add name ifb10 numtxqueues 1 numrxqueues 8 type ifb
ethtool -S ifb10
ifb_get_ethtool_stats() indexes past the tx_private allocation and
copies adjacent slab data through ETHTOOL_GSTATS.
Use dev->num_tx_queues consistently for the stats strings, the
stats count, and the stats data walks. This reports one RX stats
group and one TX stats group for each backing ifb_q_private entry,
which is the queue set IFB can actually populate.
Reproduced under UML+KASAN at v7.1-rc2:
BUG: KASAN: slab-out-of-bounds in ifb_fill_stats_data+0x3c/0xae
Read of size 8 at addr 0000000062dbd228 by task ethtool/36
ifb_fill_stats_data+0x3c/0xae
ifb_get_ethtool_stats+0xc0/0x129
__dev_ethtool+0x1ca5/0x363c
dev_ethtool+0x123/0x1b3
dev_ioctl+0x56c/0x744
sock_do_ioctl+0x15f/0x1b2
sock_ioctl+0x4d5/0x50a
sys_ioctl+0xd8b/0xde9
With the patch applied, the same UML+KASAN repro is silent and
ethtool -S ifb10 reports only the stats backed by the single
allocated tx_private entry.
Fixes: a21ee5b2fcb8 ("net: ifb: support ethtools stats")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Link: https://patch.msgid.link/20260514013739.3549624-1-michael.bommarito@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Myeonghun Pak <mhun512@gmail.com>
Date: Wed May 6 21:43:11 2026 +0900
net: lan966x: avoid unregistering netdev on register failure
[ Upstream commit c4f3d6eb1fcf6cd9ce4644f604d5aad1ce594dfc ]
lan966x_probe_port() stores the newly allocated net_device in the
port before calling register_netdev(). If register_netdev() fails,
the probe error path calls lan966x_cleanup_ports(), which sees
port->dev and calls unregister_netdev() for a device that was never
registered.
Destroy the phylink instance created for this port and clear port->dev
before returning the registration error. The common cleanup path now skips
ports without port->dev before reaching the registered netdev cleanup, so
it only handles ports that reached the registered-netdev lifetime.
This also avoids treating an uninitialized FDMA netdev and the failed port
as a NULL == NULL match in the common cleanup path.
Fixes: d28d6d2e37d1 ("net: lan966x: add port module support")
Co-developed-by: Ijae Kim <ae878000@gmail.com>
Signed-off-by: Ijae Kim <ae878000@gmail.com>
Signed-off-by: Myeonghun Pak <mhun512@gmail.com>
Link: https://patch.msgid.link/20260506124331.31945-1-mhun512@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
Date: Thu May 14 12:41:51 2026 -0700
net: mana: Fix TOCTOU double-fetch of hwc_msg_id from DMA buffer
[ Upstream commit 35f0f0a2536a4d604b4dbad92c85c4a8fdebb870 ]
In mana_hwc_rx_event_handler(), resp->response.hwc_msg_id is read from
DMA-coherent memory and bounds-checked, then mana_hwc_handle_resp()
re-reads the same field from the same DMA buffer for test_bit() and
pointer arithmetic.
DMA-coherent memory is mapped uncacheable on x86 and is shared,
unencrypted, in Confidential VMs (SEV-SNP/TDX), so each load goes
directly to host-visible memory. A H/W can modify the value
between the check and the use, bypassing the bounds validation.
Fix this by reading hwc_msg_id exactly once using READ_ONCE() into a
stack-local variable in mana_hwc_rx_event_handler(), and passing the
validated value as a parameter to mana_hwc_handle_resp().
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
Link: https://patch.msgid.link/20260514194156.466823-1-ernis@linux.microsoft.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Aditya Garg <gargaditya@linux.microsoft.com>
Date: Tue May 19 22:15:53 2026 -0700
net: mana: validate rx_req_idx to prevent out-of-bounds array access
[ Upstream commit b809d0409991b75a6cff846a5ac27c3062953f84 ]
In mana_hwc_rx_event_handler(), rx_req_idx is derived from
sge->address in DMA-coherent memory. In Confidential VMs
(SEV-SNP/TDX), this memory is shared unencrypted and HW can modify
WQE contents at any time. No bounds check exists on rx_req_idx,
which can lead to an out-of-bounds access into reqs[].
Add bounds check on rx_req_idx in mana_hwc_rx_event_handler() before
using it to index the reqs[] array.
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Aditya Garg <gargaditya@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/20260520051553.857120-1-gargaditya@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sven Schuchmann <schuchmann@schleissheimer.de>
Date: Tue May 12 09:19:47 2026 +0200
net: phy: DP83TC811: add reading of abilities
[ Upstream commit c78bdba7b9666020c0832150a4fc4c0aebc7c6ac ]
At this time the driver is not listing any speeds
it supports. This should be ETHTOOL_LINK_MODE_100baseT1_Full_BIT
for DP83TC811. Add the missing call for phylib to read the abilities.
Fixes: b753a9faaf9a ("net: phy: DP83TC811: Introduce support for the DP83TC811 phy")
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Sven Schuchmann <schuchmann@schleissheimer.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260512071949.6218-1-schuchmann@schleissheimer.de
[pabeni@redhat.com: dropped revision history]
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jonas Jelonek <jelonek.jonas@gmail.com>
Date: Fri May 15 14:31:03 2026 +0000
net: pse-pd: fix sign on -ENOENT check in of_load_pse_pis()
commit 33d35975cbead3fa6b738ee57e5e45e14fbe0886 upstream.
of_count_phandle_with_args() returns the count on success and a negative
errno on failure, including -ENOENT when the "pairsets" property is
absent. The existing comparison in of_load_pse_pis() checks against
ENOENT (positive 2) instead of -ENOENT, so the branch is taken for any
error return: legitimate DTs that omit "pairsets" trigger a spurious
"wrong number of pairsets" error and probe fails with -EINVAL.
Compare against -ENOENT so a missing "pairsets" property is correctly
treated as "this PI has no pairsets, continue".
Fixes: 9be9567a7c59 ("net: pse-pd: Add support for PSE PIs")
Cc: stable@vger.kernel.org
Signed-off-by: Jonas Jelonek <jelonek.jonas@gmail.com>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20260515143103.1721888-1-jelonek.jonas@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jakub Kicinski <kuba@kernel.org>
Date: Mon May 11 10:49:17 2026 -0700
net: tls: fix off-by-one in sg_chain entry count for wrapped sk_msg ring
[ Upstream commit 285943c6e7ca309bbea84b253745154241d9788a ]
When an sk_msg scatterlist ring wraps (sg.end < sg.start),
tls_push_record() chains the tail portion of the ring to the head
using sg_chain(). An extra entry in the sg array is reserved for
this:
struct sk_msg_sg {
[...]
/* The extra two elements:
* 1) used for chaining the front and sections when the list becomes
* partitioned (e.g. end < start). The crypto APIs require the
* chaining;
* 2) to chain tailer SG entries after the message.
*/
struct scatterlist data[MAX_MSG_FRAGS + 2];
The current code uses MAX_SKB_FRAGS + 1 as the ring size:
sg_chain(&msg_pl->sg.data[msg_pl->sg.start],
MAX_SKB_FRAGS - msg_pl->sg.start + 1,
msg_pl->sg.data);
This places the chain pointer at
sg_chain(data[start], (MAX_SKB_FRAGS - msg_start + 1) .. =
&data[start] + (MAX_SKB_FRAGS - msg_start + 1) - 1 =
data[start + (MAX_SKB_FRAGS - start + 1) - 1] =
data[MAX_SKB_FRAGS]
instead of the true last entry. This is likely due to a "race" of
the commit under Fixes landing close to
commit 031097d9e079 ("bpf: sk_msg, zap ingress queue on psock down")
Convert to ARRAY_SIZE and drop the data[start] / - start (as suggested
by Sabrina).
Reported-by: 钱一铭 <yimingqian591@gmail.com>
Fixes: 9aaaa56845a0 ("bpf: Sockmap/tls, skmsg can have wrapped skmsg that needs extra chaining")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/20260511174920.433155-2-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jakub Kicinski <kuba@kernel.org>
Date: Mon May 11 10:49:18 2026 -0700
net: tls: prevent chain-after-chain in plain text SG
[ Upstream commit ff26a0e8377dec07e4a7230db7675bed1b9a6d03 ]
Sashiko points out that if end = 0 (start != 0) the current
code will create a chain link to content type right after
the wrap link:
This would create a chain where the wrap link points directly
to another chain link. The scatterlist API sg_next iterator
does not recursively resolve consecutive chain links.
meaning this is illegal input to crypto.
The wrapping link is unnecessary if end = 0. end is the entry after
the last one used so end = 0 means there's nothing pushed after
the wrap:
end start i
v v v
[ ]...[ ][ d ][ d ][ d ][ d ][rsv for wrap]
Skip the wrapping in this case.
TLS 1.3 can use the "wrapping slot" for it's chaining if end = 0.
This avoids the chain-after-chain.
Move the wrap chaining before marking END and chaining off content
type, that feels like more logical ordering to me, but should not
matter from functional perspective.
Reported-by: Sashiko <sashiko-bot@kernel.org>
Fixes: 9aaaa56845a0 ("bpf: Sockmap/tls, skmsg can have wrapped skmsg that needs extra chaining")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20260511174920.433155-3-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Abdun Nihaal <nihaal@cse.iitm.ac.in>
Date: Tue May 19 11:57:39 2026 +0530
net: wwan: iosm: fix potential memory leaks in ipc_imem_init()
commit c5d93b2c40355e999715262a824965aac025a427 upstream.
The memory allocated in ipc_protocol_init() is not freed on the error
paths that follow in ipc_imem_init(). Fix that by calling the
corresponding release function ipc_protocol_deinit() in the error path.
Fixes: 3670970dd8c6 ("net: iosm: shared memory IPC interface")
Cc: stable@vger.kernel.org
Signed-off-by: Abdun Nihaal <nihaal@cse.iitm.ac.in>
Link: https://patch.msgid.link/20260519062815.55545-1-nihaal@cse.iitm.ac.in
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Zhengchuan Liang <zcliangcn@gmail.com>
Date: Wed May 13 15:57:17 2026 +0800
netfilter: ip6t_hbh: reject oversized option lists
commit 4322dcde6b4173c2d8e8e6118ed290794263bcc8 upstream.
struct ip6t_opts stores at most IP6T_OPTS_OPTSNR option descriptors,
but hbh_mt6_check() does not reject larger optsnr values supplied from
userspace.
Validate optsnr in the rule setup path so only match data that fits the
fixed-size opts array can be installed. This follows the existing xtables
pattern of rejecting invalid user-provided counts in checkentry() and
keeps the packet matching path unchanged.
`struct ip6t_opts` has a fixed `opts[IP6T_OPTS_OPTSNR]` array,
where `IP6T_OPTS_OPTSNR` is 16, then off-by-one array access is possible:
[ 137.924693][ T8692] UBSAN: array-index-out-of-bounds in ../net/ipv6/netfilter/ip6t_hbh.c:110:29
[ 137.926167][ T8692] index 16 is out of range for type '__u16 [16]'
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Zhengchuan Liang <zcliangcn@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Nan Li <tonanli66@gmail.com>
Date: Tue May 12 16:50:01 2026 +0800
netfilter: ipset: stop hash:* range iteration at end
commit 0d3a282ab5f165fc207ff49ea5b6ad8f54616bd6 upstream.
The following hash set variants:
hash:ip,mark
hash:ip,port
hash:ip,port,ip
hash:ip,port,net
iterate IPv4 ranges with a 32-bit iterator.
The iterator must stop once the last address in the requested range has
been processed. Advancing it once more can move the traversal state past
the end of the request, so a later retry may continue from an unintended
position.
Handle the iterator increment explicitly at the end of the loop and stop
once the upper bound has been processed. This keeps the existing retry
behaviour intact for valid ranges while preventing traversal from
continuing past the original boundary.
Fixes: 48596a8ddc46 ("netfilter: ipset: Fix adding an IPv4 range containing more than 2^31 addresses")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Nan Li <tonanli66@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Haoze Xie <royenheart@gmail.com>
Date: Fri May 15 11:19:02 2026 +0800
netfilter: nf_queue: hold bridge skb->dev while queued
commit e196115ec330a18de415bdb9f5071aa9f08e53ce upstream.
br_pass_frame_up() rewrites skb->dev from the ingress port to the bridge
master before queueing bridge LOCAL_IN packets. NFQUEUE only holds
references on state.in/out and bridge physdevs, so a queued bridge
packet can retain a freed bridge master in skb->dev until reinjection.
When the verdict is reinjected later, br_netif_receive_skb() re-enters
the receive path with skb->dev still pointing at the freed bridge master,
triggering a use-after-free.
Store skb->dev in the queue entry, hold a reference on it for the queue
lifetime, and use the saved device when dropping queued packets during
NETDEV_DOWN handling.
Fixes: ac2863445686 ("netfilter: bridge: add nf_afinfo to enable queuing to userspace")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Haoze Xie <royenheart@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn>
Date: Tue May 12 01:30:41 2026 +0800
netfilter: nft_inner: Fix IPv6 inner_thoff desync
commit b6a91f68ebfed9c38e0e9150f58a9b85da07181c upstream.
In nft_inner_parse_l2l3(), when processing inner IPv6 packets,
ipv6_find_hdr() correctly computes the transport header offset
traversing all extension headers, but the result is immediately
overwritten with nhoff + sizeof(_ip6h) (40 bytes), which only
accounts for the IPv6 base header. This creates a desync between
inner_thoff (wrong — points to extension header start) and l4proto
(correct — e.g., IPPROTO_TCP), enabling transport header forgery
and potential firewall bypass. This issue affects stable versions
from Linux 6.2.
For comparison, the normal (non-inner) IPv6 path correctly
preserves ipv6_find_hdr()'s result. Removing the incorrect overwrite
ensures that ipv6_find_hdr()'s calculated transport header offset is
preserved, thereby fixing the desynchronization.
Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching")
Cc: stable@vger.kernel.org
Reported-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn>
Reported-by: Yuxiang Yang <yangyx22@mails.tsinghua.edu.cn>
Reported-by: Xuewei Feng <fengxw06@126.com>
Reported-by: Qi Li <qli01@tsinghua.edu.cn>
Reported-by: Ke Xu <xuke@tsinghua.edu.cn>
Assisted-by: GLM:5.1 Z.ai
Signed-off-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Florian Westphal <fw@strlen.de>
Date: Wed May 6 12:07:16 2026 +0200
netfilter: x_tables: unregister the templates first
[ Upstream commit d338693d778579b676a61346849bebd892427158 ]
When the module is going away we need to zap the template
first. Else there is a small race window where userspace
could instantiate a new table after the pernet exit function
has removed the current table.
Fixes: fdacd57c79b7 ("netfilter: x_tables: never register tables by default")
Reported-by: Tristan Madani <tristan@talencesecurity.com>
Reviewed-by: Tristan Madani <tristan@talencesecurity.com>
Closes: https://lore.kernel.org/netfilter-devel/20260429175613.1459342-1-tristmd@gmail.com/
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Tue May 12 13:33:49 2026 +0100
netfs: Defer the emission of trace_netfs_folio()
[ Upstream commit daeb443b92817021c1234e8eded219e164b7c35d ]
Change netfs_perform_write() to keep the netfs_folio trace value in a
variable and emit it later to make it easier to choose the value displayed.
This is a prerequisite for a subsequent patch.
Closes: https://sashiko.dev/#/patchset/20260414082004.3756080-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/20260512123404.719402-13-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: 7b4dcf1b9455 ("netfs: Fix streaming write being overwritten")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date: Sat Oct 5 19:23:04 2024 +0100
netfs: Fix a few minor bugs in netfs_page_mkwrite()
[ Upstream commit c6a90fe7f080d71271b723490454cfda1f81e4b0 ]
We can't return with VM_FAULT_SIGBUS | VM_FAULT_LOCKED; the core
code will not unlock the folio in this instance. Introduce a new
"unlock" error exit to handle this case. Use it to handle
the "folio is truncated" check, and change the "writeback interrupted
by a fatal signal" to do a NOPAGE exit instead of letting the core
code install the folio currently under writeback before killing the
process.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Link: https://lore.kernel.org/r/20241005182307.3190401-3-willy@infradead.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: ccde2ac757c7 ("netfs: Fix folio->private handling in netfs_perform_write()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Tue May 12 13:33:54 2026 +0100
netfs: Fix early put of sink folio in netfs_read_gaps()
[ Upstream commit 3e5dd91b87a8b1450217b56a336bee315f40da7d ]
Fix netfs_read_gaps() to release the sink page it uses after waiting for
the request to complete. The way the sink page is used is that an
ITER_BVEC-class iterator is created that has the gaps from the target folio
at either end, but has the sink page tiled over the middle so that a single
read op can fill in both gaps.
The bug was found by KASAN detecting a UAF on the generic/075 xfstest in
the cifsd kernel thread that handles reception of data from the TCP socket:
BUG: KASAN: use-after-free in _copy_to_iter+0x48a/0xa20
Write of size 885 at addr ffff888107f92000 by task cifsd/1285
CPU: 2 UID: 0 PID: 1285 Comm: cifsd Not tainted 7.0.0 #6 PREEMPT(lazy)
Call Trace:
dump_stack_lvl+0x5d/0x80
print_report+0x17f/0x4f1
kasan_report+0x100/0x1e0
kasan_check_range+0x10f/0x1e0
__asan_memcpy+0x3c/0x60
_copy_to_iter+0x48a/0xa20
__skb_datagram_iter+0x2c9/0x430
skb_copy_datagram_iter+0x6e/0x160
tcp_recvmsg_locked+0xce0/0x1130
tcp_recvmsg+0xeb/0x300
inet_recvmsg+0xcf/0x3a0
sock_recvmsg+0xea/0x100
cifs_readv_from_socket+0x3a6/0x4d0 [cifs]
cifs_read_iter_from_socket+0xdd/0x130 [cifs]
cifs_readv_receive+0xaad/0xb10 [cifs]
cifs_demultiplex_thread+0x1148/0x1740 [cifs]
kthread+0x1cf/0x210
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Reported-by: Steve French <sfrench@samba.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/20260512123404.719402-18-dhowells@redhat.com
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Tue May 12 13:33:58 2026 +0100
netfs: Fix folio->private handling in netfs_perform_write()
[ Upstream commit ccde2ac757c713535b224233a296de40efe5212d ]
Under some circumstances, netfs_perform_write() doesn't correctly
manipulate folio->private between NULL, NETFS_FOLIO_COPY_TO_CACHE, pointing
to a group and pointing to a netfs_folio struct, leading to potential
multiple attachments of private data with associated folio ref leaks and
also leaks of netfs_folio structs or netfs_group refs.
Fix this by consolidating the place at which a folio is marked uptodate in
one place and having that look at what's attached to folio->private and
decide how to clean it up and then set the new group. Also, the content
shouldn't be flushed if group is NULL, even if a group is specified in the
netfs_group parameter, as that would be the case for a new folio. A
filesystem should always specify netfs_group or never specify netfs_group.
The Sashiko auto-review tool noted that it was theoretically possible that
the fpos >= ctx->zero_point section might leak if it modified a streaming
write folio. This is unlikely, but with a network filesystem, third party
changes can happen. It also pointed out that __netfs_set_group() would
leak if called multiple times on the same folio from the "whole folio
modify section".
Fixes: 8f52de0077ba ("netfs: Reduce number of conditional branches in netfs_perform_write()")
Closes: https://sashiko.dev/#/patchset/20260414082004.3756080-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/20260512123404.719402-22-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Tue May 12 13:33:48 2026 +0100
netfs: Fix netfs_invalidate_folio() to clear dirty bit if all changes gone
[ Upstream commit 156ac2ec2ee77c44c4eb7439d6d165247ba12247 ]
If a streaming write is made, this will leave the relevant modified folio
in a not-uptodate, but dirty state with a netfs_folio struct hung off of
folio->private indicating the dirty range. Subsequently truncating the
file such that the dirty data in the folio is removed, but the first part
of the folio theoretically remains will cause the netfs_folio struct to be
discarded... but will leave the dirty flag set.
If the folio is then read via mmap(), netfs_read_folio() will see that the
page is dirty and jump to netfs_read_gaps() to fill in the missing bits.
netfs_read_gaps(), however, expects there to be a netfs_folio struct
present and can oops because truncate removed it.
Fix this by calling folio_cancel_dirty() in netfs_invalidate_folio() in the
event that all the dirty data in the folio is erased (as nfs does).
Also add some tracepoints to log modifications to a dirty page.
This can be reproduced with something like:
dd if=/dev/zero of=/xfstest.test/foo bs=1M count=1
umount /xfstest.test
mount /xfstest.test
xfs_io -c "w 0xbbbf 0xf96c" \
-c "truncate 0xbbbf" \
-c "mmap -r 0xb000 0x11000" \
-c "mr 0xb000 0x11000" \
/xfstest.test/foo
with fscaching disabled (otherwise streaming writes are suppressed) and a
change to netfs_perform_write() to disallow streaming writes if the fd is
open O_RDWR:
if (//(file->f_mode & FMODE_READ) || <--- comment this out
netfs_is_cache_enabled(ctx)) {
It should be reproducible even without this change, but if prevents the
above trivial xfs_io command from reproducing it.
Note that the initial dd is important: the file must start out sufficiently
large that the zero-point logic doesn't just clear the gaps because it
knows there's nothing in the file to read yet. Unmounting and mounting is
needed to clear the pagecache (there are other ways to do that that may
also work).
This was initially reproduced with the generic/522 xfstest on some patches
that remove the FMODE_READ restriction.
Fixes: 9ebff83e6481 ("netfs: Prep to use folio->private for write grouping and streaming write")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/20260512123404.719402-12-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Tue May 12 13:33:47 2026 +0100
netfs: Fix overrun check in netfs_extract_user_iter()
[ Upstream commit 0ef37eef83fad3542ee06db2940433ae1a92b39d ]
Fix netfs_extract_user_iter() so that if iov_iter_extract_pages() overfills
pages[], then those pages don't get included in the iterator constructed at
the end of the function. If there was an overfill, memory corruption has
already happened.
Fixes: 85dd2c8ff368 ("netfs: Add a function to extract a UBUF or IOVEC into a BVEC iterator")
Closes: https://sashiko.dev/#/patchset/20260427154639.180684-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/20260512123404.719402-11-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Tue May 12 13:33:57 2026 +0100
netfs: Fix partial invalidation of streaming-write folio
[ Upstream commit 6d91acc7fb85d33ea58fca9b964a32a453937f4b ]
In netfs_invalidate_folio(), if the region of a partial invalidation
overlaps the front (but not all) of a dirty write cached in a streaming
write page (dirty, but not uptodate, with the dirty region tracked by a
netfs_folio struct), the function modifies the dirty region - but
incorrectly as it moves the region forward by setting the start to the
start, not the end, of the invalidation region.
Fix this by setting finfo->dirty_offset to the end of the invalidation
region (iend).
Fixes: cce6bfa6ca0e ("netfs: Fix trimming of streaming-write folios in netfs_inval_folio()")
Closes: https://sashiko.dev/#/patchset/20260414082004.3756080-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/20260512123404.719402-21-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Tue May 12 13:33:51 2026 +0100
netfs: Fix potential deadlock in write-through mode
[ Upstream commit b6a4ae1634b3ad2aaa05222e53d36da532852faf ]
Fix netfs_advance_writethrough() to always unlock the supplied folio and to
mark it dirty if it isn't yet written to the end. Unfortunately, it can't
be marked for writeback until the folio is done with as that may cause a
deadlock against mmapped reads and writes.
Even though it has been marked dirty, premature writeback can't occur as
the caller is holding both inode->i_rwsem (which will prevent concurrent
truncation, fallocation, DIO and other writes) and ictx->wb_lock (which
will cause flushing to wait and writeback to skip or wait).
Note that this may be easier to deal with once the queuing of folios is
split from the generation of subrequests.
Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
Closes: https://sashiko.dev/#/patchset/20260427154639.180684-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/20260512123404.719402-15-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Tue May 12 13:33:50 2026 +0100
netfs: Fix streaming write being overwritten
[ Upstream commit 7b4dcf1b9455a6e52ac7478b4057dbe10359576d ]
In order to avoid reading whilst writing, netfslib will allow "streaming
writes" in which dirty data is stored directly into folios without reading
them first. Such folios are marked dirty but may not be marked uptodate.
If a folio is entirely written by a streaming write, uptodate will be set,
otherwise it will have a netfs_folio struct attached to ->private recording
the dirty region.
In the event that a partially written streaming write page is to be
overwritten entirely by a single write(), netfs_perform_write() will try to
copy over it, but doesn't discard the netfs_folio if it succeeds; further,
it doesn't correctly handle a partial copy that overwrites some of the
dirty data.
Fix this by the following:
(1) If the folio is successfully overwritten, free the netfs_folio struct
before marking the page uptodate.
(2) If the copy to the folio partially fails, but short of the dirty data,
just ignore the copy.
(3) If the copy partially fails and overwrites some of the dirty data,
accept the copy, update the netfs_folio struct to record the new data.
If the folio is now filled, free the netfs_folio and set uptodate,
otherwise return a partial write.
Found with:
fsx -q -N 1000000 -p 10000 -o 128000 -l 600000 \
/xfstest.test/junk --replay-ops=junk.fsxops
using the following as junk.fsxops:
truncate 0x0 0 0x927c0
write 0x63fb8 0x53c8 0
copy_range 0xb704 0x19b9 0x24429 0x79380
write 0x2402b 0x144a2 0x90660 *
write 0x204d5 0x140a0 0x927c0 *
copy_range 0x1f72c 0x137d0 0x7a906 0x927c0 *
read 0x00000 0x20000 0x9157c
read 0x20000 0x20000 0x9157c
read 0x40000 0x20000 0x9157c
read 0x60000 0x20000 0x9157c
read 0x7e1a0 0xcfb9 0x9157c
on cifs with the default cache option.
It shows folio 0x24 misbehaving if the FMODE_READ check is commented out in
netfs_perform_write():
if (//(file->f_mode & FMODE_READ) ||
netfs_is_cache_enabled(ctx)) {
and no fscache. This was initially found with the generic/522 xfstest.
Fixes: 8f52de0077ba ("netfs: Reduce number of conditional branches in netfs_perform_write()")
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/20260512123404.719402-14-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Date: Tue May 12 13:33:44 2026 +0100
netfs: fix VM_BUG_ON_FOLIO() issue in netfs_write_begin() call
[ Upstream commit dc7832d05deb4d632e8035e3299e31a3528fa0d0 ]
The multiple runs of generic/013 test-case is capable
to reproduce a kernel BUG at mm/filemap.c:1504 with
probability of 30%.
while true; do
sudo ./check generic/013
done
[ 9849.452376] page: refcount:3 mapcount:0 mapping:00000000e58ff252 index:0x10781 pfn:0x1c322
[ 9849.452412] memcg:ffff8881a1915800
[ 9849.452417] aops:ceph_aops ino:1000058db9e dentry name(?):"f9XXXXXX"
[ 9849.452432] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
[ 9849.452441] raw: 0017ffffc0000000 0000000000000000 dead000000000122 ffff88816110d248
[ 9849.452445] raw: 0000000000010781 0000000000000000 00000003ffffffff ffff8881a1915800
[ 9849.452447] page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
[ 9849.452474] ------------[ cut here ]------------
[ 9849.452476] kernel BUG at mm/filemap.c:1504!
[ 9849.478635] Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
[ 9849.481772] CPU: 2 UID: 0 PID: 84223 Comm: fsstress Not tainted 7.0.0-rc1+ #18 PREEMPT(full)
[ 9849.482881] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-9.fc43 06/1
0/2025
[ 9849.484539] RIP: 0010:folio_unlock+0x85/0xa0
[ 9849.485076] Code: 89 df 31 f6 e8 1c f3 ff ff 48 8b 5d f8 c9 31 c0 31 d2 31 f6 31 ff c3 cc
cc cc cc 48 c7 c6 80 6c d9 a7 48 89 df e8 4b b3 10 00 <0f> 0b 48 89 df e8 21 e6 2c 00 eb 9d 0f 1f 40 00 66 66 2e 0f 1f 84
[ 9849.493818] RSP: 0018:ffff8881bb8076b0 EFLAGS: 00010246
[ 9849.495740] RAX: 0000000000000000 RBX: ffffea00070c8980 RCX: 0000000000000000
[ 9849.498678] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 9849.500559] RBP: ffff8881bb8076b8 R08: 0000000000000000 R09: 0000000000000000
[ 9849.501097] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000010782000
[ 9849.502108] R13: ffff8881935de738 R14: ffff88816110d010 R15: 0000000000001000
[ 9849.502516] FS: 00007e36cbe94740(0000) GS:ffff88824a899000(0000) knlGS:0000000000000000
[ 9849.502996] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9849.503810] CR2: 000000c0002b0000 CR3: 000000011bbf6004 CR4: 0000000000772ef0
[ 9849.504459] PKRU: 55555554
[ 9849.504626] Call Trace:
[ 9849.505242] <TASK>
[ 9849.505379] netfs_write_begin+0x7c8/0x10a0
[ 9849.505877] ? __kasan_check_read+0x11/0x20
[ 9849.506384] ? __pfx_netfs_write_begin+0x10/0x10
[ 9849.507178] ceph_write_begin+0x8c/0x1c0
[ 9849.507934] generic_perform_write+0x391/0x8f0
[ 9849.508503] ? __pfx_generic_perform_write+0x10/0x10
[ 9849.509062] ? file_update_time_flags+0x19a/0x4b0
[ 9849.509581] ? ceph_get_caps+0x63/0xf0
[ 9849.510259] ? ceph_get_caps+0x63/0xf0
[ 9849.510530] ceph_write_iter+0xe79/0x1ae0
[ 9849.511282] ? __pfx_ceph_write_iter+0x10/0x10
[ 9849.511839] ? lock_acquire+0x1ad/0x310
[ 9849.512334] ? ksys_write+0xf9/0x230
[ 9849.512582] ? lock_is_held_type+0xaa/0x140
[ 9849.513128] vfs_write+0x512/0x1110
[ 9849.513634] ? __fget_files+0x33/0x350
[ 9849.513893] ? __pfx_vfs_write+0x10/0x10
[ 9849.514143] ? mutex_lock_nested+0x1b/0x30
[ 9849.514394] ksys_write+0xf9/0x230
[ 9849.514621] ? __pfx_ksys_write+0x10/0x10
[ 9849.514887] ? do_syscall_64+0x25e/0x1520
[ 9849.515122] ? __kasan_check_read+0x11/0x20
[ 9849.515366] ? trace_hardirqs_on_prepare+0x178/0x1c0
[ 9849.515655] __x64_sys_write+0x72/0xd0
[ 9849.515885] ? trace_hardirqs_on+0x24/0x1c0
[ 9849.516130] x64_sys_call+0x22f/0x2390
[ 9849.516341] do_syscall_64+0x12b/0x1520
[ 9849.516545] ? do_syscall_64+0x27c/0x1520
[ 9849.516783] ? do_syscall_64+0x27c/0x1520
[ 9849.517003] ? lock_release+0x318/0x480
[ 9849.517220] ? __x64_sys_io_getevents+0x143/0x2d0
[ 9849.517479] ? percpu_ref_put_many.constprop.0+0x8f/0x210
[ 9849.517779] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 9849.518073] ? do_syscall_64+0x25e/0x1520
[ 9849.518291] ? __kasan_check_read+0x11/0x20
[ 9849.518519] ? trace_hardirqs_on_prepare+0x178/0x1c0
[ 9849.518799] ? do_syscall_64+0x27c/0x1520
[ 9849.519024] ? local_clock_noinstr+0xf/0x120
[ 9849.519262] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 9849.519544] ? do_syscall_64+0x25e/0x1520
[ 9849.519781] ? __kasan_check_read+0x11/0x20
[ 9849.520008] ? trace_hardirqs_on_prepare+0x178/0x1c0
[ 9849.520273] ? do_syscall_64+0x27c/0x1520
[ 9849.520491] ? trace_hardirqs_on_prepare+0x178/0x1c0
[ 9849.520767] ? irqentry_exit+0x10c/0x6c0
[ 9849.520984] ? trace_hardirqs_off+0x86/0x1b0
[ 9849.521224] ? exc_page_fault+0xab/0x130
[ 9849.521472] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 9849.521766] RIP: 0033:0x7e36cbd14907
[ 9849.521989] Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[ 9849.523057] RSP: 002b:00007ffff2d2a968 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 9849.523484] RAX: ffffffffffffffda RBX: 000000000000e549 RCX: 00007e36cbd14907
[ 9849.523885] RDX: 000000000000e549 RSI: 00005bd797ec6370 RDI: 0000000000000004
[ 9849.524277] RBP: 0000000000000004 R08: 0000000000000047 R09: 00005bd797ec6370
[ 9849.524652] R10: 0000000000000078 R11: 0000000000000246 R12: 0000000000000049
[ 9849.525062] R13: 0000000010781a37 R14: 00005bd797ec6370 R15: 0000000000000000
[ 9849.525447] </TASK>
[ 9849.525574] Modules linked in: intel_rapl_msr intel_rapl_common intel_uncore_frequency_common intel_pmc_core pmt_telemetry pmt_discovery pmt_class intel_pmc_ssram_telemetry intel_vsec kvm_intel joydev kvm irqbypass ghash_clmulni_intel aesni_intel input_leds rapl mac_hid psmouse vga16fb serio_raw vgastate floppy i2c_piix4 bochs qemu_fw_cfg i2c_smbus pata_acpi sch_fq_codel rbd msr parport_pc ppdev lp parport efi_pstore
[ 9849.529150] ---[ end trace 0000000000000000 ]---
[ 9849.529502] RIP: 0010:folio_unlock+0x85/0xa0
[ 9849.530813] Code: 89 df 31 f6 e8 1c f3 ff ff 48 8b 5d f8 c9 31 c0 31 d2 31 f6 31 ff c3 cc cc cc cc 48 c7 c6 80 6c d9 a7 48 89 df e8 4b b3 10 00 <0f> 0b 48 89 df e8 21 e6 2c 00 eb 9d 0f 1f 40 00 66 66 2e 0f 1f 84
[ 9849.534986] RSP: 0018:ffff8881bb8076b0 EFLAGS: 00010246
[ 9849.536198] RAX: 0000000000000000 RBX: ffffea00070c8980 RCX: 0000000000000000
[ 9849.537718] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 9849.539321] RBP: ffff8881bb8076b8 R08: 0000000000000000 R09: 0000000000000000
[ 9849.540862] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000010782000
[ 9849.542438] R13: ffff8881935de738 R14: ffff88816110d010 R15: 0000000000001000
[ 9849.543996] FS: 00007e36cbe94740(0000) GS:ffff88824b899000(0000) knlGS:0000000000000000
[ 9849.545854] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9849.547092] CR2: 00007e36cb3ff000 CR3: 000000011bbf6006 CR4: 0000000000772ef0
[ 9849.548679] PKRU: 55555554
The race sequence:
1. Read completes -> netfs_read_collection() runs
2. netfs_wake_rreq_flag(rreq, NETFS_RREQ_IN_PROGRESS, ...)
3. netfs_wait_for_read() returns -EFAULT to netfs_write_begin()
4. The netfs_unlock_abandoned_read_pages() unlocks the folio
5. netfs_write_begin() calls folio_unlock(folio) -> VM_BUG_ON_FOLIO()
The key reason of the issue that netfs_unlock_abandoned_read_pages()
doesn't check the flag NETFS_RREQ_NO_UNLOCK_FOLIO and executes
folio_unlock() unconditionally. This patch implements in
netfs_unlock_abandoned_read_pages() logic similar to
netfs_unlock_read_folio().
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/20260512123404.719402-8-dhowells@redhat.com
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: Ceph Development <ceph-devel@vger.kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Tue May 12 13:33:53 2026 +0100
netfs: Fix write streaming disablement if fd open O_RDWR
[ Upstream commit 70a7b9193bbbfceaab5974de66834c64ccc875dd ]
In netfs_perform_write(), "write streaming" (the caching of dirty data in
dirty but !uptodate folios) is performed to avoid the need to read data
that is just going to get immediately overwritten. However, this is/will
be disabled in three circumstances: if the fd is open O_RDWR, if fscache is
in use (as we need to round out the blocks for DIO) or if content
encryption is enabled (again for rounding out purposes).
The idea behind disabling it if the fd is open O_RDWR is that we'd need to
flush the write-streaming page before we could read the data, particularly
through mmap. But netfs now fills in the gaps if ->read_folio() is called
on the page, so that is unnecessary. Further, this doesn't actually work
if a separate fd is open for reading.
Fix this by removing the check for O_RDWR, thereby allowing streaming
writes even when we might read.
This caused a number of problems with the generic/522 xfstest, but those
are now fixed.
Fixes: c38f4e96e605 ("netfs: Provide func to copy data to pagecache for buffered write")
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/20260512123404.719402-17-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date: Sat Oct 5 19:23:05 2024 +0100
netfs: Remove unnecessary references to pages
[ Upstream commit e995e8b600260cff3cfaf2607a62be8bdc4aa9c7 ]
These places should all use folios instead of pages.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Link: https://lore.kernel.org/r/20241005182307.3190401-4-willy@infradead.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: ccde2ac757c7 ("netfs: Fix folio->private handling in netfs_perform_write()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chuck Lever <chuck.lever@oracle.com>
Date: Sun Apr 19 14:52:59 2026 -0400
NFSD: Fix infinite loop in layout state revocation
[ Upstream commit 4f8ef58c10bfe5f86a643c7c8331b37e69e3dae1 ]
find_one_sb_stid() skips stids whose sc_status is non-zero, but the
SC_TYPE_LAYOUT case in nfsd4_revoke_states() never sets sc_status
before calling nfsd4_close_layout(). The retry loop therefore finds
the same layout stid on every iteration, hanging the revoker
indefinitely.
Fixes: 1e33e1414bec ("nfsd: allow layout state to be admin-revoked.")
Reported-by: Dai Ngo <dai.ngo@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zhihao Cheng <chengzhihao1@huawei.com>
Date: Thu May 7 19:23:01 2026 +0800
nsfs: fix wrong error code returned for pidns ioctls
[ Upstream commit 725ecd80688bf3c57ca9205431f2c06174ff0756 ]
When executing NS_GET_PID_FROM_PIDNS (or similar pidns ioctls), if the
target task cannot be found in the corresponding pid_ns, the error code
should be ESRCH instead of ENOTTY.
This bug was introduced when the extensible ioctl handling was added.
Without proper return, ret would be overwritten by the default case in
the extensible ioctl switch statement.
Fixes: a1d220d9dafa8 ("nsfs: iterate through mount namespaces")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Link: https://patch.msgid.link/20260507112301.1042757-1-chengzhihao1@huawei.com
Reviewed-by: Yang Erkun <yangerkun@huawei.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sam Daly <sam@samdaly.ie>
Date: Wed May 13 18:42:53 2026 +0200
octeontx2-af: CGX: add bounds check to cgx_speed_mbps index
commit c0bf0a4f3f1f5f57aa83e1400ba4f56f0abfd542 upstream.
cgx_speed_mbps has 13 elements but RESP_LINKSTAT_SPEED can yield values
0-15. If it returns a value >= 13, this causes an out-of-bounds array
access. Add a bounds check and default to speed 0 if the index is out of
range.
Fixes: 61071a871ea6 ("octeontx2-af: Forward CGX link notifications to PFs")
Cc: Sunil Goutham <sgoutham@marvell.com>
Cc: Linu Cherian <lcherian@marvell.com>
Cc: Geetha sowjanya <gakula@marvell.com>
Cc: hariprasad <hkelam@marvell.com>
Cc: Subbaraya Sundeep <sbhatta@marvell.com>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Cc: stable <stable@kernel.org>
Signed-off-by: Sam Daly <sam@samdaly.ie>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/2026051352-refined-demise-e88d@gregkh
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ratheesh Kannoth <rkannoth@marvell.com>
Date: Wed May 20 10:00:36 2026 +0530
octeontx2-af: npc: Fix allmulticast skip logic for LBK and SDP VFs
[ Upstream commit 9eddc819f00b5b74bb4ac91396f80bd35f5f3561 ]
When installing the allmulticast NPC rule, rvu_npc_install_allmulti_entry()
should skip LBK and SDP VFs (only CGX PF/VF may add the entry). The
code combined is_lbk_vf() and is_sdp_vf() with logical AND, which is
never true for a single pcifunc, so the intended early return never ran.
Use logical OR instead.
Cc: Geetha sowjanya <gakula@marvell.com>
Fixes: ae703539f49d2 ("octeontx2-af: Cleanup loopback device checks")
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Link: https://patch.msgid.link/20260520043036.1523798-1-rkannoth@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nikhil P. Rao <nikhil.rao@amd.com>
Date: Wed May 20 20:58:42 2026 +0000
pds_core: ensure null-termination for firmware version strings
[ Upstream commit 3d4432d34c1992701289cbe12df9fd024f315998 ]
The driver passes fw_version directly to devlink_info_version_stored_put()
without ensuring null-termination. While current firmware null-terminates
these strings, the driver should not rely on this behavior. Add explicit
null-termination to prevent potential issues if firmware behavior changes.
Fixes: 45d76f492938 ("pds_core: set up device and adminq")
Signed-off-by: Nikhil P. Rao <nikhil.rao@amd.com>
Link: https://patch.msgid.link/20260520205842.1486718-1-nikhil.rao@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nikhil P. Rao <nikhil.rao@amd.com>
Date: Fri May 15 21:29:07 2026 +0000
pds_core: fix debugfs_lookup dentry leak and error handling
[ Upstream commit dc416e32baaeb620b9809e9e25fc7b30889686e9 ]
debugfs_lookup() returns a dentry with an elevated reference count that
must be released with dput(). The current code discards the returned
dentry without calling dput(), causing a reference leak on every
firmware reset recovery.
Additionally, when CONFIG_DEBUG_FS is disabled, debugfs_lookup()
returns ERR_PTR(-ENODEV), not NULL. The current check passes for error
pointers and would call dput() on an invalid pointer, causing a crash.
Fixes: bc90fbe0c318 ("pds_core: Rework teardown/setup flow to be more common")
Signed-off-by: Nikhil P. Rao <nikhil.rao@amd.com>
Link: https://patch.msgid.link/20260515212907.998028-3-nikhil.rao@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nikhil P. Rao <nikhil.rao@amd.com>
Date: Fri May 15 21:29:05 2026 +0000
pds_core: fix error handling in pdsc_devcmd_wait
[ Upstream commit 0e46b6635b03d29807f810c3b415c4755a3f958d ]
Fix two cases where pdsc_devcmd_wait() returns stale success from
the completion register instead of an error:
1. FW crash: If firmware stops running, the wait loop breaks early with
running=false. The condition "if ((!done || timeout) && running)" is
false, so error handling is bypassed and stale status is returned.
Check !running first and return -ENXIO.
2. Timeout: If a command times out, err is set to -ETIMEDOUT but then
overwritten by pdsc_err_to_errno(status) which reads stale status.
Return -ETIMEDOUT immediately after cleaning up.
Both errors now propagate to pdsc_devcmd_locked() which queues
health_work for recovery.
Fixes: 45d76f492938 ("pds_core: set up device and adminq")
Signed-off-by: Nikhil P. Rao <nikhil.rao@amd.com>
Link: https://patch.msgid.link/20260515212907.998028-1-nikhil.rao@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ian Rogers <irogers@google.com>
Date: Tue Oct 1 20:20:06 2024 -0700
perf parse-events: Expose/rename config_term_name
[ Upstream commit d2f3ecb0ca2099d13bf8bf69219214c1425dc453 ]
Expose config_term_name as parse_events__term_type_str so that PMUs not
in pmu.c may access it.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20241002032016.333748-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zijing Yin <yzjaurora@gmail.com>
Date: Tue May 19 10:26:33 2026 -0700
phonet/pep: disable BH around forwarded sk_receive_skb()
commit dbc81608e3a653dea6cf403f20cae35468b8ab9c upstream.
The networking receive path is usually run from softirq context, but
protocols that take the socket lock may have packets stored in the
backlog and processed later from process context. In that case
release_sock() -> __release_sock() drops the slock with spin_unlock_bh()
and then calls sk->sk_backlog_rcv() with bottom halves enabled.
Typical sk_backlog_rcv handlers process the socket whose backlog is
being drained, so the BH state at entry is irrelevant for the slocks
they touch. pep_do_rcv() is different: when the inbound skb targets an
existing PEP pipe, it forwards the skb to a different *child* socket
via sk_receive_skb(). That helper takes the child slock with
bh_lock_sock_nested(), which is just spin_lock_nested() and assumes BH
is already off. The same child slock therefore ends up acquired with
BH on (process path) and with BH off (softirq path):
process context softirq context
--------------- ---------------
release_sock(listener) __netif_receive_skb()
__release_sock() phonet_rcv()
spin_unlock_bh() __sk_receive_skb(listener)
[BH now ENABLED] [BH already disabled]
sk_backlog_rcv: sk_backlog_rcv:
pep_do_rcv() pep_do_rcv()
sk_receive_skb(child) sk_receive_skb(child)
bh_lock_sock_nested(child) bh_lock_sock_nested(child)
=> SOFTIRQ-ON-W => IN-SOFTIRQ-W
Lockdep flags this as inconsistent lock state, and it can become a real
self-deadlock if a softirq on the same CPU tries to receive to the same
child socket while its slock is held in the BH-enabled path:
WARNING: inconsistent lock state
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
(slock-AF_PHONET/1){+.?.}-{3:3}, at: __sk_receive_skb+0x1cf/0x900
__sk_receive_skb net/core/sock.c:563
sk_receive_skb include/net/sock.h:2022 [inline]
pep_do_rcv net/phonet/pep.c:675
sk_backlog_rcv include/net/sock.h:1190
__release_sock net/core/sock.c:3216
release_sock net/core/sock.c:3815
pep_sock_accept net/phonet/pep.c:879
Wrap the forwarded sk_receive_skb() in local_bh_disable() /
local_bh_enable() so the child slock is always acquired with BH off.
local_bh_disable() nests safely on the softirq path.
Discovered via in-house syzkaller fuzzing; the same root cause also
on the linux-6.1.y syzbot dashboard as extid 44f0626dd6284f02663c.
Reproduced under KASAN + LOCKDEP + PROVE_LOCKING, reproducer:
https://pastebin.com/A3t8xzCR
Fixes: 9641458d3ec4 ("Phonet: Pipe End Point for Phonet Pipes protocol")
Link: https://syzkaller.appspot.com/bug?extid=44f0626dd6284f02663c
Cc: stable@vger.kernel.org
Signed-off-by: Zijing Yin <yzjaurora@gmail.com>
Acked-by: Rémi Denis-Courmont <remi@remlab.net>
Reported-by: syzbot+9f4a135646b66c509935@syzkaller.appspotmail.com
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260519172635.86304-1-yzjaurora@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Gabor Juhos <j4g8y7@gmail.com>
Date: Sat Mar 21 15:42:32 2026 +0100
phy: marvell: mvebu-a3700-utmi: fix incorrect USB2_PHY_CTRL register access
[ Upstream commit 91ddf6f722084383fb05be731c0107814b055c0c ]
The mvebu_a3700_utmi_phy_power_off() function tries to modify the
USB2_PHY_CTRL register by using the IO address of the PHY IP block along
with the readl/writel IO accessors. However, the register exist in the
USB miscellaneous register space, and as such it must be accessed via
regmap like it is done in the mvebu_a3700_utmi_phy_power_on() function.
Change the code to use regmap_update_bits() for modífying the register
to fix this.
Fixes: cc8b7a0ae866 ("phy: add A3700 UTMI PHY driver")
Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://patch.msgid.link/20260321-a3700-utmi-fix-usb2_phy_ctrl-access-v1-1-6005ff4b5058@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wayne Chang <waynec@nvidia.com>
Date: Mon May 4 11:33:05 2026 +0800
phy: tegra: xusb: Fix per-pad high-speed termination calibration
commit da110228b54f2e2143d97ea7151e0dc22e539d67 upstream.
The existing code reads a single hs_term_range_adj value from bit field
[10:7] of FUSE_SKU_CALIB_0 and applies it to all USB2 pads uniformly.
However, on SoCs that support per-pad termination, each pad has its own
hs_term_range_adj field: pad 0 in FUSE_SKU_CALIB_0[10:7], and pads 1-3
in FUSE_USB_CALIB_EXT_0 at bit offsets [8:5], [12:9], and [16:13]
respectively.
Fix the calibration by reading per-pad values from the appropriate fuse
registers. For SoCs that do not support per-pad termination, replicate
pad 0's value to all pads to maintain existing behavior.
Add a has_per_pad_term flag to the SoC data to indicate whether per-pad
termination values are available in FUSE_USB_CALIB_EXT_0.
Fixes: 1ef535c6ba8e ("phy: tegra: xusb: Add Tegra194 support")
Cc: stable@vger.kernel.org
Signed-off-by: Wayne Chang <waynec@nvidia.com>
Signed-off-by: Wei-Cheng Chen <weichengc@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://patch.msgid.link/20260504033305.2283145-1-weichengc@nvidia.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Maulik Shah <maulik.shah@oss.qualcomm.com>
Date: Tue Apr 28 17:44:58 2026 +0530
pinctrl: qcom: Fix wakeirq map by removing disconnected irqs for sm8150
[ Upstream commit 52ac35b8a151446481496404af3a8e5e889b3c5a ]
PDC interrupts 122-125 were meant for ibi_i3c wakeup but sm8150 do not
support i3c. GPIOs 39,51,88 and 144 are also connected to different PDC
pin and already reflected in the wake irq map.
Remove the unsupported wakeup interrupts from the map.
Fixes: 90337380c809 ("pinctrl: qcom: sm8150: Specify PDC map")
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Maulik Shah <maulik.shah@oss.qualcomm.com>
Signed-off-by: Navya Malempati <navya.malempati@oss.qualcomm.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Biju Das <biju.das.jz@bp.renesas.com>
Date: Sat Mar 28 09:05:45 2026 +0000
pinctrl: renesas: rzg2l: Fix incorrect PUPD register offset for high pins during suspend/resume
[ Upstream commit 6dba9b7268cc50166bce47608670192fd874e363 ]
When saving/restoring pull-up/down register state during suspend/resume,
the second PUPD register access was incorrectly using the same base offset
as the first, effectively reading/writing the same register twice instead
of the adjacent one.
Add the correct + 4 byte offset to the second RZG2L_PCTRL_REG_ACCESS32
call so that pupd[1][port] is properly saved and restored from the next
32-bit register in the PUPD register pair, covering pins 4–7 of ports
with 4 or more pins.
Fixes: b2bd65fbb617 ("pinctrl: renesas: rzg2l: Add suspend/resume support for pull up/down")
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260328090548.84124-1-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Oliver White <oliverjwhite07@gmail.com>
Date: Thu Apr 9 15:43:47 2026 +1200
platform/surface: aggregator_registry: omit battery & AC nodes on Surface Laptop 7
[ Upstream commit 0488073a6c84571dd3cffe581a4a73a5fceb099d ]
Surface Laptop 7 exposes battery and AC status via Qualcomm PMIC GLINK
qcom_battmgr. Registering the standard SSAM battery and AC client
devices on this platform causes duplicate power-supply devices to
appear.
Drop the SSAM battery and AC nodes from the Surface Laptop 7 registry
group so that only the qcom_battmgr power supplies are instantiated.
Fixes: b27622f13172 ("platform/surface: Add OF support")
Signed-off-by: Oliver White <oliverjwhite07@gmail.com>
Link: https://patch.msgid.link/20260409034347.17381-1-oliverjwhite07@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date: Tue May 12 17:11:49 2026 +0200
platform/x86: adv_swbutton: Check ACPI_HANDLE() against NULL
[ Upstream commit e7a9a6ea40e352cd7977f6a8c80bdeadf65ad838 ]
Every platform driver can be forced to match a device that doesn't match
its list of device IDs because of device_match_driver_override(), so
platform drivers that rely on the existence of a device's ACPI companion
object need to verify its presence.
Accordingly, add a requisite ACPI_HANDLE() check against NULL to the
platform/x86 adv_swbutton driver.
Fixes: 3d904005f686 ("platform/x86: add support for Advantech software defined button")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/5115425.31r3eYUQgx@rafael.j.wysocki
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date: Tue May 12 17:12:40 2026 +0200
platform/x86: hp_accel: Check ACPI_COMPANION() against NULL
[ Upstream commit abfbe5ee8ae89f1f5449790423d5dd3e423545bd ]
Every platform driver can be forced to match a device that doesn't match
its list of device IDs because of device_match_driver_override(), so
platform drivers that rely on the existence of a device's ACPI companion
object need to verify its presence.
Accordingly, add a requisite ACPI_COMPANION() check against NULL to the
platform/x86 hp_accel driver.
Fixes: 8ebcb6c94c71 ("platform/x86: hp_accel: Convert to be a platform driver")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/2425918.ElGaqSPkdT@rafael.j.wysocki
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date: Tue May 12 17:13:28 2026 +0200
platform/x86: intel-hid: Check ACPI_HANDLE() against NULL
[ Upstream commit 5c69e090ae5dd93d910f70db0796357080707d26 ]
Every platform driver can be forced to match a device that doesn't match
its list of device IDs because of device_match_driver_override(), so
platform drivers that rely on the existence of a device's ACPI companion
object need to verify its presence.
Accordingly, add a requisite ACPI_HANDLE() check against NULL to the
platform/x86 intel-hid driver.
Fixes: ecc83e52b28c ("intel-hid: new hid event driver for hotkeys")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/1971512.tdWV9SEqCh@rafael.j.wysocki
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date: Tue May 12 17:16:22 2026 +0200
platform/x86: intel-vbtn: Check ACPI_HANDLE() against NULL
[ Upstream commit a9f305c5a355efeb240d406d378491d9eec02d07 ]
Every platform driver can be forced to match a device that doesn't match
its list of device IDs because of device_match_driver_override(), so
platform drivers that rely on the existence of a device's ACPI companion
object need to verify its presence.
Accordingly, add a requisite ACPI_HANDLE() check against NULL to the
platform/x86 intel-vbtn driver.
Fixes: 26173179fae1 ("platform/x86: intel-vbtn: Eval VBDL after registering our notifier")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/3426431.aeNJFYEL58@rafael.j.wysocki
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sayali Patil <sayalip@linux.ibm.com>
Date: Wed May 13 13:44:13 2026 +0530
powerpc/time: Remove redundant preempt_disable|enable() calls from arch_irq_work_raise()
[ Upstream commit 31467b23823ffec1f6fff407f8e3ca9af8b7491a ]
A kernel panic is observed when handling machine check exceptions from
real mode.
BUG: Unable to handle kernel data access on read at 0xc00000006be21300
Oops: Kernel access of bad area, sig: 11 [#1]
MSR: 8000000000001003 <SF,ME,RI,LE> CR: 88222248 XER: 00000005
CFAR: c00000000003ffc4 DAR: c00000006be21300 DSISR: 40000000 IRQMASK: 0
NIP [c000000000029e40] arch_irq_work_raise+0x10/0x70
LR [c00000000003ffc8] machine_check_queue_event+0xa8/0x150
Call Trace:
[c0000000179d3c70] [c00000000003ff64] machine_check_queue_event+0x44/0x150
[c0000000179d3d30] [c0000000000084e0] machine_check_early_common+0x1f0/0x2c0
The crash occurs because arch_irq_work_raise() calls preempt_disable()
from machine check exception (MCE) handlers running in real mode. In
this context, accessing the preempt_count can fault, leading to the panic.
The preempt_disable()/preempt_enable() pair in arch_irq_work_raise()
was originally added by commit 0fe1ac48bef0 ("powerpc/perf_event: Fix
oops due to perf_event_do_pending call") to avoid races while raising
irq work from exception context.
Later, commit 471ba0e686cb ("irq_work: Do not raise an IPI when
queueing work on the local CPU") added preemption protection in
irq_work_queue() path, while commit 20b876918c06 ("irq_work: Use per
cpu atomics instead of regular atomics") added equivalent
protection in irq_work_queue_on() before reaching arch_irq_work_raise():
irq_work_queue() / irq_work_queue_on()
-> preempt_disable()
-> __irq_work_queue_local()
-> irq_work_raise()
-> arch_irq_work_raise()
As a result, callers other than mce_irq_work_raise() already execute
with preemption disabled, making the additional
preempt_disable()/preempt_enable() pair in arch_irq_work_raise()
redundant.
The arch_irq_work_raise() function executes in NMI context when called
from MCE handler. Hence we will not be preempted or scheduled out since
we are in NMI context with MSR[EE]=0. Therefore, it is safe to remove
the preempt_disable()/preempt_enable() calls from here.
Remove it to avoid accessing preempt_count from real mode context.
Fixes: cc15ff327569 ("powerpc/mce: Avoid using irq_work_queue() in realmode")
Suggested-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Acked-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
[Maddy: Fixed the commit title]
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260513081413.222490-1-sayalip@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Julian Braha <julianbraha@gmail.com>
Date: Sun Apr 5 17:15:45 2026 +0100
powerpc: fix dead default for GUEST_STATE_BUFFER_TEST
[ Upstream commit aef656a0e6c01796190bb5bd2bdba1c644ed7811 ]
The GUEST_STATE_BUFFER_TEST config option should default
to KUNIT_ALL_TESTS so that if all tests are enabled then
it is included, but currently the 'default KUNIT_ALL_TESTS'
statement is shadowed by 'def_tristate n',
meaning that this second default statement is currently dead code.
It looks to me like the commit
6ccbbc33f06a ("KVM: PPC: Add helper library for Guest State Buffers")
intended to set the default to KUNIT_ALL_TESTS, but mistakenly
missed the def_tristate.
This dead code was found by kconfirm, a static analysis tool for Kconfig.
Fixes: 6ccbbc33f06a ("KVM: PPC: Add helper library for Guest State Buffers")
Signed-off-by: Julian Braha <julianbraha@gmail.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Reviewed-by: Amit Machhiwal <amachhiw@linux.ibm.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260405161545.161006-1-julianbraha@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dawei Feng <dawei.feng@seu.edu.cn>
Date: Wed May 20 15:03:23 2026 +0800
qed: fix double free in qed_cxt_tables_alloc()
commit 2bccfb8476ca5f3548afbd623dc7a6980d4e77de upstream.
If one of the later PF or VF CID bitmap allocations fails,
qed_cid_map_alloc() jumps to cid_map_fail and frees the previously
allocated CID bitmaps before returning an error. qed_cxt_tables_alloc()
then calls qed_cxt_mngr_free(), which invokes qed_cid_map_free()
again.
Fix this by setting each CID bitmap pointer to NULL after bitmap_free()
to avoid double free.
The bug was first flagged by an experimental analysis tool we are
developing for kernel memory-management bugs while analyzing
v6.13-rc1. The tool is still under development and is not yet publicly
available. Manual inspection confirms that the bug is still
present in v7.1-rc3.
Runtime reproduction was not attempted because exercising the failing
allocation path requires device-specific setup.
Fixes: fe56b9e6a8d9 ("qed: Add module with basic common support")
Cc: stable@vger.kernel.org
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Signed-off-by: Dawei Feng <dawei.feng@seu.edu.cn>
Link: https://patch.msgid.link/20260520070323.2762379-1-dawei.feng@seu.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ilya Dryomov <idryomov@gmail.com>
Date: Tue May 19 23:07:26 2026 +0200
rbd: eliminate a race in lock_dwork draining on unmap
commit 9fc75b71fdd38465c76c6f6a884cdd4ae3c72d90 upstream.
Given how rbd_lock_add_request() and rbd_img_exclusive_lock() are
written, lock_dwork may be (re)queued more than it's actually needed:
for example in case a new I/O request comes in while we are in the
middle of rbd_acquire_lock() on behalf of another I/O request. This is
expected and with rbd_release_lock() preemptively canceling lock_dwork
is benign under normal operation.
A more problematic example is maybe_kick_acquire():
if (have_requests || delayed_work_pending(&rbd_dev->lock_dwork)) {
dout("%s rbd_dev %p kicking lock_dwork\n", __func__, rbd_dev);
mod_delayed_work(rbd_dev->task_wq, &rbd_dev->lock_dwork, 0);
}
It's not unrealistic for lock_dwork to get canceled right after
delayed_work_pending() returns true and for mod_delayed_work() to
requeue it right there anyway. This is a classic TOCTOU race.
When it comes to unmapping the image, there is an implicit assumption
of no self-initiated exclusive lock activity past the point of return
from rbd_dev_image_unlock() which unlocks the lock if it happens to be
held. This unlock is assumed to be final and lock_dwork (as well as
all other exclusive lock tasks, really) isn't expected to get queued
again. However, lock_dwork is canceled only in cancel_tasks_sync()
(i.e. later in the unmap sequence) and on top of that the cancellation
can get in effect nullified by maybe_kick_acquire(). This may result
in rbd_acquire_lock() executing after rbd_dev_device_release() and
rbd_dev_image_release() run and free and/or reset a bunch of things.
One of the possible failure modes then is a violated
rbd_assert(rbd_image_format_valid(rbd_dev->image_format));
in rbd_dev_header_info() which is called via rbd_dev_refresh() from
rbd_post_acquire_action().
Redo exclusive lock task draining to provide saner semantics and try
to meet the assumptions around rbd_dev_image_unlock().
Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Guangshuo Li <lgs201920130244@gmail.com>
Date: Thu May 14 19:38:34 2026 +0800
RDMA/rtrs: Fix use-after-free in path file creation cleanup
[ Upstream commit 5b74373390113fba798a76b483837029ab010fef ]
In the error path of rtrs_srv_create_path_files(), the sysfs root folders
may already have been created and srv_path->kobj may already have been
initialized. If a later step fails, the cleanup currently calls
kobject_put(&srv_path->kobj) before
rtrs_srv_destroy_once_sysfs_root_folders(srv_path).
kobject_put() may drop the last reference to srv_path->kobj and invoke the
release callback, rtrs_srv_release(), which frees srv_path. The following
call to rtrs_srv_destroy_once_sysfs_root_folders(srv_path) then
dereferences srv_path internally to access srv_path->srv, resulting in a
use-after-free.
This failure path is reached before rtrs_srv_create_path_files() returns
success, so the successful-path lifetime handling is not involved.
Fix this by destroying the sysfs root folders before calling
kobject_put(&srv_path->kobj), so srv_path is still valid while the helper
accesses it.
This issue was found by a static analysis tool I am developing.
Fixes: ae4c81644e91 ("RDMA/rtrs-srv: Rename rtrs_srv_sess to rtrs_srv_path")
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Link: https://patch.msgid.link/20260514113834.865530-1-lgs201920130244@gmail.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michael Bommarito <michael.bommarito@gmail.com>
Date: Wed May 13 13:53:24 2026 -0400
RDMA/siw: Reject MPA FPDU length underflow before signed receive math
commit 0ce1bc9e46ecabe84772bb561e373c0d9876d6f2 upstream.
A malicious connected siw peer can send an iWARP FPDU whose MPA length
field (c_hdr->mpa_len, 16 bit big-endian, peer-controlled) is smaller
than the fixed DDP/RDMAP header for the announced opcode. Soft-iWARP
parses the full header in siw_get_hdr() based on iwarp_pktinfo[opcode]
.hdr_len, but never compares mpa_len against that header length.
siw_tcp_rx_data() then derives
srx->fpdu_part_rem = be16_to_cpu(mpa_len) - fpdu_part_rcvd
+ MPA_HDR_SIZE;
where fpdu_part_rcvd equals iwarp_pktinfo[opcode].hdr_len at this
point. For a tagged WRITE (hdr_len 16, MPA_HDR_SIZE 2) the smallest
on-wire mpa_len of 0 yields fpdu_part_rem = -14, and any mpa_len below
hdr_len - MPA_HDR_SIZE underflows to a negative int.
The signed value then flows into siw_proc_write()/siw_proc_rresp() as
bytes = min(srx->fpdu_part_rem, srx->skb_new);
is handed to siw_check_mem() as an int len (whose interval check
addr + len > mem->va + mem->len is satisfied for a valid base when
len is negative), and reaches siw_rx_data() -> siw_rx_kva() /
siw_rx_umem() -> skb_copy_bits() as a signed copy length. The header
copy branch in skb_copy_bits() promotes that to size_t, producing a
multi-gigabyte read.
KASAN under a KUnit harness that drives the real kernel TCP receive
path -- a loopback AF_INET socketpair, the malformed FPDU written via
kernel_sendmsg, sk_data_ready firing in softirq, tcp_read_sock
dispatching to siw_tcp_rx_data -- reports:
BUG: KASAN: use-after-free in skb_copy_bits+0x284/0x480
Read of size 4294967295 at addr ffff888...
Call Trace:
skb_copy_bits
siw_rx_kva
siw_rx_data
siw_check_mem
siw_proc_write
siw_tcp_rx_data
__tcp_read_sock
siw_qp_llp_data_ready
tcp_data_ready
tcp_data_queue
Add the missing invariant at the earliest point where the peer header
is fully assembled. iwarp_pktinfo[*].hdr_len - MPA_HDR_SIZE is exactly
the value the siw transmitter uses as the minimum mpa_len for each
opcode (drivers/infiniband/sw/siw/siw_qp.c:33), so this matches the
protocol contract. Out-of-range FPDUs terminate the connection with
TERM_ERROR_LAYER_LLP / LLP_ETYPE_MPA / LLP_ECODE_FPDU_START -- which
is RFC 5044 Section 8 error code 3 ("Marker and ULPDU Length fields
do not agree on the start of an FPDU"), the correct framing-error
class for this inconsistency.
Fixes: 8b6a361b8c48 ("rdma/siw: receive path")
Link: https://patch.msgid.link/r/20260513175325.2042630-2-michael.bommarito@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Assisted-by: Claude:claude-opus-4-7
Acked-by: Bernard Metzler <bernard.metzler@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sasha Levin <sashal@kernel.org>
Date: Wed May 27 12:48:53 2026 -0400
Revert "ice: fix double-free of tx_buf skb"
This reverts commit fd95ef8d0f6dbe2daa95d6488c9e0f8a95a7e048.
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sasha Levin <sashal@kernel.org>
Date: Wed May 27 12:48:54 2026 -0400
Revert "ice: Remove jumbo_remove step from TX path"
This reverts commit 7332d208c9d2067546eb7af5339773c966ac5625.
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sasha Levin <sashal@kernel.org>
Date: Sun May 24 10:29:50 2026 -0400
Revert "perf cgroup: Update metric leader in evlist__expand_cgroup"
This reverts commit d26e31446c0fa96feca0b7701243b42447225d33.
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sasha Levin <sashal@kernel.org>
Date: Sun May 24 10:36:48 2026 -0400
Revert "perf python: Add parse_events function"
This reverts commit 9cd264079fab9867dbc9fbc8a1e521996e3d7212.
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sasha Levin <sashal@kernel.org>
Date: Sun May 24 10:36:48 2026 -0400
Revert "perf tool_pmu: Factor tool events into their own PMU"
This reverts commit 7cfcd01f33fc3400c60f923d2896a8cdc60cecc4.
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sasha Levin <sashal@kernel.org>
Date: Sun May 24 10:36:48 2026 -0400
Revert "perf tool_pmu: Fix aggregation on duration_time"
This reverts commit 310be445ab1028315627b326516f193511cb1c97.
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sasha Levin <sashal@kernel.org>
Date: Mon May 25 20:45:41 2026 -0400
Revert "x86/vdso: Fix output operand size of RDPID"
This reverts commit d607e6b349b014df1d2d0399f6667322626450e0.
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Steven Rostedt <rostedt@goodmis.org>
Date: Wed May 20 22:08:01 2026 -0400
ring-buffer: Fix reporting of missed events in iterator
commit a254b6d13b0edd6272926674d2afc46d46e496b7 upstream.
When tracing is active while reading the trace file, if the iterator
reading the buffer detects that the writer has passed the iterator head,
it will reset and set a "missed events" flag. This flag is passed to the
output processing to show the user that events were missed:
CPU:4 [LOST EVENTS]
The problem is that the flag is reset after it is checked in
ring_buffer_iter_dropped(). But the "trace" file iterates over all the CPU
ring buffers and it will check if they are dropped when figuring out which
buffer to print next. This prematurely clears the missed_events flag if
the CPU buffer with the missed events is not the one that is printed next.
On the iteration where the CPU buffer with the missed events is printed,
the check if it had missed events would return false and the output does
not show that events were missed.
Do not reset the missed_events flag when checking if there were missed
events, but instead clear it when moving the iterator head to the next
event.
Cc: stable@vger.kernel.org
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/20260520220801.4fd09d13@fedora
Fixes: c9b7a4a72ff64 ("ring-buffer/tracing: Have iterator acknowledge dropped events")
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Pu Lehui <pulehui@huawei.com>
Date: Tue May 26 19:25:17 2026 +0000
riscv: fgraph: Fix stack layout to match __arch_ftrace_regs argument of ftrace_return_to_handler
commit 67a5ba8f742f247bc83e46dd2313c142b1383276 upstream.
Naresh Kamboju reported a "Bad frame pointer" kernel warning while
running LTP trace ftrace_stress_test.sh in riscv. We can reproduce the
same issue with the following command:
```
$ cd /sys/kernel/debug/tracing
$ echo 'f:myprobe do_nanosleep%return args1=$retval' > dynamic_events
$ echo 1 > events/fprobes/enable
$ echo 1 > tracing_on
$ sleep 1
```
And we can get the following kernel warning:
[ 127.692888] ------------[ cut here ]------------
[ 127.693755] Bad frame pointer: expected ff2000000065be50, received ba34c141e9594000
[ 127.693755] from func do_nanosleep return to ffffffff800ccb16
[ 127.698699] WARNING: CPU: 1 PID: 129 at kernel/trace/fgraph.c:755 ftrace_return_to_handler+0x1b2/0x1be
[ 127.699894] Modules linked in:
[ 127.700908] CPU: 1 UID: 0 PID: 129 Comm: sleep Not tainted 6.14.0-rc3-g0ab191c74642 #32
[ 127.701453] Hardware name: riscv-virtio,qemu (DT)
[ 127.701859] epc : ftrace_return_to_handler+0x1b2/0x1be
[ 127.702032] ra : ftrace_return_to_handler+0x1b2/0x1be
[ 127.702151] epc : ffffffff8013b5e0 ra : ffffffff8013b5e0 sp : ff2000000065bd10
[ 127.702221] gp : ffffffff819c12f8 tp : ff60000080853100 t0 : 6e00000000000000
[ 127.702284] t1 : 0000000000000020 t2 : 6e7566206d6f7266 s0 : ff2000000065bd80
[ 127.702346] s1 : ff60000081262000 a0 : 000000000000007b a1 : ffffffff81894f20
[ 127.702408] a2 : 0000000000000010 a3 : fffffffffffffffe a4 : 0000000000000000
[ 127.702470] a5 : 0000000000000000 a6 : 0000000000000008 a7 : 0000000000000038
[ 127.702530] s2 : ba34c141e9594000 s3 : 0000000000000000 s4 : ff2000000065bdd0
[ 127.702591] s5 : 00007fff8adcf400 s6 : 000055556dc1d8c0 s7 : 0000000000000068
[ 127.702651] s8 : 00007fff8adf5d10 s9 : 000000000000006d s10: 0000000000000001
[ 127.702710] s11: 00005555737377c8 t3 : ffffffff819d899e t4 : ffffffff819d899e
[ 127.702769] t5 : ffffffff819d89a0 t6 : ff2000000065bb18
[ 127.702826] status: 0000000200000120 badaddr: 0000000000000000 cause: 0000000000000003
[ 127.703292] [<ffffffff8013b5e0>] ftrace_return_to_handler+0x1b2/0x1be
[ 127.703760] [<ffffffff80017bce>] return_to_handler+0x16/0x26
[ 127.704009] [<ffffffff80017bb8>] return_to_handler+0x0/0x26
[ 127.704057] [<ffffffff800d3352>] common_nsleep+0x42/0x54
[ 127.704117] [<ffffffff800d44a2>] __riscv_sys_clock_nanosleep+0xba/0x10a
[ 127.704176] [<ffffffff80901c56>] do_trap_ecall_u+0x188/0x218
[ 127.704295] [<ffffffff8090cc3e>] handle_exception+0x14a/0x156
[ 127.705436] ---[ end trace 0000000000000000 ]---
The reason is that the stack layout for constructing argument for the
ftrace_return_to_handler in the return_to_handler does not match the
__arch_ftrace_regs structure of riscv, leading to unexpected results.
Fixes: a3ed4157b7d8 ("fgraph: Replace fgraph_ret_regs with ftrace_regs")
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/all/CA+G9fYvp_oAxeDFj88Tk2rfEZ7jtYKAKSwfYS66=57Db9TBdyA@mail.gmail.com
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20250317031214.4138436-2-pulehui@huaweicloud.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Gyokhan Kochmarla <gyokhan@amazon.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pu Lehui <pulehui@huawei.com>
Date: Tue May 26 19:26:35 2026 +0000
riscv: fgraph: Select HAVE_FUNCTION_GRAPH_TRACER depends on HAVE_DYNAMIC_FTRACE_WITH_ARGS
commit e8eb8e1bdae94b9e003f5909519fd311d0936890 upstream.
Currently, fgraph on riscv relies on the infrastructure of
DYNAMIC_FTRACE_WITH_ARGS. However, DYNAMIC_FTRACE_WITH_ARGS may be
turned off on riscv, which will cause the enabled fgraph to be abnormal.
Therefore, let's select HAVE_FUNCTION_GRAPH_TRACER depends on
HAVE_DYNAMIC_FTRACE_WITH_ARGS.
Fixes: a3ed4157b7d8 ("fgraph: Replace fgraph_ret_regs with ftrace_regs")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202503160820.dvqMpH0g-lkp@intel.com/
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20250317031214.4138436-1-pulehui@huaweicloud.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Gyokhan Kochmarla <gyokhan@amazon.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Osama Abdelkader <osama.abdelkader@gmail.com>
Date: Thu May 14 19:36:40 2026 +0200
riscv: kvm: return SBI_ERR_FAILURE for pmu_snapshot_set_shmem() when OOM
commit 0835ee26938e15eccd70f7d33da386b6490f9449 upstream.
kvm_riscv_vcpu_pmu_snapshot_set_shmem() returned -ENOMEM from the
SBI extension handler, which caused kvm_riscv_vcpu_sbi_ecall() to
abort KVM_RUN and surface the error to userspace instead of
ompleting the ECALL with a negative SBI error in a0.
Use SBI_ERR_FAILURE and the normal retdata path, matching other PMU
handlers and kvm_sbi_ext_pmu_handler comment.
Fixes: c2f41ddbcdd7 ("RISC-V: KVM: Implement SBI PMU Snapshot feature")
Cc: stable@vger.kernel.org
Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260514173642.41448-1-osama.abdelkader@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org>
Date: Sun Jan 25 00:52:12 2026 -0500
riscv: mm: Fixup no5lvl failure when vaddr is invalid
[ Upstream commit db909bd7986c10da074917af3dae83a60fa65093 ]
Unlike no4lvl, no5lvl still continues to detect satp, which
requires va=pa mapping. When pa=0x800000000000, no5lvl
would fail in Sv48 mode due to an illegal VA value of
0x800000000000.
So, prevent detecting the satp flow for no5lvl, when
vaddr is invalid. Add the is_vaddr_valid() function for
checking.
Fixes: 26e7aacb83df ("riscv: Allow to downgrade paging mode from the command line")
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org>
Tested-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
Link: https://patch.msgid.link/20260125055212.433163-1-guoren@kernel.org
[pjw@kernel.org: cleaned up commit message]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pengpeng Hou <pengpeng@iscas.ac.cn>
Date: Thu May 21 10:28:29 2026 +0800
s390/debug: Reject zero-length input before trimming a newline
[ Upstream commit c366a7b5ed7564e41345c380285bd3f6cb98971b ]
debug_get_user_string() copies the userspace buffer into a newly
allocated NUL-terminated buffer and then unconditionally looks at
buffer[user_len - 1] to strip a trailing newline.
A zero-length write reaches this helper unchanged, so the newline trim
reads before the start of the allocated buffer.
Reject empty writes before accessing the last input byte.
Fixes: 66a464dbc8e0 ("[PATCH] s390: debug feature changes")
Cc: stable@vger.kernel.org
Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20260417073530.96002-1-pengpeng@iscas.ac.cn
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peter Zijlstra <peterz@infradead.org>
Date: Mon May 25 23:11:17 2026 +0200
sched/deadline: Fix dl_server behaviour
commit a3a70caf7906708bf9bbc80018752a6b36543808 upstream.
John reported undesirable behaviour with the dl_server since commit:
cccb45d7c4295 ("sched/deadline: Less agressive dl_server handling").
When starving fair tasks on purpose (starting spinning FIFO tasks),
his fair workload, which often goes (briefly) idle, would delay fair
invocations for a second, running one invocation per second was both
unexpected and terribly slow.
The reason this happens is that when dl_se->server_pick_task() returns
NULL, indicating no runnable tasks, it would yield, pushing any later
jobs out a whole period (1 second).
Instead simply stop the server. This should restore behaviour in that
a later wakeup (which restarts the server) will be able to continue
running (subject to the CBS wakeup rules).
Notably, this does not re-introduce the behaviour cccb45d7c4295 set
out to solve, any start/stop cycle is naturally throttled by the timer
period (no active cancel).
Fixes: cccb45d7c4295 ("sched/deadline: Less agressive dl_server handling")
Reported-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: John Stultz <jstultz@google.com>
Closes: https://lore.kernel.org/regressions/04657838-46d1-432d-95e1-eb73b930b032@mailbox.org
Signed-off-by: Lukas Beckmann <lbckmnn@mailbox.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peter Zijlstra <peterz@infradead.org>
Date: Mon May 25 23:11:16 2026 +0200
sched/deadline: Fix dl_server getting stuck
commit 4ae8d9aa9f9dc7137ea5e564d79c5aa5af1bc45c upstream.
John found it was easy to hit lockup warnings when running locktorture
on a 2 CPU VM, which he bisected down to: commit cccb45d7c429
("sched/deadline: Less agressive dl_server handling").
While debugging it seems there is a chance where we end up with the
dl_server dequeued, with dl_se->dl_server_active. This causes
dl_server_start() to return without enqueueing the dl_server, thus it
fails to run when RT tasks starve the cpu.
When this happens, dl_server_timer() catches the
'!dl_se->server_has_tasks(dl_se)' case, which then calls
replenish_dl_entity() and dl_server_stopped() and finally return
HRTIMER_NO_RESTART.
This ends in no new timer and also no enqueue, leaving the dl_server
'dead', allowing starvation.
What should have happened is for the bandwidth timer to start the
zero-laxity timer, which in turn would enqueue the dl_server and cause
dl_se->server_pick_task() to be called -- which will stop the
dl_server if no fair tasks are observed for a whole period.
IOW, it is totally irrelevant if there are fair tasks at the moment of
bandwidth refresh.
This removes all dl_se->server_has_tasks() users, so remove the whole
thing.
Fixes: cccb45d7c4295 ("sched/deadline: Less agressive dl_server handling")
Reported-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: John Stultz <jstultz@google.com>
[ adjust renamed variable in fair_server_has_tasks (which this patch
removes) ]
Signed-off-by: Lukas Beckmann <lbckmnn@mailbox.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Huacai Chen <chenhuacai@kernel.org>
Date: Mon May 25 23:11:14 2026 +0200
sched/deadline: Fix dl_server_stopped()
commit 4717432dfd99bbd015b6782adca216c6f9340038 upstream.
Commit cccb45d7c429 ("sched/deadline: Less agressive dl_server handling")
introduces dl_server_stopped(). But it is obvious that dl_server_stopped()
should return true if dl_se->dl_server_active is 0.
Fixes: cccb45d7c429 ("sched/deadline: Less agressive dl_server handling")
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250809130419.1980742-1-chenhuacai@loongson.cn
Signed-off-by: Lukas Beckmann <lbckmnn@mailbox.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peter Zijlstra <peterz@infradead.org>
Date: Mon May 25 23:11:13 2026 +0200
sched/deadline: Less agressive dl_server handling
commit cccb45d7c4295bbfeba616582d0249f2d21e6df5 upstream.
Chris reported that commit 5f6bd380c7bd ("sched/rt: Remove default
bandwidth control") caused a significant dip in his favourite
benchmark of the day. Simply disabling dl_server cured things.
His workload hammers the 0->1, 1->0 transitions, and the
dl_server_{start,stop}() overhead kills it -- fairly obviously a bad
idea in hind sight and all that.
Change things around to only disable the dl_server when there has not
been a fair task around for a whole period. Since the default period
is 1 second, this ensures the benchmark never trips this, overhead
gone.
Fixes: 557a6bfc662c ("sched/fair: Add trivial fair server")
Reported-by: Chris Mason <clm@meta.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Juri Lelli <juri.lelli@redhat.com>
Acked-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://lkml.kernel.org/r/20250702121158.465086194@infradead.org
[ adjust context for renamed/removed variable names ]
Signed-off-by: Lukas Beckmann <lbckmnn@mailbox.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peter Zijlstra (Intel) <peterz@infradead.org>
Date: Fri Oct 10 00:17:27 2025 +0530
sched/deadline: Stop dl_server before CPU goes offline
[ Upstream commit ee6e44dfe6e50b4a5df853d933a96bdff5309e6e ]
IBM CI tool reported kernel warning[1] when running a CPU removal
operation through drmgr[2]. i.e "drmgr -c cpu -r -q 1"
WARNING: CPU: 0 PID: 0 at kernel/sched/cpudeadline.c:219 cpudl_set+0x58/0x170
NIP [c0000000002b6ed8] cpudl_set+0x58/0x170
LR [c0000000002b7cb8] dl_server_timer+0x168/0x2a0
Call Trace:
[c000000002c2f8c0] init_stack+0x78c0/0x8000 (unreliable)
[c0000000002b7cb8] dl_server_timer+0x168/0x2a0
[c00000000034df84] __hrtimer_run_queues+0x1a4/0x390
[c00000000034f624] hrtimer_interrupt+0x124/0x300
[c00000000002a230] timer_interrupt+0x140/0x320
Git bisects to: commit 4ae8d9aa9f9d ("sched/deadline: Fix dl_server getting stuck")
This happens since:
- dl_server hrtimer gets enqueued close to cpu offline, when
kthread_park enqueues a fair task.
- CPU goes offline and drmgr removes it from cpu_present_mask.
- hrtimer fires and warning is hit.
Fix it by stopping the dl_server before CPU is marked dead.
[1]: https://lore.kernel.org/all/8218e149-7718-4432-9312-f97297c352b9@linux.ibm.com/
[2]: https://github.com/ibm-power-utilities/powerpc-utils/tree/next/src/drmgr
[sshegde: wrote the changelog and tested it]
Fixes: 4ae8d9aa9f9d ("sched/deadline: Fix dl_server getting stuck")
Closes: https://lore.kernel.org/all/8218e149-7718-4432-9312-f97297c352b9@linux.ibm.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tejun Heo <tj@kernel.org>
Date: Thu May 21 10:52:11 2026 -0400
sched_ext: Avoid UAF in scx_root_enable_workfn() init failure path
[ Upstream commit 9a415cc53711f2238e0f0ca8a6bcc796c003b127 ]
In scx_root_enable_workfn(), put_task_struct(p) is called before scx_error()
dereferences p->comm and p->pid. If the iterator's reference is the last
drop, the task is freed synchronously and the deref becomes a UAF.
Move put_task_struct() past scx_error().
Reported-by: Sashiko <sashiko-bot@kernel.org>
Closes: https://lore.kernel.org/all/20260511214031.AF5E9C2BCB0@smtp.kernel.org/
Fixes: f0e1a0643a59 ("sched_ext: Implement BPF extensible scheduler class")
Cc: stable@vger.kernel.org # v6.12+
Signed-off-by: Tejun Heo <tj@kernel.org>
[ adapted fix to pre-refactor scx_ops_enable_workfn() with scx_task_iter_relock() instead of upstream scx_root_enable_workfn() ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Samuele Mariotti <smariotti@disroot.org>
Date: Thu May 21 10:52:10 2026 -0400
sched_ext: Fix missing warning in scx_set_task_state() default case
[ Upstream commit b905ee77d5f557a83a485b4146210f54f13365fc ]
In scx_set_task_state(), the default case was setting the
warn flag, but then returning immediately. This is problematic
because the only purpose of the warn flag is to trigger
WARN_ONCE, but the early return prevented it from ever firing,
leaving invalid task states undetected and untraced.
To fix this, a WARN_ONCE call is now added directly in the
default case.
The fix addresses two aspects:
- Guarantees the invalid task states are properly logged
and traced.
- Provides a distinct warning message
("sched_ext: Invalid task state") specifically for
states outside the defined scx_task_state enum values,
making it easier to distinguish from other transition
warnings.
This ensures proper detection and reporting of invalid states.
Signed-off-by: Samuele Mariotti <smariotti@disroot.org>
Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Stable-dep-of: 9a415cc53711 ("sched_ext: Avoid UAF in scx_root_enable_workfn() init failure path")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Michael Bommarito <michael.bommarito@gmail.com>
Date: Sun Apr 19 17:04:20 2026 -0400
scsi: isci: Fix use-after-free in device removal path
commit b52a8d52c3125ec9a93106ed816582368de34426 upstream.
The ISCI completion tasklet is initialized in isci_host_alloc()
(drivers/scsi/isci/init.c:496) and scheduled from both MSI-X and legacy
interrupt handlers (drivers/scsi/isci/host.c:223,613).
isci_host_deinit() stops the controller and waits for stop completion,
but it never kills completion_tasklet before teardown continues. A
top-of-function tasklet_kill() is not sufficient here: interrupts are
only disabled when isci_host_stop_complete() runs, so until
wait_for_stop() returns the IRQ handlers can still requeue the
tasklet. The tasklet callback also re-enables interrupts after draining
completions, so killing the tasklet before the source is quiesced leaves
the same race open.
Once wait_for_stop() returns, no further IRQ-driven scheduling can
occur. Kill completion_tasklet there so teardown cannot race a queued
tasklet running on a dead ihost. On remove or unload, the stale callback
can otherwise dereference ihost and touch ihost->smu_registers after the
host lifetime ends.
A UML + KASAN analogue reproduced the failure class both with no
tasklet_kill() and with tasklet_kill() placed before source quiesce, and
stayed clean once the kill happened after quiescing the scheduling
source.
This mirrors commit f6ab594672d4 ("scsi: aic94xx: fix use-after-free in
device removal path"), but ISCI needs the kill after wait_for_stop().
Fixes: 6f231dda6808 ("isci: Intel(R) C600 Series Chipset Storage Control Unit Driver")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Assisted-by: Codex:gpt-5-4
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Link: https://patch.msgid.link/20260419210420.2134639-1-michael.bommarito@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Mike Christie <michael.christie@oracle.com>
Date: Mon May 11 12:53:17 2026 -0500
scsi: sd: Fix return code handling in sd_spinup_disk()
[ Upstream commit 6ea68a8dc7d2711504d944811981a5304af7d7a9 ]
As found by smatch-ci, scsi_execute_cmd() can return negative or positve
values so we should use a int instead of unsigned int.
Fixes: b4d0c33a32c3 ("scsi: sd: Fix sshdr use in sd_spinup_disk")
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/linux-scsi/agFbI7E6JQwd3wGW@stanley.mountain/T/#u
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260511175317.114007-1-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Thu May 28 11:45:41 2026 -0700
security/keys: fix missed RCU read section on lookup
commit 43a1e3744548e6fd85873e6fb43e293eb4010694 upstream.
Nicholas Carlini reports that the keyring code calls assoc_array_find()
in find_key_to_update() without holding the RCU read lock, while the
assoc_array_gc() code really is designed around removing the node from
the tree and then freeing it after an RCU grace-period.
The regular key handling doesn't see this because holding the keyring
semaphore hides any lifetime issues, but the persistent key handling
uses a different model.
Instead of extending the keyring locking, just do the simple RCU locking
that the assoc_array was designed for.
Reported-by: Nicholas Carlini <npc@anthropic.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris James Morris <jmorris@namei.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Luiz Capitulino <luizcap@redhat.com>
Date: Mon Apr 27 12:03:51 2026 -0400
selftests/mm: run_vmtests.sh: fix destructive tests invocation
commit 3432cbb291aabf85f8af4b9d1ec37179168ff999 upstream.
Destructive tests should be invoked with -d command-line option, but this
won't work today since 'd' is missing in getopts command-line. This
commit fixes it.
Link: https://lore.kernel.org/214fd9e4-5398-4c26-859e-c982c2e277c3@redhat.com
Fixes: f16ff3b692ad ("selftests/mm: run_vmtests.sh: add missing tests")
Signed-off-by: Luiz Capitulino <luizcap@redhat.com>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: ChenXiaoSong <chenxiaosong@kylinos.cn>
Date: Mon May 18 15:23:22 2026 +0000
smb/server: promote S_DEL_ON_CLS to S_DEL_PENDING when close
commit 4ec9c8e023c79f613fe4d5ad8cc737112efb2e44 upstream.
Reproducer:
1. server: systemctl start ksmbd
2. client: mount -t cifs //${server_ip}/export /mnt
3. client: C program: openat(AT_FDCWD, "/mnt", O_RDWR | O_TMPFILE, 0600)
Do not treat `FILE_DELETE_ON_CLOSE_LE` as delete pending while files
remain open.
This patch fixes xfstests generic/004.
Cc: stable@vger.kernel.org
Link: https://chenxiaosong.com/en/smb-xfstests-generic-004.html
Co-developed-by: Huiwen He <hehuiwen@kylinos.cn>
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn>
Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Tested-by: Steve French <stfrench@microsoft.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Henrique Carvalho <henrique.carvalho@suse.com>
Date: Thu May 14 20:18:25 2026 -0300
smb: client: protect tc_count increment in smb2_find_smb_sess_tcon_unlocked()
commit 4d8690dace005a38e6dbde9ecce2da3ad85c7c41 upstream.
Commit 96c4af418586 ("cifs: Fix locking usage for tcon fields")
refactored cifs code to change cifs_tcp_ses_lock for tc_lock around
tc_count changes.
There was missing lock around tc_count increment inside
smb2_find_smb_sess_tcon_unlocked().
Cc: stable@vger.kernel.org
Fixes: 96c4af418586 ("cifs: Fix locking usage for tcon fields")
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Asim Viladi Oglu Manizada <manizada@pm.me>
Date: Sat May 16 21:15:39 2026 +0000
smb: client: reject userspace cifs.spnego descriptions
commit 3da1fdf4efbc490041eb4f836bf596201203f8f2 upstream.
cifs.spnego key descriptions contain authority-bearing fields such as
pid, uid, creduid, and upcall_target that cifs.upcall treats as
kernel-originating inputs. However, userspace can also create keys of
this type through request_key(2) or add_key(2), allowing those fields to
be supplied without CIFS origin.
Only accept cifs.spnego descriptions while CIFS is using its private
spnego_cred to request the key.
Fixes: f1d662a7d5e5 ("[CIFS] Add upcall files for cifs to use spnego/kerberos")
Assisted-by: avom-custom-harness:gpt-5.5-qwen3.6-mod-mix
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Asim Viladi Oglu Manizada <manizada@pm.me>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Michael Bommarito <michael.bommarito@gmail.com>
Date: Sun May 17 20:11:50 2026 -0400
smb: client: require net admin for CIFS SWN netlink
commit d1ebfce2c1d161186a82e77590bf7da2ea1bce91 upstream.
CIFS_GENL_CMD_SWN_NOTIFY is the userspace witness-notify command. The
intended sender is the cifs.witness helper, but the generic-netlink
operation currently has no capability flag, so any local process can send
RESOURCE_CHANGE or CLIENT_MOVE notifications to the in-kernel witness
handler.
The same family exposes CIFS_GENL_MCGRP_SWN without multicast-group
capability flags. Register messages sent to that group include the witness
registration id and, for NTLM-authenticated mounts, the username, domain,
and password attributes copied from the CIFS session. An unprivileged
local process should not be able to join that group and receive those
messages.
Require CAP_NET_ADMIN for incoming SWN_NOTIFY commands with
GENL_ADMIN_PERM, and require CAP_NET_ADMIN over the network namespace for
joining the SWN multicast group with GENL_MCAST_CAP_NET_ADMIN. The
cifs.witness service runs with the privileges needed for both operations.
Fixes: fed979a7e082 ("cifs: Set witness notification handler for messages from userspace daemon")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jeremy Erazo <mendozayt13@gmail.com>
Date: Fri May 15 19:31:41 2026 +0000
smb: client: use data_len for SMB2 READ encrypted folioq copy
commit d4d76c9ee1997cc8c977a63f6c43551c253c1066 upstream.
In handle_read_data() the encrypted/folioq branch
(buf_len <= data_offset, reached via receive_encrypted_read for
transform PDUs > CIFSMaxBufSize + MAX_HEADER_SIZE) copies the READ
payload using buffer_len rather than data_len:
rdata->result = cifs_copy_folioq_to_iter(buffer, buffer_len,
cur_off,
&rdata->subreq.io_iter);
...
rdata->got_bytes = buffer_len;
buffer_len comes from the SMB3 transform header OriginalMessageSize
field (OriginalMessageSize - read_rsp_size); it represents the size
of the decrypted message after the SMB2 header. data_len comes from
the SMB2 READ response DataLength field; it represents the actual
READ payload size and may be smaller than buffer_len when the
decrypted message contains padding or other trailing bytes after the
READ payload. The existing check `data_len > buffer_len - pad_len`
only enforces an upper bound, so a server that emits
OriginalMessageSize larger than read_rsp_size + pad_len + data_len
passes the check and the kernel copies buffer_len bytes per response,
ignoring the server-asserted DataLength.
Two observable failures with a crafted server (DataLength=4,
buffer_len=20000):
- the kernel returns 20000 bytes per sub-request to userspace and
sets got_bytes = buffer_len, even though the response claimed
only 4 bytes of payload;
- on a partial netfs sub-request whose iterator is sized to
data_len, the over-large copy_folio_to_iter() short-reads,
cifs_copy_folioq_to_iter() returns -EIO via the n != len path,
and the entire netfs read collapses to -EIO even though the
leading sub-requests succeeded.
Use data_len for the copy length and for got_bytes so the kernel
honours the server-asserted READ payload size. For well-formed
servers (where buffer_len == pad_len + data_len) the change is
behaviour-equivalent.
Cc: stable@vger.kernel.org
Signed-off-by: Jeremy Erazo <mendozayt13@gmail.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Johan Hovold <johan@kernel.org>
Date: Tue May 12 09:48:49 2026 +0200
spi: ep93xx: fix error pointer deref after DMA setup failure
commit 5e121a81667a83e9a01d62b429e340f5a4a84abc upstream.
The driver falls back to PIO mode if DMA setup fails during probe.
Make sure to the clear the DMA channel pointers on setup failure to
avoid dereferencing an error pointer on later probe errors or driver
unbind.
This issue was flagged by Sashiko when reviewing a devres allocation
conversion patch.
Fixes: e79e7c2df627 ("spi: ep93xx: add DT support for Cirrus EP93xx")
Link: https://sashiko.dev/#/patchset/20260429091333.165363-1-johan%40kernel.org?part=10
Cc: stable@vger.kernel.org # 6.12
Cc: Nikita Shubin <nikita.shubin@maquefel.me>
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Nikita Shubin <nikita.shubin@maquefel.me>
Link: https://patch.msgid.link/20260512074849.915143-1-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Felix Gu <ustc.gu@gmail.com>
Date: Sun May 10 01:55:37 2026 +0800
spi: mtk-snfi: Fix resource leak in mtk_snand_read_page_cache()
[ Upstream commit 496ba79b9496b8b3747cbc764ebd33ee7325e806 ]
When DMA read times out in mtk_snand_read_page_cache(), the original code
erroneously jumped to cleanup label which skips DMA unmapping and ECC
disable, causing a resource leak.
Fixes: 764f1b748164 ("spi: add driver for MTK SPI NAND Flash Interface")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Link: https://patch.msgid.link/20260510-snfi-v1-1-bc375cf1af8e@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Johan Hovold <johan@kernel.org>
Date: Tue May 12 09:43:34 2026 +0200
spi: qup: fix error pointer deref after DMA setup failure
commit a7e8f3efd50a165ba0189f6dc57f7e51a7d149db upstream.
The driver falls back to PIO mode if DMA setup fails during probe.
Make sure to the clear the DMA channel pointers on setup failure to
avoid dereferencing an error pointer (or attempting to release a channel
a second time) on later probe errors or driver unbind.
This issue was flagged by Sashiko when reviewing a devres allocation
conversion patch.
Fixes: 612762e82ae6 ("spi: qup: Add DMA capabilities")
Link: https://sashiko.dev/#/patchset/20260505072909.618363-1-johan%40kernel.org?part=4
Cc: stable@vger.kernel.org # 4.1
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260512074334.914735-1-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Vladimir Yakovlev <vovchkir@gmail.com>
Date: Tue Mar 3 01:20:17 2026 +0300
spi: spi-dw-dma: fix print error log when wait finish transaction
[ Upstream commit 3b46d61890632c8f8b117147b6923bff4b42ccb7 ]
If an error occurs, the device may not have a current message. In this
case, the system will crash.
In this case, it's better to use dev from the struct ctlr (struct spi_controller*).
Signed-off-by: Vladimir Yakovlev <vovchkir@gmail.com>
Link: https://patch.msgid.link/20260302222017.992228-2-vovchkir@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Johan Hovold <johan@kernel.org>
Date: Tue May 12 09:47:33 2026 +0200
spi: sprd: fix error pointer deref after DMA setup failure
commit 3d67fffb74267772d461c02c67f1eff893ad547d upstream.
The driver falls back to PIO mode if DMA setup fails during probe.
Make sure to check the dma.enabled flag before trying to release the DMA
channels also on late probe errors to avoid dereferencing an error
pointer (or attempting to release a channel a second time).
This issue was flagged by Sashiko when reviewing a devres allocation
conversion patch.
Fixes: 386119bc7be9 ("spi: sprd: spi: sprd: Add DMA mode support")
Link: https://sashiko.dev/#/patchset/20260505072909.618363-1-johan%40kernel.org?part=10
Cc: stable@vger.kernel.org # 5.1
Cc: Lanqing Liu <lanqing.liu@unisoc.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260512074733.915029-1-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Johan Hovold <johan@kernel.org>
Date: Tue May 12 09:48:09 2026 +0200
spi: ti-qspi: fix use-after-free after DMA setup failure
commit ea6ec3343e05f7937a53eb6d7617b3abdb4abc19 upstream.
The driver falls back to PIO mode if DMA setup fails during probe.
Make sure to clear the DMA channel pointer also if buffer allocation
fails to avoid passing a pointer to the released channel to the DMA
engine (or trying to free the channel a second time on late probe errors
or driver unbind).
This issue was flagged by Sashiko when reviewing a devres allocation
conversion patch.
Fixes: c687c46e9e45 ("spi: spi-ti-qspi: Use bounce buffer if read buffer is not DMA'ble")
Link: https://sashiko.dev/#/patchset/20260505072909.618363-1-johan%40kernel.org?part=17
Cc: stable@vger.kernel.org # 4.12
Cc: Vignesh R <vigneshr@ti.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260512074809.915084-1-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Wed May 20 15:05:04 2026 +0200
sysfs: don't remove existing directory on update failure
commit 237557b8a81ab948e8332f7c0058e758f081c0a3 upstream.
When sysfs_update_group() is called for a named group and create_files()
fails (e.g. -ENOMEM), internal_create_group() calls kernfs_remove(kn) on
the group directory. In the update path, kn was obtained via
kernfs_find_and_get() and refers to a directory that already existed
before this call. Removing it silently destroys a sysfs group that the
caller did not create.
Only remove the directory if we created it ourselves. On update failure
the directory remains as it is left empty by remove_files() inside
create_files(), but can be repopulated by a retry.
Cc: Rajat Jain <rajatja@google.com>
Fixes: c855cf2759d2 ("sysfs: Fix internal_create_group() for named group updates")
Cc: stable <stable@kernel.org>
Assisted-by: gkh_clanker_t1000
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/2026052003-uniquely-hastily-c093@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kuniyuki Iwashima <kuniyu@google.com>
Date: Wed May 6 03:59:19 2026 +0000
tcp: Fix imbalanced icsk_accept_queue count.
[ Upstream commit 7eca3292cac7c26dad4c236f51ba225c39a0523f ]
When TCP socket migration happens in reqsk_timer_handler(),
@sk_listener will be updated with the new listener.
When we call __inet_csk_reqsk_queue_drop(), the listener must
be the one stored in req->rsk_listener.
The cited commit accidentally replaced oreq->rsk_listener with
sk_listener, leading to imbalanced icsk_accept_queue count.
Let's pass the correct listener to __inet_csk_reqsk_queue_drop().
Fixes: e8c526f2bdf1 ("tcp/dccp: Don't use timer_pending() in reqsk_queue_unlink().")
Reported-by: Damiano Melotti <melotti@google.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260506035954.1563147-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kuniyuki Iwashima <kuniyu@google.com>
Date: Fri May 8 12:08:46 2026 +0000
tcp: Fix out-of-bounds access for twsk in tcp_ao_established_key().
[ Upstream commit 03cb001ef87b3f8d859cf7f96329acf3d6235d29 ]
lockdep_sock_is_held() was added in tcp_ao_established_key()
by the cited commit.
It can be called from tcp_v[46]_timewait_ack() with twsk.
Since it does not have sk->sk_lock, the lockdep annotation
results in out-of-bound access.
$ pahole -C tcp_timewait_sock vmlinux | grep size
/* size: 288, cachelines: 5, members: 8 */
$ pahole -C sock vmlinux | grep sk_lock
socket_lock_t sk_lock; /* 440 192 */
Let's not use lockdep_sock_is_held() for TCP_TIME_WAIT.
Fixes: 6b2d11e2d8fc ("net/tcp: Add missing lockdep annotations for TCP-AO hlist traversals")
Reported-by: Damiano Melotti <melotti@google.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260508120853.4098365-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Martin Kaiser <martin@kaiser.cx>
Date: Fri May 8 09:56:36 2026 +0900
test_kprobes: clear kprobes between test runs
[ Upstream commit ef5581bb30efb939cc2bf093475c6cc85258e5cd ]
Running the kprobes sanity tests twice makes all tests fail and
eventually crashes the kernel.
[root@martin-riscv-1 ~]# echo 1 > /sys/kernel/debug/kunit/kprobes_test/run
...
# Totals: pass:5 fail:0 skip:0 total:5
ok 1 kprobes_test
[root@martin-riscv-1 ~]# echo 1 > /sys/kernel/debug/kunit/kprobes_test/run
...
# test_kprobe: EXPECTATION FAILED at lib/tests/test_kprobes.c:64
Expected 0 == register_kprobe(&kp), but
register_kprobe(&kp) == -22 (0xffffffffffffffea)
...
Unable to handle kernel paging request ...
The testsuite defines several kprobes and kretprobes as static variables
that are preserved across test runs.
After register_kprobe and unregister_kprobe, a kprobe contains some
leftover data that must be cleared before the kprobe can be registered
again. The tests are setting symbol_name to define the probe location.
Address and flags must be cleared.
The existing code clears some of the probes between subsequent tests, but
not between two test runs. The leftover data from a previous test run
makes the registrations fail in the next run.
Move the cleanups for all kprobes into kprobes_test_init, this function
is called before each single test (including the first test of a test
run).
Link: https://lore.kernel.org/all/20260507134615.1010905-1-martin@kaiser.cx/
Fixes: e44e81c5b90f ("kprobes: convert tests to kunit")
Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chuck Lever <chuck.lever@oracle.com>
Date: Wed May 13 08:58:25 2026 -0400
tls: Preserve sk_err across recvmsg() when data has been copied
[ Upstream commit f508262ae9f21fe0e6c0749948b9dc7dd5a62a70 ]
The sk_err check in tls_rx_rec_wait() consumes the error via
sock_error(), which clears sk_err atomically. When the caller
(tls_sw_recvmsg, tls_sw_splice_read, or tls_sw_read_sock) already
has bytes copied to userspace, it returns those bytes and discards
the error from this call. sk_err is now zero on the socket, so the
next read syscall observes only RCV_SHUTDOWN and reports a clean
EOF instead of the actual error (typically -ECONNRESET).
The race is reachable when tls_read_flush_backlog()'s periodic
sk_flush_backlog() triggers tcp_reset() in the middle of a
multi-record read.
Pass a has_copied flag to tls_rx_rec_wait(). When has_copied is
false, consume sk_err via sock_error() as before. When has_copied
is true, report the error from READ_ONCE() but leave sk_err set:
the caller returns the byte count and discards the err from this
call, and the next read syscall surfaces the preserved sk_err. This
mirrors the tcp_recvmsg() preserve-and-surface pattern.
The decrypt-abort path is unaffected: tls_err_abort() raises
sk_err to EBADMSG after tls_rx_rec_wait() returns, and nothing
on the caller's return path consumes it, so the EBADMSG surfaces
on the next read.
tls_sw_splice_read() passes has_copied=false: it processes
one record per call, so no bytes have been copied within the
function when tls_rx_rec_wait() runs. A reset that arrives
between iterations of splice_direct_to_actor() (the sendfile()
path) is still consumed by sock_error() in the later call, and the
outer loop returns the prior iterations' byte count and drops the
error. tcp_splice_read() exhibits the same pattern at the iteration
boundary; addressing it belongs at the splice_direct_to_actor()
layer and is out of scope here.
Fixes: c46b01839f7a ("tls: rx: periodically flush socket backlog")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260513125825.205189-1-cel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Carlier <devnexen@gmail.com>
Date: Fri May 8 20:57:47 2026 +0100
tracing: Avoid NULL return from hist_field_name() on truncation
[ Upstream commit 576ec047d20b368b43c4d5db98c4f2e0f3c101ec ]
hist_field_name() returns "" everywhere except the fully-qualified
VAR_REF/EXPR case, where snprintf() truncation returns NULL early
and bypasses the bottom NULL->"" guard. Callers don't expect NULL:
strcat(expr, hist_field_name(field, 0)) at trace_events_hist.c:1758
and the strcmp() in the sort-key match loop at :4804 both deref it.
system and event_name are bounded by MAX_EVENT_NAME_LEN, but the
field name on a VAR_REF is kstrdup'd from a histogram variable
name parsed out of the trigger string and has no length cap, so
a long enough var name in a fully qualified reference can reach
the truncation path.
Keep the length check but leave field_name as "" on overflow.
Link: https://patch.msgid.link/20260508195747.25492-1-devnexen@gmail.com
Fixes: 5ec1d1e97de1 ("tracing: Rebuild full_name on each hist_field_name() call")
Signed-off-by: David Carlier <devnexen@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Date: Thu May 21 13:49:14 2026 +0900
tracing: Do not call map->ops->elt_free() if elt_alloc() fails
commit 8f0f5c4fb9df0e19a341e0c6ed8dc4fda9124f03 upstream.
In paths where tracing_map_elt_alloc() failed to allocate objects,
the map->ops->elt_alloc() call was never successful. In this case,
map->ops->elt_free() should not be called.
Link: https://sashiko.dev/#/patchset/20260520223101.34710-1-rosenp%40gmail.com
Cc: stable@vger.kernel.org
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Rosen Penev <rosenp@gmail.com>
Reported-by: Sashiko <sashiko-bot@kernel.org>
Fixes: 2734b629525a ("tracing: Add per-element variable support to tracing_map")
Link: https://patch.msgid.link/177933895460.108746.5396070821443932634.stgit@devnote2
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Feng Yang <yangfeng@kylinos.cn>
Date: Tue May 26 19:20:12 2026 +0000
tracing: Fix the bug where bpf_get_stackid returns -EFAULT on the ARM64
commit fd2f74f8f3d3c1a524637caf5bead9757fae4332 upstream.
When using bpf_program__attach_kprobe_multi_opts on ARM64 to hook a BPF program
that contains the bpf_get_stackid function, the BPF program fails
to obtain the stack trace and returns -EFAULT.
This is because ftrace_partial_regs omits the configuration of the pstate register,
leaving pstate at the default value of 0. When get_perf_callchain executes,
it uses user_mode(regs) to determine whether it is in kernel mode.
This leads to a misjudgment that the code is in user mode,
so perf_callchain_kernel is not executed and the function returns directly.
As a result, trace->nr becomes 0, and finally -EFAULT is returned.
Therefore, the assignment of the pstate register is added here.
Fixes: b9b55c8912ce ("tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs")
Closes: https://lore.kernel.org/bpf/20250919071902.554223-1-yangfeng59949@163.com/
Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Gyokhan Kochmarla <gyokhan@amazon.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming Lei <tom.leiming@gmail.com>
Date: Sun May 10 22:48:43 2026 +0800
ublk: reject max_sectors smaller than PAGE_SECTORS in parameter validation
[ Upstream commit 1860c2f85922917d8a46f16a6f4bd2298ffa0fb5 ]
blk_validate_limits() requires max_hw_sectors >= PAGE_SECTORS and fires
a WARN_ON_ONCE if this invariant is violated. ublk_validate_params()
only checked the upper bound of max_sectors against max_io_buf_bytes,
allowing userspace to pass small values (including zero) that trigger
the warning when blk_mq_alloc_disk() is called from
ublk_ctrl_start_dev().
Before 494ea040bcb5, ublk used blk_queue_max_hw_sectors() which silently
clamped small values up to PAGE_SECTORS. The conversion to passing
queue_limits directly to blk_mq_alloc_disk() lost that clamping and now
hits blk_validate_limits()'s WARN_ON_ONCE instead.
Validate that max_sectors is at least PAGE_SECTORS in
ublk_validate_params() so invalid values are rejected early with
-EINVAL instead of reaching the block layer.
Fixes: 494ea040bcb5 ("ublk: pass queue_limits to blk_mq_alloc_disk")
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Link: https://patch.msgid.link/20260510144843.769031-1-tom.leiming@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stefano Garzarella <sgarzare@redhat.com>
Date: Mon May 18 11:06:55 2026 +0200
vsock/virtio: reset connection on receiving queue overflow
commit a4f0b001782b21663d10df983b4b208195bec66c upstream.
When there is no more space to queue an incoming packet, the packet is
silently dropped. This causes data loss without any notification to
either peer, since there is no retransmission.
Under normal circumstances, this should never happen. However, it could
happen if the other peer doesn't respect the credit, or if the skb
overhead, which we recently began to take into account with commit
059b7dbd20a6 ("vsock/virtio: fix potential unbounded skb queue"),
is too high.
Fix this by resetting the connection and setting the local socket error
to ENOBUFS when virtio_transport_recv_enqueue() can no longer queue a
packet, so both peers are explicitly notified of the failure rather than
silently losing data.
Fixes: ae6fcfbf5f03 ("vsock/virtio: discard packets if credit is not respected")
Cc: stable@vger.kernel.org
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20260518090656.134588-2-sgarzare@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Minh Nguyen <minhnguyen.080505@gmail.com>
Date: Tue May 19 17:23:10 2026 +0700
vsock/vmci: fix UAF when peer resets connection during handshake
commit 99e22ddf4edb63dc8382bc028af928056d3450cf upstream.
vmci_transport_recv_connecting_server() returned err = 0 for a peer
RST in its default switch arm:
err = pkt->type == VMCI_TRANSPORT_PACKET_TYPE_RST ? 0 : -EINVAL;
That made vmci_transport_recv_listen() skip vsock_remove_pending(),
leaving the pending socket on the listener's pending_links with
sk_state = TCP_CLOSE while destroy: still dropped the explicit
reference taken before schedule_delayed_work().
One second later vsock_pending_work() observed is_pending=true and
performed full cleanup: vsock_remove_pending() then the two trailing
sock_put(sk) calls -- the first reached refcount 0 and __sk_freed
the socket, and the second wrote into the freed object:
BUG: KASAN: slab-use-after-free in refcount_warn_saturate
Write of size 4 at addr ffff88800b1cac80 by task kworker
Workqueue: events vsock_pending_work
Treat peer RST like any other unexpected packet type (err = -EINVAL).
All destroy: arms now return err < 0, so vmci_transport_recv_listen()
removes pending from pending_links synchronously and
vsock_pending_work() takes the is_pending=false / !rejected branch,
dropping only its own work reference. This also closes the
multi-packet race Sashiko reported on v2: pending is removed from
the list before any subsequent packet can find it.
The pre-existing sk_acceptq_removed() gap on the err < 0 path of
vmci_transport_recv_listen() that Sashiko also noted is not
introduced or changed by this patch.
Tested on lts-6.12.79 with KASAN: 52/100 unpatched -> 0/100 patched.
Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
Cc: stable@vger.kernel.org
Signed-off-by: Minh Nguyen <minhnguyen.080505@gmail.com>
Acked-by: Bryan Tan <bryan-bt.tan@broadcom.com>
Link: https://patch.msgid.link/20260519102310.237181-1-minhnguyen.080505@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kang Yang <kang.yang@oss.qualcomm.com>
Date: Tue Apr 28 14:17:37 2026 +0800
wifi: ath10k: skip WMI and beacon transmission when device is wedged
[ Upstream commit 54a5b38e4396530e5b2f12b54d3844e860ab6784 ]
In ath10k_wmi_cmd_send(), the current code detects ATH10K_STATE_WEDGED
and sets ret to -ESHUTDOWN, but still proceeds to transmit pending
beacons and calls ath10k_wmi_cmd_send_nowait().
This can lead to incorrect behavior, as WMI commands and beacons are
still sent after the device has been marked as wedged, and the original
-ESHUTDOWN return value may be overwritten by the result of the send
path.
The wedged state indicates the hardware is already unreliable, and no
further interaction with firmware is expected or meaningful in this
state.
Fix this by skipping beacon transmission and the WMI send path entirely
once ATH10K_STATE_WEDGED is detected, ensuring consistent return values
and avoiding unnecessary firmware interaction.
Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1
Tested-on: QCA6174 hw3.2 SDIO WLAN.RMH.4.4.1-00189
Fixes: c256a94d1b1b ("wifi: ath10k: shutdown driver when hardware is unreliable")
Signed-off-by: Kang Yang <kang.yang@oss.qualcomm.com>
Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Link: https://patch.msgid.link/20260428061737.37-1-kang.yang@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kyle Farnung <kfarnung@gmail.com>
Date: Wed May 13 21:52:12 2026 -0700
wifi: ath11k: clear shared SRNG pointer state on restart
commit f51e4b3b5574ad8cb5b16b11f8a1452147ece87a upstream.
LMAC rings reuse the shared rdp/wrp pointer buffers without going
through the normal SRNG hw-init path that zeros non-LMAC ring
pointers. After restart, ath11k_hal_srng_clear() can therefore hand
stale hp/tp state from the previous firmware instance back to the new
one.
Clear the shared pointer buffers while keeping the allocations in
place so restart still avoids reallocating SRNG DMA memory, but starts
with fresh ring-pointer state.
Fixes: 32be3ca4cf78b ("wifi: ath11k: HAL SRNG: don't deinitialize and re-initialize again")
Cc: stable@vger.kernel.org
Closes: https://lore.kernel.org/all/CAOPSVF04q6uvVdq8GTRLHBrVMdpt9=o9wVcFMc6f-yhmSBcZqQ@mail.gmail.com/
Signed-off-by: Kyle Farnung <kfarnung@gmail.com>
Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Link: https://patch.msgid.link/20260513-kfarnung-ath11k-srng-clear-pointer-state-v1-1-bc700dd8b333@gmail.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Nicolas Escande <nico.escande@gmail.com>
Date: Wed May 6 15:42:40 2026 +0200
wifi: ath11k: fix error path leak in ath11k_tm_cmd_wmi_ftm()
[ Upstream commit 7320d6eb861e9913193a7801834c661381756a79 ]
This is similar to what was fixed by previous patches. We have a call
to ath11k_wmi_cmd_send() which does check the return value, but forgot
to free the related skb on error.
Fixes: b43310e44edc ("wifi: ath11k: factory test mode support")
Signed-off-by: Nicolas Escande <nico.escande@gmail.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com>
Link: https://patch.msgid.link/20260506134240.2284016-4-nico.escande@gmail.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nicolas Escande <nico.escande@gmail.com>
Date: Wed May 6 15:42:38 2026 +0200
wifi: ath11k: fix error path leaks in some WMI WOW calls
[ Upstream commit 55dda532bbc261aef495e403c8900c5e2ab5fa34 ]
Fix two instances where we used to directly return the result of
ath11k_wmi_cmd_send(...). Because we did not check the return value, we
also did not free the skb in the error path.
Fixes: 79802b13a492 ("ath11k: implement WoW enable and wakeup commands")
Signed-off-by: Nicolas Escande <nico.escande@gmail.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com>
Link: https://patch.msgid.link/20260506134240.2284016-2-nico.escande@gmail.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Matthew Leach <matthew.leach@collabora.com>
Date: Fri Apr 24 10:50:35 2026 +0100
wifi: ath11k: fix peer resolution on rx path when peer_id=0
[ Upstream commit 2a2451a34afdf563b3102d36a4b6cf335cf813e2 ]
It has been observed that on certain chipsets a peer can be assigned
peer_id=0. For reception of non-aggregated MPDUs this is fine as
ath11k_dp_rx_h_find_peer() has a fallback case where it locates the peer
based upon the source MAC address. On an aggregated link, the mpdu_start
header is only populated by hardware on the first sub-MSDU. This causes
the peer resolution to be skipped for the subsequent MSDUs and the
encryption type of these frames to be set to an incorrect value,
resulting in these MSDUs being dropped by ieee80211.
ath11k_pci 0000:03:00.0: data rx skb 000000002f4b704d len 1534 peer xx:xx:xx:xx:xx:xx 0 ucast sn 3063 he160 rate_idx 9 vht_nss 2 freq 5240 band 1 flag 0x40d1a fcs-err 0 mic-err 0 amsdu-more 0 peer_id 0 first_msdu 1 last_msdu 0
ath11k_pci 0000:03:00.0: data rx skb 0000000038acd580 len 1534 peer (null) 0 ucast sn 3063 he160 rate_idx 9 vht_nss 2 freq 5240 band 1 flag 0x40d00 fcs-err 0 mic-err 0 amsdu-more 0 peer_id 0 first_msdu 0 last_msdu 1
Remove the null peer_id checks in ath11k_dp_rx_h_find_peer() and
ath11k_hal_rx_parse_mon_status_tlv(), allowing peers with an assigned ID
of 0 to be resolved.
Tested-on: QCA2066 hw2.1 PCI WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.9
Fixes: 2167fa606c0f ("ath11k: Add support for RX decapsulation offload")
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Signed-off-by: Matthew Leach <matthew.leach@collabora.com>
Reviewed-by: P Praneesh <praneesh.p@oss.qualcomm.com>
Link: https://patch.msgid.link/20260424-ath11k-null-peerid-workaround-v4-1-252b224d3cf6@collabora.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: John Walker <johnwalker0@gmail.com>
Date: Thu May 7 17:07:20 2026 -0600
wifi: cfg80211: advance loop vars in cfg80211_merge_profile()
commit 7666dbb1bacc4ba522b96740cba7283d243d16e1 upstream.
cfg80211_merge_profile() reassembles a Multi-BSSID non-transmitted BSS
profile that has been split across multiple consecutive MBSSID elements.
Its while-loop calls
cfg80211_get_profile_continuation(ie, ielen, mbssid_elem, sub_elem)
but never advances mbssid_elem or sub_elem inside the body. Each
iteration therefore searches for a continuation that follows the same
fixed pair; the helper returns the same next_mbssid; and the same
next_sub bytes are memcpy()'d into merged_ie at a growing offset until
the buffer fills.
Advance both mbssid_elem and sub_elem to the just-consumed continuation
so the next call to cfg80211_get_profile_continuation() searches for a
further continuation beyond it (or returns NULL when none exists).
A specially-crafted malicious beacon can take advantage of this bug
to cause the kernel to spend an excessive amount of time in
cfg80211_merge_profile (up to as much as 2ms per beacon received),
which could theoretically be abused in some way.
Cc: stable@vger.kernel.org
Fixes: fe806e4992c9 ("cfg80211: support profile split between elements")
Signed-off-by: John Walker <johnwalker0@gmail.com>
Link: https://patch.msgid.link/20260507230720.64783-1-johnwalker0@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Michael Bommarito <michael.bommarito@gmail.com>
Date: Fri May 15 11:17:18 2026 -0400
wifi: mac80211: consume only present negotiated TTLM maps
commit a6e6ccd5bd07155c2add6c74ce1a5e68ad3b95ea upstream.
ieee80211_tid_to_link_map_size_ok() validates negotiated TTLM elements
against the number of link-map entries indicated by link_map_presence.
ieee80211_parse_neg_ttlm() must consume the same layout.
The parser advanced its cursor for every TID, including TIDs whose
presence bit is clear and therefore have no map bytes in the element.
A sparse map can then make a later present TID read past the validated
element.
The bad bytes land in neg_ttlm->{up,down}link[tid] but are gated by
valid_links before being applied to driver state, so a peer cannot
turn the read into a policy change. Under KUnit + KASAN with an
exact-sized element allocation the OOB read is reported as a
slab-out-of-bounds; whether the same trigger fires under the
production RX path depends on surrounding allocator state.
Advance the cursor only when the current TID has a map present.
Fixes: 8f500fbc6c65 ("wifi: mac80211: process and save negotiated TID to Link mapping request")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Link: https://patch.msgid.link/20260515151719.1317659-2-michael.bommarito@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Johannes Berg <johannes.berg@intel.com>
Date: Fri May 8 09:10:31 2026 +0200
wifi: mac80211: fix MLE defragmentation
[ Upstream commit a74e893f30db64cdce0fc7a96d3baa417bcd55f5 ]
If either reconf or EPCS multi-link element (MLE) is contained in
a non-transmitted profile, the defragmentation routine is called
with a pointer to the defragmented copy, but the original elements.
This is incorrect for two reasons:
- if the original defragmentation was needed, it will not find the
correct data
- if the original frame is at a higher address, the parsing will
potentially overrun the heap data (though given the layout of
the buffers, only into the new defragmentation buffer, and then
it has to stop and fail once that's filled with copied data.
Fix it by tracking the container along with the pointer and in
doing so also unify the two almost identical defragmentation
routines.
Fixes: 4d70e9c5488d ("wifi: mac80211: defragment reconfiguration MLE when parsing")
Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com>
Reviewed-by: Ilan Peer <ilan.peer@intel.com>
Link: https://patch.msgid.link/20260508091031.8a6c34613178.I4de16ebbce2d27f2f8f98fc49949c7a376c2fe8d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jiri Olsa <jolsa@kernel.org>
Date: Tue May 26 19:23:24 2026 +0000
x86/fgraph: Fix return_to_handler regs.rsp value
commit 8bc11700e0d23d4fdb7d8d5a73b2e95de427cabc upstream.
The previous change (Fixes commit) messed up the rsp register value,
which is wrong because it's already adjusted with FRAME_SIZE, we need
the original rsp value.
This change does not affect fprobe current kernel unwind, the !perf_hw_regs
path perf_callchain_kernel:
if (perf_hw_regs(regs)) {
if (perf_callchain_store(entry, regs->ip))
return;
unwind_start(&state, current, regs, NULL);
} else {
unwind_start(&state, current, NULL, (void *)regs->sp);
}
which uses pt_regs.sp as first_frame boundary (FRAME_SIZE shift makes
no difference, unwind stil stops at the right frame).
This change fixes the other path when we want to unwind directly from
pt_regs sp/fp/ip state, which is coming in following change.
Fixes: 20a0bc10272f ("x86/fgraph,bpf: Fix stack ORC unwind from kprobe_multi return probe")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/bpf/20260126211837.472802-2-jolsa@kernel.org
Signed-off-by: Gyokhan Kochmarla <gyokhan@amazon.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Juergen Gross <jgross@suse.com>
Date: Tue May 5 12:24:17 2026 +0200
x86/xen: Fix xen_e820_swap_entry_with_ram()
[ Upstream commit 28e03f78e69cf6628b81f24777799778528a84c1 ]
When swapping a not page-aligned E820 map entry with RAM, the start
address of the modified entry is calculated wrong (the offset into the
page is subtracted instead of being added to the page address).
Fixes: be35d91c8880 ("xen: tolerate ACPI NVS memory overlapping with Xen allocated memory")
Reported-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20260505102417.208138-1-jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Date: Wed Apr 29 22:58:15 2026 +0200
zonefs: handle integer overflow in zonefs_fname_to_fno
[ Upstream commit 3a8389d42bdf4213730f4067f8bfa78bae6564ef ]
In zonefs the file name in one of the two directories corresponds to the
zone number.
Here Alexey reported a possible integer overflow in zonefs_fname_to_fno(),
where the parsing of the zone number from the file name can overflow the
'long' data type.
Add a check for integer overflows and if the fno 'long' did overflow
return -ENOENT.
Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Fixes: d207794ababe ("zonefs: Dynamically create file inodes when needed")
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>