Author: Weiming Shi <bestswngs@gmail.com>
Date: Wed Mar 25 00:54:59 2026 +0800
ACPI: EC: clean up handlers on probe failure in acpi_ec_setup()
[ Upstream commit f6484cadbcaf26b5844b51bd7307a663dda48ef6 ]
When ec_install_handlers() returns -EPROBE_DEFER on reduced-hardware
platforms, it has already started the EC and installed the address
space handler with the struct acpi_ec pointer as handler context.
However, acpi_ec_setup() propagates the error without any cleanup.
The caller acpi_ec_add() then frees the struct acpi_ec for non-boot
instances, leaving a dangling handler context in ACPICA.
Any subsequent AML evaluation that accesses an EC OpRegion field
dispatches into acpi_ec_space_handler() with the freed pointer,
causing a use-after-free:
BUG: KASAN: slab-use-after-free in mutex_lock (kernel/locking/mutex.c:289)
Write of size 8 at addr ffff88800721de38 by task init/1
Call Trace:
<TASK>
mutex_lock (kernel/locking/mutex.c:289)
acpi_ec_space_handler (drivers/acpi/ec.c:1362)
acpi_ev_address_space_dispatch (drivers/acpi/acpica/evregion.c:293)
acpi_ex_access_region (drivers/acpi/acpica/exfldio.c:246)
acpi_ex_field_datum_io (drivers/acpi/acpica/exfldio.c:509)
acpi_ex_extract_from_field (drivers/acpi/acpica/exfldio.c:700)
acpi_ex_read_data_from_field (drivers/acpi/acpica/exfield.c:327)
acpi_ex_resolve_node_to_value (drivers/acpi/acpica/exresolv.c:392)
</TASK>
Allocated by task 1:
acpi_ec_alloc (drivers/acpi/ec.c:1424)
acpi_ec_add (drivers/acpi/ec.c:1692)
Freed by task 1:
kfree (mm/slub.c:6876)
acpi_ec_add (drivers/acpi/ec.c:1751)
The bug triggers on reduced-hardware EC platforms (ec->gpe < 0)
when the GPIO IRQ provider defers probing. Once the stale handler
exists, any unprivileged sysfs read that causes AML to touch an
EC OpRegion (battery, thermal, backlight) exercises the dangling
pointer.
Fix this by calling ec_remove_handlers() in the error path of
acpi_ec_setup() before clearing first_ec. ec_remove_handlers()
checks each EC_FLAGS_* bit before acting, so it is safe to call
regardless of how far ec_install_handlers() progressed:
-ENODEV (handler not installed): only calls acpi_ec_stop()
-EPROBE_DEFER (handler installed): removes handler, stops EC
Fixes: 03e9a0e05739 ("ACPI: EC: Consolidate event handler installation code")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Link: https://patch.msgid.link/20260324165458.1337233-2-bestswngs@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Sat Mar 14 17:02:10 2026 +0000
af_key: validate families in pfkey_send_migrate()
[ Upstream commit eb2d16a7d599dc9d4df391b5e660df9949963786 ]
syzbot was able to trigger a crash in skb_put() [1]
Issue is that pfkey_send_migrate() does not check old/new families,
and that set_ipsecrequest() @family argument was truncated,
thus possibly overfilling the skb.
Validate families early, do not wait set_ipsecrequest().
[1]
skbuff: skb_over_panic: text:ffffffff8a752120 len:392 put:16 head:ffff88802a4ad040 data:ffff88802a4ad040 tail:0x188 end:0x180 dev:<NULL>
kernel BUG at net/core/skbuff.c:214 !
Call Trace:
<TASK>
skb_over_panic net/core/skbuff.c:219 [inline]
skb_put+0x159/0x210 net/core/skbuff.c:2655
skb_put_zero include/linux/skbuff.h:2788 [inline]
set_ipsecrequest net/key/af_key.c:3532 [inline]
pfkey_send_migrate+0x1270/0x2e50 net/key/af_key.c:3636
km_migrate+0x155/0x260 net/xfrm/xfrm_state.c:2848
xfrm_migrate+0x2140/0x2450 net/xfrm/xfrm_policy.c:4705
xfrm_do_migrate+0x8ff/0xaa0 net/xfrm/xfrm_user.c:3150
Fixes: 08de61beab8a ("[PFKEYV2]: Extension for dynamic update of endpoint address(es)")
Reported-by: syzbot+b518dfc8e021988fbd55@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/69b5933c.050a0220.248e02.00f2.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zhan Xusheng <zhanxusheng1024@gmail.com>
Date: Mon Mar 23 14:11:30 2026 +0800
alarmtimer: Fix argument order in alarm_timer_forward()
commit 5d16467ae56343b9205caedf85e3a131e0914ad8 upstream.
alarm_timer_forward() passes arguments to alarm_forward() in the wrong
order:
alarm_forward(alarm, timr->it_interval, now);
However, alarm_forward() is defined as:
u64 alarm_forward(struct alarm *alarm, ktime_t now, ktime_t interval);
and uses the second argument as the current time:
delta = ktime_sub(now, alarm->node.expires);
Passing the interval as "now" results in incorrect delta computation,
which can lead to missed expirations or incorrect overrun accounting.
This issue has been present since the introduction of
alarm_timer_forward().
Fix this by swapping the arguments.
Fixes: e7561f1633ac ("alarmtimer: Implement forward callback")
Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260323061130.29991-1-zhanxusheng@xiaomi.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Alexey Nepomnyashih <sdl@nppct.ru>
Date: Mon Mar 16 19:18:22 2026 +0000
ALSA: firewire-lib: fix uninitialized local variable
commit bb120ad57def62e3f23e3d999c5fbed11f610993 upstream.
Similar to commit d8dc8720468a ("ALSA: firewire-lib: fix uninitialized
local variable"), the local variable `curr_cycle_time` in
process_rx_packets() is declared without initialization.
When the tracepoint event is not probed, the variable may appear to be
used without being initialized. In practice the value is only relevant
when the tracepoint is enabled, however initializing it avoids potential
use of an uninitialized value and improves code safety.
Initialize `curr_cycle_time` to zero.
Fixes: fef4e61b0b76 ("ALSA: firewire-lib: extend tracepoints event including CYCLE_TIME of 1394 OHCI")
Cc: stable@vger.kernel.org
Signed-off-by: Alexey Nepomnyashih <sdl@nppct.ru>
Link: https://patch.msgid.link/20260316191824.83249-1-sdl@nppct.ru
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sheetal <sheetal@nvidia.com>
Date: Mon Mar 2 14:12:17 2026 +0530
ALSA: hda/hdmi: Add Tegra238 HDA codec device ID
[ Upstream commit 5f4338e5633dc034a81000b2516a78cfb51c601d ]
Add Tegra238 HDA codec device in hda_device_id list.
Signed-off-by: Sheetal <sheetal@nvidia.com>
Link: https://patch.msgid.link/20260302084217.3135982-1-sheetal@nvidia.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Uzair Mughal <contact@uzair.is-a.dev>
Date: Sat Mar 7 06:29:06 2026 +0500
ALSA: hda/realtek: Add headset jack quirk for Thinkpad X390
[ Upstream commit 542127f6528ca7cc3cf61e1651d6ccb58495f953 ]
The Lenovo ThinkPad X390 (ALC257 codec, subsystem ID 0x17aa2288)
does not report headset button press events. Headphone insertion is
detected (SW_HEADPHONE_INSERT), but pressing the inline microphone
button on a headset produces no input events.
Add a SND_PCI_QUIRK entry that maps this subsystem ID to
ALC285_FIXUP_THINKPAD_NO_BASS_SPK_HEADSET_JACK, which enables
headset jack button detection through alc_fixup_headset_jack()
and ThinkPad ACPI integration. This is the same fixup used by
similar ThinkPad models (P1 Gen 3, X1 Extreme Gen 3).
Signed-off-by: Uzair Mughal <contact@uzair.is-a.dev>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20260307012906.20093-1-contact@uzair.is-a.dev
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Liucheng Lu <luliucheng100@outlook.com>
Date: Sat Mar 7 11:27:27 2026 +0800
ALSA: hda/realtek: add HP Laptop 14s-dr5xxx mute LED quirk
[ Upstream commit 178dd118c0f07fd63a9ed74cfbd8c31ae50e33af ]
HP Laptop 14s-dr5xxx with ALC236 codec does not handle the toggling of
the mute LED.
This patch adds a quirk entry for subsystem ID 0x8a1f using
ALC236_FIXUP_HP_MUTE_LED_COEFBIT2 fixup, enabling correct mute LED
behavior.
Signed-off-by: Liucheng Lu <luliucheng100@outlook.com>
Link: https://patch.msgid.link/PAVPR03MB9774F3FCE9CCD181C585281AE37BA@PAVPR03MB9774.eurprd03.prod.outlook.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zhang Heng <zhangheng@kylinos.cn>
Date: Mon Mar 16 10:28:43 2026 +0800
ALSA: hda/realtek: add quirk for ASUS Strix G16 G615JMR
commit 0bdf27abaf8940592207be939142451436afe39f upstream.
The machine is equipped with ALC294 and requires the
ALC287_FIXUP_TXNW2781_I2C_ASUS quirk for the amplifier to work properly.
Since the machine's PCI SSID is also 1043:1204, HDA_CODEC_QUIRK is
used to retain the previous quirk.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=221173
Cc: <stable@vger.kernel.org>
Signed-off-by: Zhang Heng <zhangheng@kylinos.cn>
Link: https://patch.msgid.link/20260316022843.2809968-1-zhangheng@kylinos.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Zhang Heng <zhangheng@kylinos.cn>
Date: Fri Mar 6 20:33:17 2026 +0800
ALSA: hda/realtek: add quirk for ASUS UM6702RC
[ Upstream commit 0d3429f12133c2ca47aa82ddab2342bc360c47d3 ]
The sound card of this machine cannot adjust the volume, it can only
be 0 or 100%. The reason is that the DAC with pin 0x17 is connected
to 0x06. Testing found that connecting 0x02 can fix this problem.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=220356
Signed-off-by: Zhang Heng <zhangheng@kylinos.cn>
Link: https://patch.msgid.link/20260306123317.575346-1-zhangheng@kylinos.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zhang Heng <zhangheng@kylinos.cn>
Date: Thu Mar 5 10:35:59 2026 +0800
ALSA: hda/realtek: Add quirk for Gigabyte Technology to fix headphone
[ Upstream commit 56fbbe096a89ff4b52af78a21a4afd9d94bdcc80 ]
The BIOS of this machine has set 0x19 to mic, which needs to be set
to headphone pin in order to work properly.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=220814
Signed-off-by: Zhang Heng <zhangheng@kylinos.cn>
Link: https://patch.msgid.link/b55f6ebe-7449-49f7-ae85-00d2ba1e7af0@kylinos.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sean Rhodes <sean@starlabs.systems>
Date: Sun Mar 15 20:11:27 2026 +0000
ALSA: hda/realtek: Sequence GPIO2 on Star Labs StarFighter
[ Upstream commit a6919f2a01f8fbf807b015e5b26aecae7db8117b ]
The initial StarFighter quirk fixed the runtime suspend pop by muting
speakers in the shutup callback before power-down. Further hardware
validation showed that the speaker path is controlled directly by LINE2
EAPD on NID 0x1b together with GPIO2 for the external amplifier.
Replace the shutup-delay workaround with explicit sequencing of those
controls at playback start and stop:
- assert LINE2 EAPD and drive GPIO2 high on PREPARE
- deassert LINE2 EAPD and drive GPIO2 low on CLEANUP
This avoids the runtime suspend pop without a sleep, and also fixes pops
around G3 entry and display-manager start that the original workaround
did not cover.
Fixes: 1cb3c20688fc ("ALSA: hda/realtek: Fix speaker pop on Star Labs StarFighter")
Tested-by: Sean Rhodes <sean@starlabs.systems>
Signed-off-by: Sean Rhodes <sean@starlabs.systems>
Link: https://patch.msgid.link/20260315201127.33744-1-sean@starlabs.systems
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: wangdicheng <wangdicheng@kylinos.cn>
Date: Tue Mar 3 16:15:16 2026 +0800
ALSA: hda/senary: Ensure EAPD is enabled during init
[ Upstream commit 7ae0d8f1abbbba6f98cac735145e1206927c67d9 ]
The driver sets spec->gen.own_eapd_ctl to take manual control of the
EAPD (External Amplifier). However, senary_init does not turn on the
EAPD, while senary_shutdown turns it off.
Since the generic driver skips EAPD handling when own_eapd_ctl is set,
the EAPD remains off after initialization (e.g., after resume), leaving
the codec in a non-functional state.
Explicitly call senary_auto_turn_eapd in senary_init to ensure the EAPD
is enabled and the codec is functional.
Signed-off-by: wangdicheng <wangdicheng@kylinos.cn>
Link: https://patch.msgid.link/20260303081516.583438-1-wangdich9700@163.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Geoffrey D. Bennett <g@b4.vu>
Date: Wed Mar 18 06:02:30 2026 +1030
ALSA: usb-audio: Exclude Scarlett 2i2 1st Gen from SKIP_IFACE_SETUP
[ Upstream commit 8780f561f6717dec52351251881bff79e960eb46 ]
The Focusrite Scarlett 2i2 1st Gen (1235:8006) produces
distorted/silent audio when QUIRK_FLAG_SKIP_IFACE_SETUP is active, as
that flag causes the feedback format to be detected as 17.15 instead
of 16.16.
Add a DEVICE_FLG entry for this device before the Focusrite VENDOR_FLG
entry so that it gets no quirk flags, overriding the vendor-wide
SKIP_IFACE_SETUP. This device doesn't have the internal mixer, Air, or
Safe modes that the quirk was designed to protect.
Fixes: 38c322068a26 ("ALSA: usb-audio: Add QUIRK_FLAG_SKIP_IFACE_SETUP")
Reported-by: pairomaniac [https://github.com/geoffreybennett/linux-fcp/issues/54]
Tested-by: pairomaniac [https://github.com/geoffreybennett/linux-fcp/issues/54]
Signed-off-by: Geoffrey D. Bennett <g@b4.vu>
Link: https://patch.msgid.link/abmsTjKmQMKbhYtK@m.b4.vu
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Geoffrey D. Bennett <g@b4.vu>
Date: Mon Mar 23 21:59:21 2026 +1030
ALSA: usb-audio: Exclude Scarlett 2i4 1st Gen from SKIP_IFACE_SETUP
[ Upstream commit 990a8b0732cf899d4a0f847b0a67efeb9a384c82 ]
Same issue that the Scarlett 2i2 1st Gen had:
QUIRK_FLAG_SKIP_IFACE_SETUP causes distorted/flanging audio on the
Scarlett 2i4 1st Gen (1235:800a).
Fixes: 38c322068a26 ("ALSA: usb-audio: Add QUIRK_FLAG_SKIP_IFACE_SETUP")
Reported-by: dcferreira [https://github.com/geoffreybennett/linux-fcp/issues/54]
Signed-off-by: Geoffrey D. Bennett <g@b4.vu>
Link: https://patch.msgid.link/acEkEbftzyNe8W7C@m.b4.vu
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Markus Niebel <Markus.Niebel@ew.tq-group.com>
Date: Tue Dec 16 14:39:25 2025 +0100
arm64: dts: imx8mn-tqma8mqnl: fix LDO5 power off
commit 8adc841d43ebceabec996c9dcff6e82d3e585268 upstream.
Fix SD card removal caused by automatic LDO5 power off after boot
To prevent this, add vqmmc regulator for USDHC, using a GPIO-controlled
regulator that is supplied by LDO5. Since this is implemented on SoM but
used on baseboards with SD-card interface, implement the functionality
on SoM part and optionally enable it on baseboards if needed.
Signed-off-by: Markus Niebel <Markus.Niebel@ew.tq-group.com>
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jihed Chaibi <jihed.chaibi.dev@gmail.com>
Date: Wed Mar 25 22:07:04 2026 +0100
ASoC: adau1372: Fix clock leak on PLL lock failure
[ Upstream commit bfe6a264effcb6fe99ad7ceaf9e8c7439fc9555b ]
adau1372_enable_pll() was a void function that logged a dev_err() on
PLL lock timeout but did not propagate the error. As a result,
adau1372_set_power() would continue with adau1372->enabled set to true
despite the PLL being unlocked, and the mclk left enabled with no
corresponding disable on the error path.
Convert adau1372_enable_pll() to return int, using -ETIMEDOUT on lock
timeout and propagating regmap errors directly. In adau1372_set_power(),
check the return value and unwind in reverse order: restore regcache to
cache-only mode, reassert GPIO power-down, and disable the clock before
returning the error.
Signed-off-by: Jihed Chaibi <jihed.chaibi.dev@gmail.com>
Fixes: 6cd4c6459e47 ("ASoC: Add ADAU1372 audio CODEC support")
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Link: https://patch.msgid.link/20260325210704.76847-3-jihed.chaibi.dev@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jihed Chaibi <jihed.chaibi.dev@gmail.com>
Date: Wed Mar 25 22:07:03 2026 +0100
ASoC: adau1372: Fix unchecked clk_prepare_enable() return value
[ Upstream commit 326fe8104a4020d30080d37ac8b6b43893cdebca ]
adau1372_set_power() calls clk_prepare_enable() but discards the return
value. If the clock enable fails, the driver proceeds to access registers
on unpowered hardware, potentially causing silent corruption.
Make adau1372_set_power() return int and propagate the error from
clk_prepare_enable(). Update adau1372_set_bias_level() to return the
error directly for the STANDBY and OFF cases.
Signed-off-by: Jihed Chaibi <jihed.chaibi.dev@gmail.com>
Fixes: 6cd4c6459e47 ("ASoC: Add ADAU1372 audio CODEC support")
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Link: https://patch.msgid.link/20260325210704.76847-2-jihed.chaibi.dev@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Simon Trimmer <simont@opensource.cirrus.com>
Date: Tue Feb 24 13:03:07 2026 +0000
ASoC: amd: acp: Add ACP6.3 match entries for Cirrus Logic parts
[ Upstream commit fd13fc700e3e239826a46448bf7f01847dd26f5a ]
This adds some match entries for a few system configurations:
cs42l43 link 0 UID 0
cs35l56 link 1 UID 0
cs35l56 link 1 UID 1
cs35l56 link 1 UID 2
cs35l56 link 1 UID 3
cs42l45 link 1 UID 0
cs35l63 link 0 UID 0
cs35l63 link 0 UID 2
cs35l63 link 0 UID 4
cs35l63 link 0 UID 6
cs42l45 link 0 UID 0
cs35l63 link 1 UID 0
cs35l63 link 1 UID 1
cs42l45 link 0 UID 0
cs35l63 link 1 UID 1
cs35l63 link 1 UID 3
cs42l45 link 1 UID 0
cs35l63 link 0 UID 0
cs35l63 link 0 UID 1
cs42l43 link 1 UID 0
cs35l56 link 1 UID 0
cs35l56 link 1 UID 1
cs35l56 link 1 UID 2
cs35l56 link 1 UID 3
cs35l56 link 1 UID 0
cs35l56 link 1 UID 1
cs35l56 link 1 UID 2
cs35l56 link 1 UID 3
cs35l63 link 0 UID 0
cs35l63 link 0 UID 2
cs35l63 link 0 UID 4
cs35l63 link 0 UID 6
cs42l43 link 0 UID 1
cs42l43b link 0 UID 1
cs42l45 link 0 UID 0
cs42l45 link 1 UID 0
Signed-off-by: Simon Trimmer <simont@opensource.cirrus.com>
Link: https://patch.msgid.link/20260224130307.526626-1-simont@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Date: Mon Mar 23 23:17:48 2026 +0000
ASoC: codecs: wcd934x: fix typo in dt parsing
commit cfb385a8dc88d86a805a5682eaa68f59fa5c0ec3 upstream.
Looks like we ended up with a typo during device tree data parsing
as part of 4f16b6351bbff ("ASoC: codecs: wcd: add common helper for wcd
codecs") patch.
This will result in not parsing the device tree data and results in
zero mic bias values.
Fix this by calling wcd_dt_parse_micbias_info instead of
wcd_dt_parse_mbhc_data.
Fixes: 4f16b6351bbff ("ASoC: codecs: wcd: add common helper for wcd codecs")
Cc: Stable@vger.kernel.org
Reported-by: Joel Selvaraj <foss@joelselvaraj.com>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://patch.msgid.link/20260323231748.2217967-1-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Richard Fitzgerald <rf@opensource.cirrus.com>
Date: Thu Feb 26 11:01:37 2026 +0000
ASoC: cs35l56: Only patch ASP registers if the DAI is part of a DAIlink
[ Upstream commit 9351cf3fd92dc1349bb75f2f7f7324607dcf596f ]
Move the ASP register patches to a separate struct and apply this from the
ASP DAI probe() function so that the registers are only patched if the DAI
is part of a DAI link.
Some systems use the ASP as a special-purpose interconnect and on these
systems the ASP registers are configured by a third party (the firmware,
the BIOS, or another device using the amp's secondary host control
interface).
If the machine driver does not hook up the ASP DAI then the ASP registers
must be omitted from the patch to prevent overwriting the third party
configuration.
If the machine driver includes the ASP DAI in a DAI link, this implies that
the machine driver and higher components (such as alsa-ucm) are taking
ownership of the ASP. In this case the ASP registers are patched to known
defaults and the machine driver should configure the ASP.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://patch.msgid.link/20260226110137.1664562-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jihed Chaibi <jihed.chaibi.dev@gmail.com>
Date: Sat Mar 21 02:20:11 2026 +0100
ASoC: dt-bindings: stm32: Fix incorrect compatible string in stm32h7-sai match
[ Upstream commit 91049ec2e18376ec2192e73ef7be4c7110436350 ]
The conditional block that defines clock constraints for the stm32h7-sai
variant references "st,stm32mph7-sai", which does not match any compatible
string in the enum. As a result, clock validation for the h7 variant is
silently skipped. Correct the compatible string to "st,stm32h7-sai".
Fixes: 8509bb1f11a1f ("ASoC: dt-bindings: add stm32mp25 support for sai")
Signed-off-by: Jihed Chaibi <jihed.chaibi.dev@gmail.com>
Reviewed-by: Olivier Moysan <olivier.moysan@foss.st.com>
Link: https://patch.msgid.link/20260321012011.125791-1-jihed.chaibi.dev@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shengjiu Wang <shengjiu.wang@nxp.com>
Date: Wed Mar 18 18:28:50 2026 +0800
ASoC: fsl: imx-card: initialize playback_only and capture_only
[ Upstream commit ca67bd564e94aaa898a2cbb90922ca3cccd0612b ]
Fix uninitialized variable playback_only and capture_only because
graph_util_parse_link_direction() may not write them.
Fixes: 1877c3e7937f ("ASoC: imx-card: Add playback_only or capture_only support")
Suggested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20260318102850.2794029-3-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mark Brown <broonie@kernel.org>
Date: Thu Feb 5 00:25:37 2026 +0000
ASoC: fsl_easrc: Fix event generation in fsl_easrc_iec958_put_bits()
[ Upstream commit 54a86cf48eaa6d1ab5130d756b718775e81e1748 ]
ALSA controls should return 1 if the value in the control changed but the
control put operation fsl_easrc_iec958_put_bits() unconditionally returns
0, causing ALSA to not generate any change events. This is detected by
mixer-test with large numbers of messages in the form:
No event generated for Context 3 IEC958 CS5
Context 3 IEC958 CS5.0 orig 5224 read 5225, is_volatile 0
Add a suitable check.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/20260205-asoc-fsl-easrc-fix-events-v1-1-39d4c766918b@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mark Brown <broonie@kernel.org>
Date: Thu Feb 5 00:25:38 2026 +0000
ASoC: fsl_easrc: Fix event generation in fsl_easrc_iec958_set_reg()
[ Upstream commit 31ddc62c1cd92e51b9db61d7954b85ae2ec224da ]
ALSA controls should return 1 if the value in the control changed but the
control put operation fsl_easrc_set_reg() only returns 0 or a negative
error code, causing ALSA to not generate any change events. Add a suitable
check by using regmap_update_bits_check() with the underlying regmap, this
is more clearly and simply correct than trying to verify that one of the
generic ops is exactly equivalent to this one.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/20260205-asoc-fsl-easrc-fix-events-v1-2-39d4c766918b@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cezary Rojewski <cezary.rojewski@intel.com>
Date: Fri Mar 20 11:12:17 2026 +0100
ASoC: Intel: catpt: Fix the device initialization
[ Upstream commit 5a184f1cb43a8e035251c635f5c47da5dc3e3049 ]
The DMA mask shall be coerced before any buffer allocations for the
device are done. At the same time explain why DMA mask of 31 bits is
used in the first place.
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Fixes: 7a10b66a5df9 ("ASoC: Intel: catpt: Device driver lifecycle")
Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20260320101217.1243688-1-cezary.rojewski@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Oliver Freyermuth <o.freyermuth@googlemail.com>
Date: Tue Feb 24 20:02:24 2026 +0100
ASoC: Intel: sof_sdw: Add quirk for Alienware Area 51 (2025) 0CCD SKU
[ Upstream commit 70eddf6a0a3fc6d3ab6f77251676da97cc7f12ae ]
This adds the necessary quirk for the Alienware 18 Area 51 (2025).
Complements commit 1b03391d073d ("ASoC: Intel: sof_sdw: Add quirk
for Alienware Area 51 (2025) 0CCC SKU").
Signed-off-by: Oliver Freyermuth <o.freyermuth@googlemail.com>
Tested-by: Oliver Freyermuth <o.freyermuth@googlemail.com>
Link: https://patch.msgid.link/20260224190224.30630-1-o.freyermuth@googlemail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shuming Fan <shumingf@realtek.com>
Date: Wed Feb 25 17:12:10 2026 +0800
ASoC: rt1321: fix DMIC ch2/3 mask issue
[ Upstream commit 986841dcad257615a6e3f89231bb38e1f3506b77 ]
This patch fixed the DMIC ch2/3 mask missing problem.
Signed-off-by: Shuming Fan <shumingf@realtek.com>
Link: https://patch.msgid.link/20260225091210.3648905-1-shumingf@realtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Guangshuo Li <lgs201920130244@gmail.com>
Date: Fri Mar 13 12:06:11 2026 +0800
ASoC: sma1307: fix double free of devm_kzalloc() memory
commit fe757092d2329c397ecb32f2bf68a5b1c4bd9193 upstream.
A previous change added NULL checks and cleanup for allocation
failures in sma1307_setting_loaded().
However, the cleanup for mode_set entries is wrong. Those entries are
allocated with devm_kzalloc(), so they are device-managed resources and
must not be freed with kfree(). Manually freeing them in the error path
can lead to a double free when devres later releases the same memory.
Drop the manual kfree() loop and let devres handle the cleanup.
Fixes: 0ec6bd16705fe ("ASoC: sma1307: Add NULL check in sma1307_setting_loaded()")
Cc: stable@vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Link: https://patch.msgid.link/20260313040611.391479-1-lgs201920130244@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Date: Thu Mar 26 09:56:18 2026 +0200
ASoC: SOF: ipc4-topology: Allow bytes controls without initial payload
commit d40a198e2b7821197c5c77b89d0130cc90f400f5 upstream.
It is unexpected, but allowed to have no initial payload for a bytes
control and the code is prepared to handle this case, but the size check
missed this corner case.
Update the check for minimal size to allow the initial size to be 0.
Cc: stable@vger.kernel.org
Fixes: a653820700b8 ("ASoC: SOF: ipc4-topology: Correct the allocation size for bytes controls")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Reviewed-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Link: https://patch.msgid.link/20260326075618.1603-1-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Nilay Shroff <nilay@linux.ibm.com>
Date: Sun Mar 1 18:29:43 2026 +0530
block: break pcpu_alloc_mutex dependency on freeze_lock
[ Upstream commit 539d1b47e935e8384977dd7e5cec370c08b7a644 ]
While nr_hw_update allocates tagset tags it acquires ->pcpu_alloc_mutex
after ->freeze_lock is acquired or queue is frozen. This potentially
creates a circular dependency involving ->fs_reclaim if reclaim is
triggered simultaneously in a code path which first acquires ->pcpu_
alloc_mutex. As the queue is already frozen while nr_hw_queue update
allocates tagsets, the reclaim can't forward progress and thus it could
cause a potential deadlock as reported in lockdep splat[1].
Fix this by pre-allocating tagset tags before we freeze queue during
nr_hw_queue update. Later the allocated tagset tags could be safely
installed and used after queue is frozen.
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Closes: https://lore.kernel.org/all/CAHj4cs8F=OV9s3La2kEQ34YndgfZP-B5PHS4Z8_b9euKG6J4mw@mail.gmail.com/ [1]
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Reviewed-by: Yu Kuai <yukuai@fnnas.com>
[axboe: fix brace style issue]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cen Zhang <zzzccc427@gmail.com>
Date: Wed Mar 18 20:54:03 2026 +0800
Bluetooth: btintel: serialize btintel_hw_error() with hci_req_sync_lock
[ Upstream commit 94d8e6fe5d0818e9300e514e095a200bd5ff93ae ]
btintel_hw_error() issues two __hci_cmd_sync() calls (HCI_OP_RESET
and Intel exception-info retrieval) without holding
hci_req_sync_lock(). This lets it race against
hci_dev_do_close() -> btintel_shutdown_combined(), which also runs
__hci_cmd_sync() under the same lock. When both paths manipulate
hdev->req_status/req_rsp concurrently, the close path may free the
response skb first, and the still-running hw_error path hits a
slab-use-after-free in kfree_skb().
Wrap the whole recovery sequence in hci_req_sync_lock/unlock so it
is serialized with every other synchronous HCI command issuer.
Below is the data race report and the kasan report:
BUG: data-race in __hci_cmd_sync_sk / btintel_shutdown_combined
read of hdev->req_rsp at net/bluetooth/hci_sync.c:199
by task kworker/u17:1/83:
__hci_cmd_sync_sk+0x12f2/0x1c30 net/bluetooth/hci_sync.c:200
__hci_cmd_sync+0x55/0x80 net/bluetooth/hci_sync.c:223
btintel_hw_error+0x114/0x670 drivers/bluetooth/btintel.c:254
hci_error_reset+0x348/0xa30 net/bluetooth/hci_core.c:1030
write/free by task ioctl/22580:
btintel_shutdown_combined+0xd0/0x360
drivers/bluetooth/btintel.c:3648
hci_dev_close_sync+0x9ae/0x2c10 net/bluetooth/hci_sync.c:5246
hci_dev_do_close+0x232/0x460 net/bluetooth/hci_core.c:526
BUG: KASAN: slab-use-after-free in
sk_skb_reason_drop+0x43/0x380 net/core/skbuff.c:1202
Read of size 4 at addr ffff888144a738dc
by task kworker/u17:1/83:
__hci_cmd_sync_sk+0x12f2/0x1c30 net/bluetooth/hci_sync.c:200
__hci_cmd_sync+0x55/0x80 net/bluetooth/hci_sync.c:223
btintel_hw_error+0x186/0x670 drivers/bluetooth/btintel.c:260
Fixes: 973bb97e5aee ("Bluetooth: btintel: Add generic function for handling hardware errors")
Signed-off-by: Cen Zhang <zzzccc427@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pengpeng Hou <pengpeng@iscas.ac.cn>
Date: Wed Mar 25 08:42:45 2026 +0800
Bluetooth: btusb: clamp SCO altsetting table indices
[ Upstream commit 129fa608b6ad08b8ab7178eeb2ec272c993aaccc ]
btusb_work() maps the number of active SCO links to USB alternate
settings through a three-entry lookup table when CVSD traffic uses
transparent voice settings. The lookup currently indexes alts[] with
data->sco_num - 1 without first constraining sco_num to the number of
available table entries.
While the table only defines alternate settings for up to three SCO
links, data->sco_num comes from hci_conn_num() and is used directly.
Cap the lookup to the last table entry before indexing it so the
driver keeps selecting the highest supported alternate setting without
reading past alts[].
Fixes: baac6276c0a9 ("Bluetooth: btusb: handle mSBC audio over USB Endpoints")
Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Anas Iqbal <mohd.abd.6602@gmail.com>
Date: Sun Mar 15 10:51:37 2026 +0000
Bluetooth: hci_ll: Fix firmware leak on error path
[ Upstream commit 31148a7be723aa9f2e8fbd62424825ab8d577973 ]
Smatch reports:
drivers/bluetooth/hci_ll.c:587 download_firmware() warn:
'fw' from request_firmware() not released on lines: 544.
In download_firmware(), if request_firmware() succeeds but the returned
firmware content is invalid (no data or zero size), the function returns
without releasing the firmware, resulting in a resource leak.
Fix this by calling release_firmware() before returning when
request_firmware() succeeded but the firmware content is invalid.
Fixes: 371805522f87 ("bluetooth: hci_uart: add LL protocol serdev driver support")
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Anas Iqbal <mohd.abd.6602@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hyunwoo Kim <imv4bel@gmail.com>
Date: Fri Mar 20 20:01:26 2026 +0900
Bluetooth: L2CAP: Fix deadlock in l2cap_conn_del()
[ Upstream commit 00fdebbbc557a2fc21321ff2eaa22fd70c078608 ]
l2cap_conn_del() calls cancel_delayed_work_sync() for both info_timer
and id_addr_timer while holding conn->lock. However, the work functions
l2cap_info_timeout() and l2cap_conn_update_id_addr() both acquire
conn->lock, creating a potential AB-BA deadlock if the work is already
executing when l2cap_conn_del() takes the lock.
Move the work cancellations before acquiring conn->lock and use
disable_delayed_work_sync() to additionally prevent the works from
being rearmed after cancellation, consistent with the pattern used in
hci_conn_del().
Fixes: ab4eedb790ca ("Bluetooth: L2CAP: Fix corrupted list in hci_chan_del")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hyunwoo Kim <imv4bel@gmail.com>
Date: Fri Mar 20 20:23:10 2026 +0900
Bluetooth: L2CAP: Fix ERTM re-init and zero pdu_len infinite loop
[ Upstream commit 25f420a0d4cfd61d3d23ec4b9c56d9f443d91377 ]
l2cap_config_req() processes CONFIG_REQ for channels in BT_CONNECTED
state to support L2CAP reconfiguration (e.g. MTU changes). However,
since both CONF_INPUT_DONE and CONF_OUTPUT_DONE are already set from
the initial configuration, the reconfiguration path falls through to
l2cap_ertm_init(), which re-initializes tx_q, srej_q, srej_list, and
retrans_list without freeing the previous allocations and sets
chan->sdu to NULL without freeing the existing skb. This leaks all
previously allocated ERTM resources.
Additionally, l2cap_parse_conf_req() does not validate the minimum
value of remote_mps derived from the RFC max_pdu_size option. A zero
value propagates to l2cap_segment_sdu() where pdu_len becomes zero,
causing the while loop to never terminate since len is never
decremented, exhausting all available memory.
Fix the double-init by skipping l2cap_ertm_init() and
l2cap_chan_ready() when the channel is already in BT_CONNECTED state,
while still allowing the reconfiguration parameters to be updated
through l2cap_parse_conf_req(). Also add a pdu_len zero check in
l2cap_segment_sdu() as a safeguard.
Fixes: 96298f640104 ("Bluetooth: L2CAP: handle l2cap config request during open state")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date: Wed Jan 21 16:39:44 2026 -0500
Bluetooth: L2CAP: Fix not tracking outstanding TX ident
[ Upstream commit 6c3ea155e5ee3e56606233acde8309afda66d483 ]
This attempts to proper track outstanding request by using struct ida
and allocating from it in l2cap_get_ident using ida_alloc_range which
would reuse ids as they are free, then upon completion release
the id using ida_free.
This fixes the qualification test case L2CAP/COS/CED/BI-29-C which
attempts to check if the host stack is able to work after 256 attempts
to connect which requires Ident field to use the full range of possible
values in order to pass the test.
Link: https://github.com/bluez/bluez/issues/1829
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Stable-dep-of: 00fdebbbc557 ("Bluetooth: L2CAP: Fix deadlock in l2cap_conn_del()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Helen Koike <koike@igalia.com>
Date: Thu Mar 19 08:58:01 2026 -0300
Bluetooth: L2CAP: Fix null-ptr-deref on l2cap_sock_ready_cb
[ Upstream commit b6552e0503973daf6f23bd6ed9273ef131ee364f ]
Before using sk pointer, check if it is null.
Fix the following:
KASAN: null-ptr-deref in range [0x0000000000000260-0x0000000000000267]
CPU: 0 UID: 0 PID: 5985 Comm: kworker/0:5 Not tainted 7.0.0-rc4-00029-ga989fde763f4 #1 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-9.fc43 06/10/2025
Workqueue: events l2cap_info_timeout
RIP: 0010:kasan_byte_accessible+0x12/0x30
Code: 79 ff ff ff 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 40 d6 48 c1 ef 03 48 b8 00 00 00 00 00 fc ff df <0f> b6 04 07 3c 08 0f 92 c0 c3 cc cce
veth0_macvtap: entered promiscuous mode
RSP: 0018:ffffc90006e0f808 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffffffff89746018 RCX: 0000000080000001
RDX: 0000000000000000 RSI: ffffffff89746018 RDI: 000000000000004c
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: dffffc0000000000 R11: ffffffff8aae3e70 R12: 0000000000000000
R13: 0000000000000260 R14: 0000000000000260 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff8880983c2000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005582615a5008 CR3: 000000007007e000 CR4: 0000000000752ef0
PKRU: 55555554
Call Trace:
<TASK>
__kasan_check_byte+0x12/0x40
lock_acquire+0x79/0x2e0
lock_sock_nested+0x48/0x100
? l2cap_sock_ready_cb+0x46/0x160
l2cap_sock_ready_cb+0x46/0x160
l2cap_conn_start+0x779/0xff0
? __pfx_l2cap_conn_start+0x10/0x10
? l2cap_info_timeout+0x60/0xa0
? __pfx___mutex_lock+0x10/0x10
l2cap_info_timeout+0x68/0xa0
? process_scheduled_works+0xa8d/0x18c0
process_scheduled_works+0xb6e/0x18c0
? __pfx_process_scheduled_works+0x10/0x10
? assign_work+0x3d5/0x5e0
worker_thread+0xa53/0xfc0
kthread+0x388/0x470
? __pfx_worker_thread+0x10/0x10
? __pfx_kthread+0x10/0x10
ret_from_fork+0x51e/0xb90
? __pfx_ret_from_fork+0x10/0x10
veth1_macvtap: entered promiscuous mode
? __switch_to+0xc7d/0x1450
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
batman_adv: batadv0: Interface activated: batadv_slave_0
batman_adv: batadv0: Interface activated: batadv_slave_1
netdevsim netdevsim7 netdevsim0: set [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim7 netdevsim1: set [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim7 netdevsim2: set [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim7 netdevsim3: set [1, 0] type 2 family 0 port 6081 - 0
RIP: 0010:kasan_byte_accessible+0x12/0x30
Code: 79 ff ff ff 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 40 d6 48 c1 ef 03 48 b8 00 00 00 00 00 fc ff df <0f> b6 04 07 3c 08 0f 92 c0 c3 cc cce
ieee80211 phy39: Selected rate control algorithm 'minstrel_ht'
RSP: 0018:ffffc90006e0f808 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffffffff89746018 RCX: 0000000080000001
RDX: 0000000000000000 RSI: ffffffff89746018 RDI: 000000000000004c
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: dffffc0000000000 R11: ffffffff8aae3e70 R12: 0000000000000000
R13: 0000000000000260 R14: 0000000000000260 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff8880983c2000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7e16139e9c CR3: 000000000e74e000 CR4: 0000000000752ef0
PKRU: 55555554
Kernel panic - not syncing: Fatal exception
Fixes: 54a59aa2b562 ("Bluetooth: Add l2cap_chan->ops->ready()")
Signed-off-by: Helen Koike <koike@igalia.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date: Tue Mar 17 11:54:01 2026 -0400
Bluetooth: L2CAP: Fix regressions caused by reusing ident
commit 761fb8ec8778f0caf2bba5a41e3cff1ea86974f3 upstream.
This attempt to fix regressions caused by reusing ident which apparently
is not handled well on certain stacks causing the stack to not respond to
requests, so instead of simple returning the first unallocated id this
stores the last used tx_ident and then attempt to use the next until all
available ids are exausted and then cycle starting over to 1.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=221120
Link: https://bugzilla.kernel.org/show_bug.cgi?id=221177
Fixes: 6c3ea155e5ee ("Bluetooth: L2CAP: Fix not tracking outstanding TX ident")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tested-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Zhang Chen <zhangchen01@kylinos.cn>
Date: Thu Mar 19 17:32:11 2026 +0800
Bluetooth: L2CAP: Fix send LE flow credits in ACL link
[ Upstream commit f39f905e55f529b036321220af1ba4f4085564a5 ]
When the L2CAP channel mode is L2CAP_MODE_ERTM/L2CAP_MODE_STREAMING,
l2cap_publish_rx_avail will be called and le flow credits will be sent in
l2cap_chan_rx_avail, even though the link type is ACL.
The logs in question as follows:
> ACL Data RX: Handle 129 flags 0x02 dlen 12
L2CAP: Unknown (0x16) ident 4 len 4
40 00 ed 05
< ACL Data TX: Handle 129 flags 0x00 dlen 10
L2CAP: Command Reject (0x01) ident 4 len 2
Reason: Command not understood (0x0000)
Bluetooth: Unknown BR/EDR signaling command 0x16
Bluetooth: Wrong link type (-22)
Fixes: ce60b9231b66 ("Bluetooth: compute LE flow credits based on recvbuf space")
Signed-off-by: Zhang Chen <zhangchen01@kylinos.cn>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Minseo Park <jacob.park.9436@gmail.com>
Date: Sun Mar 15 22:14:37 2026 +0900
Bluetooth: L2CAP: Fix stack-out-of-bounds read in l2cap_ecred_conn_req
[ Upstream commit 9d87cb22195b2c67405f5485d525190747ad5493 ]
Syzbot reported a KASAN stack-out-of-bounds read in l2cap_build_cmd()
that is triggered by a malformed Enhanced Credit Based Connection Request.
The vulnerability stems from l2cap_ecred_conn_req(). The function allocates
a local stack buffer (`pdu`) designed to hold a maximum of 5 Source Channel
IDs (SCIDs), totaling 18 bytes. When an attacker sends a request with more
than 5 SCIDs, the function calculates `rsp_len` based on this unvalidated
`cmd_len` before checking if the number of SCIDs exceeds
L2CAP_ECRED_MAX_CID.
If the SCID count is too high, the function correctly jumps to the
`response` label to reject the packet, but `rsp_len` retains the
attacker's oversized value. Consequently, l2cap_send_cmd() is instructed
to read past the end of the 18-byte `pdu` buffer, triggering a
KASAN panic.
Fix this by moving the assignment of `rsp_len` to after the `num_scid`
boundary check. If the packet is rejected, `rsp_len` will safely
remain 0, and the error response will only read the 8-byte base header
from the stack.
Fixes: c28d2bff7044 ("Bluetooth: L2CAP: Fix result of L2CAP_ECRED_CONN_RSP when MTU is too short")
Reported-by: syzbot+b7f3e7d9a596bf6a63e3@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b7f3e7d9a596bf6a63e3
Tested-by: syzbot+b7f3e7d9a596bf6a63e3@syzkaller.appspotmail.com
Signed-off-by: Minseo Park <jacob.park.9436@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hyunwoo Kim <imv4bel@gmail.com>
Date: Fri Mar 13 05:22:39 2026 +0900
Bluetooth: L2CAP: Validate PDU length before reading SDU length in l2cap_ecred_data_rcv()
[ Upstream commit c65bd945d1c08c3db756821b6bf9f1c4a77b29c6 ]
l2cap_ecred_data_rcv() reads the SDU length field from skb->data using
get_unaligned_le16() without first verifying that skb contains at least
L2CAP_SDULEN_SIZE (2) bytes. When skb->len is less than 2, this reads
past the valid data in the skb.
The ERTM reassembly path correctly calls pskb_may_pull() before reading
the SDU length (l2cap_reassemble_sdu, L2CAP_SAR_START case). Apply the
same validation to the Enhanced Credit Based Flow Control data path.
Fixes: aac23bf63659 ("Bluetooth: Implement LE L2CAP reassembly")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date: Mon Mar 16 15:03:27 2026 -0400
Bluetooth: MGMT: Fix dangling pointer on mgmt_add_adv_patterns_monitor_complete
[ Upstream commit 5f5fa4cd35f707344f65ce9e225b6528691dbbaa ]
This fixes the condition checking so mgmt_pending_valid is executed
whenever status != -ECANCELED otherwise calling mgmt_pending_free(cmd)
would kfree(cmd) without unlinking it from the list first, leaving a
dangling pointer. Any subsequent list traversal (e.g.,
mgmt_pending_foreach during __mgmt_power_off, or another
mgmt_pending_valid call) would dereference freed memory.
Link: https://lore.kernel.org/linux-bluetooth/20260315132013.75ab40c5@kernel.org/T/#m1418f9c82eeff8510c1beaa21cf53af20db96c06
Fixes: 302a1f674c00 ("Bluetooth: MGMT: Fix possible UAFs")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hyunwoo Kim <imv4bel@gmail.com>
Date: Fri Mar 13 05:26:16 2026 +0900
Bluetooth: SCO: Fix use-after-free in sco_recv_frame() due to missing sock_hold
[ Upstream commit 598dbba9919c5e36c54fe1709b557d64120cb94b ]
sco_recv_frame() reads conn->sk under sco_conn_lock() but immediately
releases the lock without holding a reference to the socket. A concurrent
close() can free the socket between the lock release and the subsequent
sk->sk_state access, resulting in a use-after-free.
Other functions in the same file (sco_sock_timeout(), sco_conn_del())
correctly use sco_sock_hold() to safely hold a reference under the lock.
Fix by using sco_sock_hold() to take a reference before releasing the
lock, and adding sock_put() on all exit paths.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sachin Kumar <xcyfun@protonmail.com>
Date: Mon Mar 9 18:25:42 2026 +0000
bpf: Fix constant blinding for PROBE_MEM32 stores
[ Upstream commit 2321a9596d2260310267622e0ad8fbfa6f95378f ]
BPF_ST | BPF_PROBE_MEM32 immediate stores are not handled by
bpf_jit_blind_insn(), allowing user-controlled 32-bit immediates to
survive unblinded into JIT-compiled native code when bpf_jit_harden >= 1.
The root cause is that convert_ctx_accesses() rewrites BPF_ST|BPF_MEM
to BPF_ST|BPF_PROBE_MEM32 for arena pointer stores during verification,
before bpf_jit_blind_constants() runs during JIT compilation. The
blinding switch only matches BPF_ST|BPF_MEM (mode 0x60), not
BPF_ST|BPF_PROBE_MEM32 (mode 0xa0). The instruction falls through
unblinded.
Add BPF_ST|BPF_PROBE_MEM32 cases to bpf_jit_blind_insn() alongside the
existing BPF_ST|BPF_MEM cases. The blinding transformation is identical:
load the blinded immediate into BPF_REG_AX via mov+xor, then convert
the immediate store to a register store (BPF_STX).
The rewritten STX instruction must preserve the BPF_PROBE_MEM32 mode so
the architecture JIT emits the correct arena addressing (R12-based on
x86-64). Cannot use the BPF_STX_MEM() macro here because it hardcodes
BPF_MEM mode; construct the instruction directly instead.
Fixes: 6082b6c328b5 ("bpf: Recognize addr_space_cast instruction in the verifier.")
Reviewed-by: Puranjay Mohan <puranjay@kernel.org>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Sachin Kumar <xcyfun@protonmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/Y6IT5VvNRchPBLI5D7JZHBzZrU9rb0ycRJPJzJSXGj7kJlX8RJwZFSM2YZjcDxoQKABkxt1T8Os2gi23PYyFuQe6KkZGWVyfz8K5afdy9ak=@protonmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ihor Solodrai <ihor.solodrai@linux.dev>
Date: Thu Mar 19 17:08:08 2026 -0700
bpf: Fix exception exit lock checking for subprogs
[ Upstream commit 6c2128505f61b504c79a20b89596feba61388112 ]
process_bpf_exit_full() passes check_lock = !curframe to
check_resource_leak(), which is false in cases when bpf_throw() is
called from a static subprog. This makes check_resource_leak() to skip
validation of active_rcu_locks, active_preempt_locks, and
active_irq_id on exception exits from subprogs.
At runtime bpf_throw() unwinds the stack via ORC without releasing any
user-acquired locks, which may cause various issues as the result.
Fix by setting check_lock = true for exception exits regardless of
curframe, since exceptions bypass all intermediate frame
cleanup. Update the error message prefix to "bpf_throw" for exception
exits to distinguish them from normal BPF_EXIT.
Fix reject_subprog_with_rcu_read_lock test which was previously
passing for the wrong reason. Test program returned directly from the
subprog call without closing the RCU section, so the error was
triggered by the unclosed RCU lock on normal exit, not by
bpf_throw. Update __msg annotations for affected tests to match the
new "bpf_throw" error prefix.
The spin_lock case is not affected because they are already checked [1]
at the call site in do_check_insn() before bpf_throw can run.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v7.0-rc4#n21098
Assisted-by: Claude:claude-opus-4-6
Fixes: f18b03fabaa9 ("bpf: Implement BPF exceptions")
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260320000809.643798-1-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eduard Zingerman <eddyz87@gmail.com>
Date: Fri Mar 6 16:54:24 2026 -0800
bpf: Fix u32/s32 bounds when ranges cross min/max boundary
[ Upstream commit fbc7aef517d8765e4c425d2792409bb9bf2e1f13 ]
Same as in __reg64_deduce_bounds(), refine s32/u32 ranges
in __reg32_deduce_bounds() in the following situations:
- s32 range crosses U32_MAX/0 boundary, positive part of the s32 range
overlaps with u32 range:
0 U32_MAX
| [xxxxxxxxxxxxxx u32 range xxxxxxxxxxxxxx] |
|----------------------------|----------------------------|
|xxxxx s32 range xxxxxxxxx] [xxxxxxx|
0 S32_MAX S32_MIN -1
- s32 range crosses U32_MAX/0 boundary, negative part of the s32 range
overlaps with u32 range:
0 U32_MAX
| [xxxxxxxxxxxxxx u32 range xxxxxxxxxxxxxx] |
|----------------------------|----------------------------|
|xxxxxxxxx] [xxxxxxxxxxxx s32 range |
0 S32_MAX S32_MIN -1
- No refinement if ranges overlap in two intervals.
This helps for e.g. consider the following program:
call %[bpf_get_prandom_u32];
w0 &= 0xffffffff;
if w0 < 0x3 goto 1f; // on fall-through u32 range [3..U32_MAX]
if w0 s> 0x1 goto 1f; // on fall-through s32 range [S32_MIN..1]
if w0 s< 0x0 goto 1f; // range can be narrowed to [S32_MIN..-1]
r10 = 0;
1: ...;
The reg_bounds.c selftest is updated to incorporate identical logic,
refinement based on non-overflowing range halves:
((x ∩ [0, smax]) ∩ (y ∩ [0, smax])) ∪
((x ∩ [smin,-1]) ∩ (y ∩ [smin,-1]))
Reported-by: Andrea Righi <arighi@nvidia.com>
Reported-by: Emil Tsalapatis <emil@etsalapatis.com>
Closes: https://lore.kernel.org/bpf/aakqucg4vcujVwif@gpd4/T/
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260306-bpf-32-bit-range-overflow-v3-1-f7f67e060a6b@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jenny Guanni Qu <qguanni@gmail.com>
Date: Wed Mar 11 01:11:15 2026 +0000
bpf: Fix undefined behavior in interpreter sdiv/smod for INT_MIN
[ Upstream commit c77b30bd1dcb61f66c640ff7d2757816210c7cb0 ]
The BPF interpreter's signed 32-bit division and modulo handlers use
the kernel abs() macro on s32 operands. The abs() macro documentation
(include/linux/math.h) explicitly states the result is undefined when
the input is the type minimum. When DST contains S32_MIN (0x80000000),
abs((s32)DST) triggers undefined behavior and returns S32_MIN unchanged
on arm64/x86. This value is then sign-extended to u64 as
0xFFFFFFFF80000000, causing do_div() to compute the wrong result.
The verifier's abstract interpretation (scalar32_min_max_sdiv) computes
the mathematically correct result for range tracking, creating a
verifier/interpreter mismatch that can be exploited for out-of-bounds
map value access.
Introduce abs_s32() which handles S32_MIN correctly by casting to u32
before negating, avoiding signed overflow entirely. Replace all 8
abs((s32)...) call sites in the interpreter's sdiv32/smod32 handlers.
s32 is the only affected case -- the s64 division/modulo handlers do
not use abs().
Fixes: ec0e2da95f72 ("bpf: Support new signed div/mod instructions.")
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Jenny Guanni Qu <qguanni@gmail.com>
Link: https://lore.kernel.org/r/20260311011116.2108005-2-qguanni@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniel Wade <danjwade95@gmail.com>
Date: Sat Mar 14 13:15:20 2026 +1100
bpf: Fix unsound scalar forking in maybe_fork_scalars() for BPF_OR
[ Upstream commit c845894ebd6fb43226b3118d6b017942550910c5 ]
maybe_fork_scalars() is called for both BPF_AND and BPF_OR when the
source operand is a constant. When dst has signed range [-1, 0], it
forks the verifier state: the pushed path gets dst = 0, the current
path gets dst = -1.
For BPF_AND this is correct: 0 & K == 0.
For BPF_OR this is wrong: 0 | K == K, not 0.
The pushed path therefore tracks dst as 0 when the runtime value is K,
producing an exploitable verifier/runtime divergence that allows
out-of-bounds map access.
Fix this by passing env->insn_idx (instead of env->insn_idx + 1) to
push_stack(), so the pushed path re-executes the ALU instruction with
dst = 0 and naturally computes the correct result for any opcode.
Fixes: bffacdb80b93 ("bpf: Recognize special arithmetic shift in the verifier")
Signed-off-by: Daniel Wade <danjwade95@gmail.com>
Reviewed-by: Amery Hung <ameryhung@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260314021521.128361-2-danjwade95@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date: Thu Mar 12 13:53:07 2026 -0700
bpf: Release module BTF IDR before module unload
[ Upstream commit 146bd2a87a65aa407bb17fac70d8d583d19aba06 ]
Gregory reported in [0] that the global_map_resize test when run in
repeatedly ends up failing during program load. This stems from the fact
that BTF reference has not dropped to zero after the previous run's
module is unloaded, and the older module's BTF is still discoverable and
visible. Later, in libbpf, load_module_btfs() will find the ID for this
stale BTF, open its fd, and then it will be used during program load
where later steps taking module reference using btf_try_get_module()
fail since the underlying module for the BTF is gone.
Logically, once a module is unloaded, it's associated BTF artifacts
should become hidden. The BTF object inside the kernel may still remain
alive as long its reference counts are alive, but it should no longer be
discoverable.
To fix this, let us call btf_free_id() from the MODULE_STATE_GOING case
for the module unload to free the BTF associated IDR entry, and disable
its discovery once module unload returns to user space. If a race
happens during unload, the outcome is non-deterministic anyway. However,
user space should be able to rely on the guarantee that once it has
synchronously established a successful module unload, no more stale
artifacts associated with this module can be obtained subsequently.
Note that we must be careful to not invoke btf_free_id() in btf_put()
when btf_is_module() is true now. There could be a window where the
module unload drops a non-terminal reference, frees the IDR, but the
same ID gets reused and the second unconditional btf_free_id() ends up
releasing an unrelated entry.
To avoid a special case for btf_is_module() case, set btf->id to zero to
make btf_free_id() idempotent, such that we can unconditionally invoke it
from btf_put(), and also from the MODULE_STATE_GOING case. Since zero is
an invalid IDR, the idr_remove() should be a noop.
Note that we can be sure that by the time we reach final btf_put() for
btf_is_module() case, the btf_free_id() is already done, since the
module itself holds the BTF reference, and it will call this function
for the BTF before dropping its own reference.
[0]: https://lore.kernel.org/bpf/cover.1773170190.git.grbell@redhat.com
Fixes: 36e68442d1af ("bpf: Load and verify kernel module BTFs")
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
Reported-by: Gregory Bell <grbell@redhat.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260312205307.1346991-1-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yazhou Tang <tangyazhou518@outlook.com>
Date: Wed Mar 4 16:32:27 2026 +0800
bpf: Reset register ID for BPF_END value tracking
[ Upstream commit a3125bc01884431d30d731461634c8295b6f0529 ]
When a register undergoes a BPF_END (byte swap) operation, its scalar
value is mutated in-place. If this register previously shared a scalar ID
with another register (e.g., after an `r1 = r0` assignment), this tie must
be broken.
Currently, the verifier misses resetting `dst_reg->id` to 0 for BPF_END.
Consequently, if a conditional jump checks the swapped register, the
verifier incorrectly propagates the learned bounds to the linked register,
leading to false confidence in the linked register's value and potentially
allowing out-of-bounds memory accesses.
Fix this by explicitly resetting `dst_reg->id` to 0 in the BPF_END case
to break the scalar tie, similar to how BPF_NEG handles it via
`__mark_reg_known`.
Fixes: 9d2119984224 ("bpf: Add bitwise tracking for BPF_END")
Closes: https://lore.kernel.org/bpf/AMBPR06MB108683CFEB1CB8D9E02FC95ECF17EA@AMBPR06MB10868.eurprd06.prod.outlook.com/
Link: https://lore.kernel.org/bpf/4be25f7442a52244d0dd1abb47bc6750e57984c9.camel@gmail.com/
Reported-by: Guillaume Laporte <glapt.pro@outlook.com>
Co-developed-by: Tianci Cao <ziye@zju.edu.cn>
Signed-off-by: Tianci Cao <ziye@zju.edu.cn>
Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260304083228.142016-2-tangyazhou@zju.edu.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Date: Sun Mar 1 21:17:04 2026 +0900
btrfs: fix leak of kobject name for sub-group space_info
[ Upstream commit a4376d9a5d4c9610e69def3fc0b32c86a7ab7a41 ]
When create_space_info_sub_group() allocates elements of
space_info->sub_group[], kobject_init_and_add() is called for each
element via btrfs_sysfs_add_space_info_type(). However, when
check_removing_space_info() frees these elements, it does not call
btrfs_sysfs_remove_space_info() on them. As a result, kobject_put() is
not called and the associated kobj->name objects are leaked.
This memory leak is reproduced by running the blktests test case
zbd/009 on kernels built with CONFIG_DEBUG_KMEMLEAK. The kmemleak
feature reports the following error:
unreferenced object 0xffff888112877d40 (size 16):
comm "mount", pid 1244, jiffies 4294996972
hex dump (first 16 bytes):
64 61 74 61 2d 72 65 6c 6f 63 00 c4 c6 a7 cb 7f data-reloc......
backtrace (crc 53ffde4d):
__kmalloc_node_track_caller_noprof+0x619/0x870
kstrdup+0x42/0xc0
kobject_set_name_vargs+0x44/0x110
kobject_init_and_add+0xcf/0x150
btrfs_sysfs_add_space_info_type+0xfc/0x210 [btrfs]
create_space_info_sub_group.constprop.0+0xfb/0x1b0 [btrfs]
create_space_info+0x211/0x320 [btrfs]
btrfs_init_space_info+0x15a/0x1b0 [btrfs]
open_ctree+0x33c7/0x4a50 [btrfs]
btrfs_get_tree.cold+0x9f/0x1ee [btrfs]
vfs_get_tree+0x87/0x2f0
vfs_cmd_create+0xbd/0x280
__do_sys_fsconfig+0x3df/0x990
do_syscall_64+0x136/0x1540
entry_SYSCALL_64_after_hwframe+0x76/0x7e
To avoid the leak, call btrfs_sysfs_remove_space_info() instead of
kfree() for the elements.
Fixes: f92ee31e031c ("btrfs: introduce btrfs_space_info sub-group")
Link: https://lore.kernel.org/linux-block/b9488881-f18d-4f47-91a5-3c9bf63955a5@wdc.com/
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Filipe Manana <fdmanana@suse.com>
Date: Wed Mar 18 16:17:59 2026 +0000
btrfs: fix lost error when running device stats on multiple devices fs
[ Upstream commit 1c37d896b12dfd0d4c96e310b0033c6676933917 ]
Whenever we get an error updating the device stats item for a device in
btrfs_run_dev_stats() we allow the loop to go to the next device, and if
updating the stats item for the next device succeeds, we end up losing
the error we had from the previous device.
Fix this by breaking out of the loop once we get an error and make sure
it's returned to the caller. Since we are in the transaction commit path
(and in the critical section actually), returning the error will result
in a transaction abort.
Fixes: 733f4fbbc108 ("Btrfs: read device stats on mount, write modified ones during commit")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mark Harmstone <mark@harmstone.com>
Date: Tue Feb 17 17:35:42 2026 +0000
btrfs: fix super block offset in error message in btrfs_validate_super()
[ Upstream commit b52fe51f724385b3ed81e37e510a4a33107e8161 ]
Fix the superblock offset mismatch error message in
btrfs_validate_super(): we changed it so that it considers all the
superblocks, but the message still assumes we're only looking at the
first one.
The change from %u to %llu is because we're changing from a constant to
a u64.
Fixes: 069ec957c35e ("btrfs: Refactor btrfs_check_super_valid")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Mark Harmstone <mark@harmstone.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Boris Burkov <boris@bur.io>
Date: Tue Feb 24 14:25:35 2026 -0800
btrfs: set BTRFS_ROOT_ORPHAN_CLEANUP during subvol create
[ Upstream commit 5131fa077f9bb386a1b901bf5b247041f0ec8f80 ]
We have recently observed a number of subvolumes with broken dentries.
ls-ing the parent dir looks like:
drwxrwxrwt 1 root root 16 Jan 23 16:49 .
drwxr-xr-x 1 root root 24 Jan 23 16:48 ..
d????????? ? ? ? ? ? broken_subvol
and similarly stat-ing the file fails.
In this state, deleting the subvol fails with ENOENT, but attempting to
create a new file or subvol over it errors out with EEXIST and even
aborts the fs. Which leaves us a bit stuck.
dmesg contains a single notable error message reading:
"could not do orphan cleanup -2"
2 is ENOENT and the error comes from the failure handling path of
btrfs_orphan_cleanup(), with the stack leading back up to
btrfs_lookup().
btrfs_lookup
btrfs_lookup_dentry
btrfs_orphan_cleanup // prints that message and returns -ENOENT
After some detailed inspection of the internal state, it became clear
that:
- there are no orphan items for the subvol
- the subvol is otherwise healthy looking, it is not half-deleted or
anything, there is no drop progress, etc.
- the subvol was created a while ago and does the meaningful first
btrfs_orphan_cleanup() call that sets BTRFS_ROOT_ORPHAN_CLEANUP much
later.
- after btrfs_orphan_cleanup() fails, btrfs_lookup_dentry() returns -ENOENT,
which results in a negative dentry for the subvolume via
d_splice_alias(NULL, dentry), leading to the observed behavior. The
bug can be mitigated by dropping the dentry cache, at which point we
can successfully delete the subvolume if we want.
i.e.,
btrfs_lookup()
btrfs_lookup_dentry()
if (!sb_rdonly(inode->vfs_inode)->vfs_inode)
btrfs_orphan_cleanup(sub_root)
test_and_set_bit(BTRFS_ROOT_ORPHAN_CLEANUP)
btrfs_search_slot() // finds orphan item for inode N
...
prints "could not do orphan cleanup -2"
if (inode == ERR_PTR(-ENOENT))
inode = NULL;
return d_splice_alias(NULL, dentry) // NEGATIVE DENTRY for valid subvolume
btrfs_orphan_cleanup() does test_and_set_bit(BTRFS_ROOT_ORPHAN_CLEANUP)
on the root when it runs, so it cannot run more than once on a given
root, so something else must run concurrently. However, the obvious
routes to deleting an orphan when nlinks goes to 0 should not be able to
run without first doing a lookup into the subvolume, which should run
btrfs_orphan_cleanup() and set the bit.
The final important observation is that create_subvol() calls
d_instantiate_new() but does not set BTRFS_ROOT_ORPHAN_CLEANUP, so if
the dentry cache gets dropped, the next lookup into the subvolume will
make a real call into btrfs_orphan_cleanup() for the first time. This
opens up the possibility of concurrently deleting the inode/orphan items
but most typical evict() paths will be holding a reference on the parent
dentry (child dentry holds parent->d_lockref.count via dget in
d_alloc(), released in __dentry_kill()) and prevent the parent from
being removed from the dentry cache.
The one exception is delayed iputs. Ordered extent creation calls
igrab() on the inode. If the file is unlinked and closed while those
refs are held, iput() in __dentry_kill() decrements i_count but does
not trigger eviction (i_count > 0). The child dentry is freed and the
subvol dentry's d_lockref.count drops to 0, making it evictable while
the inode is still alive.
Since there are two races (the race between writeback and unlink and
the race between lookup and delayed iputs), and there are too many moving
parts, the following three diagrams show the complete picture.
(Only the second and third are races)
Phase 1:
Create Subvol in dentry cache without BTRFS_ROOT_ORPHAN_CLEANUP set
btrfs_mksubvol()
lookup_one_len()
__lookup_slow()
d_alloc_parallel()
__d_alloc() // d_lockref.count = 1
create_subvol(dentry)
// doesn't touch the bit..
d_instantiate_new(dentry, inode) // dentry in cache with d_lockref.count == 1
Phase 2:
Create a delayed iput for a file in the subvol but leave the subvol in
state where its dentry can be evicted (d_lockref.count == 0)
T1 (task) T2 (writeback) T3 (OE workqueue)
write() // dirty pages
btrfs_writepages()
btrfs_run_delalloc_range()
cow_file_range()
btrfs_alloc_ordered_extent()
igrab() // i_count: 1 -> 2
btrfs_unlink_inode()
btrfs_orphan_add()
close()
__fput()
dput()
finish_dput()
__dentry_kill()
dentry_unlink_inode()
iput() // 2 -> 1
--parent->d_lockref.count // 1 -> 0; evictable
finish_ordered_fn()
btrfs_finish_ordered_io()
btrfs_put_ordered_extent()
btrfs_add_delayed_iput()
Phase 3:
Once the delayed iput is pending and the subvol dentry is evictable,
the shrinker can free it, causing the next lookup to go through
btrfs_lookup() and call btrfs_orphan_cleanup() for the first time.
If the cleaner kthread processes the delayed iput concurrently, the
two race:
T1 (shrinker) T2 (cleaner kthread) T3 (lookup)
super_cache_scan()
prune_dcache_sb()
__dentry_kill()
// subvol dentry freed
btrfs_run_delayed_iputs()
iput() // i_count -> 0
evict() // sets I_FREEING
btrfs_evict_inode()
// truncation loop
btrfs_lookup()
btrfs_lookup_dentry()
btrfs_orphan_cleanup()
// first call (bit never set)
btrfs_iget()
// blocks on I_FREEING
btrfs_orphan_del()
// inode freed
// returns -ENOENT
btrfs_del_orphan_item()
// -ENOENT
// "could not do orphan cleanup -2"
d_splice_alias(NULL, dentry)
// negative dentry for valid subvol
The most straightforward fix is to ensure the invariant that a dentry
for a subvolume can exist if and only if that subvolume has
BTRFS_ROOT_ORPHAN_CLEANUP set on its root (and is known to have no
orphans or ran btrfs_orphan_cleanup()).
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ali Norouzi <ali.norouzi@keysight.com>
Date: Thu Mar 19 16:47:44 2026 +0100
can: gw: fix OOB heap access in cgw_csum_crc8_rel()
commit b9c310d72783cc2f30d103eed83920a5a29c671a upstream.
cgw_csum_crc8_rel() correctly computes bounds-safe indices via calc_idx():
int from = calc_idx(crc8->from_idx, cf->len);
int to = calc_idx(crc8->to_idx, cf->len);
int res = calc_idx(crc8->result_idx, cf->len);
if (from < 0 || to < 0 || res < 0)
return;
However, the loop and the result write then use the raw s8 fields directly
instead of the computed variables:
for (i = crc8->from_idx; ...) /* BUG: raw negative index */
cf->data[crc8->result_idx] = ...; /* BUG: raw negative index */
With from_idx = to_idx = result_idx = -64 on a 64-byte CAN FD frame,
calc_idx(-64, 64) = 0 so the guard passes, but the loop iterates with
i = -64, reading cf->data[-64], and the write goes to cf->data[-64].
This write might end up to 56 (7.0-rc) or 40 (<= 6.19) bytes before the
start of the canfd_frame on the heap.
The companion function cgw_csum_xor_rel() uses `from`/`to`/`res`
correctly throughout; fix cgw_csum_crc8_rel() to match.
Confirmed with KASAN on linux-7.0-rc2:
BUG: KASAN: slab-out-of-bounds in cgw_csum_crc8_rel+0x515/0x5b0
Read of size 1 at addr ffff8880076619c8 by task poc_cgw_oob/62
To configure the can-gw crc8 checksums CAP_NET_ADMIN is needed.
Fixes: 456a8a646b25 ("can: gw: add support for CAN FD frames")
Cc: stable@vger.kernel.org
Reported-by: Ali Norouzi <ali.norouzi@keysight.com>
Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Ali Norouzi <ali.norouzi@keysight.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://patch.msgid.link/20260319-fix-can-gw-and-can-isotp-v2-1-c45d52c6d2d8@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Oliver Hartkopp <socketcan@hartkopp.net>
Date: Thu Mar 19 16:47:45 2026 +0100
can: isotp: fix tx.buf use-after-free in isotp_sendmsg()
commit 424e95d62110cdbc8fd12b40918f37e408e35a92 upstream.
isotp_sendmsg() uses only cmpxchg() on so->tx.state to serialize access
to so->tx.buf. isotp_release() waits for ISOTP_IDLE via
wait_event_interruptible() and then calls kfree(so->tx.buf).
If a signal interrupts the wait_event_interruptible() inside close()
while tx.state is ISOTP_SENDING, the loop exits early and release
proceeds to force ISOTP_SHUTDOWN and continues to kfree(so->tx.buf)
while sendmsg may still be reading so->tx.buf for the final CAN frame
in isotp_fill_dataframe().
The so->tx.buf can be allocated once when the standard tx.buf length needs
to be extended. Move the kfree() of this potentially extended tx.buf to
sk_destruct time when either isotp_sendmsg() and isotp_release() are done.
Fixes: 96d1c81e6a04 ("can: isotp: add module parameter for maximum pdu size")
Cc: stable@vger.kernel.org
Reported-by: Ali Norouzi <ali.norouzi@keysight.com>
Co-developed-by: Ali Norouzi <ali.norouzi@keysight.com>
Signed-off-by: Ali Norouzi <ali.norouzi@keysight.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://patch.msgid.link/20260319-fix-can-gw-and-can-isotp-v2-2-c45d52c6d2d8@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Marc Kleine-Budde <mkl@pengutronix.de>
Date: Tue Mar 10 13:48:03 2026 +0100
can: netlink: can_changelink(): add missing error handling to call can_ctrlmode_changelink()
commit cadf6019231b614ebbd9ec2a16e5997ecbd8d016 upstream.
In commit e1a5cd9d6665 ("can: netlink: add can_ctrlmode_changelink()") the
CAN Control Mode (IFLA_CAN_CTRLMODE) handling was factored out into the
can_ctrlmode_changelink() function. But the call to
can_ctrlmode_changelink() is missing the error handling.
Add the missing error handling and propagation to the call
can_ctrlmode_changelink().
Cc: stable@vger.kernel.org
Fixes: e1a5cd9d6665 ("can: netlink: add can_ctrlmode_changelink()")
Link: https://patch.msgid.link/20260310-can_ctrlmode_changelink-add-error-handling-v1-1-0daf63d85922@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Oliver Hartkopp <socketcan@hartkopp.net>
Date: Wed Mar 18 18:34:13 2026 +0100
can: statistics: add missing atomic access in hot path
[ Upstream commit 46eee1661aa9b49966e6c43d07126fe408edda57 ]
Commit 80b5f90158d1 ("can: statistics: use atomic access in hot path")
fixed a KCSAN issue in can_receive() but missed to convert the 'matches'
variable used in can_rcv_filter().
Fixes: 80b5f90158d1 ("can: statistics: use atomic access in hot path")
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://patch.msgid.link/20260318173413.28235-1-socketcan@hartkopp.net
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Viresh Kumar <viresh.kumar@linaro.org>
Date: Fri Mar 20 15:08:14 2026 +0530
cpufreq: conservative: Reset requested_freq on limits change
commit 6a28fb8cb28b9eb39a392e531d938a889eacafc5 upstream.
A recently reported issue highlighted that the cached requested_freq
is not guaranteed to stay in sync with policy->cur. If the platform
changes the actual CPU frequency after the governor sets one (e.g.
due to platform-specific frequency scaling) and a re-sync occurs
later, policy->cur may diverge from requested_freq.
This can lead to incorrect behavior in the conservative governor.
For example, the governor may assume the CPU is already running at
the maximum frequency and skip further increases even though there
is still headroom.
Avoid this by resetting the cached requested_freq to policy->cur on
detecting a change in policy limits.
Reported-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Tested-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://lore.kernel.org/all/20260210115458.3493646-1-zhenglifeng1@huawei.com/
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Cc: All applicable <stable@vger.kernel.org>
Link: https://patch.msgid.link/d846a141a98ac0482f20560fcd7525c0f0ec2f30.1773999467.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Date: Mon Mar 16 20:19:49 2026 +0000
cxl/hdm: Avoid incorrect DVSEC fallback when HDM decoders are enabled
[ Upstream commit 75cea0776de502f2a1be5ca02d37c586dc81887e ]
Check the global CXL_HDM_DECODER_ENABLE bit instead of looping over
per-decoder COMMITTED bits to determine whether to fall back to DVSEC
range emulation. When the HDM decoder capability is globally enabled,
ignore DVSEC range registers regardless of individual decoder commit
state.
should_emulate_decoders() currently loops over per-decoder COMMITTED
bits, which leads to an incorrect DVSEC fallback when those bits are
zero. One way to trigger this is to destroy a region and bounce the
memdev:
cxl disable-region region0
cxl destroy-region region0
cxl disable-memdev mem0
cxl enable-memdev mem0
Region teardown zeroes the HDM decoder registers including the committed
bits. The subsequent memdev re-probe finds uncommitted decoders and falls
back to DVSEC emulation, even though HDM remains globally enabled.
Observed failures:
should_emulate_decoders: cxl_port endpoint6: decoder6.0: committed: 0 base: 0x0_00000000 size: 0x0_00000000
devm_cxl_setup_hdm: cxl_port endpoint6: Fallback map 1 range register
..
devm_cxl_add_region: cxl_acpi ACPI0017:00: decoder0.0: created region0
__construct_region: cxl_pci 0000:e1:00.0: mem1:decoder6.0:
__construct_region region0 res: [mem 0x850000000-0x284fffffff flags 0x200] iw: 1 ig: 4096
cxl region0: pci0000:e0:port1 cxl_port_setup_targets expected iw: 1 ig: 4096 ..
cxl region0: pci0000:e0:port1 cxl_port_setup_targets got iw: 1 ig: 256 state: disabled ..
cxl_port endpoint6: failed to attach decoder6.0 to region0: -6
..
devm_cxl_add_region: cxl_acpi ACPI0017:00: decoder0.0: created region4
alloc_hpa: cxl region4: HPA allocation error (-34) ..
Fixes: 52cc48ad2a76 ("cxl/hdm: Limit emulation to the number of range registers")
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20260316201950.224567-1-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alison Schofield <alison.schofield@intel.com>
Date: Thu Feb 26 10:44:36 2026 -0800
cxl/port: Fix use after free of parent_port in cxl_detach_ep()
[ Upstream commit 19d2f0b97a131198efc2c4ca3eb7f980bba8c2b4 ]
cxl_detach_ep() is called during bottom-up removal when all CXL memory
devices beneath a switch port have been removed. For each port in the
hierarchy it locks both the port and its parent, removes the endpoint,
and if the port is now empty, marks it dead and unregisters the port
by calling delete_switch_port(). There are two places during this work
where the parent_port may be used after freeing:
First, a concurrent detach may have already processed a port by the
time a second worker finds it via bus_find_device(). Without pinning
parent_port, it may already be freed when we discover port->dead and
attempt to unlock the parent_port. In a production kernel that's a
silent memory corruption, with lock debug, it looks like this:
[]DEBUG_LOCKS_WARN_ON(__owner_task(owner) != get_current())
[]WARNING: kernel/locking/mutex.c:949 at __mutex_unlock_slowpath+0x1ee/0x310
[]Call Trace:
[]mutex_unlock+0xd/0x20
[]cxl_detach_ep+0x180/0x400 [cxl_core]
[]devm_action_release+0x10/0x20
[]devres_release_all+0xa8/0xe0
[]device_unbind_cleanup+0xd/0xa0
[]really_probe+0x1a6/0x3e0
Second, delete_switch_port() releases three devm actions registered
against parent_port. The last of those is unregister_port() and it
calls device_unregister() on the child port, which can cascade. If
parent_port is now also empty the device core may unregister and free
it too. So by the time delete_switch_port() returns, parent_port may
be free, and the subsequent device_unlock(&parent_port->dev) operates
on freed memory. The kernel log looks same as above, with a different
offset in cxl_detach_ep().
Both of these issues stem from the absence of a lifetime guarantee
between a child port and its parent port.
Establish a lifetime rule for ports: child ports hold a reference to
their parent device until release. Take the reference when the port
is allocated and drop it when released. This ensures the parent is
valid for the full lifetime of the child and eliminates the use after
free window in cxl_detach_ep().
This is easily reproduced with a reload of cxl_acpi in QEMU with CXL
devices present.
Fixes: 2345df54249c ("cxl/memdev: Fix endpoint port removal")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260226184439.1732841-1-alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cui Chao <cuichao1753@phytium.com.cn>
Date: Thu Mar 19 15:45:35 2026 +0800
cxl: Adjust the startup priority of cxl_pmem to be higher than that of cxl_acpi
[ Upstream commit be5c5280cf2b20e363dc8e2a424dd200a29b1c77 ]
During the cxl_acpi probe process, it checks whether the cxl_nvb device
and driver have been attached. Currently, the startup priority of the
cxl_pmem driver is lower than that of the cxl_acpi driver. At this point,
the cxl_nvb driver has not yet been registered on the cxl_bus, causing
the attachment check to fail. This results in a failure to add the root
nvdimm bridge, leading to a cxl_acpi probe failure and ultimately
affecting the subsequent loading of cxl drivers. As a consequence, only
one mem device object exists on the cxl_bus, while the cxl_port device
objects and decoder device objects are missing.
The solution is to raise the startup priority of cxl_pmem to be higher
than that of cxl_acpi, ensuring that the cxl_pmem driver is registered
before the aforementioned attachment check occurs.
Co-developed-by: Wang Yinfeng <wangyinfeng@phytium.com.cn>
Signed-off-by: Wang Yinfeng <wangyinfeng@phytium.com.cn>
Signed-off-by: Cui Chao <cuichao1753@phytium.com.cn>
Fixes: e7e222ad73d9 ("cxl: Move devm_cxl_add_nvdimm_bridge() to cxl_pmem.ko")
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20260319074535.1709250-1-cuichao1753@phytium.com.cn
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Isaac J. Manjarres <isaacmanjarres@google.com>
Date: Mon Mar 2 16:23:09 2026 -0800
dma-buf: Include ioctl.h in UAPI header
[ Upstream commit a116bac87118903925108e57781bbfc7a7eea27b ]
include/uapi/linux/dma-buf.h uses several macros from ioctl.h to define
its ioctl commands. However, it does not include ioctl.h itself. So,
if userspace source code tries to include the dma-buf.h file without
including ioctl.h, it can result in build failures.
Therefore, include ioctl.h in the dma-buf UAPI header.
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Reviewed-by: T.J. Mercier <tjmercier@google.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20260303002309.1401849-1-isaacmanjarres@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Miguel Ojeda <ojeda@kernel.org>
Date: Wed Mar 25 02:55:48 2026 +0100
dma-mapping: add missing `inline` for `dma_free_attrs`
[ Upstream commit 2cdaff22ed26f1e619aa2b43f27bb84f2c6ef8f8 ]
Under an UML build for an upcoming series [1], I got `-Wstatic-in-inline`
for `dma_free_attrs`:
BINDGEN rust/bindings/bindings_generated.rs - due to target missing
In file included from rust/helpers/helpers.c:59:
rust/helpers/dma.c:17:2: warning: static function 'dma_free_attrs' is used in an inline function with external linkage [-Wstatic-in-inline]
17 | dma_free_attrs(dev, size, cpu_addr, dma_handle, attrs);
| ^
rust/helpers/dma.c:12:1: note: use 'static' to give inline function 'rust_helper_dma_free_attrs' internal linkage
12 | __rust_helper void rust_helper_dma_free_attrs(struct device *dev, size_t size,
| ^
| static
The issue is that `dma_free_attrs` was not marked `inline` when it was
introduced alongside the rest of the stubs.
Thus mark it.
Fixes: ed6ccf10f24b ("dma-mapping: properly stub out the DMA API for !CONFIG_HAS_DMA")
Closes: https://lore.kernel.org/rust-for-linux/20260322194616.89847-1-ojeda@kernel.org/ [1]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20260325015548.70912-1-ojeda@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shigeru Yoshida <syoshida@redhat.com>
Date: Sun Mar 15 17:27:49 2026 +0900
dma: swiotlb: add KMSAN annotations to swiotlb_bounce()
[ Upstream commit 6f770b73d0311a5b099277653199bb6421c4fed2 ]
When a device performs DMA to a bounce buffer, KMSAN is unaware of
the write and does not mark the data as initialized. When
swiotlb_bounce() later copies the bounce buffer back to the original
buffer, memcpy propagates the uninitialized shadow to the original
buffer, causing false positive uninit-value reports.
Fix this by calling kmsan_unpoison_memory() on the bounce buffer
before copying it back in the DMA_FROM_DEVICE path, so that memcpy
naturally propagates initialized shadow to the destination.
Suggested-by: Alexander Potapenko <glider@google.com>
Link: https://lore.kernel.org/CAG_fn=WUGta-paG1BgsGRoAR+fmuCgh3xo=R3XdzOt_-DqSdHw@mail.gmail.com/
Fixes: 7ade4f10779c ("dma: kmsan: unpoison DMA mappings")
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20260315082750.2375581-1-syoshida@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: LUO Haowen <luo-hw@foxmail.com>
Date: Wed Mar 4 14:45:09 2026 +0800
dmaengine: dw-edma: Fix multiple times setting of the CYCLE_STATE and CYCLE_BIT bits for HDMA.
[ Upstream commit 3f63297ff61a994b99d710dcb6dbde41c4003233 ]
Others have submitted this issue (https://lore.kernel.org/dmaengine/
20240722030405.3385-1-zhengdongxiong@gxmicro.cn/),
but it has not been fixed yet. Therefore, more supplementary information
is provided here.
As mentioned in the "PCS-CCS-CB-TCB" Producer-Consumer Synchronization of
"DesignWare Cores PCI Express Controller Databook, version 6.00a":
1. The Consumer CYCLE_STATE (CCS) bit in the register only needs to be
initialized once; the value will update automatically to be
~CYCLE_BIT (CB) in the next chunk.
2. The Consumer CYCLE_BIT bit in the register is loaded from the LL
element and tested against CCS. When CB = CCS, the data transfer is
executed. Otherwise not.
The current logic sets customer (HDMA) CS and CB bits to 1 in each chunk
while setting the producer (software) CB of odd chunks to 0 and even
chunks to 1 in the linked list. This is leading to a mismatch between
the producer CB and consumer CS bits.
This issue can be reproduced by setting the transmission data size to
exceed one chunk. By the way, in the EDMA using the same "PCS-CCS-CB-TCB"
mechanism, the CS bit is only initialized once and this issue was not
found. Refer to
drivers/dma/dw-edma/dw-edma-v0-core.c:dw_edma_v0_core_start.
So fix this issue by initializing the CYCLE_STATE and CYCLE_BIT bits
only once.
Fixes: e74c39573d35 ("dmaengine: dw-edma: Add support for native HDMA")
Signed-off-by: LUO Haowen <luo-hw@foxmail.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/tencent_CB11AA9F3920C1911AF7477A9BD8EFE0AD05@qq.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Joy Zou <joy.zou@nxp.com>
Date: Wed Sep 17 17:53:42 2025 +0800
dmaengine: fsl-edma: fix channel parameter config for fixed channel requests
commit 2e7b5cf72e51c9cf9c8b75190189c757df31ddd9 upstream.
Configure only the requested channel when a fixed channel is specified
to avoid modifying other channels unintentionally.
Fix parameter configuration when a fixed DMA channel is requested on
i.MX9 AON domain and i.MX8QM/QXP/DXL platforms. When a client requests
a fixed channel (e.g., channel 6), the driver traverses channels 0-5
and may unintentionally modify their configuration if they are unused.
This leads to issues such as setting the `is_multi_fifo` flag unexpectedly,
causing memcpy tests to fail when using the dmatest tool.
Only affect edma memcpy test when the channel is fixed.
Fixes: 72f5801a4e2b ("dmaengine: fsl-edma: integrate v3 support")
Signed-off-by: Joy Zou <joy.zou@nxp.com>
Cc: stable@vger.kernel.org
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20250917-b4-edma-chanconf-v1-1-886486e02e91@nxp.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Date: Wed Jan 21 10:34:28 2026 -0800
dmaengine: idxd: Fix crash when the event log is disabled
[ Upstream commit 52d2edea0d63c935e82631e4b9e4a94eccf97b5b ]
If reporting errors to the event log is not supported by the hardware,
and an error that causes Function Level Reset (FLR) is received, the
driver will try to restore the event log even if it was not allocated.
Also, only try to free the event log if it was properly allocated.
Fixes: 6078a315aec1 ("dmaengine: idxd: Add idxd_device_config_save() and idxd_device_config_restore() helpers")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://patch.msgid.link/20260121-idxd-fix-flr-on-kernel-queues-v3-v3-2-7ed70658a9d1@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Date: Wed Jan 21 10:34:35 2026 -0800
dmaengine: idxd: Fix freeing the allocated ida too late
[ Upstream commit c311f5e9248471a950f0a524c2fd736414d98900 ]
It can happen that when the cdev .release() is called, the driver
already called ida_destroy(). Move ida_free() to the _del() path.
We see with DEBUG_KOBJECT_RELEASE enabled and forcing an early PCI
unbind.
Fixes: 04922b7445a1 ("dmaengine: idxd: fix cdev setup and free device lifetime issues")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://patch.msgid.link/20260121-idxd-fix-flr-on-kernel-queues-v3-v3-9-7ed70658a9d1@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Date: Wed Jan 21 10:34:36 2026 -0800
dmaengine: idxd: Fix leaking event log memory
[ Upstream commit ee66bc29578391c9b48523dc9119af67bd5c7c0f ]
During the device remove process, the device is reset, causing the
configuration registers to go back to their default state, which is
zero. As the driver is checking if the event log support was enabled
before deallocating, it will fail if a reset happened before.
Do not check if the support was enabled, the check for 'idxd->evl'
being valid (only allocated if the HW capability is available) is
enough.
Fixes: 244da66cda35 ("dmaengine: idxd: setup event log configuration")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://patch.msgid.link/20260121-idxd-fix-flr-on-kernel-queues-v3-v3-10-7ed70658a9d1@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Date: Wed Jan 21 10:34:34 2026 -0800
dmaengine: idxd: Fix memory leak when a wq is reset
[ Upstream commit d9cfb5193a047a92a4d3c0e91ea4cc87c8f7c478 ]
idxd_wq_disable_cleanup() which is called from the reset path for a
workqueue, sets the wq type to NONE, which for other parts of the
driver mean that the wq is empty (all its resources were released).
Only set the wq type to NONE after its resources are released.
Fixes: da32b28c95a7 ("dmaengine: idxd: cleanup workqueue config after disabling")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://patch.msgid.link/20260121-idxd-fix-flr-on-kernel-queues-v3-v3-8-7ed70658a9d1@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Date: Wed Jan 21 10:34:33 2026 -0800
dmaengine: idxd: Fix not releasing workqueue on .release()
[ Upstream commit 3d33de353b1ff9023d5ec73b9becf80ea87af695 ]
The workqueue associated with an DSA/IAA device is not released when
the object is freed.
Fixes: 47c16ac27d4c ("dmaengine: idxd: fix idxd conf_dev 'struct device' lifetime")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://patch.msgid.link/20260121-idxd-fix-flr-on-kernel-queues-v3-v3-7-7ed70658a9d1@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Date: Wed Jan 21 10:34:29 2026 -0800
dmaengine: idxd: Fix possible invalid memory access after FLR
[ Upstream commit d6077df7b75d26e4edf98983836c05d00ebabd8d ]
In the case that the first Function Level Reset (FLR) concludes
correctly, but in the second FLR the scratch area for the saved
configuration cannot be allocated, it's possible for a invalid memory
access to happen.
Always set the deallocated scratch area to NULL after FLR completes.
Fixes: 98d187a98903 ("dmaengine: idxd: Enable Function Level Reset (FLR) for halt")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://patch.msgid.link/20260121-idxd-fix-flr-on-kernel-queues-v3-v3-3-7ed70658a9d1@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tuo Li <islituo@gmail.com>
Date: Tue Jan 6 11:24:28 2026 +0800
dmaengine: idxd: fix possible wrong descriptor completion in llist_abort_desc()
[ Upstream commit e1c9866173c5f8521f2d0768547a01508cb9ff27 ]
At the end of this function, d is the traversal cursor of flist, but the
code completes found instead. This can lead to issues such as NULL pointer
dereferences, double completion, or descriptor leaks.
Fix this by completing d instead of found in the final
list_for_each_entry_safe() loop.
Fixes: aa8d18becc0c ("dmaengine: idxd: add callback support for iaa crypto")
Signed-off-by: Tuo Li <islituo@gmail.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20260106032428.162445-1-islituo@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Claudiu Beznea <claudiu.beznea@tuxon.dev>
Date: Mon Mar 16 15:32:46 2026 +0200
dmaengine: sh: rz-dmac: Move CHCTRL updates under spinlock
commit 89a8567d84bde88cb7cdbbac2ab2299c4f991490 upstream.
Both rz_dmac_disable_hw() and rz_dmac_irq_handle_channel() update the
CHCTRL register. To avoid concurrency issues when configuring
functionalities exposed by this registers, take the virtual channel lock.
All other CHCTRL updates were already protected by the same lock.
Previously, rz_dmac_disable_hw() disabled and re-enabled local IRQs, before
accessing CHCTRL registers but this does not ensure race-free access.
Remove the local IRQ disable/enable code as well.
Fixes: 5000d37042a6 ("dmaengine: sh: Add DMAC driver for RZ/G2L SoC")
Cc: stable@vger.kernel.org
Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Link: https://patch.msgid.link/20260316133252.240348-3-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Claudiu Beznea <claudiu.beznea@tuxon.dev>
Date: Mon Mar 16 15:32:45 2026 +0200
dmaengine: sh: rz-dmac: Protect the driver specific lists
commit abb863e6213dc41a58ef8bb3289b7e77460dabf3 upstream.
The driver lists (ld_free, ld_queue) are used in
rz_dmac_free_chan_resources(), rz_dmac_terminate_all(),
rz_dmac_issue_pending(), and rz_dmac_irq_handler_thread(), all under
the virtual channel lock. Take the same lock in rz_dmac_prep_slave_sg()
and rz_dmac_prep_dma_memcpy() as well to avoid concurrency issues, since
these functions also check whether the lists are empty and update or
remove list entries.
Fixes: 5000d37042a6 ("dmaengine: sh: Add DMAC driver for RZ/G2L SoC")
Cc: stable@vger.kernel.org
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Link: https://patch.msgid.link/20260316133252.240348-2-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Alexander Stein <alexander.stein@ew.tq-group.com>
Date: Tue Oct 14 08:13:08 2025 +0200
dmaengine: xilinx: xdma: Fix regmap init error handling
[ Upstream commit e0adbf74e2a0455a6bc9628726ba87bcd0b42bf8 ]
devm_regmap_init_mmio returns an ERR_PTR() upon error, not NULL.
Fix the error check and also fix the error message. Use the error code
from ERR_PTR() instead of the wrong value in ret.
Fixes: 17ce252266c7 ("dmaengine: xilinx: xdma: Add xilinx xdma driver")
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20251014061309.283468-1-alexander.stein@ew.tq-group.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marek Vasut <marex@nabladev.com>
Date: Mon Mar 16 23:16:54 2026 +0100
dmaengine: xilinx: xilinx_dma: Fix dma_device directions
[ Upstream commit e9cc95397bb7da13fe8a5b53a2f23cfaf9018ade ]
Unlike chan->direction , struct dma_device .directions field is a
bitfield. Turn chan->direction into a bitfield to make it compatible
with struct dma_device .directions .
Fixes: 7e01511443c3 ("dmaengine: xilinx_dma: Set dma_device directions")
Signed-off-by: Marek Vasut <marex@nabladev.com>
Link: https://patch.msgid.link/20260316221728.160139-1-marex@nabladev.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marek Vasut <marex@nabladev.com>
Date: Mon Mar 16 23:18:57 2026 +0100
dmaengine: xilinx: xilinx_dma: Fix residue calculation for cyclic DMA
[ Upstream commit f61d145999d61948a23cd436ebbfa4c3b9ab8987 ]
The cyclic DMA calculation is currently entirely broken and reports
residue only for the first segment. The problem is twofold.
First, when the first descriptor finishes, it is moved from active_list
to done_list, but it is never returned back into the active_list. The
xilinx_dma_tx_status() expects the descriptor to be in the active_list
to report any meaningful residue information, which never happens after
the first descriptor finishes. Fix this up in xilinx_dma_start_transfer()
and if the descriptor is cyclic, lift it from done_list and place it back
into active_list list.
Second, the segment .status fields of the descriptor remain dirty. Once
the DMA did one pass on the descriptor, the .status fields are populated
with data by the DMA, but the .status fields are not cleared before reuse
during the next cyclic DMA round. The xilinx_dma_get_residue() recognizes
that as if the descriptor was complete and had 0 residue, which is bogus.
Reinitialize the status field before placing the descriptor back into the
active_list.
Fixes: c0bba3a99f07 ("dmaengine: vdma: Add Support for Xilinx AXI Direct Memory Access Engine")
Signed-off-by: Marek Vasut <marex@nabladev.com>
Link: https://patch.msgid.link/20260316221943.160375-1-marex@nabladev.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marek Vasut <marex@nabladev.com>
Date: Mon Mar 16 23:25:24 2026 +0100
dmaengine: xilinx: xilinx_dma: Fix unmasked residue subtraction
[ Upstream commit c7d812e33f3e8ca0fa9eeabf71d1c7bc3acedc09 ]
The segment .control and .status fields both contain top bits which are
not part of the buffer size, the buffer size is located only in the bottom
max_buffer_len bits. To avoid interference from those top bits, mask out
the size using max_buffer_len first, and only then subtract the values.
Fixes: a575d0b4e663 ("dmaengine: xilinx_dma: Introduce xilinx_dma_get_residue")
Signed-off-by: Marek Vasut <marex@nabladev.com>
Link: https://patch.msgid.link/20260316222530.163815-1-marex@nabladev.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Date: Wed Mar 11 07:34:46 2026 +0200
dmaengine: xilinx_dma: Fix reset related timeout with two-channel AXIDMA
[ Upstream commit a17ce4bc6f4f9acf77ba416c36791a15602e53aa ]
A single AXIDMA controller can have one or two channels. When it has two
channels, the reset for both are tied together: resetting one channel
resets the other as well. This creates a problem where resetting one
channel will reset the registers for both channels, including clearing
interrupt enable bits for the other channel, which can then lead to
timeouts as the driver is waiting for an interrupt which never comes.
The driver currently has a probe-time work around for this: when a
channel is created, the driver also resets and enables the
interrupts. With two channels the reset for the second channel will
clear the interrupt enables for the first one. The work around in the
driver is just to manually enable the interrupts again in
xilinx_dma_alloc_chan_resources().
This workaround only addresses the probe-time issue. When channels are
reset at runtime (e.g., in xilinx_dma_terminate_all() or during error
recovery), there's no corresponding mechanism to restore the other
channel's interrupt enables. This leads to one channel having its
interrupts disabled while the driver expects them to work, causing
timeouts and DMA failures.
A proper fix is a complicated matter, as we should not reset the other
channel when it's operating normally. So, perhaps, there should be some
kind of synchronization for a common reset, which is not trivial to
implement. To add to the complexity, the driver also supports other DMA
types, like VDMA, CDMA and MCDMA, which don't have a shared reset.
However, when the two-channel AXIDMA is used in the (assumably) normal
use case, providing DMA for a single memory-to-memory device, the common
reset is a bit smaller issue: when something bad happens on one channel,
or when one channel is terminated, the assumption is that we also want
to terminate the other channel. And thus resetting both at the same time
is "ok".
With that line of thinking we can implement a bit better work around
than just the current probe time work around: let's enable the
AXIDMA interrupts at xilinx_dma_start_transfer() instead.
This ensures interrupts are enabled whenever a transfer starts,
regardless of any prior resets that may have cleared them.
This approach is also more logical: enable interrupts only when needed
for a transfer, rather than at resource allocation time, and, I think,
all the other DMA types should also use this model, but I'm reluctant to
do such changes as I cannot test them.
The reset function still enables interrupts even though it's not needed
for AXIDMA anymore, but it's common code for all DMA types (VDMA, CDMA,
MCDMA), so leave it unchanged to avoid affecting other variants.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Fixes: c0bba3a99f07 ("dmaengine: vdma: Add Support for Xilinx AXI Direct Memory Access Engine")
Link: https://patch.msgid.link/20260311-xilinx-dma-fix-v2-1-a725abb66e3c@ideasonboard.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Danilo Krummrich <dakr@kernel.org>
Date: Tue Mar 3 12:53:18 2026 +0100
driver core: generalize driver_override in struct device
[ Upstream commit cb3d1049f4ea77d5ad93f17d8ac1f2ed4da70501 ]
Currently, there are 12 busses (including platform and PCI) that
duplicate the driver_override logic for their individual devices.
All of them seem to be prone to the bug described in [1].
While this could be solved for every bus individually using a separate
lock, solving this in the driver-core generically results in less (and
cleaner) changes overall.
Thus, move driver_override to struct device, provide corresponding
accessors for busses and handle locking with a separate lock internally.
In particular, add device_set_driver_override(),
device_has_driver_override(), device_match_driver_override() and
generalize the sysfs store() and show() callbacks via a driver_override
feature flag in struct bus_type.
Until all busses have migrated, keep driver_set_override() in place.
Note that we can't use the device lock for the reasons described in [2].
Link: https://bugzilla.kernel.org/show_bug.cgi?id=220789 [1]
Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [2]
Tested-by: Gui-Dong Han <hanguidong02@gmail.com>
Co-developed-by: Gui-Dong Han <hanguidong02@gmail.com>
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20260303115720.48783-2-dakr@kernel.org
[ Use dev->bus instead of sp->bus for consistency; fix commit message to
refer to the struct bus_type's driver_override feature flag. - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Stable-dep-of: 2b38efc05bf7 ("driver core: platform: use generic driver_override infrastructure")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Danilo Krummrich <dakr@kernel.org>
Date: Tue Mar 3 12:53:21 2026 +0100
driver core: platform: use generic driver_override infrastructure
[ Upstream commit 2b38efc05bf7a8568ec74bfffea0f5cfa62bc01d ]
When a driver is probed through __driver_attach(), the bus' match()
callback is called without the device lock held, thus accessing the
driver_override field without a lock, which can cause a UAF.
Fix this by using the driver-core driver_override infrastructure taking
care of proper locking internally.
Note that calling match() from __driver_attach() without the device lock
held is intentional. [1]
Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1]
Reported-by: Gui-Dong Han <hanguidong02@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
Fixes: 3d713e0e382e ("driver core: platform: add device binding path 'driver_override'")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20260303115720.48783-5-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yussuf Khalil <dev@pp3345.net>
Date: Fri Mar 6 12:06:35 2026 +0000
drm/amd/display: Do not skip unrelated mode changes in DSC validation
[ Upstream commit aed3d041ab061ec8a64f50a3edda0f4db7280025 ]
Starting with commit 17ce8a6907f7 ("drm/amd/display: Add dsc pre-validation in
atomic check"), amdgpu resets the CRTC state mode_changed flag to false when
recomputing the DSC configuration results in no timing change for a particular
stream.
However, this is incorrect in scenarios where a change in MST/DSC configuration
happens in the same KMS commit as another (unrelated) mode change. For example,
the integrated panel of a laptop may be configured differently (e.g., HDR
enabled/disabled) depending on whether external screens are attached. In this
case, plugging in external DP-MST screens may result in the mode_changed flag
being dropped incorrectly for the integrated panel if its DSC configuration
did not change during precomputation in pre_validate_dsc().
At this point, however, dm_update_crtc_state() has already created new streams
for CRTCs with DSC-independent mode changes. In turn,
amdgpu_dm_commit_streams() will never release the old stream, resulting in a
memory leak. amdgpu_dm_atomic_commit_tail() will never acquire a reference to
the new stream either, which manifests as a use-after-free when the stream gets
disabled later on:
BUG: KASAN: use-after-free in dc_stream_release+0x25/0x90 [amdgpu]
Write of size 4 at addr ffff88813d836524 by task kworker/9:9/29977
Workqueue: events drm_mode_rmfb_work_fn
Call Trace:
<TASK>
dump_stack_lvl+0x6e/0xa0
print_address_description.constprop.0+0x88/0x320
? dc_stream_release+0x25/0x90 [amdgpu]
print_report+0xfc/0x1ff
? srso_alias_return_thunk+0x5/0xfbef5
? __virt_addr_valid+0x225/0x4e0
? dc_stream_release+0x25/0x90 [amdgpu]
kasan_report+0xe1/0x180
? dc_stream_release+0x25/0x90 [amdgpu]
kasan_check_range+0x125/0x200
dc_stream_release+0x25/0x90 [amdgpu]
dc_state_destruct+0x14d/0x5c0 [amdgpu]
dc_state_release.part.0+0x4e/0x130 [amdgpu]
dm_atomic_destroy_state+0x3f/0x70 [amdgpu]
drm_atomic_state_default_clear+0x8ee/0xf30
? drm_mode_object_put.part.0+0xb1/0x130
__drm_atomic_state_free+0x15c/0x2d0
atomic_remove_fb+0x67e/0x980
Since there is no reliable way of figuring out whether a CRTC has unrelated
mode changes pending at the time of DSC validation, remember the value of the
mode_changed flag from before the point where a CRTC was marked as potentially
affected by a change in DSC configuration. Reset the mode_changed flag to this
earlier value instead in pre_validate_dsc().
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5004
Fixes: 17ce8a6907f7 ("drm/amd/display: Add dsc pre-validation in atomic check")
Signed-off-by: Yussuf Khalil <dev@pp3345.net>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit cc7c7121ae082b7b82891baa7280f1ff2608f22b)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alex Hung <alex.hung@amd.com>
Date: Mon Mar 9 11:16:08 2026 -0600
drm/amd/display: Fix drm_edid leak in amdgpu_dm
commit 37c2caa167b0b8aca4f74c32404c5288b876a2a3 upstream.
[WHAT]
When a sink is connected, aconnector->drm_edid was overwritten without
freeing the previous allocation, causing a memory leak on resume.
[HOW]
Free the previous drm_edid before updating it.
Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 52024a94e7111366141cfc5d888b2ef011f879e5)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Yang Wang <kevinyang.wang@amd.com>
Date: Wed Nov 19 10:46:23 2025 +0800
drm/amd/pm: fix amdgpu_irq enabled counter unbalanced on smu v11.0
commit e12603bf2c3d571476a21debfeab80bb70d8c0cc upstream.
v1:
- fix amdgpu_irq enabled counter unbalanced issue on smu_v11_0_disable_thermal_alert.
v2:
- re-enable smu thermal alert to make amdgpu irq counter balance for smu v11.0 if in runpm state
[75582.361561] ------------[ cut here ]------------
[75582.361565] WARNING: CPU: 42 PID: 533 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:639 amdgpu_irq_put+0xd8/0xf0 [amdgpu]
...
[75582.362211] Tainted: [E]=UNSIGNED_MODULE
[75582.362214] Hardware name: GIGABYTE MZ01-CE0-00/MZ01-CE0-00, BIOS F14a 08/14/2020
[75582.362218] Workqueue: pm pm_runtime_work
[75582.362225] RIP: 0010:amdgpu_irq_put+0xd8/0xf0 [amdgpu]
[75582.362556] Code: 31 f6 31 ff e9 c9 bf cf c2 44 89 f2 4c 89 e6 4c 89 ef e8 db fc ff ff 5b 41 5c 41 5d 41 5e 5d 31 d2 31 f6 31 ff e9 a8 bf cf c2 <0f> 0b eb c3 b8 fe ff ff ff eb 97 e9 84 e8 8b 00 0f 1f 84 00 00 00
[75582.362560] RSP: 0018:ffffd50d51297b80 EFLAGS: 00010246
[75582.362564] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[75582.362568] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[75582.362570] RBP: ffffd50d51297ba0 R08: 0000000000000000 R09: 0000000000000000
[75582.362573] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e72091d2008
[75582.362576] R13: ffff8e720af80000 R14: 0000000000000000 R15: ffff8e720af80000
[75582.362579] FS: 0000000000000000(0000) GS:ffff8e9158262000(0000) knlGS:0000000000000000
[75582.362582] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[75582.362585] CR2: 000074869d040c14 CR3: 0000001e37a3e000 CR4: 00000000003506f0
[75582.362588] Call Trace:
[75582.362591] <TASK>
[75582.362597] smu_v11_0_disable_thermal_alert+0x17/0x30 [amdgpu]
[75582.362983] smu_smc_hw_cleanup+0x79/0x4f0 [amdgpu]
[75582.363375] smu_suspend+0x92/0x110 [amdgpu]
[75582.363762] ? gfx_v10_0_hw_fini+0xd5/0x150 [amdgpu]
[75582.364098] amdgpu_ip_block_suspend+0x27/0x80 [amdgpu]
[75582.364377] ? timer_delete_sync+0x10/0x20
[75582.364384] amdgpu_device_ip_suspend_phase2+0x190/0x450 [amdgpu]
[75582.364665] amdgpu_device_suspend+0x1ae/0x2f0 [amdgpu]
[75582.364948] amdgpu_pmops_runtime_suspend+0xf3/0x1f0 [amdgpu]
[75582.365230] pci_pm_runtime_suspend+0x6d/0x1f0
[75582.365237] ? __pfx_pci_pm_runtime_suspend+0x10/0x10
[75582.365242] __rpm_callback+0x4c/0x190
[75582.365246] ? srso_return_thunk+0x5/0x5f
[75582.365252] ? srso_return_thunk+0x5/0x5f
[75582.365256] ? ktime_get_mono_fast_ns+0x43/0xe0
[75582.365263] rpm_callback+0x6e/0x80
[75582.365267] rpm_suspend+0x124/0x5f0
[75582.365271] ? srso_return_thunk+0x5/0x5f
[75582.365275] ? __schedule+0x439/0x15e0
[75582.365281] ? srso_return_thunk+0x5/0x5f
[75582.365285] ? __queue_delayed_work+0xb8/0x180
[75582.365293] pm_runtime_work+0xc6/0xe0
[75582.365297] process_one_work+0x1a1/0x3f0
[75582.365303] worker_thread+0x2ba/0x3d0
[75582.365309] kthread+0x107/0x220
[75582.365313] ? __pfx_worker_thread+0x10/0x10
[75582.365318] ? __pfx_kthread+0x10/0x10
[75582.365323] ret_from_fork+0xa2/0x120
[75582.365328] ? __pfx_kthread+0x10/0x10
[75582.365332] ret_from_fork_asm+0x1a/0x30
[75582.365343] </TASK>
[75582.365345] ---[ end trace 0000000000000000 ]---
[75582.365350] amdgpu 0000:05:00.0: amdgpu: Fail to disable thermal alert!
[75582.365379] amdgpu 0000:05:00.0: amdgpu: suspend of IP block <smu> failed -22
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Asad Kamal <asad.kamal@amd.com>
Date: Wed Mar 18 13:52:57 2026 +0800
drm/amd/pm: Return -EOPNOTSUPP for unsupported OD_MCLK on smu_v13_0_6
commit 2f0e491faee43181b6a86e90f34016b256042fe1 upstream.
When SET_UCLK_MAX capability is absent, return -EOPNOTSUPP from
smu_v13_0_6_emit_clk_levels() for OD_MCLK instead of 0. This makes
unsupported OD_MCLK reporting consistent with other clock types
and allows callers to skip the entry cleanly.
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d82e0a72d9189e8acd353988e1a57f85ce479e37)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date: Mon Mar 23 13:41:18 2026 +0530
drm/amdgpu: Fix fence put before wait in amdgpu_amdkfd_submit_ib
[ Upstream commit 7150850146ebfa4ca998f653f264b8df6f7f85be ]
amdgpu_amdkfd_submit_ib() submits a GPU job and gets a fence
from amdgpu_ib_schedule(). This fence is used to wait for job
completion.
Currently, the code drops the fence reference using dma_fence_put()
before calling dma_fence_wait().
If dma_fence_put() releases the last reference, the fence may be
freed before dma_fence_wait() is called. This can lead to a
use-after-free.
Fix this by waiting on the fence first and releasing the reference
only after dma_fence_wait() completes.
Fixes the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c:697 amdgpu_amdkfd_submit_ib() warn: passing freed memory 'f' (line 696)
Fixes: 9ae55f030dc5 ("drm/amdgpu: Follow up change to previous drm scheduler change.")
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8b9e5259adc385b61a6590a13b82ae0ac2bd3482)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yang Wang <kevinyang.wang@amd.com>
Date: Wed Mar 4 18:45:45 2026 -0500
drm/amdgpu: fix gpu idle power consumption issue for gfx v12
[ Upstream commit a6571045cf06c4aa749b4801382ae96650e2f0e1 ]
Older versions of the MES firmware may cause abnormal GPU power consumption.
When performing inference tasks on the GPU (e.g., with Ollama using ROCm),
the GPU may show abnormal power consumption in idle state and incorrect GPU load information.
This issue has been fixed in firmware version 0x8b and newer.
Closes: https://github.com/ROCm/ROCm/issues/5706
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4e22a5fe6ea6e0b057e7f246df4ac3ff8bfbc46a)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Huang <jinhuieric.huang@amd.com>
Date: Mon Mar 16 11:01:30 2026 -0400
drm/amdgpu: prevent immediate PASID reuse case
commit 14b81abe7bdc25f8097906fc2f91276ffedb2d26 upstream.
PASID resue could cause interrupt issue when process
immediately runs into hw state left by previous
process exited with the same PASID, it's possible that
page faults are still pending in the IH ring buffer when
the process exits and frees up its PASID. To prevent the
case, it uses idr cyclic allocator same as kernel pid's.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8f1de51f49be692de137c8525106e0fce2d1912d)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Imre Deak <imre.deak@intel.com>
Date: Fri Mar 20 11:29:00 2026 +0200
drm/i915/dp_tunnel: Fix error handling when clearing stream BW in atomic state
commit 77fcf58df15edcf3f5b5421f24814fb72796def9 upstream.
Clearing the DP tunnel stream BW in the atomic state involves getting
the tunnel group state, which can fail. Handle the error accordingly.
This fixes at least one issue where drm_dp_tunnel_atomic_set_stream_bw()
failed to get the tunnel group state returning -EDEADLK, which wasn't
handled. This lead to the ctx->contended warn later in modeset_lock()
while taking a WW mutex for another object in the same atomic state, and
thus within the same already contended WW context.
Moving intel_crtc_state_alloc() later would avoid freeing saved_state on
the error path; this stable patch leaves that simplification for a
follow-up.
Cc: Uma Shankar <uma.shankar@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v6.9+
Fixes: a4efae87ecb2 ("drm/i915/dp: Compute DP tunnel BW during encoder state computation")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7617
Reviewed-by: Michał Grzelak <michal.grzelak@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patch.msgid.link/20260320092900.13210-1-imre.deak@intel.com
(cherry picked from commit fb69d0076e687421188bc8103ab0e8e5825b1df1)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Samasth Norway Ananda <samasth.norway.ananda@oracle.com>
Date: Mon Mar 16 16:19:19 2026 -0700
drm/i915/gmbus: fix spurious timeout on 512-byte burst reads
[ Upstream commit 08441f10f4dc09fdeb64529953ac308abc79dd38 ]
When reading exactly 512 bytes with burst read enabled, the
extra_byte_added path breaks out of the inner do-while without
decrementing len. The outer while(len) then re-enters and gmbus_wait()
times out since all data has been delivered. Decrement len before the
break so the outer loop terminates correctly.
Fixes: d5dc0f43f268 ("drm/i915/gmbus: Enable burst read")
Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda@oracle.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patch.msgid.link/20260316231920.135438-2-samasth.norway.ananda@oracle.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 4ab0f09ee73fc853d00466682635f67c531f909c)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Fri Mar 13 13:07:40 2026 +0200
drm/i915: Order OP vs. timeout correctly in __wait_for()
commit 6ad2a661ff0d3d94884947d2a593311ba46d34c2 upstream.
Put the barrier() before the OP so that anything we read out in
OP and check in COND will actually be read out after the timeout
has been evaluated.
Currently the only place where we use OP is __intel_wait_for_register(),
but the use there is precisely susceptible to this reordering, assuming
the ktime_*() stuff itself doesn't act as a sufficient barrier:
__intel_wait_for_register(...)
{
...
ret = __wait_for(reg_value = intel_uncore_read_notrace(...),
(reg_value & mask) == value, ...);
...
}
Cc: stable@vger.kernel.org
Fixes: 1c3c1dc66a96 ("drm/i915: Add compiler barrier to wait_for")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20260313110740.24620-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit a464bace0482aa9a83e9aa7beefbaf44cd58e6cf)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Mon Mar 16 18:39:51 2026 +0200
drm/i915: Unlink NV12 planes earlier
commit bfa71b7a9dc6b5b8af157686e03308291141d00c upstream.
unlink_nv12_plane() will clobber parts of the plane state
potentially already set up by plane_atomic_check(), so we
must make sure not to call the two in the wrong order.
The problem happens when a plane previously selected as
a Y plane is now configured as a normal plane by user space.
plane_atomic_check() will first compute the proper plane
state based on the userspace request, and unlink_nv12_plane()
later clears some of the state.
This used to work on account of unlink_nv12_plane() skipping
the state clearing based on the plane visibility. But I removed
that check, thinking it was an impossible situation. Now when
that situation happens unlink_nv12_plane() will just WARN
and proceed to clobber the state.
Rather than reverting to the old way of doing things, I think
it's more clear if we unlink the NV12 planes before we even
compute the new plane state.
Cc: stable@vger.kernel.org
Reported-by: Khaled Almahallawy <khaled.almahallawy@intel.com>
Closes: https://lore.kernel.org/intel-gfx/20260212004852.1920270-1-khaled.almahallawy@intel.com/
Tested-by: Khaled Almahallawy <khaled.almahallawy@intel.com>
Fixes: 6a01df2f1b2a ("drm/i915: Remove pointless visible check in unlink_nv12_plane()")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20260316163953.12905-2-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
(cherry picked from commit 017ecd04985573eeeb0745fa2c23896fb22ee0cc)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Luca Leonardo Scorcia <l.scorcia@gmail.com>
Date: Wed Feb 25 09:38:41 2026 +0000
drm/mediatek: dsi: Store driver data before invoking mipi_dsi_host_register
[ Upstream commit 4cfdfeb6ac06079f92fccd977fa742d6c5b8dd3a ]
The call to mipi_dsi_host_register triggers a callback to mtk_dsi_bind,
which uses dev_get_drvdata to retrieve the mtk_dsi struct, so this
structure needs to be stored inside the driver data before invoking it.
As drvdata is currently uninitialized it leads to a crash when
registering the DSI DRM encoder right after acquiring
the mode_config.idr_mutex, blocking all subsequent DRM operations.
Fixes the following crash during mediatek-drm probe (tested on Xiaomi
Smart Clock x04g):
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000040
[...]
Modules linked in: mediatek_drm(+) drm_display_helper cec drm_client_lib
drm_dma_helper drm_kms_helper panel_simple
[...]
Call trace:
drm_mode_object_add+0x58/0x98 (P)
__drm_encoder_init+0x48/0x140
drm_encoder_init+0x6c/0xa0
drm_simple_encoder_init+0x20/0x34 [drm_kms_helper]
mtk_dsi_bind+0x34/0x13c [mediatek_drm]
component_bind_all+0x120/0x280
mtk_drm_bind+0x284/0x67c [mediatek_drm]
try_to_bring_up_aggregate_device+0x23c/0x320
__component_add+0xa4/0x198
component_add+0x14/0x20
mtk_dsi_host_attach+0x78/0x100 [mediatek_drm]
mipi_dsi_attach+0x2c/0x50
panel_simple_dsi_probe+0x4c/0x9c [panel_simple]
mipi_dsi_drv_probe+0x1c/0x28
really_probe+0xc0/0x3dc
__driver_probe_device+0x80/0x160
driver_probe_device+0x40/0x120
__device_attach_driver+0xbc/0x17c
bus_for_each_drv+0x88/0xf0
__device_attach+0x9c/0x1cc
device_initial_probe+0x54/0x60
bus_probe_device+0x34/0xa0
device_add+0x5b0/0x800
mipi_dsi_device_register_full+0xdc/0x16c
mipi_dsi_host_register+0xc4/0x17c
mtk_dsi_probe+0x10c/0x260 [mediatek_drm]
platform_probe+0x5c/0xa4
really_probe+0xc0/0x3dc
__driver_probe_device+0x80/0x160
driver_probe_device+0x40/0x120
__driver_attach+0xc8/0x1f8
bus_for_each_dev+0x7c/0xe0
driver_attach+0x24/0x30
bus_add_driver+0x11c/0x240
driver_register+0x68/0x130
__platform_register_drivers+0x64/0x160
mtk_drm_init+0x24/0x1000 [mediatek_drm]
do_one_initcall+0x60/0x1d0
do_init_module+0x54/0x240
load_module+0x1838/0x1dc0
init_module_from_file+0xd8/0xf0
__arm64_sys_finit_module+0x1b4/0x428
invoke_syscall.constprop.0+0x48/0xc8
do_el0_svc+0x3c/0xb8
el0_svc+0x34/0xe8
el0t_64_sync_handler+0xa0/0xe4
el0t_64_sync+0x198/0x19c
Code: 52800022 941004ab 2a0003f3 37f80040 (29005a80)
Fixes: e4732b590a77 ("drm/mediatek: dsi: Register DSI host after acquiring clocks and PHY")
Signed-off-by: Luca Leonardo Scorcia <l.scorcia@gmail.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: CK Hu <ck.hu@mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20260225094047.76780-1-l.scorcia@gmail.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Maarten Lankhorst <dev@lankhorst.se>
Date: Wed Mar 4 09:56:16 2026 +0100
drm/ttm/tests: Fix build failure on PREEMPT_RT
[ Upstream commit a58d487fb1a52579d3c37544ea371da78ed70c45 ]
Fix a compile error in the kunit tests when CONFIG_PREEMPT_RT is
enabled, and the normal mutex is converted into a rtmutex.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202602261547.3bM6yVAS-lkp@intel.com/
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
Link: https://patch.msgid.link/20260304085616.1216961-1-dev@lankhorst.se
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Matthew Auld <matthew.auld@intel.com>
Date: Wed Mar 18 10:02:09 2026 +0000
drm/xe: always keep track of remap prev/next
commit bfe9e314d7574d1c5c851972e7aee342733819d2 upstream.
During 3D workload, user is reporting hitting:
[ 413.361679] WARNING: drivers/gpu/drm/xe/xe_vm.c:1217 at vm_bind_ioctl_ops_unwind+0x1e2/0x2e0 [xe], CPU#7: vkd3d_queue/9925
[ 413.361944] CPU: 7 UID: 1000 PID: 9925 Comm: vkd3d_queue Kdump: loaded Not tainted 7.0.0-070000rc3-generic #202603090038 PREEMPT(lazy)
[ 413.361949] RIP: 0010:vm_bind_ioctl_ops_unwind+0x1e2/0x2e0 [xe]
[ 413.362074] RSP: 0018:ffffd4c25c3df930 EFLAGS: 00010282
[ 413.362077] RAX: 0000000000000000 RBX: ffff8f3ee817ed10 RCX: 0000000000000000
[ 413.362078] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 413.362079] RBP: ffffd4c25c3df980 R08: 0000000000000000 R09: 0000000000000000
[ 413.362081] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f41fbf99380
[ 413.362082] R13: ffff8f3ee817e968 R14: 00000000ffffffef R15: ffff8f43d00bd380
[ 413.362083] FS: 00000001040ff6c0(0000) GS:ffff8f4696d89000(0000) knlGS:00000000330b0000
[ 413.362085] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
[ 413.362086] CR2: 00007ddfc4747000 CR3: 00000002e6262005 CR4: 0000000000f72ef0
[ 413.362088] PKRU: 55555554
[ 413.362089] Call Trace:
[ 413.362092] <TASK>
[ 413.362096] xe_vm_bind_ioctl+0xa9a/0xc60 [xe]
Which seems to hint that the vma we are re-inserting for the ops unwind
is either invalid or overlapping with something already inserted in the
vm. It shouldn't be invalid since this is a re-insertion, so must have
worked before. Leaving the likely culprit as something already placed
where we want to insert the vma.
Following from that, for the case where we do something like a rebind in
the middle of a vma, and one or both mapped ends are already compatible,
we skip doing the rebind of those vma and set next/prev to NULL. As well
as then adjust the original unmap va range, to avoid unmapping the ends.
However, if we trigger the unwind path, we end up with three va, with
the two ends never being removed and the original va range in the middle
still being the shrunken size.
If this occurs, one failure mode is when another unwind op needs to
interact with that range, which can happen with a vector of binds. For
example, if we need to re-insert something in place of the original va.
In this case the va is still the shrunken version, so when removing it
and then doing a re-insert it can overlap with the ends, which were
never removed, triggering a warning like above, plus leaving the vm in a
bad state.
With that, we need two things here:
1) Stop nuking the prev/next tracking for the skip cases. Instead
relying on checking for skip prev/next, where needed. That way on the
unwind path, we now correctly remove both ends.
2) Undo the unmap va shrinkage, on the unwind path. With the two ends
now removed the unmap va should expand back to the original size again,
before re-insertion.
v2:
- Update the explanation in the commit message, based on an actual IGT of
triggering this issue, rather than conjecture.
- Also undo the unmap shrinkage, for the skip case. With the two ends
now removed, the original unmap va range should expand back to the
original range.
v3:
- Track the old start/range separately. vma_size/start() uses the va
info directly.
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7602
Fixes: 8f33b4f054fc ("drm/xe: Avoid doing rebinds")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260318100208.78097-2-matthew.auld@intel.com
(cherry picked from commit aec6969f75afbf4e01fd5fb5850ed3e9c27043ac)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matt Roper <matthew.d.roper@intel.com>
Date: Thu Mar 19 15:30:34 2026 -0700
drm/xe: Implement recent spec updates to Wa_16025250150
[ Upstream commit 56781a4597706cd25185b1dedc38841ec6c31496 ]
The hardware teams noticed that the originally documented workaround
steps for Wa_16025250150 may not be sufficient to fully avoid a hardware
issue. The workaround documentation has been augmented to suggest
programming one additional register; make the corresponding change in
the driver.
Fixes: 7654d51f1fd8 ("drm/xe/xe2hpg: Add Wa_16025250150")
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patch.msgid.link/20260319-wa_16025250150_part2-v1-1-46b1de1a31b2@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit a31566762d4075646a8a2214586158b681e94305)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jiucheng Xu <jiucheng.xu@amlogic.com>
Date: Wed Mar 11 17:11:31 2026 +0800
erofs: add GFP_NOIO in the bio completion if needed
commit c23df30915f83e7257c8625b690a1cece94142a0 upstream.
The bio completion path in the process context (e.g. dm-verity)
will directly call into decompression rather than trigger another
workqueue context for minimal scheduling latencies, which can
then call vm_map_ram() with GFP_KERNEL.
Due to insufficient memory, vm_map_ram() may generate memory
swapping I/O, which can cause submit_bio_wait to deadlock
in some scenarios.
Trimmed down the call stack, as follows:
f2fs_submit_read_io
submit_bio //bio_list is initialized.
mmc_blk_mq_recovery
z_erofs_endio
vm_map_ram
__pte_alloc_kernel
__alloc_pages_direct_reclaim
shrink_folio_list
__swap_writepage
submit_bio_wait //bio_list is non-NULL, hang!!!
Use memalloc_noio_{save,restore}() to wrap up this path.
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Jiucheng Xu <jiucheng.xu@amlogic.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sheng Yong <shengyong1@xiaomi.com>
Date: Fri Feb 27 10:30:08 2026 +0800
erofs: set fileio bio failed in short read case
[ Upstream commit eade54040384f54b7fb330e4b0975c5734850b3c ]
For file-backed mount, IO requests are handled by vfs_iocb_iter_read().
However, it can be interrupted by SIGKILL, returning the number of
bytes actually copied. Unused folios in bio are unexpectedly marked
as uptodate.
vfs_read
filemap_read
filemap_get_pages
filemap_readahead
erofs_fileio_readahead
erofs_fileio_rq_submit
vfs_iocb_iter_read
filemap_read
filemap_get_pages <= detect signal
erofs_fileio_ki_complete <= set all folios uptodate
This patch addresses this by setting short read bio with an error
directly.
Fixes: bc804a8d7e86 ("erofs: handle end of filesystem properly for file-backed mounts")
Reported-by: chenguanyou <chenguanyou@xiaomi.com>
Signed-off-by: Yunlei He <heyunlei@xiaomi.com>
Signed-off-by: Sheng Yong <shengyong1@xiaomi.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sabrina Dubroca <sd@queasysnail.net>
Date: Tue Feb 24 00:05:14 2026 +0100
esp: fix skb leak with espintcp and async crypto
[ Upstream commit 0c0eef8ccd2413b0a10eb6bbd3442333b1e64dd2 ]
When the TX queue for espintcp is full, esp_output_tail_tcp will
return an error and not free the skb, because with synchronous crypto,
the common xfrm output code will drop the packet for us.
With async crypto (esp_output_done), we need to drop the skb when
esp_output_tail_tcp returns an error.
Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Theodore Ts'o <tytso@mit.edu>
Date: Fri Mar 27 02:13:15 2026 -0400
ext4: always drain queued discard work in ext4_mb_release()
commit 9ee29d20aab228adfb02ca93f87fb53c56c2f3af upstream.
While reviewing recent ext4 patch[1], Sashiko raised the following
concern[2]:
> If the filesystem is initially mounted with the discard option,
> deleting files will populate sbi->s_discard_list and queue
> s_discard_work. If it is then remounted with nodiscard, the
> EXT4_MOUNT_DISCARD flag is cleared, but the pending s_discard_work is
> neither cancelled nor flushed.
[1] https://lore.kernel.org/r/20260319094545.19291-1-qiang.zhang@linux.dev/
[2] https://sashiko.dev/#/patchset/20260319094545.19291-1-qiang.zhang%40linux.dev
The concern was valid, but it had nothing to do with the patch[1].
One of the problems with Sashiko in its current (early) form is that
it will detect pre-existing issues and report it as a problem with the
patch that it is reviewing.
In practice, it would be hard to hit deliberately (unless you are a
malicious syzkaller fuzzer), since it would involve mounting the file
system with -o discard, and then deleting a large number of files,
remounting the file system with -o nodiscard, and then immediately
unmounting the file system before the queued discard work has a change
to drain on its own.
Fix it because it's a real bug, and to avoid Sashiko from raising this
concern when analyzing future patches to mballoc.c.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fixes: 55cdd0af2bc5 ("ext4: get discard out of jbd2 commit kthread contex")
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ye Bin <yebin10@huawei.com>
Date: Mon Mar 2 21:46:19 2026 +0800
ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal()
commit 46066e3a06647c5b186cc6334409722622d05c44 upstream.
There's issue as follows:
...
EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117
EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2243 at logical offset 0 with max blocks 1 with error 117
EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2239 at logical offset 0 with max blocks 1 with error 117
EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost
EXT4-fs (mmcblk0p1): error count since last fsck: 1
EXT4-fs (mmcblk0p1): initial error at time 1765597433: ext4_mb_generate_buddy:760
EXT4-fs (mmcblk0p1): last error at time 1765597433: ext4_mb_generate_buddy:760
...
According to the log analysis, blocks are always requested from the
corrupted block group. This may happen as follows:
ext4_mb_find_by_goal
ext4_mb_load_buddy
ext4_mb_load_buddy_gfp
ext4_mb_init_cache
ext4_read_block_bitmap_nowait
ext4_wait_block_bitmap
ext4_validate_block_bitmap
if (!grp || EXT4_MB_GRP_BBITMAP_CORRUPT(grp))
return -EFSCORRUPTED; // There's no logs.
if (err)
return err; // Will return error
ext4_lock_group(ac->ac_sb, group);
if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info))) // Unreachable
goto out;
After commit 9008a58e5dce ("ext4: make the bitmap read routines return
real error codes") merged, Commit 163a203ddb36 ("ext4: mark block group
as corrupt on block bitmap error") is no real solution for allocating
blocks from corrupted block groups. This is because if
'EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)' is true, then
'ext4_mb_load_buddy()' may return an error. This means that the block
allocation will fail.
Therefore, check block group if corrupted when ext4_mb_load_buddy()
returns error.
Fixes: 163a203ddb36 ("ext4: mark block group as corrupt on block bitmap error")
Fixes: 9008a58e5dce ("ext4: make the bitmap read routines return real error codes")
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260302134619.3145520-1-yebin@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Edward Adam Davis <eadavis@qq.com>
Date: Fri Mar 6 09:31:58 2026 +0800
ext4: avoid infinite loops caused by residual data
commit 5422fe71d26d42af6c454ca9527faaad4e677d6c upstream.
On the mkdir/mknod path, when mapping logical blocks to physical blocks,
if inserting a new extent into the extent tree fails (in this example,
because the file system disabled the huge file feature when marking the
inode as dirty), ext4_ext_map_blocks() only calls ext4_free_blocks() to
reclaim the physical block without deleting the corresponding data in
the extent tree. This causes subsequent mkdir operations to reference
the previously reclaimed physical block number again, even though this
physical block is already being used by the xattr block. Therefore, a
situation arises where both the directory and xattr are using the same
buffer head block in memory simultaneously.
The above causes ext4_xattr_block_set() to enter an infinite loop about
"inserted" and cannot release the inode lock, ultimately leading to the
143s blocking problem mentioned in [1].
If the metadata is corrupted, then trying to remove some extent space
can do even more harm. Also in case EXT4_GET_BLOCKS_DELALLOC_RESERVE
was passed, remove space wrongly update quota information.
Jan Kara suggests distinguishing between two cases:
1) The error is ENOSPC or EDQUOT - in this case the filesystem is fully
consistent and we must maintain its consistency including all the
accounting. However these errors can happen only early before we've
inserted the extent into the extent tree. So current code works correctly
for this case.
2) Some other error - this means metadata is corrupted. We should strive to
do as few modifications as possible to limit damage. So I'd just skip
freeing of allocated blocks.
[1]
INFO: task syz.0.17:5995 blocked for more than 143 seconds.
Call Trace:
inode_lock_nested include/linux/fs.h:1073 [inline]
__start_dirop fs/namei.c:2923 [inline]
start_dirop fs/namei.c:2934 [inline]
Reported-by: syzbot+512459401510e2a9a39f@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1659aaaaa8d9d11265d7
Tested-by: syzbot+1659aaaaa8d9d11265d7@syzkaller.appspotmail.com
Reported-by: syzbot+1659aaaaa8d9d11265d7@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=512459401510e2a9a39f
Tested-by: syzbot+1659aaaaa8d9d11265d7@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Tested-by: syzbot+512459401510e2a9a39f@syzkaller.appspotmail.com
Link: https://patch.msgid.link/tencent_43696283A68450B761D76866C6F360E36705@qq.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Deepanshu Kartikey <kartikey406@gmail.com>
Date: Sat Feb 7 10:06:07 2026 +0530
ext4: convert inline data to extents when truncate exceeds inline size
commit ed9356a30e59c7cc3198e7fc46cfedf3767b9b17 upstream.
Add a check in ext4_setattr() to convert files from inline data storage
to extent-based storage when truncate() grows the file size beyond the
inline capacity. This prevents the filesystem from entering an
inconsistent state where the inline data flag is set but the file size
exceeds what can be stored inline.
Without this fix, the following sequence causes a kernel BUG_ON():
1. Mount filesystem with inode that has inline flag set and small size
2. truncate(file, 50MB) - grows size but inline flag remains set
3. sendfile() attempts to write data
4. ext4_write_inline_data() hits BUG_ON(write_size > inline_capacity)
The crash occurs because ext4_write_inline_data() expects inline storage
to accommodate the write, but the actual inline capacity (~60 bytes for
i_block + ~96 bytes for xattrs) is far smaller than the file size and
write request.
The fix checks if the new size from setattr exceeds the inode's actual
inline capacity (EXT4_I(inode)->i_inline_size) and converts the file to
extent-based storage before proceeding with the size change.
This addresses the root cause by ensuring the inline data flag and file
size remain consistent during truncate operations.
Reported-by: syzbot+7de5fe447862fc37576f@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7de5fe447862fc37576f
Tested-by: syzbot+7de5fe447862fc37576f@syzkaller.appspotmail.com
Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
Link: https://patch.msgid.link/20260207043607.1175976-1-kartikey406@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Zhang Yi <yi.zhang@huawei.com>
Date: Sat Jan 31 17:11:56 2026 +0800
ext4: do not check fast symlink during orphan recovery
commit 84e21e3fb8fd99ea460eb7274584750d11cf3e9f upstream.
Commit '5f920d5d6083 ("ext4: verify fast symlink length")' causes the
generic/475 test to fail during orphan cleanup of zero-length symlinks.
generic/475 84s ... _check_generic_filesystem: filesystem on /dev/vde is inconsistent
The fsck reports are provided below:
Deleted inode 9686 has zero dtime.
Deleted inode 158230 has zero dtime.
...
Inode bitmap differences: -9686 -158230
Orphan file (inode 12) block 13 is not clean.
Failed to initialize orphan file.
In ext4_symlink(), a newly created symlink can be added to the orphan
list due to ENOSPC. Its data has not been initialized, and its size is
zero. Therefore, we need to disregard the length check of the symbolic
link when cleaning up orphan inodes. Instead, we should ensure that the
nlink count is zero.
Fixes: 5f920d5d6083 ("ext4: verify fast symlink length")
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260131091156.1733648-1-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jan Kara <jack@suse.cz>
Date: Mon Feb 16 17:48:44 2026 +0100
ext4: fix fsync(2) for nojournal mode
commit 1308255bbf8452762f89f44f7447ce137ecdbcff upstream.
When inode metadata is changed, we sometimes just call
ext4_mark_inode_dirty() to track modified metadata. This copies inode
metadata into block buffer which is enough when we are journalling
metadata. However when we are running in nojournal mode we currently
fail to write the dirtied inode buffer during fsync(2) because the inode
is not marked as dirty. Use explicit ext4_write_inode() call to make
sure the inode table buffer is written to the disk. This is a band aid
solution but proper solution requires a much larger rewrite including
changes in metadata bh tracking infrastructure.
Reported-by: Free Ekanayaka <free.ekanayaka@gmail.com>
Link: https://lore.kernel.org/all/87il8nhxdm.fsf@x1.mail-host-address-is-not-set/
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://patch.msgid.link/20260216164848.3074-4-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Baokun Li <libaokun@linux.alibaba.com>
Date: Mon Mar 23 14:08:36 2026 +0800
ext4: fix iloc.bh leak in ext4_fc_replay_inode() error paths
commit ec0a7500d8eace5b4f305fa0c594dd148f0e8d29 upstream.
During code review, Joseph found that ext4_fc_replay_inode() calls
ext4_get_fc_inode_loc() to get the inode location, which holds a
reference to iloc.bh that must be released via brelse().
However, several error paths jump to the 'out' label without
releasing iloc.bh:
- ext4_handle_dirty_metadata() failure
- sync_dirty_buffer() failure
- ext4_mark_inode_used() failure
- ext4_iget() failure
Fix this by introducing an 'out_brelse' label placed just before
the existing 'out' label to ensure iloc.bh is always released.
Additionally, make ext4_fc_replay_inode() propagate errors
properly instead of always returning 0.
Reported-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
Signed-off-by: Baokun Li <libaokun@linux.alibaba.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260323060836.3452660-1-libaokun@linux.alibaba.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Simon Weber <simon.weber.39@gmail.com>
Date: Sat Feb 7 10:53:03 2026 +0100
ext4: fix journal credit check when setting fscrypt context
commit b1d682f1990c19fb1d5b97d13266210457092bcd upstream.
Fix an issue arising when ext4 features has_journal, ea_inode, and encrypt
are activated simultaneously, leading to ENOSPC when creating an encrypted
file.
Fix by passing XATTR_CREATE flag to xattr_set_handle function if a handle
is specified, i.e., when the function is called in the control flow of
creating a new inode. This aligns the number of jbd2 credits set_handle
checks for with the number allocated for creating a new inode.
ext4_set_context must not be called with a non-null handle (fs_data) if
fscrypt context xattr is not guaranteed to not exist yet. The only other
usage of this function currently is when handling the ioctl
FS_IOC_SET_ENCRYPTION_POLICY, which calls it with fs_data=NULL.
Fixes: c1a5d5f6ab21eb7e ("ext4: improve journal credit handling in set xattr paths")
Co-developed-by: Anthony Durrer <anthonydev@fastmail.com>
Signed-off-by: Anthony Durrer <anthonydev@fastmail.com>
Signed-off-by: Simon Weber <simon.weber.39@gmail.com>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Link: https://patch.msgid.link/20260207100148.724275-4-simon.weber.39@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jan Kara <jack@suse.cz>
Date: Thu Feb 5 10:22:24 2026 +0100
ext4: fix stale xarray tags after writeback
commit f4a2b42e78914ff15630e71289adc589c3a8eb45 upstream.
There are cases where ext4_bio_write_page() gets called for a page which
has no buffers to submit. This happens e.g. when the part of the file is
actually a hole, when we cannot allocate blocks due to being called from
jbd2, or in data=journal mode when checkpointing writes the buffers
earlier. In these cases we just return from ext4_bio_write_page()
however if the page didn't need redirtying, we will leave stale DIRTY
and/or TOWRITE tags in xarray because those get cleared only in
__folio_start_writeback(). As a result we can leave these tags set in
mappings even after a final sync on filesystem that's getting remounted
read-only or that's being frozen. Various assertions can then get upset
when writeback is started on such filesystems (Gerald reported assertion
in ext4_journal_check_start() firing).
Fix the problem by cycling the page through writeback state even if we
decide nothing needs to be written for it so that xarray tags get
properly updated. This is slightly silly (we could update the xarray
tags directly) but I don't think a special helper messing with xarray
tags is really worth it in this relatively rare corner case.
Reported-by: Gerald Yang <gerald.yang@canonical.com>
Link: https://lore.kernel.org/all/20260128074515.2028982-1-gerald.yang@canonical.com
Fixes: dff4ac75eeee ("ext4: move keep_towrite handling to ext4_bio_write_page()")
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260205092223.21287-2-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Zqiang <qiang.zhang@linux.dev>
Date: Thu Mar 19 17:45:45 2026 +0800
ext4: fix the might_sleep() warnings in kvfree()
commit 496bb99b7e66f48b178126626f47e9ba79e2d0fa upstream.
Use the kvfree() in the RCU read critical section can trigger
the following warnings:
EXT4-fs (vdb): unmounting filesystem cd983e5b-3c83-4f5a-a136-17b00eb9d018.
WARNING: suspicious RCU usage
./include/linux/rcupdate.h:409 Illegal context switch in RCU read-side critical section!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
Call Trace:
<TASK>
dump_stack_lvl+0xbb/0xd0
dump_stack+0x14/0x20
lockdep_rcu_suspicious+0x15a/0x1b0
__might_resched+0x375/0x4d0
? put_object.part.0+0x2c/0x50
__might_sleep+0x108/0x160
vfree+0x58/0x910
? ext4_group_desc_free+0x27/0x270
kvfree+0x23/0x40
ext4_group_desc_free+0x111/0x270
ext4_put_super+0x3c8/0xd40
generic_shutdown_super+0x14c/0x4a0
? __pfx_shrinker_free+0x10/0x10
kill_block_super+0x40/0x90
ext4_kill_sb+0x6d/0xb0
deactivate_locked_super+0xb4/0x180
deactivate_super+0x7e/0xa0
cleanup_mnt+0x296/0x3e0
__cleanup_mnt+0x16/0x20
task_work_run+0x157/0x250
? __pfx_task_work_run+0x10/0x10
? exit_to_user_mode_loop+0x6a/0x550
exit_to_user_mode_loop+0x102/0x550
do_syscall_64+0x44a/0x500
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
BUG: sleeping function called from invalid context at mm/vmalloc.c:3441
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 556, name: umount
preempt_count: 1, expected: 0
CPU: 3 UID: 0 PID: 556 Comm: umount
Call Trace:
<TASK>
dump_stack_lvl+0xbb/0xd0
dump_stack+0x14/0x20
__might_resched+0x275/0x4d0
? put_object.part.0+0x2c/0x50
__might_sleep+0x108/0x160
vfree+0x58/0x910
? ext4_group_desc_free+0x27/0x270
kvfree+0x23/0x40
ext4_group_desc_free+0x111/0x270
ext4_put_super+0x3c8/0xd40
generic_shutdown_super+0x14c/0x4a0
? __pfx_shrinker_free+0x10/0x10
kill_block_super+0x40/0x90
ext4_kill_sb+0x6d/0xb0
deactivate_locked_super+0xb4/0x180
deactivate_super+0x7e/0xa0
cleanup_mnt+0x296/0x3e0
__cleanup_mnt+0x16/0x20
task_work_run+0x157/0x250
? __pfx_task_work_run+0x10/0x10
? exit_to_user_mode_loop+0x6a/0x550
exit_to_user_mode_loop+0x102/0x550
do_syscall_64+0x44a/0x500
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The above scenarios occur in initialization failures and teardown
paths, there are no parallel operations on the resources released
by kvfree(), this commit therefore remove rcu_read_lock/unlock() and
use rcu_access_pointer() instead of rcu_dereference() operations.
Fixes: 7c990728b99e ("ext4: fix potential race between s_flex_groups online resizing and access")
Fixes: df3da4ea5a0f ("ext4: fix potential race between s_group_info online resizing and access")
Signed-off-by: Zqiang <qiang.zhang@linux.dev>
Reviewed-by: Baokun Li <libaokun@linux.alibaba.com>
Link: https://patch.msgid.link/20260319094545.19291-1-qiang.zhang@linux.dev
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jiayuan Chen <jiayuan.chen@shopee.com>
Date: Thu Mar 19 20:03:35 2026 +0800
ext4: fix use-after-free in update_super_work when racing with umount
commit d15e4b0a418537aafa56b2cb80d44add83e83697 upstream.
Commit b98535d09179 ("ext4: fix bug_on in start_this_handle during umount
filesystem") moved ext4_unregister_sysfs() before flushing s_sb_upd_work
to prevent new error work from being queued via /proc/fs/ext4/xx/mb_groups
reads during unmount. However, this introduced a use-after-free because
update_super_work calls ext4_notify_error_sysfs() -> sysfs_notify() which
accesses the kobject's kernfs_node after it has been freed by kobject_del()
in ext4_unregister_sysfs():
update_super_work ext4_put_super
----------------- --------------
ext4_unregister_sysfs(sb)
kobject_del(&sbi->s_kobj)
__kobject_del()
sysfs_remove_dir()
kobj->sd = NULL
sysfs_put(sd)
kernfs_put() // RCU free
ext4_notify_error_sysfs(sbi)
sysfs_notify(&sbi->s_kobj)
kn = kobj->sd // stale pointer
kernfs_get(kn) // UAF on freed kernfs_node
ext4_journal_destroy()
flush_work(&sbi->s_sb_upd_work)
Instead of reordering the teardown sequence, fix this by making
ext4_notify_error_sysfs() detect that sysfs has already been torn down
by checking s_kobj.state_in_sysfs, and skipping the sysfs_notify() call
in that case. A dedicated mutex (s_error_notify_mutex) serializes
ext4_notify_error_sysfs() against kobject_del() in ext4_unregister_sysfs()
to prevent TOCTOU races where the kobject could be deleted between the
state_in_sysfs check and the sysfs_notify() call.
Fixes: b98535d09179 ("ext4: fix bug_on in start_this_handle during umount filesystem")
Cc: Jiayuan Chen <jiayuan.chen@linux.dev>
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260319120336.157873-1-jiayuan.chen@linux.dev
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Theodore Ts'o <tytso@mit.edu>
Date: Thu Mar 26 00:58:34 2026 -0400
ext4: handle wraparound when searching for blocks for indirect mapped blocks
commit bb81702370fad22c06ca12b6e1648754dbc37e0f upstream.
Commit 4865c768b563 ("ext4: always allocate blocks only from groups
inode can use") restricts what blocks will be allocated for indirect
block based files to block numbers that fit within 32-bit block
numbers.
However, when using a review bot running on the latest Gemini LLM to
check this commit when backporting into an LTS based kernel, it raised
this concern:
If ac->ac_g_ex.fe_group is >= ngroups (for instance, if the goal
group was populated via stream allocation from s_mb_last_groups),
then start will be >= ngroups.
Does this allow allocating blocks beyond the 32-bit limit for
indirect block mapped files? The commit message mentions that
ext4_mb_scan_groups_linear() takes care to not select unsupported
groups. However, its loop uses group = *start, and the very first
iteration will call ext4_mb_scan_group() with this unsupported
group because next_linear_group() is only called at the end of the
iteration.
After reviewing the code paths involved and considering the LLM
review, I determined that this can happen when there is a file system
where some files/directories are extent-mapped and others are
indirect-block mapped. To address this, add a safety clamp in
ext4_mb_scan_groups().
Fixes: 4865c768b563 ("ext4: always allocate blocks only from groups inode can use")
Cc: Jan Kara <jack@suse.cz>
Reviewed-by: Baokun Li <libaokun@linux.alibaba.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://patch.msgid.link/20260326045834.1175822-1-tytso@mit.edu
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jan Kara <jack@suse.cz>
Date: Mon Feb 16 17:48:43 2026 +0100
ext4: make recently_deleted() properly work with lazy itable initialization
commit bd060afa7cc3e0ad30afa9ecc544a78638498555 upstream.
recently_deleted() checks whether inode has been used in the near past.
However this can give false positive result when inode table is not
initialized yet and we are in fact comparing to random garbage (or stale
itable block of a filesystem before mkfs). Ultimately this results in
uninitialized inodes being skipped during inode allocation and possibly
they are never initialized and thus e2fsck complains. Verify if the
inode has been initialized before checking for dtime.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://patch.msgid.link/20260216164848.3074-3-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Li Chen <me@linux.beauty>
Date: Wed Feb 25 16:26:16 2026 +0800
ext4: publish jinode after initialization
commit 1aec30021edd410b986c156f195f3d23959a9d11 upstream.
ext4_inode_attach_jinode() publishes ei->jinode to concurrent users.
It used to set ei->jinode before jbd2_journal_init_jbd_inode(),
allowing a reader to observe a non-NULL jinode with i_vfs_inode
still unset.
The fast commit flush path can then pass this jinode to
jbd2_wait_inode_data(), which dereferences i_vfs_inode->i_mapping and
may crash.
Below is the crash I observe:
```
BUG: unable to handle page fault for address: 000000010beb47f4
PGD 110e51067 P4D 110e51067 PUD 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 1 UID: 0 PID: 4850 Comm: fc_fsync_bench_ Not tainted 6.18.0-00764-g795a690c06a5 #1 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-2-2 04/01/2014
RIP: 0010:xas_find_marked+0x3d/0x2e0
Code: e0 03 48 83 f8 02 0f 84 f0 01 00 00 48 8b 47 08 48 89 c3 48 39 c6 0f 82 fd 01 00 00 48 85 c9 74 3d 48 83 f9 03 77 63 4c 8b 0f <49> 8b 71 08 48 c7 47 18 00 00 00 00 48 89 f1 83 e1 03 48 83 f9 02
RSP: 0018:ffffbbee806e7bf0 EFLAGS: 00010246
RAX: 000000000010beb4 RBX: 000000000010beb4 RCX: 0000000000000003
RDX: 0000000000000001 RSI: 0000002000300000 RDI: ffffbbee806e7c10
RBP: 0000000000000001 R08: 0000002000300000 R09: 000000010beb47ec
R10: ffff9ea494590090 R11: 0000000000000000 R12: 0000002000300000
R13: ffffbbee806e7c90 R14: ffff9ea494513788 R15: ffffbbee806e7c88
FS: 00007fc2f9e3e6c0(0000) GS:ffff9ea6b1444000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000010beb47f4 CR3: 0000000119ac5000 CR4: 0000000000750ef0
PKRU: 55555554
Call Trace:
<TASK>
filemap_get_folios_tag+0x87/0x2a0
__filemap_fdatawait_range+0x5f/0xd0
? srso_alias_return_thunk+0x5/0xfbef5
? __schedule+0x3e7/0x10c0
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? preempt_count_sub+0x5f/0x80
? srso_alias_return_thunk+0x5/0xfbef5
? cap_safe_nice+0x37/0x70
? srso_alias_return_thunk+0x5/0xfbef5
? preempt_count_sub+0x5f/0x80
? srso_alias_return_thunk+0x5/0xfbef5
filemap_fdatawait_range_keep_errors+0x12/0x40
ext4_fc_commit+0x697/0x8b0
? ext4_file_write_iter+0x64b/0x950
? srso_alias_return_thunk+0x5/0xfbef5
? preempt_count_sub+0x5f/0x80
? srso_alias_return_thunk+0x5/0xfbef5
? vfs_write+0x356/0x480
? srso_alias_return_thunk+0x5/0xfbef5
? preempt_count_sub+0x5f/0x80
ext4_sync_file+0xf7/0x370
do_fsync+0x3b/0x80
? syscall_trace_enter+0x108/0x1d0
__x64_sys_fdatasync+0x16/0x20
do_syscall_64+0x62/0x2c0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
...
```
Fix this by initializing the jbd2_inode first.
Use smp_wmb() and WRITE_ONCE() to publish ei->jinode after
initialization. Readers use READ_ONCE() to fetch the pointer.
Fixes: a361293f5fede ("jbd2: Fix oops in jbd2_journal_file_inode()")
Cc: stable@vger.kernel.org
Signed-off-by: Li Chen <me@linux.beauty>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260225082617.147957-1-me@linux.beauty
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Helen Koike <koike@igalia.com>
Date: Tue Mar 17 11:23:10 2026 -0300
ext4: reject mount if bigalloc with s_first_data_block != 0
commit 3822743dc20386d9897e999dbb990befa3a5b3f8 upstream.
bigalloc with s_first_data_block != 0 is not supported, reject mounting
it.
Signed-off-by: Helen Koike <koike@igalia.com>
Suggested-by: Theodore Ts'o <tytso@mit.edu>
Reported-by: syzbot+b73703b873a33d8eb8f6@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b73703b873a33d8eb8f6
Link: https://patch.msgid.link/20260317142325.135074-1-koike@igalia.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Yuto Ohnuki <ytohnuki@amazon.com>
Date: Mon Feb 23 12:33:46 2026 +0000
ext4: replace BUG_ON with proper error handling in ext4_read_inline_folio
commit 356227096eb66e41b23caf7045e6304877322edf upstream.
Replace BUG_ON() with proper error handling when inline data size
exceeds PAGE_SIZE. This prevents kernel panic and allows the system to
continue running while properly reporting the filesystem corruption.
The error is logged via ext4_error_inode(), the buffer head is released
to prevent memory leak, and -EFSCORRUPTED is returned to indicate
filesystem corruption.
Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com>
Link: https://patch.msgid.link/20260223123345.14838-2-ytohnuki@amazon.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ye Bin <yebin10@huawei.com>
Date: Tue Mar 3 09:22:42 2026 +0800
ext4: test if inode's all dirty pages are submitted to disk
commit 73bf12adbea10b13647864cd1c62410d19e21086 upstream.
The commit aa373cf55099 ("writeback: stop background/kupdate works from
livelocking other works") introduced an issue where unmounting a filesystem
in a multi-logical-partition scenario could lead to batch file data loss.
This problem was not fixed until the commit d92109891f21 ("fs/writeback:
bail out if there is no more inodes for IO and queued once"). It took
considerable time to identify the root cause. Additionally, in actual
production environments, we frequently encountered file data loss after
normal system reboots. Therefore, we are adding a check in the inode
release flow to verify whether all dirty pages have been flushed to disk,
in order to determine whether the data loss is caused by a logic issue in
the filesystem code.
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260303012242.3206465-1-yebin@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Tejas Bharambe <tejas.bharambe@outlook.com>
Date: Tue Mar 3 23:14:34 2026 -0800
ext4: validate p_idx bounds in ext4_ext_correct_indexes
commit 2acb5c12ebd860f30e4faf67e6cc8c44ddfe5fe8 upstream.
ext4_ext_correct_indexes() walks up the extent tree correcting
index entries when the first extent in a leaf is modified. Before
accessing path[k].p_idx->ei_block, there is no validation that
p_idx falls within the valid range of index entries for that
level.
If the on-disk extent header contains a corrupted or crafted
eh_entries value, p_idx can point past the end of the allocated
buffer, causing a slab-out-of-bounds read.
Fix this by validating path[k].p_idx against EXT_LAST_INDEX() at
both access sites: before the while loop and inside it. Return
-EFSCORRUPTED if the index pointer is out of range, consistent
with how other bounds violations are handled in the ext4 extent
tree code.
Reported-by: syzbot+04c4e65cab786a2e5b7e@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=04c4e65cab786a2e5b7e
Signed-off-by: Tejas Bharambe <tejas.bharambe@outlook.com>
Link: https://patch.msgid.link/JH0PR06MB66326016F9B6AD24097D232B897CA@JH0PR06MB6632.apcprd06.prod.outlook.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Davidlohr Bueso <dave@stgolabs.net>
Date: Wed Mar 25 17:17:59 2026 -0700
futex: Clear stale exiting pointer in futex_lock_pi() retry path
commit 210d36d892de5195e6766c45519dfb1e65f3eb83 upstream.
Fuzzying/stressing futexes triggered:
WARNING: kernel/futex/core.c:825 at wait_for_owner_exiting+0x7a/0x80, CPU#11: futex_lock_pi_s/524
When futex_lock_pi_atomic() sees the owner is exiting, it returns -EBUSY
and stores a refcounted task pointer in 'exiting'.
After wait_for_owner_exiting() consumes that reference, the local pointer
is never reset to nil. Upon a retry, if futex_lock_pi_atomic() returns a
different error, the bogus pointer is passed to wait_for_owner_exiting().
CPU0 CPU1 CPU2
futex_lock_pi(uaddr)
// acquires the PI futex
exit()
futex_cleanup_begin()
futex_state = EXITING;
futex_lock_pi(uaddr)
futex_lock_pi_atomic()
attach_to_pi_owner()
// observes EXITING
*exiting = owner; // takes ref
return -EBUSY
wait_for_owner_exiting(-EBUSY, owner)
put_task_struct(); // drops ref
// exiting still points to owner
goto retry;
futex_lock_pi_atomic()
lock_pi_update_atomic()
cmpxchg(uaddr)
*uaddr ^= WAITERS // whatever
// value changed
return -EAGAIN;
wait_for_owner_exiting(-EAGAIN, exiting) // stale
WARN_ON_ONCE(exiting)
Fix this by resetting upon retry, essentially aligning it with requeue_pi.
Fixes: 3ef240eaff36 ("futex: Prevent exit livelock")
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260326001759.4129680-1-dave@stgolabs.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Hao-Yu Yang <naup96721@gmail.com>
Date: Fri Mar 13 20:47:56 2026 +0800
futex: Fix UaF between futex_key_to_node_opt() and vma_replace_policy()
[ Upstream commit 190a8c48ff623c3d67cb295b4536a660db2012aa ]
During futex_key_to_node_opt() execution, vma->vm_policy is read under
speculative mmap lock and RCU. Concurrently, mbind() may call
vma_replace_policy() which frees the old mempolicy immediately via
kmem_cache_free().
This creates a race where __futex_key_to_node() dereferences a freed
mempolicy pointer, causing a use-after-free read of mpol->mode.
[ 151.412631] BUG: KASAN: slab-use-after-free in __futex_key_to_node (kernel/futex/core.c:349)
[ 151.414046] Read of size 2 at addr ffff888001c49634 by task e/87
[ 151.415969] Call Trace:
[ 151.416732] __asan_load2 (mm/kasan/generic.c:271)
[ 151.416777] __futex_key_to_node (kernel/futex/core.c:349)
[ 151.416822] get_futex_key (kernel/futex/core.c:374 kernel/futex/core.c:386 kernel/futex/core.c:593)
Fix by adding rcu to __mpol_put().
Fixes: c042c505210d ("futex: Implement FUTEX2_MPOL")
Reported-by: Hao-Yu Yang <naup96721@gmail.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Hao-Yu Yang <naup96721@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Link: https://patch.msgid.link/20260324174418.GB1850007@noisy.programming.kicks-ass.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peter Zijlstra <peterz@infradead.org>
Date: Thu Mar 26 13:35:53 2026 +0100
futex: Require sys_futex_requeue() to have identical flags
[ Upstream commit 19f94b39058681dec64a10ebeb6f23fe7fc3f77a ]
Nicholas reported that his LLM found it was possible to create a UaF
when sys_futex_requeue() is used with different flags. The initial
motivation for allowing different flags was the variable sized futex,
but since that hasn't been merged (yet), simply mandate the flags are
identical, as is the case for the old style sys_futex() requeue
operations.
Fixes: 0f4b5f972216 ("futex: Add sys_futex_requeue()")
Reported-by: Nicholas Carlini <npc@anthropic.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Takashi Iwai <tiwai@suse.de>
Date: Tue Feb 24 10:00:02 2026 +0100
HID: apple: Add EPOMAKER TH87 to the non-apple keyboards list
[ Upstream commit 7c698de0dc5daa1e1a5fd1f0c6aa1b6bb2f5d867 ]
EPOMAKER TH87 has the very same ID as Apple Aluminum keyboard
(05ac:024f) although it doesn't work as expected in compatible way.
Put three entries to the non-apple keyboards list to exclude this
device: one for BT ("TH87"), one for USB ("HFD Epomaker TH87") and one
for dongle ("2.4G Wireless Receiver").
Link: https://bugzilla.suse.com/show_bug.cgi?id=1258455
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Günther Noack <gnoack@google.com>
Date: Thu Feb 19 16:43:36 2026 +0100
HID: apple: avoid memory leak in apple_report_fixup()
[ Upstream commit 239c15116d80f67d32f00acc34575f1a6b699613 ]
The apple_report_fixup() function was returning a
newly kmemdup()-allocated buffer, but never freeing it.
The caller of report_fixup() does not take ownership of the returned
pointer, but it *is* permitted to return a sub-portion of the input
rdesc, whose lifetime is managed by the caller.
Assisted-by: Gemini-CLI:Google Gemini 3
Signed-off-by: Günther Noack <gnoack@google.com>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Denis Benato <denis.benato@linux.dev>
Date: Mon Feb 16 18:55:38 2026 +0100
HID: asus: add xg mobile 2023 external hardware support
[ Upstream commit 377f8e788945d45b012ed9cfc35ca56c02e86cd8 ]
XG mobile stations have the 0x5a endpoint and has to be initialized:
add them to hid-asus.
Signed-off-by: Denis Benato <denis.benato@linux.dev>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Günther Noack <gnoack@google.com>
Date: Thu Feb 19 16:43:38 2026 +0100
HID: asus: avoid memory leak in asus_report_fixup()
[ Upstream commit 2bad24c17742fc88973d6aea526ce1353f5334a3 ]
The asus_report_fixup() function was returning a newly allocated
kmemdup()-allocated buffer, but never freeing it. Switch to
devm_kzalloc() to ensure the memory is managed and freed automatically
when the device is removed.
The caller of report_fixup() does not take ownership of the returned
pointer, but it is permitted to return a pointer whose lifetime is at
least that of the input buffer.
Also fix a harmless out-of-bounds read by copying only the original
descriptor size.
Assisted-by: Gemini-CLI:Google Gemini 3
Signed-off-by: Günther Noack <gnoack@google.com>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zhang Lixu <lixu.zhang@intel.com>
Date: Tue Feb 3 08:55:07 2026 +0800
HID: intel-ish-hid: ipc: Add Nova Lake-H/S PCI device IDs
[ Upstream commit 22f8bcec5aeb05104b3eaa950cb5a345e95f0aa8 ]
Add device IDs of Nova Lake-H and Nova Lake-S into ishtp support list.
Signed-off-by: Zhang Lixu <lixu.zhang@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Günther Noack <gnoack@google.com>
Date: Thu Feb 19 16:43:37 2026 +0100
HID: magicmouse: avoid memory leak in magicmouse_report_fixup()
[ Upstream commit 91e8c6e601bdc1ccdf886479b6513c01c7e51c2c ]
The magicmouse_report_fixup() function was returning a
newly kmemdup()-allocated buffer, but never freeing it.
The caller of report_fixup() does not take ownership of the returned
pointer, but it *is* permitted to return a sub-portion of the input
rdesc, whose lifetime is managed by the caller.
Assisted-by: Gemini-CLI:Google Gemini 3
Signed-off-by: Günther Noack <gnoack@google.com>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Julius Lehmann <lehmanju@devpi.de>
Date: Sat Feb 14 20:34:21 2026 +0100
HID: magicmouse: fix battery reporting for Apple Magic Trackpad 2
[ Upstream commit 5f3518d77419255f8b12bb23c8ec22acbeb6bc5b ]
Battery reporting does not work for the Apple Magic Trackpad 2 if it is
connected via USB. The current hid descriptor fixup code checks for a
hid descriptor length of exactly 83 bytes. If the hid descriptor is
larger, which is the case for newer apple mice, the fixup is not
applied.
This fix checks for hid descriptor sizes greater/equal 83 bytes which
applies the fixup for newer devices as well.
Signed-off-by: Julius Lehmann <lehmanju@devpi.de>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Romain Sioen <romain.sioen@microchip.com>
Date: Fri Feb 6 17:32:58 2026 +0100
HID: mcp2221: cancel last I2C command on read error
[ Upstream commit e31b556c0ba21f20c298aa61181b96541140b7b9 ]
When an I2C SMBus read operation fails, the MCP2221 internal state machine
may not reset correctly, causing subsequent transactions to fail.
By adding a short delay and explicitly cancelling the last command,
we ensure the device is ready for the next operation.
Fix an issue where i2cdetect was not able to detect all devices correctly
on the bus.
Signed-off-by: Romain Sioen <romain.sioen@microchip.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sanman Pradhan <psanman@juniper.net>
Date: Wed Mar 25 05:13:06 2026 +0000
hwmon: (adm1177) fix sysfs ABI violation and current unit conversion
[ Upstream commit bf08749a6abb6d1959bfdc0edc32c640df407558 ]
The adm1177 driver exposes the current alert threshold through
hwmon_curr_max_alarm. This violates the hwmon sysfs ABI, where
*_alarm attributes are read-only status flags and writable thresholds
must use currN_max.
The driver also stores the threshold internally in microamps, while
currN_max is defined in milliamps. Convert the threshold accordingly
on both the read and write paths.
Widen the cached threshold and related calculations to 64 bits so
that small shunt resistor values do not cause truncation or overflow.
Also use 64-bit arithmetic for the mA/uA conversions, clamp writes
to the range the hardware can represent, and propagate failures from
adm1177_write_alert_thr() instead of silently ignoring them.
Update the hwmon documentation to reflect the attribute rename and
the correct units returned by the driver.
Fixes: 09b08ac9e8d5 ("hwmon: (adm1177) Add ADM1177 Hot Swap Controller and Digital Power Monitor driver")
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Acked-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20260325051246.28262-1-sanman.pradhan@hpe.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sanman Pradhan <psanman@juniper.net>
Date: Mon Mar 23 00:24:25 2026 +0000
hwmon: (peci/cputemp) Fix crit_hyst returning delta instead of absolute temperature
commit 0adc752b4f7d82af7bd14f7cad3091b3b5d702ba upstream.
The hwmon sysfs ABI expects tempN_crit_hyst to report the temperature at
which the critical condition clears, not the hysteresis delta from the
critical limit.
The peci cputemp driver currently returns tjmax - tcontrol for
crit_hyst_type, which is the hysteresis margin rather than the
corresponding absolute temperature.
Return tcontrol directly, and update the documentation accordingly.
Fixes: bf3608f338e9 ("hwmon: peci: Add cputemp driver")
Cc: stable@vger.kernel.org
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Link: https://lore.kernel.org/r/20260323002352.93417-2-sanman.pradhan@hpe.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sanman Pradhan <psanman@juniper.net>
Date: Mon Mar 23 00:24:37 2026 +0000
hwmon: (peci/cputemp) Fix off-by-one in cputemp_is_visible()
commit b0c9d8ae71509f25690d57f2efddebf7f4b12194 upstream.
cputemp_is_visible() validates the channel index against
CPUTEMP_CHANNEL_NUMS, but currently uses '>' instead of '>='.
As a result, channel == CPUTEMP_CHANNEL_NUMS is not rejected even though
valid indices are 0 .. CPUTEMP_CHANNEL_NUMS - 1.
Fix the bounds check by using '>=' so invalid channel indices are
rejected before indexing the core bitmap.
Fixes: bf3608f338e9 ("hwmon: peci: Add cputemp driver")
Cc: stable@vger.kernel.org
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Link: https://lore.kernel.org/r/20260323002352.93417-3-sanman.pradhan@hpe.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Guenter Roeck <linux@roeck-us.net>
Date: Tue Mar 24 18:54:11 2026 -0700
hwmon: (pmbus) Introduce the concept of "write-only" attributes
[ Upstream commit cd658475e7694d58e1c40dabc1dacf8431ccedb2 ]
Attributes intended to clear sensor history are intended to be writeable
only. Reading those attributes today results in reporting more or less
random values. To avoid ABI surprises, have those attributes explicitly
return 0 when reading.
Fixes: 787c095edaa9d ("hwmon: (pmbus/core) Add support for rated attributes")
Reviewed-by: Sanman Pradhan <psanman@juniper.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Guenter Roeck <linux@roeck-us.net>
Date: Tue Mar 24 16:41:07 2026 -0700
hwmon: (pmbus) Mark lowest/average/highest/rated attributes as read-only
[ Upstream commit 805a5bd1c3f307d45ae4e9cf8915ef16d585a54a ]
Writing those attributes is not supported, so mark them as read-only.
Prior to this change, attempts to write into these attributes returned
an error.
Mark boolean fields in struct pmbus_limit_attr and in struct
pmbus_sensor_attr as bit fields to reduce configuration data size.
The data is scanned only while probing, so performance is not a concern.
Fixes: 6f183d33a02e6 ("hwmon: (pmbus) Add support for peak attributes")
Reviewed-by: Sanman Pradhan <psanman@juniper.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Guenter Roeck <linux@roeck-us.net>
Date: Sun Mar 22 09:12:33 2026 -0700
hwmon: (pmbus/core) Protect regulator operations with mutex
[ Upstream commit 754bd2b4a084b90b5e7b630e1f423061a9b9b761 ]
The regulator operations pmbus_regulator_get_voltage(),
pmbus_regulator_set_voltage(), and pmbus_regulator_list_voltage()
access PMBus registers and shared data but were not protected by
the update_lock mutex. This could lead to race conditions.
However, adding mutex protection directly to these functions causes
a deadlock because pmbus_regulator_notify() (which calls
regulator_notifier_call_chain()) is often called with the mutex
already held (e.g., from pmbus_fault_handler()). If a regulator
callback then calls one of the now-protected voltage functions,
it will attempt to acquire the same mutex.
Rework pmbus_regulator_notify() to utilize a worker function to
send notifications outside of the mutex protection. Events are
stored as atomics in a per-page bitmask and processed by the worker.
Initialize the worker and its associated data during regulator
registration, and ensure it is cancelled on device removal using
devm_add_action_or_reset().
While at it, remove the unnecessary include of linux/of.h.
Cc: Sanman Pradhan <psanman@juniper.net>
Fixes: ddbb4db4ced1b ("hwmon: (pmbus) Add regulator support")
Reviewed-by: Sanman Pradhan <psanman@juniper.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sanman Pradhan <psanman@juniper.net>
Date: Thu Mar 19 17:31:19 2026 +0000
hwmon: (pmbus/ina233) Fix error handling and sign extension in shunt voltage read
commit f7e775c4694782844c66da5316fed82881835cf8 upstream.
ina233_read_word_data() reads MFR_READ_VSHUNT via pmbus_read_word_data()
but has two issues:
1. The return value is not checked for errors before being used in
arithmetic. A negative error code from a failed I2C transaction is
passed directly to DIV_ROUND_CLOSEST(), producing garbage data.
2. MFR_READ_VSHUNT is a 16-bit two's complement value. Negative shunt
voltages (values with bit 15 set) are treated as large positive
values since pmbus_read_word_data() returns them zero-extended in an
int. This leads to incorrect scaling in the VIN coefficient
conversion.
Fix both issues by adding an error check, casting to s16 for proper
sign extension, and clamping the result to a valid non-negative range.
The clamp is necessary because read_word_data callbacks must return
non-negative values on success (negative values indicate errors to the
pmbus core).
Fixes: b64b6cb163f16 ("hwmon: Add driver for TI INA233 Current and Power Monitor")
Cc: stable@vger.kernel.org
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Link: https://lore.kernel.org/r/20260319173055.125271-2-sanman.pradhan@hpe.com
[groeck: Fixed clamp to avoid losing the sign bit]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sanman Pradhan <psanman@juniper.net>
Date: Thu Mar 19 17:31:29 2026 +0000
hwmon: (pmbus/isl68137) Add mutex protection for AVS enable sysfs attributes
commit 3075a3951f7708da5a8ab47b0b7d068a32f69e58 upstream.
The custom avs0_enable and avs1_enable sysfs attributes access PMBus
registers through the exported API helpers (pmbus_read_byte_data,
pmbus_read_word_data, pmbus_write_word_data, pmbus_update_byte_data)
without holding the PMBus update_lock mutex. These exported helpers do
not acquire the mutex internally, unlike the core's internal callers
which hold the lock before invoking them.
The store callback is especially vulnerable: it performs a multi-step
read-modify-write sequence (read VOUT_COMMAND, write VOUT_COMMAND, then
update OPERATION) where concurrent access from another thread could
interleave and corrupt the register state.
Add pmbus_lock_interruptible()/pmbus_unlock() around both the show and
store callbacks to serialize PMBus register access with the rest of the
driver.
Fixes: 038a9c3d1e424 ("hwmon: (pmbus/isl68137) Add driver for Intersil ISL68137 PWM Controller")
Cc: stable@vger.kernel.org
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Link: https://lore.kernel.org/r/20260319173055.125271-3-sanman.pradhan@hpe.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Danilo Krummrich <dakr@kernel.org>
Date: Tue Mar 3 12:53:20 2026 +0100
hwmon: axi-fan: don't use driver_override as IRQ name
[ Upstream commit 813bbc4d33d2ca5b0da63e70ae13b60874f20d37 ]
Do not use driver_override as IRQ name, as it is not guaranteed to point
to a valid string; use NULL instead (which makes the devm IRQ helpers
use dev_name()).
Fixes: 8412b410fa5e ("hwmon: Support ADI Fan Control IP")
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20260303115720.48783-4-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pratap Nirujogi <pratap.nirujogi@amd.com>
Date: Fri Mar 20 16:12:22 2026 -0400
i2c: designware: amdisp: Fix resume-probe race condition issue
commit e2f1ada8e089dd5a331bcd8b88125ae2af8d188f upstream.
Identified resume-probe race condition in kernel v7.0 with the commit
38fa29b01a6a ("i2c: designware: Combine the init functions"),but this
issue existed from the beginning though not detected.
The amdisp i2c device requires ISP to be in power-on state for probe
to succeed. To meet this requirement, this device is added to genpd
to control ISP power using runtime PM. The pm_runtime_get_sync() called
before i2c_dw_probe() triggers PM resume, which powers on ISP and also
invokes the amdisp i2c runtime resume before the probe completes resulting
in this race condition and a NULL dereferencing issue in v7.0
Fix this race condition by using the genpd APIs directly during probe:
- Call dev_pm_genpd_resume() to Power ON ISP before probe
- Call dev_pm_genpd_suspend() to Power OFF ISP after probe
- Set the device to suspended state with pm_runtime_set_suspended()
- Enable runtime PM only after the device is fully initialized
Fixes: d6263c468a761 ("i2c: amd-isp: Add ISP i2c-designware driver")
Co-developed-by: Bin Du <bin.du@amd.com>
Signed-off-by: Bin Du <bin.du@amd.com>
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Cc: <stable@vger.kernel.org> # v6.16+
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260320201302.3490570-1-pratap.nirujogi@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Stefan Eichenberger <stefan.eichenberger@toradex.com>
Date: Wed Feb 18 16:08:50 2026 +0100
i2c: imx: ensure no clock is generated after last read
commit 13101db735bdb29c5f60e95fb578690bd178b30f upstream.
When reading from the I2DR register, right after releasing the bus by
clearing MSTA and MTX, the I2C controller might still generate an
additional clock cycle which can cause devices to misbehave. Ensure to
only read from I2DR after the bus is not busy anymore. Because this
requires polling, the read of the last byte is moved outside of the
interrupt handler.
An example for such a failing transfer is this:
i2ctransfer -y -a 0 w1@0x00 0x02 r1
Error: Sending messages failed: Connection timed out
It does not happen with every device because not all devices react to
the additional clock cycle.
Fixes: 5f5c2d4579ca ("i2c: imx: prevent rescheduling in non dma mode")
Cc: stable@vger.kernel.org # v6.13+
Signed-off-by: Stefan Eichenberger <stefan.eichenberger@toradex.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260218150940.131354-3-eichest@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Stefan Eichenberger <stefan.eichenberger@toradex.com>
Date: Wed Feb 18 16:08:49 2026 +0100
i2c: imx: fix i2c issue when reading multiple messages
commit f88e2e748a1fc3cb4b8d163a9be790812f578850 upstream.
When reading multiple messages, meaning a repeated start is required,
polling the bus busy bit must be avoided. This must only be done for
the last message. Otherwise, the driver will timeout.
Here an example of such a sequence that fails with an error:
i2ctransfer -y -a 0 w1@0x00 0x02 r1 w1@0x00 0x02 r1
Error: Sending messages failed: Connection timed out
Fixes: 5f5c2d4579ca ("i2c: imx: prevent rescheduling in non dma mode")
Cc: stable@vger.kernel.org # v6.13+
Signed-off-by: Stefan Eichenberger <stefan.eichenberger@toradex.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260218150940.131354-2-eichest@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Peter Yin <peteryin.openbmc@gmail.com>
Date: Mon Mar 2 15:56:42 2026 +0800
i3c: master: dw-i3c: Fix missing of_node for virtual I2C adapter
[ Upstream commit f26ecaa0f0abfe5db173416214098a00d3b7db79 ]
The DesignWare I3C master driver creates a virtual I2C adapter to
provide backward compatibility with I2C devices. However, the current
implementation does not associate this virtual adapter with any
Device Tree node.
Propagate the of_node from the I3C master platform device to the
virtual I2C adapter's device structure. This ensures that standard
I2C aliases are correctly resolved and bus numbering remains consistent.
Signed-off-by: Peter Yin <peteryin.openbmc@gmail.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260302075645.1492766-1-peteryin.openbmc@gmail.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kohei Enju <kohei@enjuk.jp>
Date: Sat Feb 14 19:14:25 2026 +0000
iavf: fix out-of-bounds writes in iavf_get_ethtool_stats()
[ Upstream commit fecacfc95f195b99c71c579a472120d0b4ed65fa ]
iavf incorrectly uses real_num_tx_queues for ETH_SS_STATS. Since the
value could change in runtime, we should use num_tx_queues instead.
Moreover iavf_get_ethtool_stats() uses num_active_queues while
iavf_get_sset_count() and iavf_get_stat_strings() use
real_num_tx_queues, which triggers out-of-bounds writes when we do
"ethtool -L" and "ethtool -S" simultaneously [1].
For example when we change channels from 1 to 8, Thread 3 could be
scheduled before Thread 2, and out-of-bounds writes could be triggered
in Thread 3:
Thread 1 (ethtool -L) Thread 2 (work) Thread 3 (ethtool -S)
iavf_set_channels()
...
iavf_alloc_queues()
-> num_active_queues = 8
iavf_schedule_finish_config()
iavf_get_sset_count()
real_num_tx_queues: 1
-> buffer for 1 queue
iavf_get_ethtool_stats()
num_active_queues: 8
-> out-of-bounds!
iavf_finish_config()
-> real_num_tx_queues = 8
Use immutable num_tx_queues in all related functions to avoid the issue.
[1]
BUG: KASAN: vmalloc-out-of-bounds in iavf_add_one_ethtool_stat+0x200/0x270
Write of size 8 at addr ffffc900031c9080 by task ethtool/5800
CPU: 1 UID: 0 PID: 5800 Comm: ethtool Not tainted 6.19.0-enjuk-08403-g8137e3db7f1c #241 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x6f/0xb0
print_report+0x170/0x4f3
kasan_report+0xe1/0x180
iavf_add_one_ethtool_stat+0x200/0x270
iavf_get_ethtool_stats+0x14c/0x2e0
__dev_ethtool+0x3d0c/0x5830
dev_ethtool+0x12d/0x270
dev_ioctl+0x53c/0xe30
sock_do_ioctl+0x1a9/0x270
sock_ioctl+0x3d4/0x5e0
__x64_sys_ioctl+0x137/0x1c0
do_syscall_64+0xf3/0x690
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f7da0e6e36d
...
</TASK>
The buggy address belongs to a 1-page vmalloc region starting at 0xffffc900031c9000 allocated at __dev_ethtool+0x3cc9/0x5830
The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000
index:0xffff88813a013de0 pfn:0x13a013
flags: 0x200000000000000(node=0|zone=2)
raw: 0200000000000000 0000000000000000 dead000000000122 0000000000000000
raw: ffff88813a013de0 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffffc900031c8f80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
ffffc900031c9000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffffc900031c9080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
^
ffffc900031c9100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
ffffc900031c9180: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
Fixes: 64430f70ba6f ("iavf: Fix displaying queue statistics shown by ethtool")
Signed-off-by: Kohei Enju <kohei@enjuk.jp>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Petr Oros <poros@redhat.com>
Date: Thu Feb 12 08:53:10 2026 +0100
ice: fix inverted ready check for VF representors
[ Upstream commit ad85de0fc09eb3236e73df5acb2bc257625103f5 ]
Commit 0f00a897c9fcbd ("ice: check if SF is ready in ethtool ops")
refactored the VF readiness check into a generic repr->ops.ready()
callback but implemented ice_repr_ready_vf() with inverted logic:
return !ice_check_vf_ready_for_cfg(repr->vf);
ice_check_vf_ready_for_cfg() returns 0 on success, so the negation
makes ready() return non-zero when the VF is ready. All callers treat
non-zero as "not ready, skip", causing ndo_get_stats64, get_drvinfo,
get_strings and get_ethtool_stats to always bail out in switchdev mode.
Remove the erroneous negation. The SF variant ice_repr_ready_sf() is
already correct (returns !active, i.e. non-zero when not active).
Fixes: 0f00a897c9fcbd ("ice: check if SF is ready in ethtool ops")
Signed-off-by: Petr Oros <poros@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Patryk Holda <patryk.holda@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Petr Oros <poros@redhat.com>
Date: Thu Feb 12 08:53:11 2026 +0100
ice: use ice_update_eth_stats() for representor stats
[ Upstream commit 2526e440df2725e7328d59b835a164826f179b93 ]
ice_repr_get_stats64() and __ice_get_ethtool_stats() call
ice_update_vsi_stats() on the VF's src_vsi. This always returns early
because ICE_VSI_DOWN is permanently set for VF VSIs - ice_up() is never
called on them since queues are managed by iavf through virtchnl.
In __ice_get_ethtool_stats() the original code called
ice_update_vsi_stats() for all VSIs including representors, iterated
over ice_gstrings_vsi_stats[] to populate the data, and then bailed out
with an early return before the per-queue ring stats section. That early
return was necessary because representor VSIs have no rings on the PF
side - the rings belong to the VF driver (iavf), so accessing per-queue
stats would be invalid.
Move the representor handling to the top of __ice_get_ethtool_stats()
and call ice_update_eth_stats() directly to read the hardware GLV_*
counters. This matches ice_get_vf_stats() which already uses
ice_update_eth_stats() for the same VF VSI in legacy mode. Apply the
same fix to ice_repr_get_stats64().
Note that ice_gstrings_vsi_stats[] contains five software ring counters
(rx_buf_failed, rx_page_failed, tx_linearize, tx_busy, tx_restart) that
are always zero for representors since the PF never processes packets on
VF rings. This is pre-existing behavior unchanged by this patch.
Fixes: 7aae80cef7ba ("ice: add port representor ethtool ops and stats")
Signed-off-by: Petr Oros <poros@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Patryk Holda <patryk.holda@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mohammad Heib <mheib@redhat.com>
Date: Tue Mar 17 19:08:06 2026 +0200
ionic: fix persistent MAC address override on PF
[ Upstream commit cbcb3cfcdc436d6f91a3d95ecfa9c831abe14aed ]
The use of IONIC_CMD_LIF_SETATTR in the MAC address update path causes
the ionic firmware to update the LIF's identity in its persistent state.
Since the firmware state is maintained across host warm boots and driver
reloads, any MAC change on the Physical Function (PF) becomes "sticky.
This is problematic because it causes ethtool -P to report the
user-configured MAC as the permanent factory address, which breaks
system management tools that rely on a stable hardware identity.
While Virtual Functions (VFs) need this hardware-level programming to
properly handle MAC assignments in guest environments, the PF should
maintain standard transient behavior. This patch gates the
ionic_program_mac call using is_virtfn so that PF MAC changes remain
local to the netdev filters and do not overwrite the firmware's
permanent identity block.
Fixes: 19058be7c48c ("ionic: VF initial random MAC address if no assigned mac")
Signed-off-by: Mohammad Heib <mheib@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Link: https://patch.msgid.link/20260317170806.35390-1-mheib@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kuniyuki Iwashima <kuniyu@google.com>
Date: Fri Mar 20 07:23:00 2026 +0000
ipv6: Don't remove permanent routes with exceptions from tb6_gc_hlist.
[ Upstream commit 4be7b99c253f0c85a255cc1db7127ba3232dfa30 ]
The cited commit mechanically put fib6_remove_gc_list()
just after every fib6_clean_expires() call.
When a temporary route is promoted to a permanent route,
there may already be exception routes tied to it.
If fib6_remove_gc_list() removes the route from tb6_gc_hlist,
such exception routes will no longer be aged.
Let's replace fib6_remove_gc_list() with a new helper
fib6_may_remove_gc_list() and use fib6_age_exceptions() there.
Note that net->ipv6 is only compiled when CONFIG_IPV6 is
enabled, so fib6_{add,remove,may_remove}_gc_list() are guarded.
Fixes: 5eb902b8e719 ("net/ipv6: Remove expired routes with a separated list of routes.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260320072317.2561779-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kuniyuki Iwashima <kuniyu@google.com>
Date: Fri Mar 20 07:22:59 2026 +0000
ipv6: Remove permanent routes from tb6_gc_hlist when all exceptions expire.
[ Upstream commit 6af51e9f31336632263c4680b2a3712295103e1f ]
Commit 5eb902b8e719 ("net/ipv6: Remove expired routes with a
separated list of routes.") introduced a per-table GC list and
changed GC to iterate over that list instead of traversing
the entire route table.
However, it forgot to add permanent routes to tb6_gc_hlist
when exception routes are added.
Commit cfe82469a00f ("ipv6: add exception routes to GC list
in rt6_insert_exception") fixed that issue but introduced
another one.
Even after all exception routes expire, the permanent routes
remain in tb6_gc_hlist, potentially negating the performance
benefits intended by the initial change.
Let's count gc_args->more before and after rt6_age_exceptions()
and remove the permanent route when the delta is 0.
Note that the next patch will reuse fib6_age_exceptions().
Fixes: cfe82469a00f ("ipv6: add exception routes to GC list in rt6_insert_exception")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260320072317.2561779-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jassi Brar <jassisinghbrar@gmail.com>
Date: Sun Mar 22 12:15:33 2026 -0500
irqchip/qcom-mpm: Add missing mailbox TX done acknowledgment
commit cfe02147e86307a17057ee4e3604f5f5919571d2 upstream.
The mbox_client for qcom-mpm sends NULL doorbell messages via
mbox_send_message() but never signals TX completion.
Set knows_txdone=true and call mbox_client_txdone() after a successful
send, matching the pattern used by other Qualcomm mailbox clients (smp2p,
smsm, qcom_aoss etc).
Fixes: a6199bb514d8a6 "irqchip: Add Qualcomm MPM controller driver"
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260322171533.608436-1-jassisinghbrar@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Biju Das <biju.das.jz@bp.renesas.com>
Date: Mon Mar 23 12:49:14 2026 +0000
irqchip/renesas-rzv2h: Fix error path in rzv2h_icu_probe_common()
[ Upstream commit 897cf98926429c8671a9009442883c2f62deae96 ]
Replace pm_runtime_put() with pm_runtime_put_sync() when
irq_domain_create_hierarchy() fails to ensure the device suspends
synchronously before devres cleanup disables runtime PM via
pm_runtime_disable().
Fixes: 5ec8cabc3b86 ("irqchip/renesas-rzv2h: Use devm_pm_runtime_enable()")
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260323124917.41602-1-biju.das.jz@bp.renesas.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Milos Nikic <nikic.milos@gmail.com>
Date: Tue Mar 10 21:15:48 2026 -0700
jbd2: gracefully abort on checkpointing state corruptions
commit bac3190a8e79beff6ed221975e0c9b1b5f2a21da upstream.
This patch targets two internal state machine invariants in checkpoint.c
residing inside functions that natively return integer error codes.
- In jbd2_cleanup_journal_tail(): A blocknr of 0 indicates a severely
corrupted journal superblock. Replaced the J_ASSERT with a WARN_ON_ONCE
and a graceful journal abort, returning -EFSCORRUPTED.
- In jbd2_log_do_checkpoint(): Replaced the J_ASSERT_BH checking for
an unexpected buffer_jwrite state. If the warning triggers, we
explicitly drop the just-taken get_bh() reference and call __flush_batch()
to safely clean up any previously queued buffers in the j_chkpt_bhs array,
preventing a memory leak before returning -EFSCORRUPTED.
Signed-off-by: Milos Nikic <nikic.milos@gmail.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Baokun Li <libaokun@linux.alibaba.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260311041548.159424-1-nikic.milos@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Charles Mirabile <cmirabil@redhat.com>
Date: Sat Mar 7 23:43:30 2026 -0500
kbuild: Delete .builtin-dtbs.S when running make clean
commit a76e30c2479ce6ffa2aa6c8a8462897afc82bc90 upstream.
The makefile tries to delete a file named ".builtin-dtb.S" but the file
created by scripts/Makefile.vmlinux is actually called ".builtin-dtbs.S".
Fixes: 654102df2ac2a ("kbuild: add generic support for built-in boot DTBs")
Cc: stable@vger.kernel.org
Signed-off-by: Charles Mirabile <cmirabil@redhat.com>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20260308044338.181403-1-cmirabil@redhat.com
[nathan: Small commit message adjustments]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Date: Thu Feb 26 08:41:48 2026 +0100
kbuild: install-extmod-build: Package resolve_btfids if necessary
[ Upstream commit 459cb3c054c2352bb321648744b620259a716b60 ]
When CONFIG_DEBUG_INFO_BTF_MODULES is enabled and vmlinux is available,
Makefile.modfinal and gen-btf.sh will try to use resolve_btfids on the
module .ko. install-extmod-build currently does not package
resolve_btfids, so that step fails.
Package resolve_btfids if it may be used.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20260226-kbuild-resolve_btfids-v1-1-2bf38b93dfe7@linutronix.de
[nathan: Small commit message tweaks]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hyunwoo Kim <imv4bel@gmail.com>
Date: Tue Mar 17 08:52:01 2026 +0900
ksmbd: do not expire session on binding failure
commit 9bbb19d21ded7d78645506f20d8c44895e3d0fb9 upstream.
When a multichannel session binding request fails (e.g. wrong password),
the error path unconditionally sets sess->state = SMB2_SESSION_EXPIRED.
However, during binding, sess points to the target session looked up via
ksmbd_session_lookup_slowpath() -- which belongs to another connection's
user. This allows a remote attacker to invalidate any active session by
simply sending a binding request with a wrong password (DoS).
Fix this by skipping session expiration when the failed request was
a binding attempt, since the session does not belong to the current
connection. The reference taken by ksmbd_session_lookup_slowpath() is
still correctly released via ksmbd_user_session_put().
Cc: stable@vger.kernel.org
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Werner Kasselman <werner@verivus.com>
Date: Tue Mar 17 07:55:37 2026 +0000
ksmbd: fix memory leaks and NULL deref in smb2_lock()
commit 309b44ed684496ed3f9c5715d10b899338623512 upstream.
smb2_lock() has three error handling issues after list_del() detaches
smb_lock from lock_list at no_check_cl:
1) If vfs_lock_file() returns an unexpected error in the non-UNLOCK
path, goto out leaks smb_lock and its flock because the out:
handler only iterates lock_list and rollback_list, neither of
which contains the detached smb_lock.
2) If vfs_lock_file() returns -ENOENT in the UNLOCK path, goto out
leaks smb_lock and flock for the same reason. The error code
returned to the dispatcher is also stale.
3) In the rollback path, smb_flock_init() can return NULL on
allocation failure. The result is dereferenced unconditionally,
causing a kernel NULL pointer dereference. Add a NULL check to
prevent the crash and clean up the bookkeeping; the VFS lock
itself cannot be rolled back without the allocation and will be
released at file or connection teardown.
Fix cases 1 and 2 by hoisting the locks_free_lock()/kfree() to before
the if(!rc) check in the UNLOCK branch so all exit paths share one
free site, and by freeing smb_lock and flock before goto out in the
non-UNLOCK branch. Propagate the correct error code in both cases.
Fix case 3 by wrapping the VFS unlock in an if(rlock) guard and adding
a NULL check for locks_free_lock(rlock) in the shared cleanup.
Found via call-graph analysis using sqry.
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Cc: stable@vger.kernel.org
Suggested-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Werner Kasselman <werner@verivus.com>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Namjae Jeon <linkinjeon@kernel.org>
Date: Thu Mar 19 21:00:02 2026 +0900
ksmbd: fix potencial OOB in get_file_all_info() for compound requests
commit beef2634f81f1c086208191f7228bce1d366493d upstream.
When a compound request consists of QUERY_DIRECTORY + QUERY_INFO
(FILE_ALL_INFORMATION) and the first command consumes nearly the entire
max_trans_size, get_file_all_info() would blindly call smbConvertToUTF16()
with PATH_MAX, causing out-of-bounds write beyond the response buffer.
In get_file_all_info(), there was a missing validation check for
the client-provided OutputBufferLength before copying the filename into
FileName field of the smb2_file_all_info structure.
If the filename length exceeds the available buffer space, it could lead to
potential buffer overflows or memory corruption during smbConvertToUTF16
conversion. This calculating the actual free buffer size using
smb2_calc_max_out_buf_len() and returning -EINVAL if the buffer is
insufficient and updating smbConvertToUTF16 to use the actual filename
length (clamped by PATH_MAX) to ensure a safe copy operation.
Cc: stable@vger.kernel.org
Fixes: e2b76ab8b5c9 ("ksmbd: add support for read compound")
Reported-by: Asim Viladi Oglu Manizada <manizada@pm.me>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Werner Kasselman <werner@verivus.com>
Date: Mon Mar 30 18:50:51 2026 -0400
ksmbd: fix use-after-free and NULL deref in smb_grant_oplock()
[ Upstream commit 48623ec358c1c600fa1e38368746f933e0f1a617 ]
smb_grant_oplock() has two issues in the oplock publication sequence:
1) opinfo is linked into ci->m_op_list (via opinfo_add) before
add_lease_global_list() is called. If add_lease_global_list()
fails (kmalloc returns NULL), the error path frees the opinfo
via __free_opinfo() while it is still linked in ci->m_op_list.
Concurrent m_op_list readers (opinfo_get_list, or direct iteration
in smb_break_all_levII_oplock) dereference the freed node.
2) opinfo->o_fp is assigned after add_lease_global_list() publishes
the opinfo on the global lease list. A concurrent
find_same_lease_key() can walk the lease list and dereference
opinfo->o_fp->f_ci while o_fp is still NULL.
Fix by restructuring the publication sequence to eliminate post-publish
failure:
- Set opinfo->o_fp before any list publication (fixes NULL deref).
- Preallocate lease_table via alloc_lease_table() before opinfo_add()
so add_lease_global_list() becomes infallible after publication.
- Keep the original m_op_list publication order (opinfo_add before
lease list) so concurrent opens via same_client_has_lease() and
opinfo_get_list() still see the in-flight grant.
- Use opinfo_put() instead of __free_opinfo() on err_out so that
the RCU-deferred free path is used.
This also requires splitting add_lease_global_list() to take a
preallocated lease_table and changing its return type from int to void,
since it can no longer fail.
Fixes: 1dfd062caa16 ("ksmbd: fix use-after-free by using call_rcu() for oplock_info")
Cc: stable@vger.kernel.org
Signed-off-by: Werner Kasselman <werner@verivus.com>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
[ adapted kmalloc_obj() macro to kmalloc(sizeof()) ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Namjae Jeon <linkinjeon@kernel.org>
Date: Fri Mar 13 14:45:58 2026 +0900
ksmbd: replace hardcoded hdr2_len with offsetof() in smb2_calc_max_out_buf_len()
commit 0e55f63dd08f09651d39e1b709a91705a8a0ddcb upstream.
After this commit (e2b76ab8b5c9 "ksmbd: add support for read compound"),
response buffer management was changed to use dynamic iov array.
In the new design, smb2_calc_max_out_buf_len() expects the second
argument (hdr2_len) to be the offset of ->Buffer field in the
response structure, not a hardcoded magic number.
Fix the remaining call sites to use the correct offsetof() value.
Cc: stable@vger.kernel.org
Fixes: e2b76ab8b5c9 ("ksmbd: add support for read compound")
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Marc Zyngier <maz@kernel.org>
Date: Thu Mar 12 14:08:50 2026 +0000
KVM: arm64: Discard PC update state on vcpu reset
commit 1744a6ef48b9a48f017e3e1a0d05de0a6978396e upstream.
Our vcpu reset suffers from a particularly interesting flaw, as it
does not correctly deal with state that will have an effect on the
execution flow out of reset.
Take the following completely random example, never seen in the wild
and that never resulted in a couple of sleepless nights: /s
- vcpu-A issues a PSCI_CPU_OFF using the SMC conduit
- SMC being a trapped instruction (as opposed to HVC which is always
normally executed), we annotate the vcpu as needing to skip the
next instruction, which is the SMC itself
- vcpu-A is now safely off
- vcpu-B issues a PSCI_CPU_ON for vcpu-A, providing a starting PC
- vcpu-A gets reset, get the new PC, and is sent on its merry way
- right at the point of entering the guest, we notice that a PC
increment is pending (remember the earlier SMC?)
- vcpu-A skips its first instruction...
What could possibly go wrong?
Well, I'm glad you asked. For pKVM as a NV guest, that first instruction
is extremely significant, as it indicates whether the CPU is booting
or resuming. Having skipped that instruction, nothing makes any sense
anymore, and CPU hotplugging fails.
This is all caused by the decoupling of PC update from the handling
of an exception that triggers such update, making it non-obvious
what affects what when.
Fix this train wreck by discarding all the PC-affecting state on
vcpu reset.
Fixes: f5e30680616ab ("KVM: arm64: Move __adjust_pc out of line")
Cc: stable@vger.kernel.org
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://patch.msgid.link/20260312140850.822968-1-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sean Christopherson <seanjc@google.com>
Date: Thu Mar 5 17:28:04 2026 -0800
KVM: x86/mmu: Drop/zap existing present SPTE even when creating an MMIO SPTE
commit aad885e774966e97b675dfe928da164214a71605 upstream.
When installing an emulated MMIO SPTE, do so *after* dropping/zapping the
existing SPTE (if it's shadow-present). While commit a54aa15c6bda3 was
right about it being impossible to convert a shadow-present SPTE to an
MMIO SPTE due to a _guest_ write, it failed to account for writes to guest
memory that are outside the scope of KVM.
E.g. if host userspace modifies a shadowed gPTE to switch from a memslot
to emulted MMIO and then the guest hits a relevant page fault, KVM will
install the MMIO SPTE without first zapping the shadow-present SPTE.
------------[ cut here ]------------
is_shadow_present_pte(*sptep)
WARNING: arch/x86/kvm/mmu/mmu.c:484 at mark_mmio_spte+0xb2/0xc0 [kvm], CPU#0: vmx_ept_stale_r/4292
Modules linked in: kvm_intel kvm irqbypass
CPU: 0 UID: 1000 PID: 4292 Comm: vmx_ept_stale_r Not tainted 7.0.0-rc2-eafebd2d2ab0-sink-vm #319 PREEMPT
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:mark_mmio_spte+0xb2/0xc0 [kvm]
Call Trace:
<TASK>
mmu_set_spte+0x237/0x440 [kvm]
ept_page_fault+0x535/0x7f0 [kvm]
kvm_mmu_do_page_fault+0xee/0x1f0 [kvm]
kvm_mmu_page_fault+0x8d/0x620 [kvm]
vmx_handle_exit+0x18c/0x5a0 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0xc55/0x1c20 [kvm]
kvm_vcpu_ioctl+0x2d5/0x980 [kvm]
__x64_sys_ioctl+0x8a/0xd0
do_syscall_64+0xb5/0x730
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x47fa3f
</TASK>
---[ end trace 0000000000000000 ]---
Reported-by: Alexander Bulekov <bkov@amazon.com>
Debugged-by: Alexander Bulekov <bkov@amazon.com>
Suggested-by: Fred Griffoul <fgriffo@amazon.co.uk>
Fixes: a54aa15c6bda3 ("KVM: x86/mmu: Handle MMIO SPTEs directly in mmu_set_spte()")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sean Christopherson <seanjc@google.com>
Date: Thu Mar 5 17:42:14 2026 -0800
KVM: x86/mmu: Only WARN in direct MMUs when overwriting shadow-present SPTE
commit df83746075778958954aa0460cca55f4b3fc9c02 upstream.
Adjust KVM's sanity check against overwriting a shadow-present SPTE with a
another SPTE with a different target PFN to only apply to direct MMUs,
i.e. only to MMUs without shadowed gPTEs. While it's impossible for KVM
to overwrite a shadow-present SPTE in response to a guest write, writes
from outside the scope of KVM, e.g. from host userspace, aren't detected
by KVM's write tracking and so can break KVM's shadow paging rules.
------------[ cut here ]------------
pfn != spte_to_pfn(*sptep)
WARNING: arch/x86/kvm/mmu/mmu.c:3069 at mmu_set_spte+0x1e4/0x440 [kvm], CPU#0: vmx_ept_stale_r/872
Modules linked in: kvm_intel kvm irqbypass
CPU: 0 UID: 1000 PID: 872 Comm: vmx_ept_stale_r Not tainted 7.0.0-rc2-eafebd2d2ab0-sink-vm #319 PREEMPT
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:mmu_set_spte+0x1e4/0x440 [kvm]
Call Trace:
<TASK>
ept_page_fault+0x535/0x7f0 [kvm]
kvm_mmu_do_page_fault+0xee/0x1f0 [kvm]
kvm_mmu_page_fault+0x8d/0x620 [kvm]
vmx_handle_exit+0x18c/0x5a0 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0xc55/0x1c20 [kvm]
kvm_vcpu_ioctl+0x2d5/0x980 [kvm]
__x64_sys_ioctl+0x8a/0xd0
do_syscall_64+0xb5/0x730
entry_SYSCALL_64_after_hwframe+0x4b/0x53
</TASK>
---[ end trace 0000000000000000 ]---
Fixes: 11d45175111d ("KVM: x86/mmu: Warn if PFN changes on shadow-present SPTE in shadow MMU")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Thu Apr 2 13:23:33 2026 +0200
Linux 6.18.21
Link: https://lore.kernel.org/r/20260331161753.468533260@linuxfoundation.org
Tested-by: Dileep Malepu <dileep.debian@gmail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Peter Schneider <pschneider1968@googlemail.com>
Tested-by: Wentao Guan <guanwentao@uniontech.com>
Tested-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Tested-by: Brett A C Sheffield <bacs@librecast.net>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Ron Economos <re@w6rz.net>
Tested-by: Barry K. Nathan <barryn@pobox.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Li Jun <lijun01@kylinos.cn>
Date: Thu Mar 26 14:29:08 2026 +0800
LoongArch: Fix missing NULL checks for kstrdup()
commit 3a28daa9b7d7c2ddf2c722e9e95d7e0928bf0cd1 upstream.
1. Replace "of_find_node_by_path("/")" with "of_root" to avoid multiple
calls to "of_node_put()".
2. Fix a potential kernel oops during early boot when memory allocation
fails while parsing CPU model from device tree.
Cc: stable@vger.kernel.org
Signed-off-by: Li Jun <lijun01@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Huacai Chen <chenhuacai@kernel.org>
Date: Thu Mar 26 14:29:09 2026 +0800
LoongArch: KVM: Handle the case that EIOINTC's coremap is empty
commit b97bd69eb0f67b5f961b304d28e9ba45e202d841 upstream.
EIOINTC's coremap in eiointc_update_sw_coremap() can be empty, currently
we get a cpuid with -1 in this case, but we actually need 0 because it's
similar as the case that cpuid >= 4.
This fix an out-of-bounds access to kvm_arch::phyid_map::phys_map[].
Cc: <stable@vger.kernel.org>
Fixes: 3956a52bc05bd81 ("LoongArch: KVM: Add EIOINTC read and write functions")
Reported-by: Aurelien Jarno <aurel32@debian.org>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1131431
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Huacai Chen <chenhuacai@kernel.org>
Date: Thu Mar 26 14:29:09 2026 +0800
LoongArch: KVM: Make kvm_get_vcpu_by_cpuid() more robust
commit 2db06c15d8c7a0ccb6108524e16cd9163753f354 upstream.
kvm_get_vcpu_by_cpuid() takes a cpuid parameter whose type is int, so
cpuid can be negative. Let kvm_get_vcpu_by_cpuid() return NULL for this
case so as to make it more robust.
This fix an out-of-bounds access to kvm_arch::phyid_map::phys_map[].
Cc: <stable@vger.kernel.org>
Fixes: 73516e9da512adc ("LoongArch: KVM: Add vcpu mapping from physical cpuid")
Reported-by: Aurelien Jarno <aurel32@debian.org>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1131431
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Xi Ruoyao <xry111@xry111.site>
Date: Thu Mar 26 14:29:09 2026 +0800
LoongArch: vDSO: Emit GNU_EH_FRAME correctly
commit e4878c37f6679fdea91b27a0f4e60a871f0b7bad upstream.
With -fno-asynchronous-unwind-tables and --no-eh-frame-hdr (the default
of the linker), the GNU_EH_FRAME segment (specified by vdso.lds.S) is
empty. This is not valid, as the current DWARF specification mandates
the first byte of the EH frame to be the version number 1. It causes
some unwinders to complain, for example the ClickHouse query profiler
spams the log with messages:
clickhouse-server[365854]: libunwind: unsupported .eh_frame_hdr
version: 127 at 7ffffffb0000
Here "127" is just the byte located at the p_vaddr (0, i.e. the
beginning of the vDSO) of the empty GNU_EH_FRAME segment. Cross-
checking with /proc/365854/maps has also proven 7ffffffb0000 is the
start of vDSO in the process VM image.
In LoongArch the -fno-asynchronous-unwind-tables option seems just a
MIPS legacy, and MIPS only uses this option to satisfy the MIPS-specific
"genvdso" program, per the commit cfd75c2db17e ("MIPS: VDSO: Explicitly
use -fno-asynchronous-unwind-tables"). IIRC it indicates some inherent
limitation of the MIPS ELF ABI and has nothing to do with LoongArch. So
we can simply flip it over to -fasynchronous-unwind-tables and pass
--eh-frame-hdr for linking the vDSO, allowing the profilers to unwind the
stack for statistics even if the sample point is taken when the PC is in
the vDSO.
However simply adjusting the options above would exploit an issue: when
the libgcc unwinder saw the invalid GNU_EH_FRAME segment, it silently
falled back to a machine-specific routine to match the code pattern of
rt_sigreturn() and extract the registers saved in the sigframe if the
code pattern is matched. As unwinding from signal handlers is vital for
libgcc to support pthread cancellation etc., the fall-back routine had
been silently keeping the LoongArch Linux systems functioning since
Linux 5.19. But when we start to emit GNU_EH_FRAME with the correct
format, fall-back routine will no longer be used and libgcc will fail
to unwind the sigframe, and unwinding from signal handlers will no
longer work, causing dozens of glibc test failures. To make it possible
to unwind from signal handlers again, it's necessary to code the unwind
info in __vdso_rt_sigreturn via .cfi_* directives.
The offsets in the .cfi_* directives depend on the layout of struct
sigframe, notably the offset of sigcontext in the sigframe. To use the
offset in the assembly file, factor out struct sigframe into a header to
allow asm-offsets.c to output the offset for assembly.
To work around a long-term issue in the libgcc unwinder (the pc is
unconditionally substracted by 1: doing so is technically incorrect for
a signal frame), a nop instruction is included with the two real
instructions in __vdso_rt_sigreturn in the same FDE PC range. The same
hack has been used on x86 for a long time.
Cc: stable@vger.kernel.org
Fixes: c6b99bed6b8f ("LoongArch: Add VDSO and VSYSCALL support")
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Huacai Chen <chenhuacai@kernel.org>
Date: Thu Mar 26 14:29:09 2026 +0800
LoongArch: Workaround LS2K/LS7A GPU DMA hang bug
commit 95db0c9f526d583634cddb2e5914718570fbac87 upstream.
1. Hardware limitation: GPU, DC and VPU are typically PCI device 06.0,
06.1 and 06.2. They share some hardware resources, so when configure the
PCI 06.0 device BAR1, DMA memory access cannot be performed through this
BAR, otherwise it will cause hardware abnormalities.
2. In typical scenarios of reboot or S3/S4, DC access to memory through
BAR is not prohibited, resulting in GPU DMA hangs.
3. Workaround method: When configuring the 06.0 device BAR1, turn off
the memory access of DC, GPU and VPU (via DC's CRTC registers).
Cc: stable@vger.kernel.org
Signed-off-by: Qianhai Wu <wuqianhai@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Yuchan Nam <entropy1110@gmail.com>
Date: Fri Mar 6 21:52:23 2026 +0900
media: mc, v4l2: serialize REINIT and REQBUFS with req_queue_mutex
commit bef4f4a88b73e4cc550d25f665b8a9952af22773 upstream.
MEDIA_REQUEST_IOC_REINIT can run concurrently with VIDIOC_REQBUFS(0)
queue teardown paths. This can race request object cleanup against vb2
queue cancellation and lead to use-after-free reports.
We already serialize request queueing against STREAMON/OFF with
req_queue_mutex. Extend that serialization to REQBUFS, and also take
the same mutex in media_request_ioctl_reinit() so REINIT is in the
same exclusion domain.
This keeps request cleanup and queue cancellation from running in
parallel for request-capable devices.
Fixes: 6093d3002eab ("media: vb2: keep a reference to the request until dqbuf")
Cc: stable@vger.kernel.org
Signed-off-by: Yuchan Nam <entropy1110@gmail.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: SeongJae Park <sj@kernel.org>
Date: Thu Mar 19 07:52:17 2026 -0700
mm/damon/core: avoid use of half-online-committed context
commit 26f775a054c3cda86ad465a64141894a90a9e145 upstream.
One major usage of damon_call() is online DAMON parameters update. It is
done by calling damon_commit_ctx() inside the damon_call() callback
function. damon_commit_ctx() can fail for two reasons: 1) invalid
parameters and 2) internal memory allocation failures. In case of
failures, the damon_ctx that attempted to be updated (commit destination)
can be partially updated (or, corrupted from a perspective), and therefore
shouldn't be used anymore. The function only ensures the damon_ctx object
can safely deallocated using damon_destroy_ctx().
The API callers are, however, calling damon_commit_ctx() only after
asserting the parameters are valid, to avoid damon_commit_ctx() fails due
to invalid input parameters. But it can still theoretically fail if the
internal memory allocation fails. In the case, DAMON may run with the
partially updated damon_ctx. This can result in unexpected behaviors
including even NULL pointer dereference in case of damos_commit_dests()
failure [1]. Such allocation failure is arguably too small to fail, so
the real world impact would be rare. But, given the bad consequence, this
needs to be fixed.
Avoid such partially-committed (maybe-corrupted) damon_ctx use by saving
the damon_commit_ctx() failure on the damon_ctx object. For this,
introduce damon_ctx->maybe_corrupted field. damon_commit_ctx() sets it
when it is failed. kdamond_call() checks if the field is set after each
damon_call_control->fn() is executed. If it is set, ignore remaining
callback requests and return. All kdamond_call() callers including
kdamond_fn() also check the maybe_corrupted field right after
kdamond_call() invocations. If the field is set, break the kdamond_fn()
main loop so that DAMON sill doesn't use the context that might be
corrupted.
[sj@kernel.org: let kdamond_call() with cancel regardless of maybe_corrupted]
Link: https://lkml.kernel.org/r/20260320031553.2479-1-sj@kernel.org
Link: https://sashiko.dev/#/patchset/20260319145218.86197-1-sj%40kernel.org
Link: https://lkml.kernel.org/r/20260319145218.86197-1-sj@kernel.org
Link: https://lore.kernel.org/20260319043309.97966-1-sj@kernel.org [1]
Fixes: 3301f1861d34 ("mm/damon/sysfs: handle commit command using damon_call()")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> [6.15+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: SeongJae Park <sj@kernel.org>
Date: Mon Mar 16 16:51:17 2026 -0700
mm/damon/stat: monitor all System RAM resources
commit 84481e705ab07ed46e56587fe846af194acacafe upstream.
DAMON_STAT usage document (Documentation/admin-guide/mm/damon/stat.rst)
says it monitors the system's entire physical memory. But, it is
monitoring only the biggest System RAM resource of the system. When there
are multiple System RAM resources, this results in monitoring only an
unexpectedly small fraction of the physical memory. For example, suppose
the system has a 500 GiB System RAM, 10 MiB non-System RAM, and 500 GiB
System RAM resources in order on the physical address space. DAMON_STAT
will monitor only the first 500 GiB System RAM. This situation is
particularly common on NUMA systems.
Select a physical address range that covers all System RAM areas of the
system, to fix this issue and make it work as documented.
[sj@kernel.org: return error if monitoring target region is invalid]
Link: https://lkml.kernel.org/r/20260317053631.87907-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20260316235118.873-1-sj@kernel.org
Fixes: 369c415e6073 ("mm/damon: introduce DAMON_STAT module")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> [6.17+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Josh Law <objecting@objecting.org>
Date: Sat Mar 21 10:54:25 2026 -0700
mm/damon/sysfs: check contexts->nr before accessing contexts_arr[0]
commit 1bfe9fb5ed2667fb075682408b776b5273162615 upstream.
Multiple sysfs command paths dereference contexts_arr[0] without first
verifying that kdamond->contexts->nr == 1. A user can set nr_contexts to
0 via sysfs while DAMON is running, causing NULL pointer dereferences.
In more detail, the issue can be triggered by privileged users like
below.
First, start DAMON and make contexts directory empty
(kdamond->contexts->nr == 0).
# damo start
# cd /sys/kernel/mm/damon/admin/kdamonds/0
# echo 0 > contexts/nr_contexts
Then, each of below commands will cause the NULL pointer dereference.
# echo update_schemes_stats > state
# echo update_schemes_tried_regions > state
# echo update_schemes_tried_bytes > state
# echo update_schemes_effective_quotas > state
# echo update_tuned_intervals > state
Guard all commands (except OFF) at the entry point of
damon_sysfs_handle_cmd().
Link: https://lkml.kernel.org/r/20260321175427.86000-3-sj@kernel.org
Fixes: 0ac32b8affb5 ("mm/damon/sysfs: support DAMOS stats")
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> [5.18+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Josh Law <objecting@objecting.org>
Date: Sat Mar 21 10:54:26 2026 -0700
mm/damon/sysfs: check contexts->nr in repeat_call_fn
commit 6557004a8b59c7701e695f02be03c7e20ed1cc15 upstream.
damon_sysfs_repeat_call_fn() calls damon_sysfs_upd_tuned_intervals(),
damon_sysfs_upd_schemes_stats(), and
damon_sysfs_upd_schemes_effective_quotas() without checking contexts->nr.
If nr_contexts is set to 0 via sysfs while DAMON is running, these
functions dereference contexts_arr[0] and cause a NULL pointer
dereference. Add the missing check.
For example, the issue can be reproduced using DAMON sysfs interface and
DAMON user-space tool (damo) [1] like below.
$ sudo damo start --refresh_interval 1s
$ echo 0 | sudo tee \
/sys/kernel/mm/damon/admin/kdamonds/0/contexts/nr_contexts
Link: https://patch.msgid.link/20260320163559.178101-3-objecting@objecting.org
Link: https://lkml.kernel.org/r/20260321175427.86000-4-sj@kernel.org
Link: https://github.com/damonitor/damo [1]
Fixes: d809a7c64ba8 ("mm/damon/sysfs: implement refresh_ms file internal work")
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> [6.17+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Josh Law <objecting@objecting.org>
Date: Sat Mar 21 10:54:24 2026 -0700
mm/damon/sysfs: fix param_ctx leak on damon_sysfs_new_test_ctx() failure
commit 7fe000eb32904758a85e62f6ea9483f89d5dabfc upstream.
Patch series "mm/damon/sysfs: fix memory leak and NULL dereference
issues", v4.
DAMON_SYSFS can leak memory under allocation failure, and do NULL pointer
dereference when a privileged user make wrong sequences of control. Fix
those.
This patch (of 3):
When damon_sysfs_new_test_ctx() fails in damon_sysfs_commit_input(),
param_ctx is leaked because the early return skips the cleanup at the out
label. Destroy param_ctx before returning.
Link: https://lkml.kernel.org/r/20260321175427.86000-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20260321175427.86000-2-sj@kernel.org
Fixes: f0c5118ebb0e ("mm/damon/sysfs: catch commit test ctx alloc failure")
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> [6.18+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jinjiang Tu <tujinjiang@huawei.com>
Date: Mon Mar 30 21:09:28 2026 -0400
mm/huge_memory: fix folio isn't locked in softleaf_to_folio()
[ Upstream commit 4c5e7f0fcd592801c9cc18f29f80fbee84eb8669 ]
On arm64 server, we found folio that get from migration entry isn't locked
in softleaf_to_folio(). This issue triggers when mTHP splitting and
zap_nonpresent_ptes() races, and the root cause is lack of memory barrier
in softleaf_to_folio(). The race is as follows:
CPU0 CPU1
deferred_split_scan() zap_nonpresent_ptes()
lock folio
split_folio()
unmap_folio()
change ptes to migration entries
__split_folio_to_order() softleaf_to_folio()
set flags(including PG_locked) for tail pages folio = pfn_folio(softleaf_to_pfn(entry))
smp_wmb() VM_WARN_ON_ONCE(!folio_test_locked(folio))
prep_compound_page() for tail pages
In __split_folio_to_order(), smp_wmb() guarantees page flags of tail pages
are visible before the tail page becomes non-compound. smp_wmb() should
be paired with smp_rmb() in softleaf_to_folio(), which is missed. As a
result, if zap_nonpresent_ptes() accesses migration entry that stores tail
pfn, softleaf_to_folio() may see the updated compound_head of tail page
before page->flags.
This issue will trigger VM_WARN_ON_ONCE() in pfn_swap_entry_folio()
because of the race between folio split and zap_nonpresent_ptes()
leading to a folio incorrectly undergoing modification without a folio
lock being held.
This is a BUG_ON() before commit 93976a20345b ("mm: eliminate further
swapops predicates"), which in merged in v6.19-rc1.
To fix it, add missing smp_rmb() if the softleaf entry is migration entry
in softleaf_to_folio() and softleaf_to_page().
[tujinjiang@huawei.com: update function name and comments]
Link: https://lkml.kernel.org/r/20260321075214.3305564-1-tujinjiang@huawei.com
Link: https://lkml.kernel.org/r/20260319012541.4158561-1-tujinjiang@huawei.com
Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()")
Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Barry Song <baohua@kernel.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nanyong Sun <sunnanyong@huawei.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ applied fix to swapops.h using old pfn_swap_entry/swp_entry_t naming ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Date: Fri Mar 27 17:31:04 2026 +0000
mm/mseal: update VMA end correctly on merge
commit 2697dd8ae721db4f6a53d4f4cbd438212a80f8dc upstream.
Previously we stored the end of the current VMA in curr_end, and then upon
iterating to the next VMA updated curr_start to curr_end to advance to the
next VMA.
However, this doesn't take into account the fact that a VMA might be
updated due to a merge by vma_modify_flags(), which can result in curr_end
being stale and thus, upon setting curr_start to curr_end, ending up with
an incorrect curr_start on the next iteration.
Resolve the issue by setting curr_end to vma->vm_end unconditionally to
ensure this value remains updated should this occur.
While we're here, eliminate this entire class of bug by simply setting
const curr_[start/end] to be clamped to the input range and VMAs, which
also happens to simplify the logic.
Link: https://lkml.kernel.org/r/20260327173104.322405-1-ljs@kernel.org
Fixes: 6c2da14ae1e0 ("mm/mseal: rework mseal apply logic")
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reported-by: Antonius <antonius@bluedragonsec.com>
Closes: https://lore.kernel.org/linux-mm/CAK8a0jwWGj9-SgFk0yKFh7i8jMkwKm5b0ao9=kmXWjO54veX2g@mail.gmail.com/
Suggested-by: David Hildenbrand (ARM) <david@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jeff Xu <jeffxu@chromium.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Max Boone <mboone@akamai.com>
Date: Wed Mar 25 10:59:16 2026 +0100
mm/pagewalk: fix race between concurrent split and refault
commit 3b89863c3fa482912911cd65a12a3aeef662c250 upstream.
The splitting of a PUD entry in walk_pud_range() can race with a
concurrent thread refaulting the PUD leaf entry causing it to try walking
a PMD range that has disappeared.
An example and reproduction of this is to try reading numa_maps of a
process while VFIO-PCI is setting up DMA (specifically the
vfio_pin_pages_remote call) on a large BAR for that process.
This will trigger a kernel BUG:
vfio-pci 0000:03:00.0: enabling device (0000 -> 0002)
BUG: unable to handle page fault for address: ffffa23980000000
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP NOPTI
...
RIP: 0010:walk_pgd_range+0x3b5/0x7a0
Code: 8d 43 ff 48 89 44 24 28 4d 89 ce 4d 8d a7 00 00 20 00 48 8b 4c 24
28 49 81 e4 00 00 e0 ff 49 8d 44 24 ff 48 39 c8 4c 0f 43 e3 <49> f7 06
9f ff ff ff 75 3b 48 8b 44 24 20 48 8b 40 28 48 85 c0 74
RSP: 0018:ffffac23e1ecf808 EFLAGS: 00010287
RAX: 00007f44c01fffff RBX: 00007f4500000000 RCX: 00007f44ffffffff
RDX: 0000000000000000 RSI: 000ffffffffff000 RDI: ffffffff93378fe0
RBP: ffffac23e1ecf918 R08: 0000000000000004 R09: ffffa23980000000
R10: 0000000000000020 R11: 0000000000000004 R12: 00007f44c0200000
R13: 00007f44c0000000 R14: ffffa23980000000 R15: 00007f44c0000000
FS: 00007fe884739580(0000) GS:ffff9b7d7a9c0000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffa23980000000 CR3: 000000c0650e2005 CR4: 0000000000770ef0
PKRU: 55555554
Call Trace:
<TASK>
__walk_page_range+0x195/0x1b0
walk_page_vma+0x62/0xc0
show_numa_map+0x12b/0x3b0
seq_read_iter+0x297/0x440
seq_read+0x11d/0x140
vfs_read+0xc2/0x340
ksys_read+0x5f/0xe0
do_syscall_64+0x68/0x130
? get_page_from_freelist+0x5c2/0x17e0
? mas_store_prealloc+0x17e/0x360
? vma_set_page_prot+0x4c/0xa0
? __alloc_pages_noprof+0x14e/0x2d0
? __mod_memcg_lruvec_state+0x8d/0x140
? __lruvec_stat_mod_folio+0x76/0xb0
? __folio_mod_stat+0x26/0x80
? do_anonymous_page+0x705/0x900
? __handle_mm_fault+0xa8d/0x1000
? __count_memcg_events+0x53/0xf0
? handle_mm_fault+0xa5/0x360
? do_user_addr_fault+0x342/0x640
? arch_exit_to_user_mode_prepare.constprop.0+0x16/0xa0
? irqentry_exit_to_user_mode+0x24/0x100
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fe88464f47e
Code: c0 e9 b6 fe ff ff 50 48 8d 3d be 07 0b 00 e8 69 01 02 00 66 0f 1f
84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00
f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
RSP: 002b:00007ffe6cd9a9b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007fe88464f47e
RDX: 0000000000020000 RSI: 00007fe884543000 RDI: 0000000000000003
RBP: 00007fe884543000 R08: 00007fe884542010 R09: 0000000000000000
R10: fffffffffffffbc5 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
</TASK>
Fix this by validating the PUD entry in walk_pmd_range() using a stable
snapshot (pudp_get()). If the PUD is not present or is a leaf, retry the
walk via ACTION_AGAIN instead of descending further. This mirrors the
retry logic in walk_pte_range(), which lets walk_pmd_range() retry if the
PTE is not being got by pte_offset_map_lock().
Link: https://lkml.kernel.org/r/20260325-pagewalk-check-pmd-refault-v2-1-707bff33bc60@akamai.com
Fixes: f9e54c3a2f5b ("vfio/pci: implement huge_fault support")
Co-developed-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Max Boone <mboone@akamai.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ihor Solodrai <ihor.solodrai@linux.dev>
Date: Tue Dec 30 10:32:08 2025 -0800
module: Fix kernel panic when a symbol st_shndx is out of bounds
[ Upstream commit f9d69d5e7bde2295eb7488a56f094ac8f5383b92 ]
The module loader doesn't check for bounds of the ELF section index in
simplify_symbols():
for (i = 1; i < symsec->sh_size / sizeof(Elf_Sym); i++) {
const char *name = info->strtab + sym[i].st_name;
switch (sym[i].st_shndx) {
case SHN_COMMON:
[...]
default:
/* Divert to percpu allocation if a percpu var. */
if (sym[i].st_shndx == info->index.pcpu)
secbase = (unsigned long)mod_percpu(mod);
else
/** HERE --> **/ secbase = info->sechdrs[sym[i].st_shndx].sh_addr;
sym[i].st_value += secbase;
break;
}
}
A symbol with an out-of-bounds st_shndx value, for example 0xffff
(known as SHN_XINDEX or SHN_HIRESERVE), may cause a kernel panic:
BUG: unable to handle page fault for address: ...
RIP: 0010:simplify_symbols+0x2b2/0x480
...
Kernel panic - not syncing: Fatal exception
This can happen when module ELF is legitimately using SHN_XINDEX or
when it is corrupted.
Add a bounds check in simplify_symbols() to validate that st_shndx is
within the valid range before using it.
This issue was discovered due to a bug in llvm-objcopy, see relevant
discussion for details [1].
[1] https://lore.kernel.org/linux-modules/20251224005752.201911-1-ihor.solodrai@linux.dev/
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Reviewed-by: Daniel Gomez <da.gomez@samsung.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Qi Tang <tpluszz77@gmail.com>
Date: Wed Mar 18 14:48:47 2026 +0800
net/smc: fix double-free of smc_spd_priv when tee() duplicates splice pipe buffer
[ Upstream commit 24dd586bb4cbba1889a50abe74143817a095c1c9 ]
smc_rx_splice() allocates one smc_spd_priv per pipe_buffer and stores
the pointer in pipe_buffer.private. The pipe_buf_operations for these
buffers used .get = generic_pipe_buf_get, which only increments the page
reference count when tee(2) duplicates a pipe buffer. The smc_spd_priv
pointer itself was not handled, so after tee() both the original and the
cloned pipe_buffer share the same smc_spd_priv *.
When both pipes are subsequently released, smc_rx_pipe_buf_release() is
called twice against the same object:
1st call: kfree(priv) sock_put(sk) smc_rx_update_cons() [correct]
2nd call: kfree(priv) sock_put(sk) smc_rx_update_cons() [UAF]
KASAN reports a slab-use-after-free in smc_rx_pipe_buf_release(), which
then escalates to a NULL-pointer dereference and kernel panic via
smc_rx_update_consumer() when it chases the freed priv->smc pointer:
BUG: KASAN: slab-use-after-free in smc_rx_pipe_buf_release+0x78/0x2a0
Read of size 8 at addr ffff888004a45740 by task smc_splice_tee_/74
Call Trace:
<TASK>
dump_stack_lvl+0x53/0x70
print_report+0xce/0x650
kasan_report+0xc6/0x100
smc_rx_pipe_buf_release+0x78/0x2a0
free_pipe_info+0xd4/0x130
pipe_release+0x142/0x160
__fput+0x1c6/0x490
__x64_sys_close+0x4f/0x90
do_syscall_64+0xa6/0x1a0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
BUG: kernel NULL pointer dereference, address: 0000000000000020
RIP: 0010:smc_rx_update_consumer+0x8d/0x350
Call Trace:
<TASK>
smc_rx_pipe_buf_release+0x121/0x2a0
free_pipe_info+0xd4/0x130
pipe_release+0x142/0x160
__fput+0x1c6/0x490
__x64_sys_close+0x4f/0x90
do_syscall_64+0xa6/0x1a0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
Kernel panic - not syncing: Fatal exception
Beyond the memory-safety problem, duplicating an SMC splice buffer is
semantically questionable: smc_rx_update_cons() would advance the
consumer cursor twice for the same data, corrupting receive-window
accounting. A refcount on smc_spd_priv could fix the double-free, but
the cursor-accounting issue would still need to be addressed separately.
The .get callback is invoked by both tee(2) and splice_pipe_to_pipe()
for partial transfers; both will now return -EFAULT. Users who need
to duplicate SMC socket data must use a copy-based read path.
Fixes: 9014db202cb7 ("smc: add support for splice()")
Signed-off-by: Qi Tang <tpluszz77@gmail.com>
Link: https://patch.msgid.link/20260318064847.23341-1-tpluszz77@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Qingfang Deng <dqfext@gmail.com>
Date: Fri Mar 20 17:43:15 2026 +0800
net: airoha: add RCU lock around dev_fill_forward_path
[ Upstream commit 1065913dedfd3a8269816835bfe810b6e2c28579 ]
Since 0417adf367a0 ("ppp: fix race conditions in ppp_fill_forward_path")
dev_fill_forward_path() should be called with RCU read lock held. This
fix was applied to net, while the Airoha flowtable commit was applied to
net-next, so it hadn't been an issue until net was merged into net-next.
Fixes: a8bdd935d1dd ("net: airoha: Add wlan flowtable TX offload")
Signed-off-by: Qingfang Deng <dqfext@gmail.com>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260320094315.525126-1-dqfext@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Justin Chen <justin.chen@broadcom.com>
Date: Thu Mar 19 16:48:13 2026 -0700
net: bcmasp: fix double disable of clk
[ Upstream commit 27dfe9030acbc601c260b42ecdbb4e5858a97b53 ]
Switch to devm_clk_get_optional() so we can manage the clock ourselves.
We dynamically control the clocks depending on the state of the interface
for power savings. The default state is clock disabled, so unbinding the
driver causes a double disable.
Fixes: 490cb412007d ("net: bcmasp: Add support for ASP2.0 Ethernet controller")
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20260319234813.1937315-3-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Justin Chen <justin.chen@broadcom.com>
Date: Thu Mar 19 16:48:12 2026 -0700
net: bcmasp: fix double free of WoL irq
[ Upstream commit cbfa5be2bf64511d49b854a0f9fd6d0b5118621f ]
We do not need to free wol_irq since it was instantiated with
devm_request_irq(). So devres will free for us.
Fixes: a2f0751206b0 ("net: bcmasp: Add support for WoL magic packet")
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20260319234813.1937315-2-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Justin Chen <justin.chen@broadcom.com>
Date: Thu Jan 22 11:49:49 2026 -0800
net: bcmasp: streamline early exit in probe
[ Upstream commit 1fd1281250c38408d793863c8dcaa43c7de8932c ]
Streamline the bcmasp_probe early exit. As support for other
functionality is added(i.e. ptp), it is easier to keep track of early
exit cleanup when it is all in one place.
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20260122194949.1145107-3-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: cbfa5be2bf64 ("net: bcmasp: fix double free of WoL irq")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wei Fang <wei.fang@nxp.com>
Date: Fri Mar 20 17:42:22 2026 +0800
net: enetc: fix the output issue of 'ethtool --show-ring'
[ Upstream commit 70b439bf06f6a12e491f827fa81a9887a11501f9 ]
Currently, enetc_get_ringparam() only provides rx_pending and tx_pending,
but 'ethtool --show-ring' no longer displays these fields. Because the
ringparam retrieval path has moved to the new netlink interface, where
rings_fill_reply() emits the *x_pending only if the *x_max_pending values
are non-zero. So rx_max_pending and tx_max_pending to are added to
enetc_get_ringparam() to fix the issue.
Note that the maximum tx/rx ring size of hardware is 64K, but we haven't
added set_ringparam() to make the ring size configurable. To avoid users
mistakenly believing that the ring size can be increased, so set
the *x_max_pending to priv->*x_bd_count.
Fixes: e4a1717b677c ("ethtool: provide ring sizes with RINGS_GET request")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260320094222.706339-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yochai Eisenrich <echelonh@gmail.com>
Date: Thu Mar 19 22:06:10 2026 +0200
net: fix fanout UAF in packet_release() via NETDEV_UP race
[ Upstream commit 42156f93d123436f2a27c468f18c966b7e5db796 ]
`packet_release()` has a race window where `NETDEV_UP` can re-register a
socket into a fanout group's `arr[]` array. The re-registration is not
cleaned up by `fanout_release()`, leaving a dangling pointer in the fanout
array.
`packet_release()` does NOT zero `po->num` in its `bind_lock` section.
After releasing `bind_lock`, `po->num` is still non-zero and `po->ifindex`
still matches the bound device. A concurrent `packet_notifier(NETDEV_UP)`
that already found the socket in `sklist` can re-register the hook.
For fanout sockets, this re-registration calls `__fanout_link(sk, po)`
which adds the socket back into `f->arr[]` and increments `f->num_members`,
but does NOT increment `f->sk_ref`.
The fix sets `po->num` to zero in `packet_release` while `bind_lock` is
held to prevent NETDEV_UP from linking, preventing the race window.
This bug was found following an additional audit with Claude Code based
on CVE-2025-38617.
Fixes: ce06b03e60fc ("packet: Add helpers to register/unregister ->prot_hook")
Link: https://blog.calif.io/p/a-race-within-a-race-exploiting-cve
Signed-off-by: Yochai Eisenrich <echelonh@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260319200610.25101-1-echelonh@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Thangaraj Samynathan <thangaraj.s@microchip.com>
Date: Mon Mar 23 12:23:45 2026 +0530
net: lan743x: fix duplex configuration in mac_link_up
[ Upstream commit 71399707876b93240f236f48b8062f3423a5fe97 ]
The driver does not explicitly configure the MAC duplex mode when
bringing the link up. As a result, the MAC may retain a stale duplex
setting from a previous link state, leading to duplex mismatches with
the link partner and degraded network performance.
Update lan743x_phylink_mac_link_up() to set or clear the MAC_CR_DPX_
bit according to the negotiated duplex mode.
This ensures the MAC configuration is consistent with the phylink
resolved state.
Fixes: a5f199a8d8a03 ("net: lan743x: Migrate phylib to phylink")
Signed-off-by: Thangaraj Samynathan <thangaraj.s@microchip.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20260323065345.144915-1-thangaraj.s@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kevin Hao <haokexin@gmail.com>
Date: Wed Mar 18 14:36:58 2026 +0800
net: macb: Move devm_{free,request}_irq() out of spin lock area
commit 317e49358ebbf6390fa439ef3c142f9239dd25fb upstream.
The devm_free_irq() and devm_request_irq() functions should not be
executed in an atomic context.
During device suspend, all userspace processes and most kernel threads
are frozen. Additionally, we flush all tx/rx status, disable all macb
interrupts, and halt rx operations. Therefore, it is safe to split the
region protected by bp->lock into two independent sections, allowing
devm_free_irq() and devm_request_irq() to run in a non-atomic context.
This modification resolves the following lockdep warning:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:591
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 501, name: rtcwake
preempt_count: 1, expected: 0
RCU nest depth: 1, expected: 0
7 locks held by rtcwake/501:
#0: ffff0008038c3408 (sb_writers#5){.+.+}-{0:0}, at: vfs_write+0xf8/0x368
#1: ffff0008049a5e88 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0xbc/0x1c8
#2: ffff00080098d588 (kn->active#70){.+.+}-{0:0}, at: kernfs_fop_write_iter+0xcc/0x1c8
#3: ffff800081c84888 (system_transition_mutex){+.+.}-{4:4}, at: pm_suspend+0x1ec/0x290
#4: ffff0008009ba0f8 (&dev->mutex){....}-{4:4}, at: device_suspend+0x118/0x4f0
#5: ffff800081d00458 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48
#6: ffff0008031fb9e0 (&bp->lock){-.-.}-{3:3}, at: macb_suspend+0x144/0x558
irq event stamp: 8682
hardirqs last enabled at (8681): [<ffff8000813c7d7c>] _raw_spin_unlock_irqrestore+0x44/0x88
hardirqs last disabled at (8682): [<ffff8000813c7b58>] _raw_spin_lock_irqsave+0x38/0x98
softirqs last enabled at (7322): [<ffff8000800f1b4c>] handle_softirqs+0x52c/0x588
softirqs last disabled at (7317): [<ffff800080010310>] __do_softirq+0x20/0x2c
CPU: 1 UID: 0 PID: 501 Comm: rtcwake Not tainted 7.0.0-rc3-next-20260310-yocto-standard+ #125 PREEMPT
Hardware name: ZynqMP ZCU102 Rev1.1 (DT)
Call trace:
show_stack+0x24/0x38 (C)
__dump_stack+0x28/0x38
dump_stack_lvl+0x64/0x88
dump_stack+0x18/0x24
__might_resched+0x200/0x218
__might_sleep+0x38/0x98
__mutex_lock_common+0x7c/0x1378
mutex_lock_nested+0x38/0x50
free_irq+0x68/0x2b0
devm_irq_release+0x24/0x38
devres_release+0x40/0x80
devm_free_irq+0x48/0x88
macb_suspend+0x298/0x558
device_suspend+0x218/0x4f0
dpm_suspend+0x244/0x3a0
dpm_suspend_start+0x50/0x78
suspend_devices_and_enter+0xec/0x560
pm_suspend+0x194/0x290
state_store+0x110/0x158
kobj_attr_store+0x1c/0x30
sysfs_kf_write+0xa8/0xd0
kernfs_fop_write_iter+0x11c/0x1c8
vfs_write+0x248/0x368
ksys_write+0x7c/0xf8
__arm64_sys_write+0x28/0x40
invoke_syscall+0x4c/0xe8
el0_svc_common+0x98/0xf0
do_el0_svc+0x28/0x40
el0_svc+0x54/0x1e0
el0t_64_sync_handler+0x84/0x130
el0t_64_sync+0x198/0x1a0
Fixes: 558e35ccfe95 ("net: macb: WoL support for GEM type of Ethernet controller")
Cc: stable@vger.kernel.org
Reviewed-by: Théo Lebrun <theo.lebrun@bootlin.com>
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Link: https://patch.msgid.link/20260318-macb-irq-v2-1-f1179768ab24@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kevin Hao <haokexin@gmail.com>
Date: Wed Mar 18 14:36:59 2026 +0800
net: macb: Protect access to net_device::ip_ptr with RCU lock
commit baa35a698cea26930679a20a7550bbb4c8319725 upstream.
Access to net_device::ip_ptr and its associated members must be
protected by an RCU lock. Since we are modifying this piece of code,
let's also move it to execute only when WAKE_ARP is enabled.
To minimize the duration of the RCU lock, a local variable is used to
temporarily store the IP address. This change resolves the following
RCU check warning:
WARNING: suspicious RCU usage
7.0.0-rc3-next-20260310-yocto-standard+ #122 Not tainted
-----------------------------
drivers/net/ethernet/cadence/macb_main.c:5944 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
5 locks held by rtcwake/518:
#0: ffff000803ab1408 (sb_writers#5){.+.+}-{0:0}, at: vfs_write+0xf8/0x368
#1: ffff0008090bf088 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0xbc/0x1c8
#2: ffff00080098d588 (kn->active#70){.+.+}-{0:0}, at: kernfs_fop_write_iter+0xcc/0x1c8
#3: ffff800081c84888 (system_transition_mutex){+.+.}-{4:4}, at: pm_suspend+0x1ec/0x290
#4: ffff0008009ba0f8 (&dev->mutex){....}-{4:4}, at: device_suspend+0x118/0x4f0
stack backtrace:
CPU: 3 UID: 0 PID: 518 Comm: rtcwake Not tainted 7.0.0-rc3-next-20260310-yocto-standard+ #122 PREEMPT
Hardware name: ZynqMP ZCU102 Rev1.1 (DT)
Call trace:
show_stack+0x24/0x38 (C)
__dump_stack+0x28/0x38
dump_stack_lvl+0x64/0x88
dump_stack+0x18/0x24
lockdep_rcu_suspicious+0x134/0x1d8
macb_suspend+0xd8/0x4c0
device_suspend+0x218/0x4f0
dpm_suspend+0x244/0x3a0
dpm_suspend_start+0x50/0x78
suspend_devices_and_enter+0xec/0x560
pm_suspend+0x194/0x290
state_store+0x110/0x158
kobj_attr_store+0x1c/0x30
sysfs_kf_write+0xa8/0xd0
kernfs_fop_write_iter+0x11c/0x1c8
vfs_write+0x248/0x368
ksys_write+0x7c/0xf8
__arm64_sys_write+0x28/0x40
invoke_syscall+0x4c/0xe8
el0_svc_common+0x98/0xf0
do_el0_svc+0x28/0x40
el0_svc+0x54/0x1e0
el0t_64_sync_handler+0x84/0x130
el0t_64_sync+0x198/0x1a0
Fixes: 0cb8de39a776 ("net: macb: Add ARP support to WOL")
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Cc: stable@vger.kernel.org
Reviewed-by: Théo Lebrun <theo.lebrun@bootlin.com>
Link: https://patch.msgid.link/20260318-macb-irq-v2-2-f1179768ab24@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kevin Hao <haokexin@gmail.com>
Date: Sat Mar 21 22:04:41 2026 +0800
net: macb: Use dev_consume_skb_any() to free TX SKBs
commit 647b8a2fe474474704110db6bd07f7a139e621eb upstream.
The napi_consume_skb() function is not intended to be called in an IRQ
disabled context. However, after commit 6bc8a5098bf4 ("net: macb: Fix
tx_ptr_lock locking"), the freeing of TX SKBs is performed with IRQs
disabled. To resolve the following call trace, use dev_consume_skb_any()
for freeing TX SKBs:
WARNING: kernel/softirq.c:430 at __local_bh_enable_ip+0x174/0x188, CPU#0: ksoftirqd/0/15
Modules linked in:
CPU: 0 UID: 0 PID: 15 Comm: ksoftirqd/0 Not tainted 7.0.0-rc4-next-20260319-yocto-standard-dirty #37 PREEMPT
Hardware name: ZynqMP ZCU102 Rev1.1 (DT)
pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __local_bh_enable_ip+0x174/0x188
lr : local_bh_enable+0x24/0x38
sp : ffff800082b3bb10
x29: ffff800082b3bb10 x28: ffff0008031f3c00 x27: 000000000011ede0
x26: ffff000800a7ff00 x25: ffff800083937ce8 x24: 0000000000017a80
x23: ffff000803243a78 x22: 0000000000000040 x21: 0000000000000000
x20: ffff000800394c80 x19: 0000000000000200 x18: 0000000000000001
x17: 0000000000000001 x16: ffff000803240000 x15: 0000000000000000
x14: ffffffffffffffff x13: 0000000000000028 x12: ffff000800395650
x11: ffff8000821d1528 x10: ffff800081c2bc08 x9 : ffff800081c1e258
x8 : 0000000100000301 x7 : ffff8000810426ec x6 : 0000000000000000
x5 : 0000000000000001 x4 : 0000000000000001 x3 : 0000000000000000
x2 : 0000000000000008 x1 : 0000000000000200 x0 : ffff8000810428dc
Call trace:
__local_bh_enable_ip+0x174/0x188 (P)
local_bh_enable+0x24/0x38
skb_attempt_defer_free+0x190/0x1d8
napi_consume_skb+0x58/0x108
macb_tx_poll+0x1a4/0x558
__napi_poll+0x50/0x198
net_rx_action+0x1f4/0x3d8
handle_softirqs+0x16c/0x560
run_ksoftirqd+0x44/0x80
smpboot_thread_fn+0x1d8/0x338
kthread+0x120/0x150
ret_from_fork+0x10/0x20
irq event stamp: 29751
hardirqs last enabled at (29750): [<ffff8000813be184>] _raw_spin_unlock_irqrestore+0x44/0x88
hardirqs last disabled at (29751): [<ffff8000813bdf60>] _raw_spin_lock_irqsave+0x38/0x98
softirqs last enabled at (29150): [<ffff8000800f1aec>] handle_softirqs+0x504/0x560
softirqs last disabled at (29153): [<ffff8000800f2fec>] run_ksoftirqd+0x44/0x80
Fixes: 6bc8a5098bf4 ("net: macb: Fix tx_ptr_lock locking")
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Cc: stable@vger.kernel.org
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260321-macb-tx-v1-1-b383a58dd4e6@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Paolo Valerio <pvalerio@redhat.com>
Date: Mon Mar 23 20:16:34 2026 +0100
net: macb: use the current queue number for stats
[ Upstream commit 72d96e4e24bbefdcfbc68bdb9341a05d8f5cb6e5 ]
There's a potential mismatch between the memory reserved for statistics
and the amount of memory written.
gem_get_sset_count() correctly computes the number of stats based on the
active queues, whereas gem_get_ethtool_stats() indiscriminately copies
data using the maximum number of queues, and in the case the number of
active queues is less than MACB_MAX_QUEUES, this results in a OOB write
as observed in the KASAN splat.
==================================================================
BUG: KASAN: vmalloc-out-of-bounds in gem_get_ethtool_stats+0x54/0x78
[macb]
Write of size 760 at addr ffff80008080b000 by task ethtool/1027
CPU: [...]
Tainted: [E]=UNSIGNED_MODULE
Hardware name: raspberrypi rpi/rpi, BIOS 2025.10 10/01/2025
Call trace:
show_stack+0x20/0x38 (C)
dump_stack_lvl+0x80/0xf8
print_report+0x384/0x5e0
kasan_report+0xa0/0xf0
kasan_check_range+0xe8/0x190
__asan_memcpy+0x54/0x98
gem_get_ethtool_stats+0x54/0x78 [macb
926c13f3af83b0c6fe64badb21ec87d5e93fcf65]
dev_ethtool+0x1220/0x38c0
dev_ioctl+0x4ac/0xca8
sock_do_ioctl+0x170/0x1d8
sock_ioctl+0x484/0x5d8
__arm64_sys_ioctl+0x12c/0x1b8
invoke_syscall+0xd4/0x258
el0_svc_common.constprop.0+0xb4/0x240
do_el0_svc+0x48/0x68
el0_svc+0x40/0xf8
el0t_64_sync_handler+0xa0/0xe8
el0t_64_sync+0x1b0/0x1b8
The buggy address belongs to a 1-page vmalloc region starting at
0xffff80008080b000 allocated at dev_ethtool+0x11f0/0x38c0
The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000
index:0xffff00000a333000 pfn:0xa333
flags: 0x7fffc000000000(node=0|zone=0|lastcpupid=0x1ffff)
raw: 007fffc000000000 0000000000000000 dead000000000122 0000000000000000
raw: ffff00000a333000 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff80008080b080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff80008080b100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff80008080b180: 00 00 00 00 00 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
^
ffff80008080b200: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
ffff80008080b280: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
==================================================================
Fix it by making sure the copied size only considers the active number of
queues.
Fixes: 512286bbd4b7 ("net: macb: Added some queue statistics")
Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>
Link: https://patch.msgid.link/20260323191634.2185840-1-pvalerio@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Toke Høiland-Jørgensen <toke@redhat.com>
Date: Wed Mar 18 16:55:51 2026 +0100
net: openvswitch: Avoid releasing netdev before teardown completes
[ Upstream commit 7c770dadfda5cbbde6aa3c4363ed513f1d212bf8 ]
The patch cited in the Fixes tag below changed the teardown code for
OVS ports to no longer unconditionally take the RTNL. After this change,
the netdev_destroy() callback can proceed immediately to the call_rcu()
invocation if the IFF_OVS_DATAPATH flag is already cleared on the
netdev.
The ovs_netdev_detach_dev() function clears the flag before completing
the unregistration, and if it gets preempted after clearing the flag (as
can happen on an -rt kernel), netdev_destroy() can complete and the
device can be freed before the unregistration completes. This leads to a
splat like:
[ 998.393867] Oops: general protection fault, probably for non-canonical address 0xff00000001000239: 0000 [#1] SMP PTI
[ 998.393877] CPU: 42 UID: 0 PID: 55177 Comm: ip Kdump: loaded Not tainted 6.12.0-211.1.1.el10_2.x86_64+rt #1 PREEMPT_RT
[ 998.393886] Hardware name: Dell Inc. PowerEdge R740/0JMK61, BIOS 2.24.0 03/27/2025
[ 998.393889] RIP: 0010:dev_set_promiscuity+0x8d/0xa0
[ 998.393901] Code: 00 00 75 d8 48 8b 53 08 48 83 ba b0 02 00 00 00 75 ca 48 83 c4 08 5b c3 cc cc cc cc 48 83 bf 48 09 00 00 00 75 91 48 8b 47 08 <48> 83 b8 b0 02 00 00 00 74 97 eb 81 0f 1f 80 00 00 00 00 90 90 90
[ 998.393906] RSP: 0018:ffffce5864a5f6a0 EFLAGS: 00010246
[ 998.393912] RAX: ff00000000ffff89 RBX: ffff894d0adf5a05 RCX: 0000000000000000
[ 998.393917] RDX: 0000000000000000 RSI: 00000000ffffffff RDI: ffff894d0adf5a05
[ 998.393921] RBP: ffff894d19252000 R08: ffff894d19252000 R09: 0000000000000000
[ 998.393924] R10: ffff894d19252000 R11: ffff894d192521b8 R12: 0000000000000006
[ 998.393927] R13: ffffce5864a5f738 R14: 00000000ffffffe2 R15: 0000000000000000
[ 998.393931] FS: 00007fad61971800(0000) GS:ffff894cc0140000(0000) knlGS:0000000000000000
[ 998.393936] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 998.393940] CR2: 000055df0a2a6e40 CR3: 000000011c7fe003 CR4: 00000000007726f0
[ 998.393944] PKRU: 55555554
[ 998.393946] Call Trace:
[ 998.393949] <TASK>
[ 998.393952] ? show_trace_log_lvl+0x1b0/0x2f0
[ 998.393961] ? show_trace_log_lvl+0x1b0/0x2f0
[ 998.393975] ? dp_device_event+0x41/0x80 [openvswitch]
[ 998.394009] ? __die_body.cold+0x8/0x12
[ 998.394016] ? die_addr+0x3c/0x60
[ 998.394027] ? exc_general_protection+0x16d/0x390
[ 998.394042] ? asm_exc_general_protection+0x26/0x30
[ 998.394058] ? dev_set_promiscuity+0x8d/0xa0
[ 998.394066] ? ovs_netdev_detach_dev+0x3a/0x80 [openvswitch]
[ 998.394092] dp_device_event+0x41/0x80 [openvswitch]
[ 998.394102] notifier_call_chain+0x5a/0xd0
[ 998.394106] unregister_netdevice_many_notify+0x51b/0xa60
[ 998.394110] rtnl_dellink+0x169/0x3e0
[ 998.394121] ? rt_mutex_slowlock.constprop.0+0x95/0xd0
[ 998.394125] rtnetlink_rcv_msg+0x142/0x3f0
[ 998.394128] ? avc_has_perm_noaudit+0x69/0xf0
[ 998.394130] ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[ 998.394132] netlink_rcv_skb+0x50/0x100
[ 998.394138] netlink_unicast+0x292/0x3f0
[ 998.394141] netlink_sendmsg+0x21b/0x470
[ 998.394145] ____sys_sendmsg+0x39d/0x3d0
[ 998.394149] ___sys_sendmsg+0x9a/0xe0
[ 998.394156] __sys_sendmsg+0x7a/0xd0
[ 998.394160] do_syscall_64+0x7f/0x170
[ 998.394162] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 998.394165] RIP: 0033:0x7fad61bf4724
[ 998.394188] Code: 89 02 b8 ff ff ff ff eb bb 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 80 3d c5 e9 0c 00 00 74 13 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 89 54 24 1c 48 89
[ 998.394189] RSP: 002b:00007ffd7e2f7cb8 EFLAGS: 00000202 ORIG_RAX: 000000000000002e
[ 998.394191] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fad61bf4724
[ 998.394193] RDX: 0000000000000000 RSI: 00007ffd7e2f7d20 RDI: 0000000000000003
[ 998.394194] RBP: 00007ffd7e2f7d90 R08: 0000000000000010 R09: 000000000000003f
[ 998.394195] R10: 000055df11558010 R11: 0000000000000202 R12: 00007ffd7e2f8380
[ 998.394196] R13: 0000000069b233d7 R14: 000055df0a256040 R15: 0000000000000000
[ 998.394200] </TASK>
To fix this, reorder the operations in ovs_netdev_detach_dev() to only
clear the flag after completing the other operations, and introduce an
smp_wmb() to make the ordering requirement explicit. The smp_wmb() is
paired with a full smp_mb() in netdev_destroy() to make sure the
call_rcu() invocation does not happen before the unregister operations
are visible.
Reported-by: Minxi Hou <mhou@redhat.com>
Tested-by: Minxi Hou <mhou@redhat.com>
Fixes: 549822767630 ("net: openvswitch: Avoid needlessly taking the RTNL on vport destroy")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20260318155554.1133405-1-toke@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Valentin Spreckels <valentin@spreckels.dev>
Date: Thu Feb 26 20:54:09 2026 +0100
net: usb: r8152: add TRENDnet TUC-ET2G
[ Upstream commit 15fba71533bcdfaa8eeba69a5a5a2927afdf664a ]
The TRENDnet TUC-ET2G is a RTL8156 based usb ethernet adapter. Add its
vendor and product IDs.
Signed-off-by: Valentin Spreckels <valentin@spreckels.dev>
Link: https://patch.msgid.link/20260226195409.7891-2-valentin@spreckels.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Carlier <devnexen@gmail.com>
Date: Wed Mar 25 14:11:08 2026 +0100
netfilter: ctnetlink: use netlink policy range checks
[ Upstream commit 8f15b5071b4548b0aafc03b366eb45c9c6566704 ]
Replace manual range and mask validations with netlink policy
annotations in ctnetlink code paths, so that the netlink core rejects
invalid values early and can generate extack errors.
- CTA_PROTOINFO_TCP_STATE: reject values > TCP_CONNTRACK_SYN_SENT2 at
policy level, removing the manual >= TCP_CONNTRACK_MAX check.
- CTA_PROTOINFO_TCP_WSCALE_ORIGINAL/REPLY: reject values > TCP_MAX_WSCALE
(14). The normal TCP option parsing path already clamps to this value,
but the ctnetlink path accepted 0-255, causing undefined behavior when
used as a u32 shift count.
- CTA_FILTER_ORIG_FLAGS/REPLY_FLAGS: use NLA_POLICY_MASK with
CTA_FILTER_F_ALL, removing the manual mask checks.
- CTA_EXPECT_FLAGS: use NLA_POLICY_MASK with NF_CT_EXPECT_MASK, adding
a new mask define grouping all valid expect flags.
Extracted from a broader nf-next patch by Florian Westphal, scoped to
ctnetlink for the fixes tree.
Fixes: c8e2078cfe41 ("[NETFILTER]: ctnetlink: add support for internal tcp connection tracking flags handling")
Signed-off-by: David Carlier <devnexen@gmail.com>
Co-developed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ren Wei <n05ec@lzu.edu.cn>
Date: Wed Mar 25 14:11:00 2026 +0100
netfilter: ip6t_rt: reject oversized addrnr in rt_mt6_check()
[ Upstream commit 9d3f027327c2fa265f7f85ead41294792c3296ed ]
Reject rt match rules whose addrnr exceeds IP6T_RT_HOPS.
rt_mt6() expects addrnr to stay within the bounds of rtinfo->addrs[].
Validate addrnr during rule installation so malformed rules are rejected
before the match logic can use an out-of-range value.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Co-developed-by: Yuan Tan <yuantan098@gmail.com>
Signed-off-by: Yuan Tan <yuantan098@gmail.com>
Suggested-by: Xin Liu <bird@lzu.edu.cn>
Tested-by: Yuhang Zheng <z1652074432@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Wed Mar 25 14:11:06 2026 +0100
netfilter: nf_conntrack_expect: skip expectations in other netns via proc
[ Upstream commit 3db5647984de03d9cae0dcddb509b058351f0ee4 ]
Skip expectations that do not reside in this netns.
Similar to e77e6ff502ea ("netfilter: conntrack: do not dump other netns's
conntrack entries via proc").
Fixes: 9b03f38d0487 ("netfilter: netns nf_conntrack: per-netns expectations")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Weiming Shi <bestswngs@gmail.com>
Date: Wed Mar 25 14:11:07 2026 +0100
netfilter: nf_conntrack_sip: fix use of uninitialized rtp_addr in process_sdp
[ Upstream commit 6a2b724460cb67caed500c508c2ae5cf012e4db4 ]
process_sdp() declares union nf_inet_addr rtp_addr on the stack and
passes it to the nf_nat_sip sdp_session hook after walking the SDP
media descriptions. However rtp_addr is only initialized inside the
media loop when a recognized media type with a non-zero port is found.
If the SDP body contains no m= lines, only inactive media sections
(m=audio 0 ...) or only unrecognized media types, rtp_addr is never
assigned. Despite that, the function still calls hooks->sdp_session()
with &rtp_addr, causing nf_nat_sdp_session() to format the stale stack
value as an IP address and rewrite the SDP session owner and connection
lines with it.
With CONFIG_INIT_STACK_ALL_ZERO (default on most distributions) this
results in the session-level o= and c= addresses being rewritten to
0.0.0.0 for inactive SDP sessions. Without stack auto-init the
rewritten address is whatever happened to be on the stack.
Fix this by pre-initializing rtp_addr from the session-level connection
address (caddr) when available, and tracking via a have_rtp_addr flag
whether any valid address was established. Skip the sdp_session hook
entirely when no valid address exists.
Fixes: 4ab9e64e5e3c ("[NETFILTER]: nf_nat_sip: split up SDP mangling")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Weiming Shi <bestswngs@gmail.com>
Date: Wed Mar 25 14:10:58 2026 +0100
netfilter: nfnetlink_log: fix uninitialized padding leak in NFULA_PAYLOAD
[ Upstream commit 52025ebaa29f4eb4ed8bf92ce83a68f24ab7fdf7 ]
__build_packet_message() manually constructs the NFULA_PAYLOAD netlink
attribute using skb_put() and skb_copy_bits(), bypassing the standard
nla_reserve()/nla_put() helpers. While nla_total_size(data_len) bytes
are allocated (including NLA alignment padding), only data_len bytes
of actual packet data are copied. The trailing nla_padlen(data_len)
bytes (1-3 when data_len is not 4-byte aligned) are never initialized,
leaking stale heap contents to userspace via the NFLOG netlink socket.
Replace the manual attribute construction with nla_reserve(), which
handles the tailroom check, header setup, and padding zeroing via
__nla_reserve(). The subsequent skb_copy_bits() fills in the payload
data on top of the properly initialized attribute.
Fixes: df6fb868d611 ("[NETFILTER]: nfnetlink: convert to generic netlink attribute functions")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Wed Mar 25 14:11:01 2026 +0100
netfilter: nft_set_rbtree: revisit array resize logic
[ Upstream commit fafdd92b9e30fe057740c5bb5cd4f92ecea9bf26 ]
Chris Arges reports high memory consumption with thousands of
containers, this patch revisits the array allocation logic.
For anonymous sets, start by 16 slots (which takes 256 bytes on x86_64).
Expand it by x2 until threshold of 512 slots is reached, over that
threshold, expand it by x1.5.
For non-anonymous set, start by 1024 slots in the array (which takes 16
Kbytes initially on x86_64). Expand it by x1.5.
Use set->ndeact to subtract deactivated elements when calculating the
number of the slots in the array, otherwise the array size array gets
increased artifically. Add special case shrink logic to deal with flush
set too.
The shrink logic is skipped by anonymous sets.
Use check_add_overflow() to calculate the new array size.
Add a WARN_ON_ONCE check to make sure elements fit into the new array
size.
Reported-by: Chris Arges <carges@cloudflare.com>
Fixes: 7e43e0a1141d ("netfilter: nft_set_rbtree: translate rbtree to array for binary search")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Deepanshu Kartikey <kartikey406@gmail.com>
Date: Sat Mar 7 14:30:41 2026 +0530
netfs: Fix kernel BUG in netfs_limit_iter() for ITER_KVEC iterators
[ Upstream commit 67e467a11f62ff64ad219dc6aa5459e132c79d14 ]
When a process crashes and the kernel writes a core dump to a 9P
filesystem, __kernel_write() creates an ITER_KVEC iterator. This
iterator reaches netfs_limit_iter() via netfs_unbuffered_write(), which
only handles ITER_FOLIOQ, ITER_BVEC and ITER_XARRAY iterator types,
hitting the BUG() for any other type.
Fix this by adding netfs_limit_kvec() following the same pattern as
netfs_limit_bvec(), since both kvec and bvec are simple segment arrays
with pointer and length fields. Dispatch it from netfs_limit_iter() when
the iterator type is ITER_KVEC.
Fixes: cae932d3aee5 ("netfs: Add func to calculate pagecount/size-limited span of an iterator")
Reported-by: syzbot+9c058f0d63475adc97fd@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9c058f0d63475adc97fd
Tested-by: syzbot+9c058f0d63475adc97fd@syzkaller.appspotmail.com
Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
Link: https://patch.msgid.link/20260307090041.359870-1-kartikey406@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Deepanshu Kartikey <kartikey406@gmail.com>
Date: Sat Mar 7 10:09:47 2026 +0530
netfs: Fix NULL pointer dereference in netfs_unbuffered_write() on retry
[ Upstream commit e9075e420a1eb3b52c60f3b95893a55e77419ce8 ]
When a write subrequest is marked NETFS_SREQ_NEED_RETRY, the retry path
in netfs_unbuffered_write() unconditionally calls stream->prepare_write()
without checking if it is NULL.
Filesystems such as 9P do not set the prepare_write operation, so
stream->prepare_write remains NULL. When get_user_pages() fails with
-EFAULT and the subrequest is flagged for retry, this results in a NULL
pointer dereference at fs/netfs/direct_write.c:189.
Fix this by mirroring the pattern already used in write_retry.c: if
stream->prepare_write is NULL, skip renegotiation and directly reissue
the subrequest via netfs_reissue_write(), which handles iterator reset,
IN_PROGRESS flag, stats update and reissue internally.
Fixes: a0b4c7a49137 ("netfs: Fix unbuffered/DIO writes to dispatch subrequests in strict sequence")
Reported-by: syzbot+7227db0fbac9f348dba0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7227db0fbac9f348dba0
Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
Link: https://patch.msgid.link/20260307043947.347092-1-kartikey406@gmail.com
Tested-by: syzbot+7227db0fbac9f348dba0@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Wed Mar 18 15:38:58 2026 +0000
netfs: Fix read abandonment during retry
[ Upstream commit 7e57523490cd2efb52b1ea97f2e0a74c0fb634cd ]
Under certain circumstances, all the remaining subrequests from a read
request will get abandoned during retry. The abandonment process expects
the 'subreq' variable to be set to the place to start abandonment from, but
it doesn't always have a useful value (it will be uninitialised on the
first pass through the loop and it may point to a deleted subrequest on
later passes).
Fix the first jump to "abandon:" to set subreq to the start of the first
subrequest expected to need retry (which, in this abandonment case, turned
out unexpectedly to no longer have NEED_RETRY set).
Also clear the subreq pointer after discarding superfluous retryable
subrequests to cause an oops if we do try to access it.
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/3775287.1773848338@warthog.procyon.org.uk
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Wed Mar 25 08:20:17 2026 +0000
netfs: Fix the handling of stream->front by removing it
[ Upstream commit 0e764b9d46071668969410ec5429be0e2f38c6d3 ]
The netfs_io_stream::front member is meant to point to the subrequest
currently being collected on a stream, but it isn't actually used this way
by direct write (which mostly ignores it). However, there's a tracepoint
which looks at it. Further, stream->front is actually redundant with
stream->subrequests.next.
Fix the potential problem in the direct code by just removing the member
and using stream->subrequests.next instead, thereby also simplifying the
code.
Fixes: a0b4c7a49137 ("netfs: Fix unbuffered/DIO writes to dispatch subrequests in strict sequence")
Reported-by: Paulo Alcantara <pc@manguebit.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/4158599.1774426817@warthog.procyon.org.uk
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jakub Kicinski <kuba@kernel.org>
Date: Tue Mar 17 12:33:34 2026 -0700
nfc: nci: fix circular locking dependency in nci_close_device
[ Upstream commit 4527025d440ce84bf56e75ce1df2e84cb8178616 ]
nci_close_device() flushes rx_wq and tx_wq while holding req_lock.
This causes a circular locking dependency because nci_rx_work()
running on rx_wq can end up taking req_lock too:
nci_rx_work -> nci_rx_data_packet -> nci_data_exchange_complete
-> __sk_destruct -> rawsock_destruct -> nfc_deactivate_target
-> nci_deactivate_target -> nci_request -> mutex_lock(&ndev->req_lock)
Move the flush of rx_wq after req_lock has been released.
This should safe (I think) because NCI_UP has already been cleared
and the transport is closed, so the work will see it and return
-ENETDOWN.
NIPA has been hitting this running the nci selftest with a debug
kernel on roughly 4% of the runs.
Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation")
Reviewed-by: Ian Ray <ian.ray@gehealthcare.com>
Link: https://patch.msgid.link/20260317193334.988609-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniel Hodges <hodgesd@meta.com>
Date: Sat Jan 31 19:08:40 2026 -0800
nvme-fabrics: use kfree_sensitive() for DHCHAP secrets
[ Upstream commit 0a1fc2f301529ac75aec0ce80d5ab9d9e4dc4b16 ]
The DHCHAP secrets (dhchap_secret and dhchap_ctrl_secret) contain
authentication key material for NVMe-oF. Use kfree_sensitive() instead
of kfree() in nvmf_free_options() to ensure secrets are zeroed before
the memory is freed, preventing recovery from freed pages.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Daniel Hodges <hodgesd@meta.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Keith Busch <kbusch@kernel.org>
Date: Tue Feb 10 11:00:12 2026 -0800
nvme-pci: cap queue creation to used queues
[ Upstream commit 4735b510a00fb2d4ac9e8d21a8c9552cb281f585 ]
If the user reduces the special queue count at runtime and resets the
controller, we need to reduce the number of queues and interrupts
requested accordingly rather than start with the pre-allocated queue
count.
Tested-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Keith Busch <kbusch@kernel.org>
Date: Tue Feb 10 09:26:54 2026 -0800
nvme-pci: ensure we're polling a polled queue
[ Upstream commit 166e31d7dbf6aa44829b98aa446bda5c9580f12a ]
A user can change the polled queue count at run time. There's a brief
window during a reset where a hipri task may try to poll that queue
before the block layer has updated the queue maps, which would race with
the now interrupt driven queue and may cause double completions.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chaitanya Kulkarni <kch@nvidia.com>
Date: Wed Feb 25 20:30:03 2026 -0800
nvmet: move async event work off nvmet-wq
[ Upstream commit 2922e3507f6d5caa7f1d07f145e186fc6f317a4e ]
For target nvmet_ctrl_free() flushes ctrl->async_event_work.
If nvmet_ctrl_free() runs on nvmet-wq, the flush re-enters workqueue
completion for the same worker:-
A. Async event work queued on nvmet-wq (prior to disconnect):
nvmet_execute_async_event()
queue_work(nvmet_wq, &ctrl->async_event_work)
nvmet_add_async_event()
queue_work(nvmet_wq, &ctrl->async_event_work)
B. Full pre-work chain (RDMA CM path):
nvmet_rdma_cm_handler()
nvmet_rdma_queue_disconnect()
__nvmet_rdma_queue_disconnect()
queue_work(nvmet_wq, &queue->release_work)
process_one_work()
lock((wq_completion)nvmet-wq) <--------- 1st
nvmet_rdma_release_queue_work()
C. Recursive path (same worker):
nvmet_rdma_release_queue_work()
nvmet_rdma_free_queue()
nvmet_sq_destroy()
nvmet_ctrl_put()
nvmet_ctrl_free()
flush_work(&ctrl->async_event_work)
__flush_work()
touch_wq_lockdep_map()
lock((wq_completion)nvmet-wq) <--------- 2nd
Lockdep splat:
============================================
WARNING: possible recursive locking detected
6.19.0-rc3nvme+ #14 Tainted: G N
--------------------------------------------
kworker/u192:42/44933 is trying to acquire lock:
ffff888118a00948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: touch_wq_lockdep_map+0x26/0x90
but task is already holding lock:
ffff888118a00948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: process_one_work+0x53e/0x660
3 locks held by kworker/u192:42/44933:
#0: ffff888118a00948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: process_one_work+0x53e/0x660
#1: ffffc9000e6cbe28 ((work_completion)(&queue->release_work)){+.+.}-{0:0}, at: process_one_work+0x1c5/0x660
#2: ffffffff82d4db60 (rcu_read_lock){....}-{1:3}, at: __flush_work+0x62/0x530
Workqueue: nvmet-wq nvmet_rdma_release_queue_work [nvmet_rdma]
Call Trace:
__flush_work+0x268/0x530
nvmet_ctrl_free+0x140/0x310 [nvmet]
nvmet_cq_put+0x74/0x90 [nvmet]
nvmet_rdma_free_queue+0x23/0xe0 [nvmet_rdma]
nvmet_rdma_release_queue_work+0x19/0x50 [nvmet_rdma]
process_one_work+0x206/0x660
worker_thread+0x184/0x320
kthread+0x10c/0x240
ret_from_fork+0x319/0x390
Move async event work to a dedicated nvmet-aen-wq to avoid reentrant
flush on nvmet-wq.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Josh Poimboeuf <jpoimboe@kernel.org>
Date: Fri Mar 6 09:35:06 2026 -0800
objtool: Handle Clang RSP musical chairs
[ Upstream commit 7fdaa640c810cb42090a182c33f905bcc47a616a ]
For no apparent reason (possibly related to CONFIG_KMSAN), Clang can
randomly pass the value of RSP to other registers and then back again to
RSP. Handle that accordingly.
Fixes the following warnings:
drivers/input/misc/uinput.o: warning: objtool: uinput_str_to_user+0x165: undefined stack state
drivers/input/misc/uinput.o: warning: objtool: uinput_str_to_user+0x165: unknown CFA base reg -1
Reported-by: Arnd Bergmann <arnd@arndb.de>
Closes: https://lore.kernel.org/90956545-2066-46e3-b547-10c884582eb0@app.fastmail.com
Link: https://patch.msgid.link/240e6a172cc73292499334a3724d02ccb3247fc7.1772818491.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yang Yang <n05ec@lzu.edu.cn>
Date: Thu Mar 19 07:42:41 2026 +0000
openvswitch: defer tunnel netdev_put to RCU release
[ Upstream commit 6931d21f87bc6d657f145798fad0bf077b82486c ]
ovs_netdev_tunnel_destroy() may run after NETDEV_UNREGISTER already
detached the device. Dropping the netdev reference in destroy can race
with concurrent readers that still observe vport->dev.
Do not release vport->dev in ovs_netdev_tunnel_destroy(). Instead, let
vport_netdev_free() drop the reference from the RCU callback, matching
the non-tunnel destroy path and avoiding additional synchronization
under RTNL.
Fixes: a9020fde67a6 ("openvswitch: Move tunnel destroy function to oppenvswitch module.")
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Tested-by: Ao Zhou <n05ec@lzu.edu.cn>
Co-developed-by: Yuan Tan <tanyuan98@outlook.com>
Signed-off-by: Yuan Tan <tanyuan98@outlook.com>
Suggested-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Yang Yang <n05ec@lzu.edu.cn>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Link: https://patch.msgid.link/20260319074241.3405262-1-n05ec@lzu.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yang Yang <n05ec@lzu.edu.cn>
Date: Thu Mar 19 08:02:27 2026 +0000
openvswitch: validate MPLS set/set_masked payload length
[ Upstream commit 546b68ac893595877ffbd7751e5c55fd1c43ede6 ]
validate_set() accepted OVS_KEY_ATTR_MPLS as variable-sized payload for
SET/SET_MASKED actions. In action handling, OVS expects fixed-size
MPLS key data (struct ovs_key_mpls).
Use the already normalized key_len (masked case included) and reject
non-matching MPLS action key sizes.
Reject invalid MPLS action payload lengths early.
Fixes: fbdcdd78da7c ("Change in Openvswitch to support MPLS label depth of 3 in ingress direction")
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Tested-by: Ao Zhou <n05ec@lzu.edu.cn>
Co-developed-by: Yuan Tan <tanyuan98@outlook.com>
Signed-off-by: Yuan Tan <tanyuan98@outlook.com>
Suggested-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Yang Yang <n05ec@lzu.edu.cn>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Link: https://patch.msgid.link/20260319080228.3423307-1-n05ec@lzu.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Amir Goldstein <amir73il@gmail.com>
Date: Sun Mar 8 12:02:21 2026 +0100
ovl: fix wrong detection of 32bit inode numbers
commit 53a7c171e9dd833f0a96b545adcb89bd57387239 upstream.
The implicit FILEID_INO32_GEN encoder was changed to be explicit,
so we need to fix the detection.
When mounting overlayfs with upperdir and lowerdir on different ext4
filesystems, the expected kmsg log is:
overlayfs: "xino" feature enabled using 32 upper inode bits.
But instead, since the regressing commit, the kmsg log was:
overlayfs: "xino" feature enabled using 2 upper inode bits.
Fixes: e21fc2038c1b9 ("exportfs: make ->encode_fh() a mandatory method for NFS export")
Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Fei Lv <feilv@asrmicro.com>
Date: Mon Jul 22 18:14:43 2024 +0800
ovl: make fsync after metadata copy-up opt-in mount option
commit 1f6ee9be92f8df85a8c9a5a78c20fd39c0c21a95 upstream.
Commit 7d6899fb69d25 ("ovl: fsync after metadata copy-up") was done to
fix durability of overlayfs copy up on an upper filesystem which does
not enforce ordering on storing of metadata changes (e.g. ubifs).
In an earlier revision of the regressing commit by Lei Lv, the metadata
fsync behavior was opt-in via a new "fsync=strict" mount option.
We were hoping that the opt-in mount option could be avoided, so the
change was only made to depend on metacopy=off, in the hope of not
hurting performance of metadata heavy workloads, which are more likely
to be using metacopy=on.
This hope was proven wrong by a performance regression report from Google
COS workload after upgrade to kernel 6.12.
This is an adaptation of Lei's original "fsync=strict" mount option
to the existing upstream code.
The new mount option is mutually exclusive with the "volatile" mount
option, so the latter is now an alias to the "fsync=volatile" mount
option.
Reported-by: Chenglong Tang <chenglongtang@google.com>
Closes: https://lore.kernel.org/linux-unionfs/CAOdxtTadAFH01Vui1FvWfcmQ8jH1O45owTzUcpYbNvBxnLeM7Q@mail.gmail.com/
Link: https://lore.kernel.org/linux-unionfs/CAOQ4uxgKC1SgjMWre=fUb00v8rxtd6sQi-S+dxR8oDzAuiGu8g@mail.gmail.com/
Fixes: 7d6899fb69d25 ("ovl: fsync after metadata copy-up")
Depends: 50e638beb67e0 ("ovl: Use str_on_off() helper in ovl_show_options()")
Cc: stable@vger.kernel.org # v6.12+
Signed-off-by: Fei Lv <feilv@asrmicro.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Peter Zijlstra <peterz@infradead.org>
Date: Mon Mar 9 13:55:46 2026 +0100
perf: Make sure to use pmu_ctx->pmu for groups
[ Upstream commit 4b9ce671960627b2505b3f64742544ae9801df97 ]
Oliver reported that x86_pmu_del() ended up doing an out-of-bound memory access
when group_sched_in() fails and needs to roll back.
This *should* be handled by the transaction callbacks, but he found that when
the group leader is a software event, the transaction handlers of the wrong PMU
are used. Despite the move_group case in perf_event_open() and group_sched_in()
using pmu_ctx->pmu.
Turns out, inherit uses event->pmu to clone the events, effectively undoing the
move_group case for all inherited contexts. Fix this by also making inherit use
pmu_ctx->pmu, ensuring all inherited counters end up in the same pmu context.
Similarly, __perf_event_read() should use equally use pmu_ctx->pmu for the
group case.
Fixes: bd2756811766 ("perf: Rewrite core context handling")
Reported-by: Oliver Rosenberg <olrose55@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://patch.msgid.link/20260309133713.GB606826@noisy.programming.kicks-ass.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Abel Vesa <abel.vesa@oss.qualcomm.com>
Date: Thu Feb 19 13:11:48 2026 +0200
phy: qcom: qmp-ufs: Fix SM8650 PCS table for Gear 4
commit 81af9e40e2e4e1aa95f09fb34811760be6742c58 upstream.
According to internal documentation, on SM8650, when the PHY is configured
in Gear 4, the QPHY_V6_PCS_UFS_PLL_CNTL register needs to have the same
value as for Gear 5.
At the moment, there is no board that comes with a UFS 3.x device, so
this issue doesn't show up, but with the new Eliza SoC, which uses the
same init sequence as SM8650, on the MTP board, the link startup fails
with the current Gear 4 PCS table.
So fix that by moving the entry into the PCS generic table instead,
while keeping the value from Gear 5 configuration.
Cc: stable@vger.kernel.org # v6.10
Fixes: b9251e64a96f ("phy: qcom: qmp-ufs: update SM8650 tables for Gear 4 & 5")
Suggested-by: Nitin Rawat <nitin.rawat@oss.qualcomm.com>
Signed-off-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK
Link: https://patch.msgid.link/20260219-phy-qcom-qmp-ufs-fix-sm8650-pcs-g4-table-v1-1-f136505b57f6@oss.qualcomm.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Felix Gu <ustc.gu@gmail.com>
Date: Thu Feb 12 18:39:19 2026 +0800
phy: ti: j721e-wiz: Fix device node reference leak in wiz_get_lane_phy_types()
[ Upstream commit 584b457f4166293bdfa50f930228e9fb91a38392 ]
The serdes device_node is obtained using of_get_child_by_name(),
which increments the reference count. However, it is never put,
leading to a reference leak.
Add the missing of_node_put() calls to ensure the reference count is
properly balanced.
Fixes: 7ae14cf581f2 ("phy: ti: j721e-wiz: Implement DisplayPort mode to the wiz driver")
Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20260212-wiz-v2-1-6e8bd4cc7a4a@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luca Leonardo Scorcia <l.scorcia@gmail.com>
Date: Tue Mar 17 11:02:06 2026 +0000
pinctrl: mediatek: common: Fix probe failure for devices without EINT
[ Upstream commit 8f9f64c8f90dca07d3b9f1d7ce5d34ccd246c9dd ]
Some pinctrl devices like mt6397 or mt6392 don't support EINT at all, but
the mtk_eint_init function is always called and returns -ENODEV, which
then bubbles up and causes probe failure.
To address this only call mtk_eint_init if EINT pins are present.
Tested on Xiaomi Mi Smart Clock x04g (mt6392).
Fixes: e46df235b4e6 ("pinctrl: mediatek: refactor EINT related code for all MediaTek pinctrl can fit")
Signed-off-by: Luca Leonardo Scorcia <l.scorcia@gmail.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Date: Wed Feb 18 12:58:09 2026 -0800
pinctrl: renesas: rza1: Normalize return value of gpio_get()
[ Upstream commit fb22bb9701d48c4b0e81fe204c2f96a37a520568 ]
The GPIO .get() callback is expected to return 0 or 1 (or a negative
error code). Ensure that the value returned by rza1_gpio_get() is
normalized to the [0, 1] range.
Fixes: 86ef402d805d606a ("gpiolib: sanitize the return value of gpio_chip::get()")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/aZYnyl-Nf4S1U2yj@google.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Felix Gu <ustc.gu@gmail.com>
Date: Tue Jan 27 00:35:47 2026 +0800
pinctrl: renesas: rzt2h: Fix device node leak in rzt2h_gpio_register()
[ Upstream commit e825c79ef914bd55cf7c2476ddcfb2738eb689c3 ]
When calling of_parse_phandle_with_fixed_args(), the caller is
responsible for calling of_node_put() to release the device node
reference.
In rzt2h_gpio_register(), the driver fails to call of_node_put() to
release the reference in of_args.np, which causes a memory leak.
Add the missing of_node_put() call to fix the leak.
Fixes: 34d4d093077a ("pinctrl: renesas: Add support for RZ/T2H")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260127-rzt2h-v1-1-86472e7421b8@gmail.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Amelie Delaunay <amelie.delaunay@foss.st.com>
Date: Tue Mar 17 11:06:54 2026 +0100
pinctrl: stm32: fix HDP driver dependency on GPIO_GENERIC
[ Upstream commit c8cfeb4b9dda2cdfce79519aee4aaff16310a7b6 ]
The HDP driver uses the generic GPIO chip API, but this configuration
may not be enabled.
Ensure it is enabled by selecting the appropriate option.
Fixes: 4bcff9c05b9d ("pinctrl: stm32: use new generic GPIO chip API")
Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alok Tiwari <alok.a.tiwari@oracle.com>
Date: Tue Mar 10 06:01:35 2026 -0700
platform/olpc: olpc-xo175-ec: Fix overflow error message to print inlen
[ Upstream commit 2061f7b042f88d372cca79615f8425f3564c0b40 ]
The command length check validates inlen (> 5), but the error message
incorrectly printed resp_len. Print inlen so the log reflects the
actual command length.
Fixes: 0c3d931b3ab9e ("Platform: OLPC: Add XO-1.75 EC driver")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Acked-by: Lubomir Rintel <lkundrak@v3.sk>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20260310130138.700687-1-alok.a.tiwari@oracle.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Krishna Chomal <krishna.chomal108@gmail.com>
Date: Mon Feb 16 12:50:03 2026 +0530
platform/x86: hp-wmi: Add Omen 16-wf0xxx fan and thermal support
[ Upstream commit 13fa3aaf02edaad9b41fc61d7f6326d2b6a4bf80 ]
The HP Omen 16-wf0xxx (board ID: 8BAB) has the same WMI interface as
other Victus S boards, but requires quirks for correctly switching
thermal profile (similar to HP Omen 16-wf1xxx, board ID: 8C78).
Add the DMI board name to victus_s_thermal_profile_boards[] table and
map it to omen_v1_thermal_params.
Testing on HP Omen 16-wf0xxx confirmed that platform profile is
registered successfully and fan RPMs are readable and controllable.
Suggested-by: Noah Provenzano <noahpro@gmail.com>
Tested-by: Juan Martin Morales <juanm4morales@gmail.com>
Reported-by: Juan Martin Morales <juanm4morales@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220639
Signed-off-by: Krishna Chomal <krishna.chomal108@gmail.com>
Link: https://patch.msgid.link/20260216072003.90151-1-krishna.chomal108@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Krishna Chomal <krishna.chomal108@gmail.com>
Date: Wed Feb 18 10:32:35 2026 +0530
platform/x86: hp-wmi: Add Omen 16-xd0xxx fan and thermal support
[ Upstream commit 3c99a545b372c77b5d39715968a141f523eccbf2 ]
The HP Omen 16-xd0xxx (board ID: 8BCD) has the same WMI interface as
other Victus S boards, but requires quirks for correctly switching
thermal profile (similar to HP Omen 16-wf1xxx, board ID: 8C78).
Add the DMI board name to victus_s_thermal_profile_boards[] table and
map it to omen_v1_thermal_params.
Testing on HP Omen 16-xd0xxx confirmed that platform profile is
registered successfully and fan RPMs are readable and controllable.
Tested-by: Varad Amol Pisale <varadpisale.work@gmail.com>
Signed-off-by: Krishna Chomal <krishna.chomal108@gmail.com>
Link: https://patch.msgid.link/20260218050235.94687-1-krishna.chomal108@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peter Metz <peter.metz@unarin.com>
Date: Thu Feb 12 23:46:27 2026 -0500
platform/x86: intel-hid: Add Dell 14 Plus 2-in-1 to dmi_vgbs_allow_list
[ Upstream commit 6b3fa0615cd8432148581de62a52f83847af3d70 ]
The Dell 14 Plus 2-in-1 (model DB04250) requires the VGBS allow list
entry to correctly enable the tablet mode switch. Without this, the
chassis state is not reported, and the hinge rotation only emits
unknown scancodes.
Verified on Dell 14 Plus 2-in-1 DB04250.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221090
Signed-off-by: Peter Metz <peter.metz@unarin.com>
Reviewed-by: Hans de Goede <johannes.goede@oss.qualcomm.com>
Link: https://patch.msgid.link/20260213044627.203638-1-peter.metz@unarin.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David McFarland <corngood@gmail.com>
Date: Thu Feb 5 19:16:24 2026 -0400
platform/x86: intel-hid: disable wakeup_mode during hibernation
[ Upstream commit e02ea3ae8ee40d5835a845884c7b161a27c10bcb ]
Add a freeze handler which clears wakeup_mode. This fixes aborted hibernation on
Dell Precision 3880.
Wakeup event detected during hibernation, rolling back
This system sends power button events during hibernation, even when triggered by
software.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218634
Fixes: 0c4cae1bc00d ("PM: hibernate: Avoid missing wakeup events during hibernation")
Signed-off-by: David McFarland <corngood@gmail.com>
Link: https://patch.msgid.link/20260205231629.1336348-1-corngood@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Leif Skunberg <diamondback@cohunt.app>
Date: Tue Feb 10 09:56:25 2026 +0100
platform/x86: intel-hid: Enable 5-button array on ThinkPad X1 Fold 16 Gen 1
[ Upstream commit b38d478dad79e61e8a65931021bdfd7a71741212 ]
The Lenovo ThinkPad X1 Fold 16 Gen 1 has physical volume up/down
buttons that are handled through the intel-hid 5-button array
interface. The firmware does not advertise 5-button array support via
HEBC, so the driver relies on a DMI allowlist to enable it.
Add the ThinkPad X1 Fold 16 Gen 1 to the button_array_table so the
volume buttons work out of the box.
Signed-off-by: Leif Skunberg <diamondback@cohunt.app>
Reviewed-by: Hans de Goede <johannes.goede@oss.qualcomm.com>
Link: https://patch.msgid.link/20260210085625.34380-1-diamondback@cohunt.app
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Li RongQing <lirongqing@baidu.com>
Date: Tue Mar 3 02:46:35 2026 -0500
platform/x86: ISST: Check HWP support before MSR access
[ Upstream commit 9f11d9b15efb5f77e810b6dfbeb01b4650a79eae ]
On some systems, HWP can be explicitly disabled in the BIOS settings
When HWP is disabled by firmware, the HWP CPUID bit is not set, and
attempting to read MSR_PM_ENABLE will result in a General Protection
(GP) fault.
unchecked MSR access error: RDMSR from 0x770 at rIP: 0xffffffffc33db92e (disable_dynamic_sst_features+0xe/0x50 [isst_tpmi_core])
Call Trace:
<TASK>
? ex_handler_msr+0xf6/0x150
? fixup_exception+0x1ad/0x340
? gp_try_fixup_and_notify+0x1e/0xb0
? exc_general_protection+0xc9/0x390
? terminate_walk+0x64/0x100
? asm_exc_general_protection+0x22/0x30
? disable_dynamic_sst_features+0xe/0x50 [isst_tpmi_core]
isst_if_def_ioctl+0xece/0x1050 [isst_tpmi_core]
? ioctl_has_perm.constprop.42+0xe0/0x130
isst_if_def_ioctl+0x10d/0x1a0 [isst_if_common]
__se_sys_ioctl+0x86/0xc0
do_syscall_64+0x8a/0x100
entry_SYSCALL_64_after_hwframe+0x78/0xe2
RIP: 0033:0x7f36eaef54a7
Add a check for X86_FEATURE_HWP before accessing the MSR. If HWP is
not available, return true safely.
Fixes: 12a7d2cb811d ("platform/x86: ISST: Add SST-CP support via TPMI")
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20260303074635.2218-1-lirongqing@baidu.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Date: Mon Mar 23 08:36:35 2026 -0700
platform/x86: ISST: Correct locked bit width
commit fbddf68d7b4e1e6da7a78dd7fbd8ec376536584a upstream.
SST-PP locked bit width is set to three bits. It should be only one bit.
Use SST_PP_LOCK_WIDTH define instead of SST_PP_LEVEL_WIDTH.
Fixes: ea009e4769fa ("platform/x86: ISST: Add SST-PP support via TPMI")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260323153635.3263828-1-srinivas.pandruvada@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Nathan Chancellor <nathan@kernel.org>
Date: Fri Mar 13 14:06:34 2026 -0700
platform/x86: lenovo: wmi-gamezone: Drop gz_chain_head
[ Upstream commit 5a3955f3602950d1888df743a5b1889e43b5cb60 ]
The gz_chain_head variable has been unused since the driver's initial
addition to the tree. Its use was eliminated between v3 and v4 during
development but due to the reference of gz_chain_head's wait_list
member, the compiler could not warn that it was unused.
After a (tip) commit ("locking/rwsem: Remove the list_head from struct
rw_semaphore"), which removed a reference to the variable passed to
__RWSEM_INITIALIZER(), certain configurations show an unused variable
warning from the Lenovo wmi-gamezone driver:
drivers/platform/x86/lenovo/wmi-gamezone.c:34:31: warning: 'gz_chain_head' defined but not used [-Wunused-variable]
34 | static BLOCKING_NOTIFIER_HEAD(gz_chain_head);
| ^~~~~~~~~~~~~
include/linux/notifier.h:119:39: note: in definition of macro 'BLOCKING_NOTIFIER_HEAD'
119 | struct blocking_notifier_head name = \
| ^~~~
Remove the variable to prevent the warning from showing up.
Fixes: 22024ac5366f ("platform/x86: Add Lenovo Gamezone WMI Driver")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Link: https://patch.msgid.link/20260313-lenovo-wmi-gamezone-remove-gz_chain_head-v1-1-ce5231f0c6fa@kernel.org
[ij: reorganized the changelog]
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Antheas Kapenekakis <lkml@antheas.dev>
Date: Mon Feb 23 19:29:53 2026 +0100
platform/x86: oxpec: Add support for Aokzoe A2 Pro
[ Upstream commit cd0883055b04586770dab43c64159348bf480a3e ]
Aokzoe A2 Pro is an older device that the oxpec driver is missing the
quirk for. It has the same behavior as the AOKZOE A1 devices. Add a
quirk for it to the oxpec driver.
Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
Link: https://patch.msgid.link/20260223183004.2696892-5-lkml@antheas.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Antheas Kapenekakis <lkml@antheas.dev>
Date: Mon Feb 23 19:29:50 2026 +0100
platform/x86: oxpec: Add support for OneXPlayer APEX
[ Upstream commit 3385ea97c14d271dcb0c6e6fcf16972f819eecd8 ]
OneXPlayer Apex is a new Strix Halo handheld. It uses the same registers
as the OneXPlayer Fly devices. Add a quirk for it to the oxpec driver.
Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
Link: https://patch.msgid.link/20260223183004.2696892-2-lkml@antheas.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Antheas Kapenekakis <lkml@antheas.dev>
Date: Mon Feb 23 19:29:52 2026 +0100
platform/x86: oxpec: Add support for OneXPlayer X1 Air
[ Upstream commit 2a3b4a8c10a64a62c4243007139d253dc1324dfd ]
X1 Air is an X1 variant with a newer Intel chipset. It uses the same
registers as the X1. Add a quirk for it to the oxpec driver.
Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
Link: https://patch.msgid.link/20260223183004.2696892-4-lkml@antheas.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Antheas Kapenekakis <lkml@antheas.dev>
Date: Mon Feb 23 19:29:51 2026 +0100
platform/x86: oxpec: Add support for OneXPlayer X1z
[ Upstream commit 4049c46edb5d44c0de045f6f504371705dd603dd ]
X1z is a variant of OneXPlayer X1 A with 8840U. It seems that only one
user has this one. Add a quirk for it to the oxpec driver.
Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
Link: https://patch.msgid.link/20260223183004.2696892-3-lkml@antheas.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hans de Goede <johannes.goede@oss.qualcomm.com>
Date: Tue Feb 17 14:23:46 2026 +0100
platform/x86: touchscreen_dmi: Add quirk for y-inverted Goodix touchscreen on SUPI S10
[ Upstream commit 7d87ed70fc95482c12edf9493c249b6413be485e ]
The touchscreen on the SUPI S10 tablet reports inverted Y coordinates,
causing touch input to be mirrored vertically relative to the display.
Add a quirk to set the "touchscreen-inverted-y" boolean device-property
on the touchscreen device, so that the goodix_ts driver will fixup
the coordinates.
Reported-by: Yajat Kumar <yajatapps3@gmail.com>
Closes: https://lore.kernel.org/linux-input/20251230221639.582406-1-yajatapps3@gmail.com/
Tested-by: Yajat Kumar <yajatapps3@gmail.com>
Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com>
Link: https://patch.msgid.link/20260217132346.34535-1-johannes.goede@oss.qualcomm.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alberto Garcia <berto@igalia.com>
Date: Mon Mar 9 18:39:41 2026 +0100
PM: hibernate: Drain trailing zero pages on userspace restore
[ Upstream commit 734eba62cd32cb9ceffa09e57cdc03d761528525 ]
Commit 005e8dddd497 ("PM: hibernate: don't store zero pages in the
image file") added an optimization to skip zero-filled pages in the
hibernation image. On restore, zero pages are handled internally by
snapshot_write_next() in a loop that processes them without returning
to the caller.
With the userspace restore interface, writing the last non-zero page
to /dev/snapshot is followed by the SNAPSHOT_ATOMIC_RESTORE ioctl. At
this point there are no more calls to snapshot_write_next() so any
trailing zero pages are not processed, snapshot_image_loaded() fails
because handle->cur is smaller than expected, the ioctl returns -EPERM
and the image is not restored.
The in-kernel restore path is not affected by this because the loop in
load_image() in swap.c calls snapshot_write_next() until it returns 0.
It is this final call that drains any trailing zero pages.
Fixed by calling snapshot_write_next() in snapshot_write_finalize(),
giving the kernel the chance to drain any trailing zero pages.
Fixes: 005e8dddd497 ("PM: hibernate: don't store zero pages in the image file")
Signed-off-by: Alberto Garcia <berto@igalia.com>
Acked-by: Brian Geffon <bgeffon@google.com>
Link: https://patch.msgid.link/ef5a7c5e3e3dbd17dcb20efaa0c53a47a23498bb.1773075892.git.berto@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Youngjun Park <youngjun.park@lge.com>
Date: Sun Mar 22 21:05:28 2026 +0900
PM: sleep: Drop spurious WARN_ON() from pm_restore_gfp_mask()
[ Upstream commit a8d51efb5929ae308895455a3e496b5eca2cd143 ]
Commit 35e4a69b2003f ("PM: sleep: Allow pm_restrict_gfp_mask()
stacking") introduced refcount-based GFP mask management that warns
when pm_restore_gfp_mask() is called with saved_gfp_count == 0.
Some hibernation paths call pm_restore_gfp_mask() defensively where
the GFP mask may or may not be restricted depending on the execution
path. For example, the uswsusp interface invokes it in
SNAPSHOT_CREATE_IMAGE, SNAPSHOT_UNFREEZE, and snapshot_release().
Before the stacking change this was a silent no-op; it now triggers
a spurious WARNING.
Remove the WARN_ON() wrapper from the !saved_gfp_count check while
retaining the check itself, so that defensive calls remain harmless
without producing false warnings.
Fixes: 35e4a69b2003f ("PM: sleep: Allow pm_restrict_gfp_mask() stacking")
Signed-off-by: Youngjun Park <youngjun.park@lge.com>
[ rjw: Subject tweak ]
Link: https://patch.msgid.link/20260322120528.750178-1-youngjun.park@lge.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hari Bathini <hbathini@linux.ibm.com>
Date: Tue Mar 3 23:40:25 2026 +0530
powerpc64/bpf: do not increment tailcall count when prog is NULL
commit 521bd39d9d28ce54cbfec7f9b89c94ad4fdb8350 upstream.
Do not increment tailcall count, if tailcall did not succeed due to
missing BPF program.
Fixes: ce0761419fae ("powerpc/bpf: Implement support for tail calls")
Cc: stable@vger.kernel.org
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260303181031.390073-2-hbathini@linux.ibm.com
[ Conflict due to missing feature commit 2ed2d8f6fb38 ("powerpc64/bpf:
Support tailcalls with subprogs") resolved accordingly. ]
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Hari Bathini <hbathini@linux.ibm.com>
Date: Tue Jan 27 14:19:25 2026 +0530
powerpc64/ftrace: fix OOL stub count with clang
[ Upstream commit 875612a7745013a43c67493cb0583ee3f7476344 ]
The total number of out-of-line (OOL) stubs required for function
tracing is determined using the following command:
$(OBJDUMP) -r -j __patchable_function_entries vmlinux.o
While this works correctly with GNU objdump, llvm-objdump does not
list the expected relocation records for this section. Fix this by
using the -d option and counting R_PPC64_ADDR64 relocation entries.
This works as desired with both objdump and llvm-objdump.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260127084926.34497-3-hbathini@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yonatan Nachum <ynachum@amazon.com>
Date: Wed Dec 10 13:06:13 2025 +0000
RDMA/efa: Check stored completion CTX command ID with received one
[ Upstream commit 4b01ec0f133b3fe1038dc538d6bfcbd72462d2f0 ]
In admin command completion, we receive a CQE with the command ID which
is constructed from context index and entropy bits from the admin queue
producer counter. To try to detect memory corruptions in the received
CQE, validate the full command ID of the fetched context with the CQE
command ID. If there is a mismatch, complete the CQE with error.
Also use LSBs of the admin queue producer counter to better detect
entropy mismatch between smaller number of commands.
Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com>
Reviewed-by: Michael Margolin <mrgolin@amazon.com>
Signed-off-by: Yonatan Nachum <ynachum@amazon.com>
Link: https://patch.msgid.link/20251210130614.36460-2-ynachum@amazon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Stable-dep-of: ef3b06742c8a ("RDMA/efa: Fix use of completion ctx after free")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ethan Tidmore <ethantidmore06@gmail.com>
Date: Fri Mar 13 23:57:30 2026 -0500
RDMA/efa: Fix possible deadlock
[ Upstream commit 0f2055db7b630559870afb40fc84490816ab8ec5 ]
In the error path for efa_com_alloc_comp_ctx() the semaphore assigned to
&aq->avail_cmds is not released.
Detected by Smatch:
drivers/infiniband/hw/efa/efa_com.c:662 efa_com_cmd_exec() warn:
inconsistent returns '&aq->avail_cmds'
Add release for &aq->avail_cmds in efa_com_alloc_comp_ctx() error path.
Fixes: ef3b06742c8a2 ("RDMA/efa: Fix use of completion ctx after free")
Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com>
Link: https://patch.msgid.link/20260314045730.1143862-1-ethantidmore06@gmail.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yonatan Nachum <ynachum@amazon.com>
Date: Sun Mar 8 16:53:50 2026 +0000
RDMA/efa: Fix use of completion ctx after free
[ Upstream commit ef3b06742c8a201d0e83edc9a33a89a4fe3009f8 ]
On admin queue completion handling, if the admin command completed with
error we print data from the completion context. The issue is that we
already freed the completion context in polling/interrupts handler which
means we print data from context in an unknown state (it might be
already used again).
Change the admin submission flow so alloc/dealloc of the context will be
symmetric and dealloc will be called after any potential use of the
context.
Fixes: 68fb9f3e312a ("RDMA/efa: Remove redundant NULL pointer check of CQE")
Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com>
Reviewed-by: Michael Margolin <mrgolin@amazon.com>
Signed-off-by: Yonatan Nachum <ynachum@amazon.com>
Link: https://patch.msgid.link/20260308165350.18219-1-ynachum@amazon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yonatan Nachum <ynachum@amazon.com>
Date: Wed Dec 10 13:06:14 2025 +0000
RDMA/efa: Improve admin completion context state machine
[ Upstream commit dab5825491f7b0ea92a09390f39df0a51100f12f ]
Add a new unused state to the admin completion contexts state machine
instead of the occupied field. This improves the completion validity
check because it now enforce the context to be in submitted state prior
to completing it. Also add allocated state as a intermediate state
between unused and submitted.
Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com>
Reviewed-by: Michael Margolin <mrgolin@amazon.com>
Signed-off-by: Yonatan Nachum <ynachum@amazon.com>
Link: https://patch.msgid.link/20251210130614.36460-3-ynachum@amazon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Stable-dep-of: ef3b06742c8a ("RDMA/efa: Fix use of completion ctx after free")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Abhijit Gangurde <abhijit.gangurde@amd.com>
Date: Fri Feb 27 11:48:09 2026 +0530
RDMA/ionic: Preserve and set Ethernet source MAC after ib_ud_header_init()
commit a08aaf3968aec5d05cd32c801b8cc0c61da69c41 upstream.
ionic_build_hdr() populated the Ethernet source MAC (hdr->eth.smac_h) by
passing the header’s storage directly to rdma_read_gid_l2_fields().
However, ib_ud_header_init() is called after that and re-initializes the
UD header, which wipes the previously written smac_h. As a result, packets
are emitted with an zero source MAC address on the wire.
Correct the source MAC by reading the GID-derived smac into a temporary
buffer and copy it after ib_ud_header_init() completes.
Fixes: e8521822c733 ("RDMA/ionic: Register device ops for control path")
Cc: stable@vger.kernel.org # 6.18
Signed-off-by: Abhijit Gangurde <abhijit.gangurde@amd.com>
Link: https://patch.msgid.link/20260227061809.2979990-1-abhijit.gangurde@amd.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ivan Barrera <ivan.d.barrera@intel.com>
Date: Mon Mar 16 13:39:43 2026 -0500
RDMA/irdma: Clean up unnecessary dereference of event->cm_node
[ Upstream commit b415399c9a024d574b65479636f0d4eb625b9abd ]
The cm_node is available and the usage of cm_node and event->cm_node
seems arbitrary. Clean up unnecessary dereference of event->cm_node.
Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager")
Signed-off-by: Ivan Barrera <ivan.d.barrera@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Anil Samal <anil.samal@intel.com>
Date: Mon Mar 16 13:39:45 2026 -0500
RDMA/irdma: Fix deadlock during netdev reset with active connections
[ Upstream commit 6f52370970ac07d352a7af4089e55e0e6425f827 ]
Resolve deadlock that occurs when user executes netdev reset while RDMA
applications (e.g., rping) are active. The netdev reset causes ice
driver to remove irdma auxiliary driver, triggering device_delete and
subsequent client removal. During client removal, uverbs_client waits
for QP reference count to reach zero while cma_client holds the final
reference, creating circular dependency and indefinite wait in iWARP
mode. Skip QP reference count wait during device reset to prevent
deadlock.
Fixes: c8f304d75f6c ("RDMA/irdma: Prevent QP use after free")
Signed-off-by: Anil Samal <anil.samal@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shiraz Saleem <shiraz.saleem@intel.com>
Date: Mon Mar 16 13:39:47 2026 -0500
RDMA/irdma: Harden depth calculation functions
[ Upstream commit e37afcb56ae070477741fe2d6e61fc0c542cce2d ]
An issue was exposed where OS can pass in U32_MAX for SQ/RQ/SRQ size.
This can cause integer overflow and truncation of SQ/RQ/SRQ depth
returning a success when it should have failed.
Harden the functions to do all depth calculations and boundary
checking in u64 sizes.
Fixes: 563e1feb5f6e ("RDMA/irdma: Add SRQ support")
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jacob Moroni <jmoroni@google.com>
Date: Mon Mar 16 13:39:38 2026 -0500
RDMA/irdma: Initialize free_qp completion before using it
[ Upstream commit 11a95521fb93c91e2d4ef9d53dc80ef0a755549b ]
In irdma_create_qp, if ib_copy_to_udata fails, it will call
irdma_destroy_qp to clean up which will attempt to wait on
the free_qp completion, which is not initialized yet. Fix this
by initializing the completion before the ib_copy_to_udata call.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Jacob Moroni <jmoroni@google.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Date: Mon Mar 16 13:39:42 2026 -0500
RDMA/irdma: Remove a NOP wait_event() in irdma_modify_qp_roce()
[ Upstream commit 5e8f0239731a83753473b7aa91bda67bbdff5053 ]
Remove a NOP wait_event() in irdma_modify_qp_roce() which is relevant
for iWARP and likely a copy and paste artifact for RoCEv2. The wait event
is for sending a reset on a TCP connection, after the reset has been
requested in irdma_modify_qp(), which occurs only in iWarp mode.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Date: Mon Mar 16 13:39:44 2026 -0500
RDMA/irdma: Remove reset check from irdma_modify_qp_to_err()
[ Upstream commit c45c6ebd693b944f1ffe429fdfb6cc1674c237be ]
During reset, irdma_modify_qp() to error should be called to disconnect
the QP. Without this fix, if not preceded by irdma_modify_qp() to error, the
API call irdma_destroy_qp() gets stuck waiting for the QP refcount to go
to zero, because the cm_node associated with this QP isn't disconnected.
Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions")
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Date: Mon Mar 16 13:39:46 2026 -0500
RDMA/irdma: Return EINVAL for invalid arp index error
[ Upstream commit 7221f581eefa79ead06e171044f393fb7ee22f87 ]
When rdma_connect() fails due to an invalid arp index, user space rdma core
reports ENOMEM which is confusing. Modify irdma_make_cm_node() to return the
correct error code.
Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager")
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Date: Mon Mar 16 13:39:41 2026 -0500
RDMA/irdma: Update ibqp state to error if QP is already in error state
[ Upstream commit 8c1f19a2225cf37b3f8ab0b5a8a5322291cda620 ]
In irdma_modify_qp() update ibqp state to error if the irdma QP is already
in error state, otherwise the ibqp state which is visible to the consumer
app remains stale.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chuck Lever <chuck.lever@oracle.com>
Date: Fri Mar 13 15:41:58 2026 -0400
RDMA/rw: Fall back to direct SGE on MR pool exhaustion
[ Upstream commit 00da250c21b074ea9494c375d0117b69e5b1d0a4 ]
When IOMMU passthrough mode is active, ib_dma_map_sgtable_attrs()
produces no coalescing: each scatterlist page maps 1:1 to a DMA
entry, so sgt.nents equals the raw page count. A 1 MB transfer
yields 256 DMA entries. If that count exceeds the device's
max_sgl_rd threshold (an optimization hint from mlx5 firmware),
rdma_rw_io_needs_mr() steers the operation into the MR
registration path. Each such operation consumes one or more MRs
from a pool sized at max_rdma_ctxs -- roughly one MR per
concurrent context. Under write-intensive workloads that issue
many concurrent RDMA READs, the pool is rapidly exhausted,
ib_mr_pool_get() returns NULL, and rdma_rw_init_one_mr() returns
-EAGAIN. Upper layer protocols treat this as a fatal DMA mapping
failure and tear down the connection.
The max_sgl_rd check is a performance optimization, not a
correctness requirement: the device can handle large SGE counts
via direct posting, just less efficiently than with MR
registration. When the MR pool cannot satisfy a request, falling
back to the direct SGE (map_wrs) path avoids the connection
reset while preserving the MR optimization for the common case
where pool resources are available.
Add a fallback in rdma_rw_ctx_init() so that -EAGAIN from
rdma_rw_init_mr_wrs() triggers direct SGE posting instead of
propagating the error. iWARP devices, which mandate MR
registration for RDMA READs, and force_mr debug mode continue
to treat -EAGAIN as terminal.
Fixes: 00bd1439f464 ("RDMA/rw: Support threshold for registration vs scattering to local pages")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260313194201.5818-2-cel@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Mon Mar 2 19:43:31 2026 +0100
regmap: Synchronize cache for the page selector
[ Upstream commit 09e70e4f119ff650d24c96161fd2f62ac7e424b0 ]
If the selector register is represented in each page, its value
according to the debugfs is stale because it gets synchronized
only after the real page switch happens. Hence the regmap cache
initialisation from the HW inherits outdated data in the selector
register.
Synchronize cache for the page selector just in time.
Before (offset followed by hexdump, the first byte is selector):
// Real registers
18: 05 ff 00 00 ff 0f 00 00 f0 00 00 00
...
// Virtual (per port)
40: 05 ff 00 00 e0 e0 00 00 00 00 00 1f
50: 00 ff 00 00 e0 e0 00 00 00 00 00 1f
60: 01 ff 00 00 ff ff 00 00 00 00 00 00
70: 02 ff 00 00 cf f3 00 00 00 00 00 0c
80: 03 ff 00 00 00 00 00 00 00 00 00 ff
90: 04 ff 00 00 ff 0f 00 00 f0 00 00 00
After:
// Real registers
18: 05 ff 00 00 ff 0f 00 00 f0 00 00 00
...
// Virtual (per port)
40: 00 ff 00 00 e0 e0 00 00 00 00 00 1f
50: 01 ff 00 00 e0 e0 00 00 00 00 00 1f
60: 02 ff 00 00 ff ff 00 00 00 00 00 00
70: 03 ff 00 00 cf f3 00 00 00 00 00 0c
80: 04 ff 00 00 00 00 00 00 00 00 00 ff
90: 05 ff 00 00 ff 0f 00 00 f0 00 00 00
Fixes: 6863ca622759 ("regmap: Add support for register indirect addressing.")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20260302184753.2693803-1-andriy.shevchenko@linux.intel.com
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mario Limonciello <mario.limonciello@amd.com>
Date: Thu Mar 26 14:05:38 2026 -0500
Revert "ALSA: hda/intel: Add MSI X870E Tomahawk to denylist"
commit ed4da361bf943b9041fc63e5cb6af01b3c0de978 upstream.
commit 30b3211aa2416 ("ALSA: hda/intel: Add MSI X870E Tomahawk
to denylist") was added to silence a warning, but this effectively
reintroduced commit df42ee7e22f03 ("ALSA: hda: Add ASRock
X670E Taichi to denylist") which was already reported to cause
problems and reverted in commit ee8f1613596ad ("Revert "ALSA: hda:
Add ASRock X670E Taichi to denylist"")
Revert it yet again.
Cc: stable@vger.kernel.org
Reported-by: Juhyun Song <juju6985@outlook.kr>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221274
Cc: Stuart Hayhurst <stuart.a.hayhurst@gmail.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20260326190542.524515-1-mario.limonciello@amd.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sabrina Dubroca <sd@queasysnail.net>
Date: Fri Mar 20 00:02:53 2026 +0100
rtnetlink: count IFLA_INFO_SLAVE_KIND in if_nlmsg_size
[ Upstream commit ee00a12593ffb69db4dd1a1c00ecb0253376874a ]
rtnl_link_get_slave_info_data_size counts IFLA_INFO_SLAVE_DATA, but
rtnl_link_slave_info_fill adds both IFLA_INFO_SLAVE_DATA and
IFLA_INFO_SLAVE_KIND.
Fixes: ba7d49b1f0f8 ("rtnetlink: provide api for getting and setting slave info")
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/049843b532e23cde7ddba263c0bbe35ba6f0d26d.1773919462.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sabrina Dubroca <sd@queasysnail.net>
Date: Fri Mar 20 00:02:52 2026 +0100
rtnetlink: count IFLA_PARENT_DEV_{NAME,BUS_NAME} in if_nlmsg_size
[ Upstream commit 52501989c76206462d9b11a8485beef40ef41821 ]
Commit 00e77ed8e64d ("rtnetlink: add IFLA_PARENT_[DEV|DEV_BUS]_NAME")
added those attributes to rtnl_fill_ifinfo, but forgot to extend
if_nlmsg_size.
Fixes: 00e77ed8e64d ("rtnetlink: add IFLA_PARENT_[DEV|DEV_BUS]_NAME")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/0b849da95562af45487080528d60f578636aba5c.1773919462.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sabrina Dubroca <sd@queasysnail.net>
Date: Mon Mar 23 16:19:43 2026 +0100
rtnetlink: fix leak of SRCU struct in rtnl_link_register
[ Upstream commit 09474055f2619be9445ba4245e4013741ed01a5e ]
Commit 6b57ff21a310 ("rtnetlink: Protect link_ops by mutex.") swapped
the EEXIST check with the init_srcu_struct, but didn't add cleanup of
the SRCU struct we just allocated in case of error.
Fixes: 6b57ff21a310 ("rtnetlink: Protect link_ops by mutex.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/e77fe499f9a58c547b33b5212b3596dad417cec6.1774025341.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Benno Lossin <lossin@kernel.org>
Date: Mon Mar 2 15:04:15 2026 +0100
rust: pin-init: internal: init: document load-bearing fact of field accessors
commit 580cc37b1de4fcd9997c48d7080e744533f09f36 upstream.
The functions `[Pin]Init::__[pinned_]init` and `ptr::write` called from
the `init!` macro require the passed pointer to be aligned. This fact is
ensured by the creation of field accessors to previously initialized
fields.
Since we missed this very important fact from the beginning [1],
document it in the code.
Link: https://rust-for-linux.zulipchat.com/#narrow/channel/561532-pin-init/topic/initialized.20field.20accessor.20detection/with/576210658 [1]
Fixes: 90e53c5e70a6 ("rust: add pin-init API core")
Cc: <stable@vger.kernel.org> # 6.6.y, 6.12.y: 42415d163e5d: rust: pin-init: add references to previously initialized fields
Cc: <stable@vger.kernel.org> # 6.6.y, 6.12.y, 6.18.y, 6.19.y
Signed-off-by: Benno Lossin <lossin@kernel.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260302140424.4097655-2-lossin@kernel.org
[ Updated Cc: stable@ tags as discussed. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
[ Moved changes to the declarative macro, because 6.19.y and earlier do not
have `syn`. Also duplicated the comment for all field accessor creations.
- Benno ]
Signed-off-by: Benno Lossin <lossin@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Alice Ryhl <aliceryhl@google.com>
Date: Tue Mar 24 10:49:59 2026 +0000
rust: regulator: do not assume that regulator_get() returns non-null
[ Upstream commit 8121353a4bf8e38afee26299419a78ec108e14a6 ]
The Rust `Regulator` abstraction uses `NonNull` to wrap the underlying
`struct regulator` pointer. When `CONFIG_REGULATOR` is disabled, the C
stub for `regulator_get` returns `NULL`. `from_err_ptr` does not treat
`NULL` as an error, so it was passed to `NonNull::new_unchecked`,
causing undefined behavior.
Fix this by using a raw pointer `*mut bindings::regulator` instead of
`NonNull`. This allows `inner` to be `NULL` when `CONFIG_REGULATOR` is
disabled, and leverages the C stubs which are designed to handle `NULL`
or are no-ops.
Fixes: 9b614ceada7c ("rust: regulator: add a bare minimum regulator abstraction")
Reported-by: Miguel Ojeda <ojeda@kernel.org>
Closes: https://lore.kernel.org/r/20260322193830.89324-1-ojeda@kernel.org
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Link: https://patch.msgid.link/20260324-regulator-fix-v1-1-a5244afa3c15@google.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vasily Gorbik <gor@linux.ibm.com>
Date: Thu Mar 26 14:38:44 2026 +0100
s390/barrier: Make array_index_mask_nospec() __always_inline
commit c5c0a268b38adffbb2e70e6957017537ff54c157 upstream.
Mark array_index_mask_nospec() as __always_inline to guarantee the
mitigation is emitted inline regardless of compiler inlining decisions.
Fixes: e2dd833389cc ("s390: add optimized array_index_mask_nospec")
Cc: stable@kernel.org
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Vasily Gorbik <gor@linux.ibm.com>
Date: Thu Mar 26 19:50:14 2026 +0100
s390/entry: Scrub r12 register on kernel entry
commit 0738d395aab8fae3b5a3ad3fc640630c91693c27 upstream.
Before commit f33f2d4c7c80 ("s390/bp: remove TIF_ISOLATE_BP"),
all entry handlers loaded r12 with the current task pointer
(lg %r12,__LC_CURRENT) for use by the BPENTER/BPEXIT macros. That
commit removed TIF_ISOLATE_BP, dropping both the branch prediction
macros and the r12 load, but did not add r12 to the register clearing
sequence.
Add the missing xgr %r12,%r12 to make the register scrub consistent
across all entry points.
Fixes: f33f2d4c7c80 ("s390/bp: remove TIF_ISOLATE_BP")
Cc: stable@kernel.org
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Janosch Frank <frankja@linux.ibm.com>
Date: Wed Mar 4 10:18:37 2026 +0000
s390/mm: Add missing secure storage access fixups for donated memory
[ Upstream commit b00be77302d7ec4ad0367bb236494fce7172b730 ]
There are special cases where secure storage access exceptions happen
in a kernel context for pages that don't have the PG_arch_1 bit
set. That bit is set for non-exported guest secure storage (memory)
but is absent on storage donated to the Ultravisor since the kernel
isn't allowed to export donated pages.
Prior to this patch we would try to export the page by calling
arch_make_folio_accessible() which would instantly return since the
arch bit is absent signifying that the page was already exported and
no further action is necessary. This leads to secure storage access
exception loops which can never be resolved.
With this patch we unconditionally try to export and if that fails we
fixup.
Fixes: 084ea4d611a3 ("s390/mm: add (non)secure page access exceptions handlers")
Reported-by: Heiko Carstens <hca@linux.ibm.com>
Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Tue Mar 24 17:34:05 2026 +0100
s390/syscalls: Add spectre boundary for syscall dispatch table
commit 48b8814e25d073dd84daf990a879a820bad2bcbd upstream.
The s390 syscall number is directly controlled by userspace, but does
not have an array_index_nospec() boundary to prevent access past the
syscall function pointer tables.
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Fixes: 56e62a737028 ("s390: convert to generic entry")
Cc: stable@kernel.org
Assisted-by: gkh_clanker_2000
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/2026032404-sterling-swoosh-43e6@gregkh
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: zhidao su <soolaugust@gmail.com>
Date: Wed Mar 4 13:37:30 2026 +0800
sched_ext: Use WRITE_ONCE() for the write side of dsq->seq update
[ Upstream commit 7a8464555d2e5f038758bb19e72ab4710b79e9cd ]
bpf_iter_scx_dsq_new() reads dsq->seq via READ_ONCE() without holding
any lock, making dsq->seq a lock-free concurrently accessed variable.
However, dispatch_enqueue(), the sole writer of dsq->seq, uses a plain
increment without the matching WRITE_ONCE() on the write side:
dsq->seq++;
^^^^^^^^^^^
plain write -- KCSAN data race
The KCSAN documentation requires that if one accessor uses READ_ONCE()
or WRITE_ONCE() on a variable to annotate lock-free access, all other
accesses must also use the appropriate accessor. A plain write leaves
the pair incomplete and will trigger KCSAN warnings.
Fix by using WRITE_ONCE() for the write side of the update:
WRITE_ONCE(dsq->seq, dsq->seq + 1);
This is consistent with bpf_iter_scx_dsq_new() and makes the
concurrent access annotation complete and KCSAN-clean.
Signed-off-by: zhidao su <suzhidao@xiaomi.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Florian Fuchs <fuchsfl@gmail.com>
Date: Fri Feb 27 19:18:23 2026 +0100
scsi: devinfo: Add BLIST_SKIP_IO_HINTS for Iomega ZIP
[ Upstream commit 80bf3b28d32b431f84f244a8469488eb6d96afbb ]
The Iomega ZIP 100 (Z100P2) can't process IO Advice Hints Grouping mode
page query. It immediately switches to the status phase 0xb8 after
receiving the subpage code 0x05 of MODE_SENSE_10 command, which fails
imm_out() and turns into DID_ERROR of this command, which leads to unusable
device. This was tested with an Iomega ZIP 100 (Z100P2) connected with a
StarTech PEX1P2 AX99100 PCIe parallel port card.
Prior to this fix, Test Unit Ready fails and the drive can't be used:
IMM: returned SCSI status b8
sd 7:0:6:0: [sdh] Test Unit Ready failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK
Signed-off-by: Florian Fuchs <fuchsfl@gmail.com>
Link: https://patch.msgid.link/20260227181823.892932-1-fuchsfl@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tyllis Xu <livelycarpet87@gmail.com>
Date: Sat Mar 14 12:01:50 2026 -0500
scsi: ibmvfc: Fix OOB access in ibmvfc_discover_targets_done()
commit 61d099ac4a7a8fb11ebdb6e2ec8d77f38e77362f upstream.
A malicious or compromised VIO server can return a num_written value in the
discover targets MAD response that exceeds max_targets. This value is
stored directly in vhost->num_targets without validation, and is then used
as the loop bound in ibmvfc_alloc_targets() to index into disc_buf[], which
is only allocated for max_targets entries. Indices at or beyond max_targets
access kernel memory outside the DMA-coherent allocation. The
out-of-bounds data is subsequently embedded in Implicit Logout and PLOGI
MADs that are sent back to the VIO server, leaking kernel memory.
Fix by clamping num_written to max_targets before storing it.
Fixes: 072b91f9c651 ("[SCSI] ibmvfc: IBM Power Virtual Fibre Channel Adapter Client Driver")
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tyllis Xu <LivelyCarpet87@gmail.com>
Reviewed-by: Dave Marquardt <davemarq@linux.ibm.com>
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Link: https://patch.msgid.link/20260314170151.548614-1-LivelyCarpet87@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ranjan Kumar <ranjan.kumar@broadcom.com>
Date: Wed Feb 25 13:56:22 2026 +0530
scsi: mpi3mr: Clear reset history on ready and recheck state after timeout
[ Upstream commit dbd53975ed4132d161b6a97ebe785a262380182d ]
The driver retains reset history even after the IOC has successfully
reached the READY state. That leaves stale reset information active during
normal operation and can mislead recovery and diagnostics. In addition, if
the IOC becomes READY just as the ready timeout loop exits, the driver
still follows the failure path and may retry or report failure incorrectly.
Clear reset history once READY is confirmed so driver state matches actual
IOC status. After the timeout loop, recheck the IOC state and treat READY
as success instead of failing.
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://patch.msgid.link/20260225082622.82588-1-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yihang Li <liyihang9@huawei.com>
Date: Tue Mar 17 14:31:47 2026 +0800
scsi: scsi_transport_sas: Fix the maximum channel scanning issue
[ Upstream commit d71afa9deb4d413232ba16d693f7d43b321931b4 ]
After commit 37c4e72b0651 ("scsi: Fix sas_user_scan() to handle wildcard
and multi-channel scans"), if the device supports multiple channels (0 to
shost->max_channel), user_scan() invokes updated sas_user_scan() to perform
the scan behavior for a specific transfer. However, when the user
specifies shost->max_channel, it will return -EINVAL, which is not
expected.
Fix and support specifying the scan shost->max_channel for scanning.
Fixes: 37c4e72b0651 ("scsi: Fix sas_user_scan() to handle wildcard and multi-channel scans")
Signed-off-by: Yihang Li <liyihang9@huawei.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Link: https://patch.msgid.link/20260317063147.2182562-1-liyihang9@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Mon Feb 23 16:44:59 2026 +0100
scsi: ses: Handle positive SCSI error from ses_recv_diag()
commit 7a9f448d44127217fabc4065c5ba070d4e0b5d37 upstream.
ses_recv_diag() can return a positive value, which also means that an
error happened, so do not only test for negative values.
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: stable <stable@kernel.org>
Assisted-by: gkh_clanker_2000
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://patch.msgid.link/2026022301-bony-overstock-a07f@gregkh
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Christian Brauner <brauner@kernel.org>
Date: Tue Mar 17 16:59:45 2026 +0100
selftests/mount_setattr: increase tmpfs size for idmapped mount tests
[ Upstream commit c465f5591aa84a6f85d66d152e28b92844a45d4f ]
The mount_setattr_idmapped fixture mounts a 2 MB tmpfs at /mnt and then
creates a 2 GB sparse ext4 image at /mnt/C/ext4.img. While ftruncate()
succeeds (sparse file), mkfs.ext4 needs to write actual metadata blocks
(inode tables, journal, bitmaps) which easily exceeds the 2 MB tmpfs
limit, causing ENOSPC and failing the fixture setup for all
mount_setattr_idmapped tests.
This was introduced by commit d37d4720c3e7 ("selftests/mount_settattr:
ensure that ext4 filesystem can be created") which increased the image
size from 2 MB to 2 GB but didn't adjust the tmpfs size.
Bump the tmpfs size to 256 MB which is sufficient for the ext4 metadata.
Fixes: d37d4720c3e7 ("selftests/mount_settattr: ensure that ext4 filesystem can be created")
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Danilo Krummrich <dakr@kernel.org>
Date: Tue Mar 17 00:37:15 2026 +0100
sh: platform_early: remove pdev->driver_override check
[ Upstream commit c5f60e3f07b6609562d21efda878e83ce8860728 ]
In commit 507fd01d5333 ("drivers: move the early platform device support to
arch/sh") platform_match() was copied over to the sh platform_early
code, accidentally including the driver_override check.
This check does not make sense for platform_early, as sysfs is not even
available in first place at this point in the boot process, hence remove
the check.
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Fixes: 507fd01d5333 ("drivers: move the early platform device support to arch/sh")
Link: https://lore.kernel.org/all/DH4M3DJ4P58T.1BGVAVXN71Z09@kernel.org/
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alan Borzeszkowski <alan.borzeszkowski@linux.intel.com>
Date: Mon Mar 9 16:37:03 2026 +0100
spi: intel-pci: Add support for Nova Lake mobile SPI flash
[ Upstream commit 85b731ad4bbf6eb3fedf267ab00be3596f148432 ]
Add Intel Nova Lake PCD-H SPI serial flash PCI ID to the list of
supported devices.
Signed-off-by: Alan Borzeszkowski <alan.borzeszkowski@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Link: https://patch.msgid.link/20260309153703.74282-1-alan.borzeszkowski@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Felix Gu <ustc.gu@gmail.com>
Date: Sun Mar 22 21:29:56 2026 +0800
spi: meson-spicc: Fix double-put in remove path
[ Upstream commit 63542bb402b7013171c9f621c28b609eda4dbf1f ]
meson_spicc_probe() registers the controller with
devm_spi_register_controller(), so teardown already drops the
controller reference via devm cleanup.
Calling spi_controller_put() again in meson_spicc_remove()
causes a double-put.
Fixes: 8311ee2164c5 ("spi: meson-spicc: fix memory leak in meson_spicc_remove")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260322-rockchip-v1-1-fac3f0c6dad8@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Felix Gu <ustc.gu@gmail.com>
Date: Thu Mar 19 00:12:34 2026 +0800
spi: sn-f-ospi: Fix resource leak in f_ospi_probe()
[ Upstream commit ef3d549e1deb3466c61f3b01d22fc3fe3e5efb08 ]
In f_ospi_probe(), when num_cs validation fails, it returns without
calling spi_controller_put() on the SPI controller, which causes a
resource leak.
Use devm_spi_alloc_host() instead of spi_alloc_host() to ensure the
SPI controller is properly freed when probe fails.
Fixes: 1b74dd64c861 ("spi: Add Socionext F_OSPI SPI flash controller driver")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Link: https://patch.msgid.link/20260319-sn-f-v1-1-33a6738d2da8@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marc Kleine-Budde <mkl@pengutronix.de>
Date: Thu Mar 19 19:38:12 2026 +0100
spi: spi-fsl-lpspi: fix teardown order issue (UAF)
[ Upstream commit b341c1176f2e001b3adf0b47154fc31589f7410e ]
There is a teardown order issue in the driver. The SPI controller is
registered using devm_spi_register_controller(), which delays
unregistration of the SPI controller until after the fsl_lpspi_remove()
function returns.
As the fsl_lpspi_remove() function synchronously tears down the DMA
channels, a running SPI transfer triggers the following NULL pointer
dereference due to use after free:
| fsl_lpspi 42550000.spi: I/O Error in DMA RX
| Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[...]
| Call trace:
| fsl_lpspi_dma_transfer+0x260/0x340 [spi_fsl_lpspi]
| fsl_lpspi_transfer_one+0x198/0x448 [spi_fsl_lpspi]
| spi_transfer_one_message+0x49c/0x7c8
| __spi_pump_transfer_message+0x120/0x420
| __spi_sync+0x2c4/0x520
| spi_sync+0x34/0x60
| spidev_message+0x20c/0x378 [spidev]
| spidev_ioctl+0x398/0x750 [spidev]
[...]
Switch from devm_spi_register_controller() to spi_register_controller() in
fsl_lpspi_probe() and add the corresponding spi_unregister_controller() in
fsl_lpspi_remove().
Fixes: 5314987de5e5 ("spi: imx: add lpspi bus driver")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://patch.msgid.link/20260319-spi-fsl-lpspi-fixes-v1-1-b433e435b2d8@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Danilo Krummrich <dakr@kernel.org>
Date: Tue Mar 24 01:59:15 2026 +0100
spi: use generic driver_override infrastructure
[ Upstream commit cc34d77dd48708d810c12bfd6f5bf03304f6c824 ]
When a driver is probed through __driver_attach(), the bus' match()
callback is called without the device lock held, thus accessing the
driver_override field without a lock, which can cause a UAF.
Fix this by using the driver-core driver_override infrastructure taking
care of proper locking internally.
Note that calling match() from __driver_attach() without the device lock
held is intentional. [1]
Also note that we do not enable the driver_override feature of struct
bus_type, as SPI - in contrast to most other buses - passes "" to
sysfs_emit() when the driver_override pointer is NULL. Thus, printing
"\n" instead of "(null)\n".
Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1]
Reported-by: Gui-Dong Han <hanguidong02@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
Fixes: 5039563e7c25 ("spi: Add driver_override SPI device attribute")
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260324005919.2408620-12-dakr@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marc Buerg <buermarc@googlemail.com>
Date: Wed Mar 25 23:29:50 2026 +0100
sysctl: fix uninitialized variable in proc_do_large_bitmap
[ Upstream commit f63a9df7e3f9f842945d292a19d9938924f066f9 ]
proc_do_large_bitmap() does not initialize variable c, which is expected
to be set to a trailing character by proc_get_long().
However, proc_get_long() only sets c when the input buffer contains a
trailing character after the parsed value.
If c is not initialized it may happen to contain a '-'. If this is the
case proc_do_large_bitmap() expects to be able to parse a second part of
the input buffer. If there is no second part an unjustified -EINVAL will
be returned.
Initialize c to 0 to prevent returning -EINVAL on valid input.
Fixes: 9f977fb7ae9d ("sysctl: add proc_do_large_bitmap")
Signed-off-by: Marc Buerg <buermarc@googlemail.com>
Reviewed-by: Joel Granados <joel.granados@kernel.org>
Signed-off-by: Joel Granados <joel.granados@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jiayuan Chen <jiayuan.chen@shopee.com>
Date: Fri Mar 20 15:21:26 2026 +0800
team: fix header_ops type confusion with non-Ethernet ports
[ Upstream commit 425000dbf17373a4ab8be9428f5dc055ef870a56 ]
Similar to commit 950803f72547 ("bonding: fix type confusion in
bond_setup_by_slave()") team has the same class of header_ops type
confusion.
For non-Ethernet ports, team_setup_by_port() copies port_dev->header_ops
directly. When the team device later calls dev_hard_header() or
dev_parse_header(), these callbacks can run with the team net_device
instead of the real lower device, so netdev_priv(dev) is interpreted as
the wrong private type and can crash.
The syzbot report shows a crash in bond_header_create(), but the root
cause is in team: the topology is gre -> bond -> team, and team calls
the inherited header_ops with its own net_device instead of the lower
device, so bond_header_create() receives a team device and interprets
netdev_priv() as bonding private data, causing a type confusion crash.
Fix this by introducing team header_ops wrappers for create/parse,
selecting a team port under RCU, and calling the lower device callbacks
with port->dev, so each callback always sees the correct net_device
context.
Also pass the selected lower device to the lower parse callback, so
recursion is bounded in stacked non-Ethernet topologies and parse
callbacks always run with the correct device context.
Fixes: 1d76efe1577b ("team: add support for non-ethernet devices")
Reported-by: syzbot+3d8bc31c45e11450f24c@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/69b46af7.050a0220.36eb34.000e.GAE@google.com/T/
Cc: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
Link: https://patch.msgid.link/20260320072139.134249-2-jiayuan.chen@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Date: Tue Mar 24 10:23:46 2026 -0700
thermal: intel: int340x: soc_slider: Set offset only for balanced mode
commit 7dfe9846016b15816e287a4650be1ff1b48c5ab4 upstream.
The slider offset can be set via debugfs for balanced mode. The offset
should be only applicable in balanced mode. For other modes, it should
be 0 when writing to MMIO offset,
Fixes: 8306bcaba06d ("thermal: intel: int340x: Add module parameter to change slider offset")
Tested-by: Erin Park <erin.park@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: 6.18+ <stable@vger.kernel.org> # 6.18+
[ rjw: Subject and changelog tweaks ]
Link: https://patch.msgid.link/20260324172346.3317145-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Chuck Lever <chuck.lever@oracle.com>
Date: Tue Mar 24 08:53:23 2026 -0400
tls: Purge async_hold in tls_decrypt_async_wait()
[ Upstream commit 84a8335d8300576f1b377ae24abca1d9f197807f ]
The async_hold queue pins encrypted input skbs while
the AEAD engine references their scatterlist data. Once
tls_decrypt_async_wait() returns, every AEAD operation
has completed and the engine no longer references those
skbs, so they can be freed unconditionally.
A subsequent patch adds batch async decryption to
tls_sw_read_sock(), introducing a new call site that
must drain pending AEAD operations and release held
skbs. Move __skb_queue_purge(&ctx->async_hold) into
tls_decrypt_async_wait() so the purge is centralized
and every caller -- recvmsg's drain path, the -EBUSY
fallback in tls_do_decryption(), and the new read_sock
batch path -- releases held skbs on synchronization
without each site managing the purge independently.
This fixes a leak when tls_strp_msg_hold() fails part-way through,
after having added some cloned skbs to the async_hold
queue. tls_decrypt_sg() will then call tls_decrypt_async_wait() to
process all pending decrypts, and drop back to synchronous mode, but
tls_sw_recvmsg() only flushes the async_hold queue when one record has
been processed in "fully-async" mode, which may not be the case here.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Fixes: b8a6ff84abbc ("tls: wait for pending async decryptions if tls_strp_msg_hold fails")
Link: https://patch.msgid.link/20260324-tls-read-sock-v5-1-5408befe5774@oracle.com
[pabeni@redhat.com: added leak comment]
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luo Haiyang <luo.haiyang@zte.com.cn>
Date: Thu Mar 26 14:19:53 2026 +0800
tracing: Fix potential deadlock in cpu hotplug with osnoise
commit 1f9885732248d22f788e4992c739a98c88ab8a55 upstream.
The following sequence may leads deadlock in cpu hotplug:
task1 task2 task3
----- ----- -----
mutex_lock(&interface_lock)
[CPU GOING OFFLINE]
cpus_write_lock();
osnoise_cpu_die();
kthread_stop(task3);
wait_for_completion();
osnoise_sleep();
mutex_lock(&interface_lock);
cpus_read_lock();
[DEAD LOCK]
Fix by swap the order of cpus_read_lock() and mutex_lock(&interface_lock).
Cc: stable@vger.kernel.org
Cc: <mathieu.desnoyers@efficios.com>
Cc: <zhang.run@zte.com.cn>
Cc: <yang.tao172@zte.com.cn>
Cc: <ran.xiaokai@zte.com.cn>
Fixes: bce29ac9ce0bb ("trace: Add osnoise tracer")
Link: https://patch.msgid.link/20260326141953414bVSj33dAYktqp9Oiyizq8@zte.com.cn
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Luo Haiyang <luo.haiyang@zte.com.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Xuewen Yan <xuewen.yan@unisoc.com>
Date: Fri Mar 6 15:59:54 2026 +0800
tracing: Revert "tracing: Remove pid in task_rename tracing output"
[ Upstream commit a6f22e50c7d51aa225c392c62c33f0fae11f734d ]
This reverts commit e3f6a42272e028c46695acc83fc7d7c42f2750ad.
The commit says that the tracepoint only deals with the current task,
however the following case is not current task:
comm_write() {
p = get_proc_task(inode);
if (!p)
return -ESRCH;
if (same_thread_group(current, p))
set_task_comm(p, buffer);
}
where set_task_comm() calls __set_task_comm() which records
the update of p and not current.
So revert the patch to show pid.
Cc: <mhiramat@kernel.org>
Cc: <mathieu.desnoyers@efficios.com>
Cc: <elver@google.com>
Cc: <kees@kernel.org>
Link: https://patch.msgid.link/20260306075954.4533-1-xuewen.yan@unisoc.com
Fixes: e3f6a42272e0 ("tracing: Remove pid in task_rename tracing output")
Reported-by: Guohua Yan <guohua.yan@unisoc.com>
Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Martin KaFai Lau <martin.lau@kernel.org>
Date: Thu Mar 19 11:18:17 2026 -0700
udp: Fix wildcard bind conflict check when using hash2
[ Upstream commit e537dd15d0d4ad989d56a1021290f0c674dd8b28 ]
When binding a udp_sock to a local address and port, UDP uses
two hashes (udptable->hash and udptable->hash2) for collision
detection. The current code switches to "hash2" when
hslot->count > 10.
"hash2" is keyed by local address and local port.
"hash" is keyed by local port only.
The issue can be shown in the following bind sequence (pseudo code):
bind(fd1, "[fd00::1]:8888")
bind(fd2, "[fd00::2]:8888")
bind(fd3, "[fd00::3]:8888")
bind(fd4, "[fd00::4]:8888")
bind(fd5, "[fd00::5]:8888")
bind(fd6, "[fd00::6]:8888")
bind(fd7, "[fd00::7]:8888")
bind(fd8, "[fd00::8]:8888")
bind(fd9, "[fd00::9]:8888")
bind(fd10, "[fd00::10]:8888")
/* Correctly return -EADDRINUSE because "hash" is used
* instead of "hash2". udp_lib_lport_inuse() detects the
* conflict.
*/
bind(fail_fd, "[::]:8888")
/* After one more socket is bound to "[fd00::11]:8888",
* hslot->count exceeds 10 and "hash2" is used instead.
*/
bind(fd11, "[fd00::11]:8888")
bind(fail_fd, "[::]:8888") /* succeeds unexpectedly */
The same issue applies to the IPv4 wildcard address "0.0.0.0"
and the IPv4-mapped wildcard address "::ffff:0.0.0.0". For
example, if there are existing sockets bound to
"192.168.1.[1-11]:8888", then binding "0.0.0.0:8888" or
"[::ffff:0.0.0.0]:8888" can also miss the conflict when
hslot->count > 10.
TCP inet_csk_get_port() already has the correct check in
inet_use_bhash2_on_bind(). Rename it to
inet_use_hash2_on_bind() and move it to inet_hashtables.h
so udp.c can reuse it in this fix.
Fixes: 30fff9231fad ("udp: bind() optimisation")
Reported-by: Andrew Onyshchuk <oandrew@meta.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260319181817.1901357-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peter Zijlstra <peterz@infradead.org>
Date: Wed Oct 29 14:24:57 2025 +0100
unwind_user/x86: Fix arch=um build
commit aa7387e79a5cff0585cd1b9091944142a06872b6 upstream.
Add CONFIG_HAVE_UNWIND_USER_FP guards to make sure this code
doesn't break arch=um builds.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Closes: https://lore.kernel.org/oe-kbuild-all/202510291919.FFGyU7nq-lkp@intel.com/
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jie Deng <dengjie03@kylinos.cn>
Date: Fri Feb 27 16:49:31 2026 +0800
usb: core: new quirk to handle devices with zero configurations
[ Upstream commit 9f6a983cfa22ac662c86e60816d3a357d4b551e9 ]
Some USB devices incorrectly report bNumConfigurations as 0 in their
device descriptor, which causes the USB core to reject them during
enumeration.
logs:
usb 1-2: device descriptor read/64, error -71
usb 1-2: no configurations
usb 1-2: can't read configurations, error -22
However, these devices actually work correctly when
treated as having a single configuration.
Add a new quirk USB_QUIRK_FORCE_ONE_CONFIG to handle such devices.
When this quirk is set, assume the device has 1 configuration instead
of failing with -EINVAL.
This quirk is applied to the device with VID:PID 5131:2007 which
exhibits this behavior.
Signed-off-by: Jie Deng <dengjie03@kylinos.cn>
Link: https://patch.msgid.link/20260227084931.1527461-1-dengjie03@kylinos.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zubin Mithra <zsm@google.com>
Date: Wed Mar 18 13:40:13 2026 +0000
virt: tdx-guest: Fix handling of host controlled 'quote' buffer length
commit c3fd16c3b98ed726294feab2f94f876290bf7b61 upstream.
Validate host controlled value `quote_buf->out_len` that determines how
many bytes of the quote are copied out to guest userspace. In TDX
environments with remote attestation, quotes are not considered private,
and can be forwarded to an attestation server.
Catch scenarios where the host specifies a response length larger than
the guest's allocation, or otherwise races modifying the response while
the guest consumes it.
This prevents contents beyond the pages allocated for `quote_buf`
(up to TSM_REPORT_OUTBLOB_MAX) from being read out to guest userspace,
and possibly forwarded in attestation requests.
Recall that some deployments want per-container configs-tsm-report
interfaces, so the leak may cross container protection boundaries, not
just local root.
Fixes: f4738f56d1dc ("virt: tdx-guest: Add Quote generation support using TSM_REPORTS")
Cc: stable@vger.kernel.org
Signed-off-by: Zubin Mithra <zsm@google.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date: Fri Mar 20 10:18:18 2026 +0800
virtio-net: correct hdr_len handling for tunnel gso
[ Upstream commit 6c860dc02a8e60b438e26940227dfa641fcdb66a ]
The commit a2fb4bc4e2a6a03 ("net: implement virtio helpers to handle UDP
GSO tunneling.") introduces support for the UDP GSO tunnel feature in
virtio-net.
The virtio spec says:
If the \field{gso_type} has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or
VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit set, \field{hdr_len} accounts for
all the headers up to and including the inner transport.
The commit did not update the hdr_len to include the inner transport.
I observed that the "hdr_len" is 116 for this packet:
17:36:18.241105 52:55:00:d1:27:0a > 2e:2c:df:46:a9:e1, ethertype IPv4 (0x0800), length 2912: (tos 0x0, ttl 64, id 45197, offset 0, flags [none], proto UDP (17), length 2898)
192.168.122.100.50613 > 192.168.122.1.4789: [bad udp cksum 0x8106 -> 0x26a0!] VXLAN, flags [I] (0x08), vni 1
fa:c3:ba:82:05:ee > ce:85:0c:31:77:e5, ethertype IPv4 (0x0800), length 2862: (tos 0x0, ttl 64, id 14678, offset 0, flags [DF], proto TCP (6), length 2848)
192.168.3.1.49880 > 192.168.3.2.9898: Flags [P.], cksum 0x9266 (incorrect -> 0xaa20), seq 515667:518463, ack 1, win 64, options [nop,nop,TS val 2990048824 ecr 2798801412], length 2796
116 = 14(mac) + 20(ip) + 8(udp) + 8(vxlan) + 14(inner mac) + 20(inner ip) + 32(innner tcp)
Fixes: a2fb4bc4e2a6a03 ("net: implement virtio helpers to handle UDP GSO tunneling.")
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://patch.msgid.link/20260320021818.111741-3-xuanzhuo@linux.alibaba.com
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date: Fri Mar 20 10:18:17 2026 +0800
virtio-net: correct hdr_len handling for VIRTIO_NET_F_GUEST_HDRLEN
[ Upstream commit 38ec410b99a5ee6566f75650ce3d4fd632940fd0 ]
The commit be50da3e9d4a ("net: virtio_net: implement exact header length
guest feature") introduces support for the VIRTIO_NET_F_GUEST_HDRLEN
feature in virtio-net.
This feature requires virtio-net to set hdr_len to the actual header
length of the packet when transmitting, the number of
bytes from the start of the packet to the beginning of the
transport-layer payload.
However, in practice, hdr_len was being set using skb_headlen(skb),
which is clearly incorrect. This commit fixes that issue.
Fixes: be50da3e9d4a ("net: virtio_net: implement exact header length guest feature")
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://patch.msgid.link/20260320021818.111741-2-xuanzhuo@linux.alibaba.com
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: xietangxin <xietangxin@yeah.net>
Date: Thu Mar 12 10:54:06 2026 +0800
virtio_net: Fix UAF on dst_ops when IFF_XMIT_DST_RELEASE is cleared and napi_tx is false
commit ba8bda9a0896746053aa97ac6c3e08168729172c upstream.
A UAF issue occurs when the virtio_net driver is configured with napi_tx=N
and the device's IFF_XMIT_DST_RELEASE flag is cleared
(e.g., during the configuration of tc route filter rules).
When IFF_XMIT_DST_RELEASE is removed from the net_device, the network stack
expects the driver to hold the reference to skb->dst until the packet
is fully transmitted and freed. In virtio_net with napi_tx=N,
skbs may remain in the virtio transmit ring for an extended period.
If the network namespace is destroyed while these skbs are still pending,
the corresponding dst_ops structure has freed. When a subsequent packet
is transmitted, free_old_xmit() is triggered to clean up old skbs.
It then calls dst_release() on the skb associated with the stale dst_entry.
Since the dst_ops (referenced by the dst_entry) has already been freed,
a UAF kernel paging request occurs.
fix it by adds skb_dst_drop(skb) in start_xmit to explicitly release
the dst reference before the skb is queued in virtio_net.
Call Trace:
Unable to handle kernel paging request at virtual address ffff80007e150000
CPU: 2 UID: 0 PID: 6236 Comm: ping Kdump: loaded Not tainted 7.0.0-rc1+ #6 PREEMPT
...
percpu_counter_add_batch+0x3c/0x158 lib/percpu_counter.c:98 (P)
dst_release+0xe0/0x110 net/core/dst.c:177
skb_release_head_state+0xe8/0x108 net/core/skbuff.c:1177
sk_skb_reason_drop+0x54/0x2d8 net/core/skbuff.c:1255
dev_kfree_skb_any_reason+0x64/0x78 net/core/dev.c:3469
napi_consume_skb+0x1c4/0x3a0 net/core/skbuff.c:1527
__free_old_xmit+0x164/0x230 drivers/net/virtio_net.c:611 [virtio_net]
free_old_xmit drivers/net/virtio_net.c:1081 [virtio_net]
start_xmit+0x7c/0x530 drivers/net/virtio_net.c:3329 [virtio_net]
...
Reproduction Steps:
NETDEV="enp3s0"
config_qdisc_route_filter() {
tc qdisc del dev $NETDEV root
tc qdisc add dev $NETDEV root handle 1: prio
tc filter add dev $NETDEV parent 1:0 \
protocol ip prio 100 route to 100 flowid 1:1
ip route add 192.168.1.100/32 dev $NETDEV realm 100
}
test_ns() {
ip netns add testns
ip link set $NETDEV netns testns
ip netns exec testns ifconfig $NETDEV 10.0.32.46/24
ip netns exec testns ping -c 1 10.0.32.1
ip netns del testns
}
config_qdisc_route_filter
test_ns
sleep 2
test_ns
Fixes: f2fc6a54585a ("[NETNS][IPV6] route6 - move ip6_dst_ops inside the network namespace")
Cc: stable@vger.kernel.org
Signed-off-by: xietangxin <xietangxin@yeah.net>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Fixes: 0287587884b1 ("net: better IFF_XMIT_DST_RELEASE support")
Link: https://patch.msgid.link/20260312025406.15641-1-xietangxin@yeah.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Joanne Koong <joannelkoong@gmail.com>
Date: Thu Mar 19 17:51:45 2026 -0700
writeback: don't block sync for filesystems with no data integrity guarantees
commit 76f9377cd2ab7a9220c25d33940d9ca20d368172 upstream.
Add a SB_I_NO_DATA_INTEGRITY superblock flag for filesystems that cannot
guarantee data persistence on sync (eg fuse). For superblocks with this
flag set, sync kicks off writeback of dirty inodes but does not wait
for the flusher threads to complete the writeback.
This replaces the per-inode AS_NO_DATA_INTEGRITY mapping flag added in
commit f9a49aa302a0 ("fs/writeback: skip AS_NO_DATA_INTEGRITY mappings
in wait_sb_inodes()"). The flag belongs at the superblock level because
data integrity is a filesystem-wide property, not a per-inode one.
Having this flag at the superblock level also allows us to skip having
to iterate every dirty inode in wait_sb_inodes() only to skip each inode
individually.
Prior to this commit, mappings with no data integrity guarantees skipped
waiting on writeback completion but still waited on the flusher threads
to finish initiating the writeback. Waiting on the flusher threads is
unnecessary. This commit kicks off writeback but does not wait on the
flusher threads. This change properly addresses a recent report [1] for
a suspend-to-RAM hang seen on fuse-overlayfs that was caused by waiting
on the flusher threads to finish:
Workqueue: pm_fs_sync pm_fs_sync_work_fn
Call Trace:
<TASK>
__schedule+0x457/0x1720
schedule+0x27/0xd0
wb_wait_for_completion+0x97/0xe0
sync_inodes_sb+0xf8/0x2e0
__iterate_supers+0xdc/0x160
ksys_sync+0x43/0xb0
pm_fs_sync_work_fn+0x17/0xa0
process_one_work+0x193/0x350
worker_thread+0x1a1/0x310
kthread+0xfc/0x240
ret_from_fork+0x243/0x280
ret_from_fork_asm+0x1a/0x30
</TASK>
On fuse this is problematic because there are paths that may cause the
flusher thread to block (eg if systemd freezes the user session cgroups
first, which freezes the fuse daemon, before invoking the kernel
suspend. The kernel suspend triggers ->write_node() which on fuse issues
a synchronous setattr request, which cannot be processed since the
daemon is frozen. Or if the daemon is buggy and cannot properly complete
writeback, initiating writeback on a dirty folio already under writeback
leads to writeback_get_folio() -> folio_prepare_writeback() ->
unconditional wait on writeback to finish, which will cause a hang).
This commit restores fuse to its prior behavior before tmp folios were
removed, where sync was essentially a no-op.
[1] https://lore.kernel.org/linux-fsdevel/CAJnrk1a-asuvfrbKXbEwwDSctvemF+6zfhdnuzO65Pt8HsFSRw@mail.gmail.com/T/#m632c4648e9cafc4239299887109ebd880ac6c5c1
Fixes: 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree")
Reported-by: John <therealgraysky@proton.me>
Cc: stable@vger.kernel.org
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Link: https://patch.msgid.link/20260320005145.2483161-2-joannelkoong@gmail.com
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Nikunj A Dadhania <nikunj@amd.com>
Date: Wed Mar 18 07:56:52 2026 +0000
x86/cpu: Enable FSGSBASE early in cpu_init_exception_handling()
commit 05243d490bb7852a8acca7b5b5658019c7797a52 upstream.
Move FSGSBASE enablement from identify_cpu() to cpu_init_exception_handling()
to ensure it is enabled before any exceptions can occur on both boot and
secondary CPUs.
== Background ==
Exception entry code (paranoid_entry()) uses ALTERNATIVE patching based on
X86_FEATURE_FSGSBASE to decide whether to use RDGSBASE/WRGSBASE instructions
or the slower RDMSR/SWAPGS sequence for saving/restoring GSBASE.
On boot CPU, ALTERNATIVE patching happens after enabling FSGSBASE in CR4.
When the feature is available, the code is permanently patched to use
RDGSBASE/WRGSBASE, which require CR4.FSGSBASE=1 to execute without triggering
== Boot Sequence ==
Boot CPU (with CR pinning enabled):
trap_init()
cpu_init() <- Uses unpatched code (RDMSR/SWAPGS)
x2apic_setup()
...
arch_cpu_finalize_init()
identify_boot_cpu()
identify_cpu()
cr4_set_bits(X86_CR4_FSGSBASE) # Enables the feature
# This becomes part of cr4_pinned_bits
...
alternative_instructions() <- Patches code to use RDGSBASE/WRGSBASE
Secondary CPUs (with CR pinning enabled):
start_secondary()
cr4_init() <- Code already patched, CR4.FSGSBASE=1
set implicitly via cr4_pinned_bits
cpu_init() <- exceptions work because FSGSBASE is
already enabled
Secondary CPU (with CR pinning disabled):
start_secondary()
cr4_init() <- Code already patched, CR4.FSGSBASE=0
cpu_init()
x2apic_setup()
rdmsrq(MSR_IA32_APICBASE) <- Triggers #VC in SNP guests
exc_vmm_communication()
paranoid_entry() <- Uses RDGSBASE with CR4.FSGSBASE=0
(patched code)
...
ap_starting()
identify_secondary_cpu()
identify_cpu()
cr4_set_bits(X86_CR4_FSGSBASE) <- Enables the feature, which is
too late
== CR Pinning ==
Currently, for secondary CPUs, CR4.FSGSBASE is set implicitly through
CR-pinning: the boot CPU sets it during identify_cpu(), it becomes part of
cr4_pinned_bits, and cr4_init() applies those pinned bits to secondary CPUs.
This works but creates an undocumented dependency between cr4_init() and the
pinning mechanism.
== Problem ==
Secondary CPUs boot after alternatives have been applied globally. They
execute already-patched paranoid_entry() code that uses RDGSBASE/WRGSBASE
instructions, which require CR4.FSGSBASE=1. Upcoming changes to CR pinning
behavior will break the implicit dependency, causing secondary CPUs to
generate #UD.
This issue manifests itself on AMD SEV-SNP guests, where the rdmsrq() in
x2apic_setup() triggers a #VC exception early during cpu_init(). The #VC
handler (exc_vmm_communication()) executes the patched paranoid_entry() path.
Without CR4.FSGSBASE enabled, RDGSBASE instructions trigger #UD.
== Fix ==
Enable FSGSBASE explicitly in cpu_init_exception_handling() before loading
exception handlers. This makes the dependency explicit and ensures both
boot and secondary CPUs have FSGSBASE enabled before paranoid_entry()
executes.
Fixes: c82965f9e530 ("x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit")
Reported-by: Borislav Petkov <bp@alien8.de>
Suggested-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Cc: <stable@kernel.org>
Link: https://patch.msgid.link/20260318075654.1792916-2-nikunj@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Borislav Petkov (AMD) <bp@alien8.de>
Date: Thu Mar 19 12:07:59 2026 +0100
x86/cpu: Remove X86_CR4_FRED from the CR4 pinned bits mask
commit 411df123c017169922cc767affce76282b8e6c85 upstream.
Commit in Fixes added the FRED CR4 bit to the CR4 pinned bits mask so
that whenever something else modifies CR4, that bit remains set. Which
in itself is a perfectly fine idea.
However, there's an issue when during boot FRED is initialized: first on
the BSP and later on the APs. Thus, there's a window in time when
exceptions cannot be handled.
This becomes particularly nasty when running as SEV-{ES,SNP} or TDX
guests which, when they manage to trigger exceptions during that short
window described above, triple fault due to FRED MSRs not being set up
yet.
See Link tag below for a much more detailed explanation of the
situation.
So, as a result, the commit in that Link URL tried to address this
shortcoming by temporarily disabling CR4 pinning when an AP is not
online yet.
However, that is a problem in itself because in this case, an attack on
the kernel needs to only modify the online bit - a single bit in RW
memory - and then disable CR4 pinning and then disable SM*P, leading to
more and worse things to happen to the system.
So, instead, remove the FRED bit from the CR4 pinning mask, thus
obviating the need to temporarily disable CR4 pinning.
If someone manages to disable FRED when poking at CR4, then
idt_invalidate() would make sure the system would crash'n'burn on the
first exception triggered, which is a much better outcome security-wise.
Fixes: ff45746fbf00 ("x86/cpu: Add X86_CR4_FRED macro")
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org> # 6.12+
Link: https://lore.kernel.org/r/177385987098.1647592.3381141860481415647.tip-bot2@tip-bot2
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Mike Rapoport (Microsoft) <rppt@kernel.org>
Date: Fri Mar 20 15:59:48 2026 +0200
x86/efi: efi_unmap_boot_services: fix calculation of ranges_to_free size
[ Upstream commit 217c0a5c177a3d4f7c8497950cbf5c36756e8bbb ]
ranges_to_free array should have enough room to store the entire EFI
memmap plus an extra element for NULL entry.
The calculation of this array size wrongly adds 1 to the overall size
instead of adding 1 to the number of elements.
Add parentheses to properly size the array.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Fixes: a4b0bf6a40f3 ("x86/efi: defer freeing of boot services memory")
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nikunj A Dadhania <nikunj@amd.com>
Date: Wed Mar 18 07:56:54 2026 +0000
x86/fred: Fix early boot failures on SEV-ES/SNP guests
commit 3645eb7e3915990a149460c151a00894cb586253 upstream.
FRED-enabled SEV-(ES,SNP) guests fail to boot due to the following issues
in the early boot sequence:
* FRED does not have a #VC exception handler in the dispatch logic
* Early FRED #VC exceptions attempt to use uninitialized per-CPU GHCBs
instead of boot_ghcb
Add X86_TRAP_VC case to fred_hwexc() with a new exc_vmm_communication()
function that provides the unified entry point FRED requires, dispatching
to existing user/kernel handlers based on privilege level. The function is
already declared via DECLARE_IDTENTRY_VC().
Fix early GHCB access by falling back to boot_ghcb in
__sev_{get,put}_ghcb() when per-CPU GHCBs are not yet initialized.
Fixes: 14619d912b65 ("x86/fred: FRED entry/exit and dispatch code")
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: <stable@kernel.org> # 6.12+
Link: https://patch.msgid.link/20260318075654.1792916-4-nikunj@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Peter Zijlstra <peterz@infradead.org>
Date: Wed Mar 11 21:29:14 2026 +0100
x86/perf: Make sure to program the counter value for stopped events on migration
[ Upstream commit f1cac6ac62d28a9a57b17f51ac5795bf250c12d3 ]
Both Mi Dapeng and Ian Rogers noted that not everything that sets HES_STOPPED
is required to EF_UPDATE. Specifically the 'step 1' loop of rescheduling
explicitly does EF_UPDATE to ensure the counter value is read.
However, then 'step 2' simply leaves the new counter uninitialized when
HES_STOPPED, even though, as noted above, the thing that stopped them might not
be aware it needs to EF_RELOAD -- since it didn't EF_UPDATE on stop.
One such location that is affected is throttling, throttle does pmu->stop(, 0);
and unthrottle does pmu->start(, 0); possibly restarting an uninitialized counter.
Fixes: a4eaf7f14675 ("perf: Rework the PMU methods")
Reported-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Reported-by: Ian Rogers <irogers@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://patch.msgid.link/20260311204035.GX606826@noisy.programming.kicks-ass.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: GuoHan Zhao <zhaoguohan@kylinos.cn>
Date: Wed Mar 25 20:02:46 2026 +0800
xen/privcmd: unregister xenstore notifier on module exit
[ Upstream commit cd7e1fef5a1ca1c4fcd232211962ac2395601636 ]
Commit 453b8fb68f36 ("xen/privcmd: restrict usage in
unprivileged domU") added a xenstore notifier to defer setting the
restriction target until Xenstore is ready.
XEN_PRIVCMD can be built as a module, but privcmd_exit() leaves that
notifier behind. Balance the notifier lifecycle by unregistering it on
module exit.
This is harmless even if xenstore was already ready at registration
time and the notifier was never queued on the chain.
Fixes: 453b8fb68f3641fe ("xen/privcmd: restrict usage in unprivileged domU")
Signed-off-by: GuoHan Zhao <zhaoguohan@kylinos.cn>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20260325120246.252899-1-zhaoguohan@kylinos.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sabrina Dubroca <sd@queasysnail.net>
Date: Tue Feb 24 00:05:11 2026 +0100
xfrm: add missing extack for XFRMA_SA_PCPU in add_acquire and allocspi
[ Upstream commit aa8a3f3c67235422a0c3608a8772f69ca3b7b63f ]
We're returning an error caused by invalid user input without setting
an extack. Add one.
Fixes: 1ddf9916ac09 ("xfrm: Add support for per cpu xfrm state handling.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sabrina Dubroca <sd@queasysnail.net>
Date: Tue Feb 24 00:05:13 2026 +0100
xfrm: call xdo_dev_state_delete during state update
[ Upstream commit 7d2fc41f91bc69acb6e01b0fa23cd7d0109a6a23 ]
When we update an SA, we construct a new state and call
xdo_dev_state_add, but never insert it. The existing state is updated,
then we immediately destroy the new state. Since we haven't added it,
we don't go through the standard state delete code, and we're skipping
removing it from the device (but xdo_dev_state_free will get called
when we destroy the temporary state).
This is similar to commit c5d4d7d83165 ("xfrm: Fix deletion of
offloaded SAs on failure.").
Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sabrina Dubroca <sd@queasysnail.net>
Date: Tue Feb 24 00:05:12 2026 +0100
xfrm: fix the condition on x->pcpu_num in xfrm_sa_len
[ Upstream commit b57defcf8f109da5ba9cf59b2a736606faf3d846 ]
pcpu_num = 0 is a valid value. The marker for "unset pcpu_num" which
makes copy_to_user_state_extra not add the XFRMA_SA_PCPU attribute is
UINT_MAX.
Fixes: 1ddf9916ac09 ("xfrm: Add support for per cpu xfrm state handling.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hyunwoo Kim <imv4bel@gmail.com>
Date: Wed Mar 11 03:16:29 2026 +0900
xfrm: Fix work re-schedule after cancel in xfrm_nat_keepalive_net_fini()
[ Upstream commit daf8e3b253aa760ff9e96c7768a464bc1d6b3c90 ]
After cancel_delayed_work_sync() is called from
xfrm_nat_keepalive_net_fini(), xfrm_state_fini() flushes remaining
states via __xfrm_state_delete(), which calls
xfrm_nat_keepalive_state_updated() to re-schedule nat_keepalive_work.
The following is a simple race scenario:
cpu0 cpu1
cleanup_net() [Round 1]
ops_undo_list()
xfrm_net_exit()
xfrm_nat_keepalive_net_fini()
cancel_delayed_work_sync(nat_keepalive_work);
xfrm_state_fini()
xfrm_state_flush()
xfrm_state_delete(x)
__xfrm_state_delete(x)
xfrm_nat_keepalive_state_updated(x)
schedule_delayed_work(nat_keepalive_work);
rcu_barrier();
net_complete_free();
net_passive_dec(net);
llist_add(&net->defer_free_list, &defer_free_list);
cleanup_net() [Round 2]
rcu_barrier();
net_complete_free()
kmem_cache_free(net_cachep, net);
nat_keepalive_work()
// on freed net
To prevent this, cancel_delayed_work_sync() is replaced with
disable_delayed_work_sync().
Fixes: f531d13bdfe3 ("xfrm: support sending NAT keepalives in ESP in UDP states")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Fernando Fernandez Mancera <fmancera@suse.de>
Date: Wed Mar 4 15:09:35 2026 +0100
xfrm: iptfs: fix skb_put() panic on non-linear skb during reassembly
[ Upstream commit 0b352f83cabfefdaafa806d6471f0eca117dc7d5 ]
In iptfs_reassem_cont(), IP-TFS attempts to append data to the new inner
packet 'newskb' that is being reassembled. First a zero-copy approach is
tried if it succeeds then newskb becomes non-linear.
When a subsequent fragment in the same datagram does not meet the
fast-path conditions, a memory copy is performed. It calls skb_put() to
append the data and as newskb is non-linear it triggers
SKB_LINEAR_ASSERT check.
Oops: invalid opcode: 0000 [#1] SMP NOPTI
[...]
RIP: 0010:skb_put+0x3c/0x40
[...]
Call Trace:
<IRQ>
iptfs_reassem_cont+0x1ab/0x5e0 [xfrm_iptfs]
iptfs_input_ordered+0x2af/0x380 [xfrm_iptfs]
iptfs_input+0x122/0x3e0 [xfrm_iptfs]
xfrm_input+0x91e/0x1a50
xfrm4_esp_rcv+0x3a/0x110
ip_protocol_deliver_rcu+0x1d7/0x1f0
ip_local_deliver_finish+0xbe/0x1e0
__netif_receive_skb_core.constprop.0+0xb56/0x1120
__netif_receive_skb_list_core+0x133/0x2b0
netif_receive_skb_list_internal+0x1ff/0x3f0
napi_complete_done+0x81/0x220
virtnet_poll+0x9d6/0x116e [virtio_net]
__napi_poll.constprop.0+0x2b/0x270
net_rx_action+0x162/0x360
handle_softirqs+0xdc/0x510
__irq_exit_rcu+0xe7/0x110
irq_exit_rcu+0xe/0x20
common_interrupt+0x85/0xa0
</IRQ>
<TASK>
Fix this by checking if the skb is non-linear. If it is, linearize it by
calling skb_linearize(). As the initial allocation of newskb originally
reserved enough tailroom for the entire reassembled packet we do not
need to check if we have enough tailroom or extend it.
Fixes: 5f2b6a909574 ("xfrm: iptfs: add skb-fragment sharing code")
Reported-by: Hao Long <me@imlonghao.com>
Closes: https://lore.kernel.org/netdev/DGRCO9SL0T5U.JTINSHJQ9KPK@imlonghao.com/
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Paul Moses <p@1g4.org>
Date: Mon Mar 16 14:56:51 2026 +0000
xfrm: iptfs: only publish mode_data after clone setup
commit d849a2f7309fc0616e79d13b008b0a47e0458b6e upstream.
iptfs_clone_state() stores x->mode_data before allocating the reorder
window. If that allocation fails, the code frees the cloned state and
returns -ENOMEM, leaving x->mode_data pointing at freed memory.
The xfrm clone unwind later runs destroy_state() through x->mode_data,
so the failed clone path tears down IPTFS state that clone_state()
already freed.
Keep the cloned IPTFS state private until all allocations succeed so
failed clones leave x->mode_data unset. The destroy path already
handles a NULL mode_data pointer.
Fixes: 6be02e3e4f37 ("xfrm: iptfs: handle reordering of received packets")
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moses <p@1g4.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Roshan Kumar <roshaen09@gmail.com>
Date: Sun Mar 1 10:56:38 2026 +0000
xfrm: iptfs: validate inner IPv4 header length in IPTFS payload
commit 0d10393d5eac33cbd92f7a41fddca12c41d3cb7e upstream.
Add validation of the inner IPv4 packet tot_len and ihl fields parsed
from decrypted IPTFS payloads in __input_process_payload(). A crafted
ESP packet containing an inner IPv4 header with tot_len=0 causes an
infinite loop: iplen=0 leads to capturelen=min(0, remaining)=0, so the
data offset never advances and the while(data < tail) loop never
terminates, spinning forever in softirq context.
Reject inner IPv4 packets where tot_len < ihl*4 or ihl*4 < sizeof(struct
iphdr), which catches both the tot_len=0 case and malformed ihl values.
The normal IP stack performs this validation in ip_rcv_core(), but IPTFS
extracts and processes inner packets before they reach that layer.
Reported-by: Roshan Kumar <roshaen09@gmail.com>
Fixes: 6c82d2433671 ("xfrm: iptfs: add basic receive packet (tunnel egress) handling")
Cc: stable@vger.kernel.org
Signed-off-by: Roshan Kumar <roshaen09@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Minwoo Ra <raminwo0202@gmail.com>
Date: Sat Mar 14 00:58:44 2026 +0900
xfrm: prevent policy_hthresh.work from racing with netns teardown
[ Upstream commit 29fe3a61bcdce398ee3955101c39f89c01a8a77e ]
A XFRM_MSG_NEWSPDINFO request can queue the per-net work item
policy_hthresh.work onto the system workqueue.
The queued callback, xfrm_hash_rebuild(), retrieves the enclosing
struct net via container_of(). If the net namespace is torn down
before that work runs, the associated struct net may already have
been freed, and xfrm_hash_rebuild() may then dereference stale memory.
xfrm_policy_fini() already flushes policy_hash_work during teardown,
but it does not synchronize policy_hthresh.work.
Synchronize policy_hthresh.work in xfrm_policy_fini() as well, so the
queued work cannot outlive the net namespace teardown and access a
freed struct net.
Fixes: 880a6fab8f6b ("xfrm: configure policy hash table thresholds by netlink")
Signed-off-by: Minwoo Ra <raminwo0202@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yuto Ohnuki <ytohnuki@amazon.com>
Date: Tue Mar 10 18:38:38 2026 +0000
xfs: avoid dereferencing log items after push callbacks
commit 79ef34ec0554ec04bdbafafbc9836423734e1bd6 upstream.
After xfsaild_push_item() calls iop_push(), the log item may have been
freed if the AIL lock was dropped during the push. Background inode
reclaim or the dquot shrinker can free the log item while the AIL lock
is not held, and the tracepoints in the switch statement dereference
the log item after iop_push() returns.
Fix this by capturing the log item type, flags, and LSN before calling
xfsaild_push_item(), and introducing a new xfs_ail_push_class trace
event class that takes these pre-captured values and the ailp pointer
instead of the log item pointer.
Reported-by: syzbot+652af2b3c5569c4ab63c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=652af2b3c5569c4ab63c
Fixes: 90c60e164012 ("xfs: xfs_iflush() is no longer necessary")
Cc: stable@vger.kernel.org # v5.9
Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Darrick J. Wong <djwong@kernel.org>
Date: Mon Mar 23 14:01:57 2026 -0700
xfs: don't irele after failing to iget in xfs_attri_recover_work
commit 70685c291ef82269180758130394ecdc4496b52c upstream.
xlog_recovery_iget* never set @ip to a valid pointer if they return
an error, so this irele will walk off a dangling pointer. Fix that.
Cc: stable@vger.kernel.org # v6.10
Fixes: ae673f534a3097 ("xfs: record inode generation in xattr update log intent items")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Long Li <leo.lilong@huawei.com>
Date: Fri Mar 20 10:11:29 2026 +0800
xfs: fix ri_total validation in xlog_recover_attri_commit_pass2
commit d72f2084e30966097c8eae762e31986a33c3c0ae upstream.
The ri_total checks for SET/REPLACE operations are hardcoded to 3,
but xfs_attri_item_size() only emits a value iovec when value_len > 0,
so ri_total is 2 when value_len == 0.
For PPTR_SET/PPTR_REMOVE/PPTR_REPLACE, value_len is validated by
xfs_attri_validate() to be exactly sizeof(struct xfs_parent_rec) and
is never zero, so their hardcoded checks remain correct.
This problem may cause log recovery failures. The following script can be
used to reproduce the problem:
#!/bin/bash
mkfs.xfs -f /dev/sda
mount /dev/sda /mnt/test/
touch /mnt/test/file
for i in {1..200}; do
attr -s "user.attr_$i" -V "value_$i" /mnt/test/file > /dev/null
done
echo 1 > /sys/fs/xfs/debug/larp
echo 1 > /sys/fs/xfs/sda/errortag/larp
attr -s "user.zero" -V "" /mnt/test/file
echo 0 > /sys/fs/xfs/sda/errortag/larp
umount /mnt/test
mount /dev/sda /mnt/test/ # mount failed
Fix this by deriving the expected count dynamically as "2 + !!value_len"
for SET/REPLACE operations.
Cc: stable@vger.kernel.org # v6.9
Fixes: ad206ae50eca ("xfs: check opcode and iovec count match in xlog_recover_attri_commit_pass2")
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Darrick J. Wong <djwong@kernel.org>
Date: Mon Mar 23 14:04:33 2026 -0700
xfs: remove file_path tracepoint data
commit e31c53a8060e134111ed095783fee0aa0c43b080 upstream.
The xfile/xmbuf shmem file descriptions are no longer as detailed as
they were when online fsck was first merged, because moving to static
strings in commit 60382993a2e180 ("xfs: get rid of the
xchk_xfile_*_descr calls") removed a memory allocation and hence a
source of failure.
However this makes encoding the description in the tracepoints sort of a
waste of memory. David Laight also points out that file_path doesn't
zero the whole buffer which causes exposure of stale trace bytes, and
Steven Rostedt wonders why we're not using a dynamic array for the file
path.
I don't think this is worth fixing, so let's just rip it out.
Cc: rostedt@goodmis.org
Cc: david.laight.linux@gmail.com
Link: https://lore.kernel.org/linux-xfs/20260323172204.work.979-kees@kernel.org/
Cc: stable@vger.kernel.org # v6.11
Fixes: 19ebc8f84ea12e ("xfs: fix file_path handling in tracepoints")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Yuto Ohnuki <ytohnuki@amazon.com>
Date: Tue Mar 10 18:38:39 2026 +0000
xfs: save ailp before dropping the AIL lock in push callbacks
commit 394d70b86fae9fe865e7e6d9540b7696f73aa9b6 upstream.
In xfs_inode_item_push() and xfs_qm_dquot_logitem_push(), the AIL lock
is dropped to perform buffer IO. Once the cluster buffer no longer
protects the log item from reclaim, the log item may be freed by
background reclaim or the dquot shrinker. The subsequent spin_lock()
call dereferences lip->li_ailp, which is a use-after-free.
Fix this by saving the ailp pointer in a local variable while the AIL
lock is held and the log item is guaranteed to be valid.
Reported-by: syzbot+652af2b3c5569c4ab63c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=652af2b3c5569c4ab63c
Fixes: 90c60e164012 ("xfs: xfs_iflush() is no longer necessary")
Cc: stable@vger.kernel.org # v5.9
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: hongao <hongao@uniontech.com>
Date: Thu Mar 12 20:10:26 2026 +0800
xfs: scrub: unlock dquot before early return in quota scrub
commit 268378b6ad20569af0d1957992de1c8b16c6e900 upstream.
xchk_quota_item can return early after calling xchk_fblock_process_error.
When that helper returns false, the function returned immediately without
dropping dq->q_qlock, which can leave the dquot lock held and risk lock
leaks or deadlocks in later quota operations.
Fix this by unlocking dq->q_qlock before the early return.
Signed-off-by: hongao <hongao@uniontech.com>
Fixes: 7d1f0e167a067e ("xfs: check the ondisk space mapping behind a dquot")
Cc: <stable@vger.kernel.org> # v6.8
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Yuto Ohnuki <ytohnuki@amazon.com>
Date: Tue Mar 10 18:38:37 2026 +0000
xfs: stop reclaim before pushing AIL during unmount
commit 4f24a767e3d64a5f58c595b5c29b6063a201f1e3 upstream.
The unmount sequence in xfs_unmount_flush_inodes() pushed the AIL while
background reclaim and inodegc are still running. This is broken
independently of any use-after-free issues - background reclaim and
inodegc should not be running while the AIL is being pushed during
unmount, as inodegc can dirty and insert inodes into the AIL during the
flush, and background reclaim can race to abort and free dirty inodes.
Reorder xfs_unmount_flush_inodes() to stop inodegc and cancel background
reclaim before pushing the AIL. Stop inodegc before cancelling
m_reclaim_work because the inodegc worker can re-queue m_reclaim_work
via xfs_inodegc_set_reclaimable.
Reported-by: syzbot+652af2b3c5569c4ab63c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=652af2b3c5569c4ab63c
Fixes: 90c60e164012 ("xfs: xfs_iflush() is no longer necessary")
Cc: stable@vger.kernel.org # v5.9
Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>