Author: Hans de Goede <hdegoede@redhat.com>
Date: Fri Dec 20 19:13:52 2024 +0100
ACPI: resource: Add Asus Vivobook X1504VAP to irq1_level_low_skip_override[]
commit 66d337fede44dcbab4107d37684af8fcab3d648e upstream.
Like the Vivobook X1704VAP the X1504VAP has its keyboard IRQ (1) described
as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh which
breaks the keyboard.
Add the X1504VAP to the irq1_level_low_skip_override[] quirk table to fix
this.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219224
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patch.msgid.link/20241220181352.25974-1-hdegoede@redhat.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Hans de Goede <hdegoede@redhat.com>
Date: Sat Dec 28 17:48:45 2024 +0100
ACPI: resource: Add TongFang GM5HG0A to irq1_edge_low_force_override[]
commit 7ed4e4a659d99499dc6968c61970d41b64feeac0 upstream.
The TongFang GM5HG0A is a TongFang barebone design which is sold under
various brand names.
The ACPI IRQ override for the keyboard IRQ must be used on these AMD Zen
laptops in order for the IRQ to work.
At least on the SKIKK Vanaheim variant the DMI product- and board-name
strings have been replaced by the OEM with "Vanaheim" so checking that
board-name contains "GM5HG0A" as is usually done for TongFang barebones
quirks does not work.
The DMI OEM strings do contain "GM5HG0A". I have looked at the dmidecode
for a few other TongFang devices and the TongFang code-name string being
in the OEM strings seems to be something which is consistently true.
Add a quirk checking one of the DMI_OEM_STRING(s) is "GM5HG0A" in the hope
that this will work for other OEM versions of the "GM5HG0A" too.
Link: https://www.skikk.eu/en/laptops/vanaheim-15-rtx-4060
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219614
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patch.msgid.link/20241228164845.42381-1-hdegoede@redhat.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: David Howells <dhowells@redhat.com>
Date: Mon Jan 6 16:21:00 2025 +0000
afs: Fix the maximum cell name length
[ Upstream commit 8fd56ad6e7c90ac2bddb0741c6b248c8c5d56ac8 ]
The kafs filesystem limits the maximum length of a cell to 256 bytes, but a
problem occurs if someone actually does that: kafs tries to create a
directory under /proc/net/afs/ with the name of the cell, but that fails
with a warning:
WARNING: CPU: 0 PID: 9 at fs/proc/generic.c:405
because procfs limits the maximum filename length to 255.
However, the DNS limits the maximum lookup length and, by extension, the
maximum cell name, to 255 less two (length count and trailing NUL).
Fix this by limiting the maximum acceptable cellname length to 253. This
also allows us to be sure we can create the "/afs/.<cell>/" mountpoint too.
Further, split the YFS VL record cell name maximum to be the 256 allowed by
the protocol and ignore the record retrieved by YFSVL.GetCellName if it
exceeds 253.
Fixes: c3e9f888263b ("afs: Implement client support for the YFSVL.GetCellName RPC op")
Reported-by: syzbot+7848fee1f1e5c53f912b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/6776d25d.050a0220.3a8527.0048.GAE@google.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/376236.1736180460@warthog.procyon.org.uk
Tested-by: syzbot+7848fee1f1e5c53f912b@syzkaller.appspotmail.com
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wei Fang <wei.fang@nxp.com>
Date: Tue Nov 5 13:46:02 2024 +0800
arm64: dts: imx95: correct the address length of netcmix_blk_ctrl
[ Upstream commit c5b8d2c370842e3f9a15655893d8c597e2d981d9 ]
The netc_blk_ctrl is controlled by the imx95-blk-ctl clock driver and
provides relevant clock configurations for NETC, SAI and MQS. Its address
length should be 8 bytes instead of 0x1000.
Fixes: 7764fef26ea9 ("arm64: dts: imx95: Add NETCMIX block control support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jie Gan <quic_jiegan@quicinc.com>
Date: Thu Dec 19 10:52:16 2024 +0800
arm64: dts: qcom: sa8775p: fix the secure device bootup issue
[ Upstream commit 8a6442ec3437083348f32a6159b9a67bf66417bc ]
The secure device(fused) cannot bootup with TPDM_DCC device. So
disable it in DT.
Fixes: 6596118ccdcd ("arm64: dts: qcom: Add coresight nodes for SA8775p")
Signed-off-by: Jie Gan <quic_jiegan@quicinc.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20241219025216.3463527-1-quic_jiegan@quicinc.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Date: Thu Nov 28 20:21:47 2024 +0530
arm64: dts: qcom: sa8775p: Fix the size of 'addr_space' regions
commit e60b14f47d779edc38bc1f14d2c995d477cec6f9 upstream.
For both the controller instances, size of the 'addr_space' region should
be 0x1fe00000 as per the hardware memory layout.
Otherwise, endpoint drivers cannot request even reasonable BAR size of 1MB.
Cc: stable@vger.kernel.org # 6.11
Fixes: c5f5de8434ec ("arm64: dts: qcom: sa8775p: Add ep pcie1 controller node")
Fixes: 1924f5518224 ("arm64: dts: qcom: sa8775p: Add ep pcie0 controller node")
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20241128145147.145618-1-manivannan.sadhasivam@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Qiang Yu <quic_qianyu@quicinc.com>
Date: Wed Nov 13 00:05:08 2024 -0800
arm64: dts: qcom: x1e80100: Fix up BAR space size for PCIe6a
commit fb8e7b33c2174e00dfa411361eeed21eeaf3634b upstream.
As per memory map table, the region for PCIe6a is 64MByte. Hence, set the
size of 32 bit non-prefetchable memory region beginning on address
0x70300000 as 0x3d00000 so that BAR space assigned to BAR registers can be
allocated from 0x70300000 to 0x74000000.
Fixes: 7af141850012 ("arm64: dts: qcom: x1e80100: Fix up BAR spaces")
Cc: stable@vger.kernel.org
Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
Reviewed-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20241113080508.3458849-1-quic_qianyu@quicinc.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Peter Geis <pgwipeout@gmail.com>
Date: Sat Dec 14 22:43:39 2024 +0000
arm64: dts: rockchip: add hevc power domain clock to rk3328
[ Upstream commit 3699f2c43ea9984e00d70463f8c29baaf260ea97 ]
There is a race condition at startup between disabling power domains not
used and disabling clocks not used on the rk3328. When the clocks are
disabled first, the hevc power domain fails to shut off leading to a
splat of failures. Add the hevc core clock to the rk3328 power domain
node to prevent this condition.
rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 3-.... }
1087 jiffies s: 89 root: 0x8/.
rcu: blocking rcu_node structures (internal RCU debug):
Sending NMI from CPU 0 to CPUs 3:
NMI backtrace for cpu 3
CPU: 3 UID: 0 PID: 86 Comm: kworker/3:3 Not tainted 6.12.0-rc5+ #53
Hardware name: Firefly ROC-RK3328-CC (DT)
Workqueue: pm genpd_power_off_work_fn
pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : regmap_unlock_spinlock+0x18/0x30
lr : regmap_read+0x60/0x88
sp : ffff800081123c00
x29: ffff800081123c00 x28: ffff2fa4c62cad80 x27: 0000000000000000
x26: ffffd74e6e660eb8 x25: ffff2fa4c62cae00 x24: 0000000000000040
x23: ffffd74e6d2f3ab8 x22: 0000000000000001 x21: ffff800081123c74
x20: 0000000000000000 x19: ffff2fa4c0412000 x18: 0000000000000000
x17: 77202c31203d2065 x16: 6c6469203a72656c x15: 6c6f72746e6f632d
x14: 7265776f703a6e6f x13: 2063766568206e69 x12: 616d6f64202c3431
x11: 347830206f742030 x10: 3430303034783020 x9 : ffffd74e6c7369e0
x8 : 3030316666206e69 x7 : 205d383738353733 x6 : 332e31202020205b
x5 : ffffd74e6c73fc88 x4 : ffffd74e6c73fcd4 x3 : ffffd74e6c740b40
x2 : ffff800080015484 x1 : 0000000000000000 x0 : ffff2fa4c0412000
Call trace:
regmap_unlock_spinlock+0x18/0x30
rockchip_pmu_set_idle_request+0xac/0x2c0
rockchip_pd_power+0x144/0x5f8
rockchip_pd_power_off+0x1c/0x30
_genpd_power_off+0x9c/0x180
genpd_power_off.part.0.isra.0+0x130/0x2a8
genpd_power_off_work_fn+0x6c/0x98
process_one_work+0x170/0x3f0
worker_thread+0x290/0x4a8
kthread+0xec/0xf8
ret_from_fork+0x10/0x20
rockchip-pm-domain ff100000.syscon:power-controller: failed to get ack on domain 'hevc', val=0x88220
Fixes: 52e02d377a72 ("arm64: dts: rockchip: add core dtsi file for RK3328 SoCs")
Signed-off-by: Peter Geis <pgwipeout@gmail.com>
Reviewed-by: Dragan Simic <dsimic@manjaro.org>
Link: https://lore.kernel.org/r/20241214224339.24674-1-pgwipeout@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jesse Taube <Mr.Bossman075@gmail.com>
Date: Mon Nov 18 10:36:41 2024 -0500
ARM: dts: imxrt1050: Fix clocks for mmc
[ Upstream commit 5f122030061db3e5d2bddd9cf5c583deaa6c54ff ]
One of the usdhc1 controller's clocks should be IMXRT1050_CLK_AHB_PODF not
IMXRT1050_CLK_OSC.
Fixes: 1c4f01be3490 ("ARM: dts: imx: Add i.MXRT1050-EVK support")
Signed-off-by: Jesse Taube <Mr.Bossman075@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chen-Yu Tsai <wenst@chromium.org>
Date: Thu Dec 19 18:53:02 2024 +0800
ASoC: mediatek: disable buffer pre-allocation
[ Upstream commit 32c9c06adb5b157ef259233775a063a43746d699 ]
On Chromebooks based on Mediatek MT8195 or MT8188, the audio frontend
(AFE) is limited to accessing a very small window (1 MiB) of memory,
which is described as a reserved memory region in the device tree.
On these two platforms, the maximum buffer size is given as 512 KiB.
The MediaTek common code uses the same value for preallocations. This
means that only the first two PCM substreams get preallocations, and
then the whole space is exhausted, barring any other substreams from
working. Since the substreams used are not always the first two, this
means audio won't work correctly.
This is observed on the MT8188 Geralt Chromebooks, on which the
"mediatek,dai-link" property was dropped when it was upstreamed. That
property causes the driver to only register the PCM substreams listed
in the property, and in the order given.
Instead of trying to compute an optimal value and figuring out which
streams are used, simply disable preallocation. The PCM buffers are
managed by the core and are allocated and released on the fly. There
should be no impact to any of the other MediaTek platforms.
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://patch.msgid.link/20241219105303.548437-1-wenst@chromium.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shuming Fan <shumingf@realtek.com>
Date: Wed Dec 18 17:13:07 2024 +0800
ASoC: rt722: add delay time to wait for the calibration procedure
[ Upstream commit c9e3ebdc52ebe028f238c9df5162ae92483bedd5 ]
The calibration procedure needs some time to finish.
This patch adds the delay time to ensure the calibration procedure is completed correctly.
Signed-off-by: Shuming Fan <shumingf@realtek.com>
Link: https://patch.msgid.link/20241218091307.96656-1-shumingf@realtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yu Kuai <yukuai3@huawei.com>
Date: Wed Jan 8 16:41:48 2025 +0800
block, bfq: fix waker_bfqq UAF after bfq_split_bfqq()
[ Upstream commit fcede1f0a043ccefe9bc6ad57f12718e42f63f1d ]
Our syzkaller report a following UAF for v6.6:
BUG: KASAN: slab-use-after-free in bfq_init_rq+0x175d/0x17a0 block/bfq-iosched.c:6958
Read of size 8 at addr ffff8881b57147d8 by task fsstress/232726
CPU: 2 PID: 232726 Comm: fsstress Not tainted 6.6.0-g3629d1885222 #39
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x91/0xf0 lib/dump_stack.c:106
print_address_description.constprop.0+0x66/0x300 mm/kasan/report.c:364
print_report+0x3e/0x70 mm/kasan/report.c:475
kasan_report+0xb8/0xf0 mm/kasan/report.c:588
hlist_add_head include/linux/list.h:1023 [inline]
bfq_init_rq+0x175d/0x17a0 block/bfq-iosched.c:6958
bfq_insert_request.isra.0+0xe8/0xa20 block/bfq-iosched.c:6271
bfq_insert_requests+0x27f/0x390 block/bfq-iosched.c:6323
blk_mq_insert_request+0x290/0x8f0 block/blk-mq.c:2660
blk_mq_submit_bio+0x1021/0x15e0 block/blk-mq.c:3143
__submit_bio+0xa0/0x6b0 block/blk-core.c:639
__submit_bio_noacct_mq block/blk-core.c:718 [inline]
submit_bio_noacct_nocheck+0x5b7/0x810 block/blk-core.c:747
submit_bio_noacct+0xca0/0x1990 block/blk-core.c:847
__ext4_read_bh fs/ext4/super.c:205 [inline]
ext4_read_bh+0x15e/0x2e0 fs/ext4/super.c:230
__read_extent_tree_block+0x304/0x6f0 fs/ext4/extents.c:567
ext4_find_extent+0x479/0xd20 fs/ext4/extents.c:947
ext4_ext_map_blocks+0x1a3/0x2680 fs/ext4/extents.c:4182
ext4_map_blocks+0x929/0x15a0 fs/ext4/inode.c:660
ext4_iomap_begin_report+0x298/0x480 fs/ext4/inode.c:3569
iomap_iter+0x3dd/0x1010 fs/iomap/iter.c:91
iomap_fiemap+0x1f4/0x360 fs/iomap/fiemap.c:80
ext4_fiemap+0x181/0x210 fs/ext4/extents.c:5051
ioctl_fiemap.isra.0+0x1b4/0x290 fs/ioctl.c:220
do_vfs_ioctl+0x31c/0x11a0 fs/ioctl.c:811
__do_sys_ioctl fs/ioctl.c:869 [inline]
__se_sys_ioctl+0xae/0x190 fs/ioctl.c:857
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x70/0x120 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x78/0xe2
Allocated by task 232719:
kasan_save_stack+0x22/0x50 mm/kasan/common.c:45
kasan_set_track+0x25/0x30 mm/kasan/common.c:52
__kasan_slab_alloc+0x87/0x90 mm/kasan/common.c:328
kasan_slab_alloc include/linux/kasan.h:188 [inline]
slab_post_alloc_hook mm/slab.h:768 [inline]
slab_alloc_node mm/slub.c:3492 [inline]
kmem_cache_alloc_node+0x1b8/0x6f0 mm/slub.c:3537
bfq_get_queue+0x215/0x1f00 block/bfq-iosched.c:5869
bfq_get_bfqq_handle_split+0x167/0x5f0 block/bfq-iosched.c:6776
bfq_init_rq+0x13a4/0x17a0 block/bfq-iosched.c:6938
bfq_insert_request.isra.0+0xe8/0xa20 block/bfq-iosched.c:6271
bfq_insert_requests+0x27f/0x390 block/bfq-iosched.c:6323
blk_mq_insert_request+0x290/0x8f0 block/blk-mq.c:2660
blk_mq_submit_bio+0x1021/0x15e0 block/blk-mq.c:3143
__submit_bio+0xa0/0x6b0 block/blk-core.c:639
__submit_bio_noacct_mq block/blk-core.c:718 [inline]
submit_bio_noacct_nocheck+0x5b7/0x810 block/blk-core.c:747
submit_bio_noacct+0xca0/0x1990 block/blk-core.c:847
__ext4_read_bh fs/ext4/super.c:205 [inline]
ext4_read_bh_nowait+0x15a/0x240 fs/ext4/super.c:217
ext4_read_bh_lock+0xac/0xd0 fs/ext4/super.c:242
ext4_bread_batch+0x268/0x500 fs/ext4/inode.c:958
__ext4_find_entry+0x448/0x10f0 fs/ext4/namei.c:1671
ext4_lookup_entry fs/ext4/namei.c:1774 [inline]
ext4_lookup.part.0+0x359/0x6f0 fs/ext4/namei.c:1842
ext4_lookup+0x72/0x90 fs/ext4/namei.c:1839
__lookup_slow+0x257/0x480 fs/namei.c:1696
lookup_slow fs/namei.c:1713 [inline]
walk_component+0x454/0x5c0 fs/namei.c:2004
link_path_walk.part.0+0x773/0xda0 fs/namei.c:2331
link_path_walk fs/namei.c:3826 [inline]
path_openat+0x1b9/0x520 fs/namei.c:3826
do_filp_open+0x1b7/0x400 fs/namei.c:3857
do_sys_openat2+0x5dc/0x6e0 fs/open.c:1428
do_sys_open fs/open.c:1443 [inline]
__do_sys_openat fs/open.c:1459 [inline]
__se_sys_openat fs/open.c:1454 [inline]
__x64_sys_openat+0x148/0x200 fs/open.c:1454
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x70/0x120 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x78/0xe2
Freed by task 232726:
kasan_save_stack+0x22/0x50 mm/kasan/common.c:45
kasan_set_track+0x25/0x30 mm/kasan/common.c:52
kasan_save_free_info+0x2b/0x50 mm/kasan/generic.c:522
____kasan_slab_free mm/kasan/common.c:236 [inline]
__kasan_slab_free+0x12a/0x1b0 mm/kasan/common.c:244
kasan_slab_free include/linux/kasan.h:164 [inline]
slab_free_hook mm/slub.c:1827 [inline]
slab_free_freelist_hook mm/slub.c:1853 [inline]
slab_free mm/slub.c:3820 [inline]
kmem_cache_free+0x110/0x760 mm/slub.c:3842
bfq_put_queue+0x6a7/0xfb0 block/bfq-iosched.c:5428
bfq_forget_entity block/bfq-wf2q.c:634 [inline]
bfq_put_idle_entity+0x142/0x240 block/bfq-wf2q.c:645
bfq_forget_idle+0x189/0x1e0 block/bfq-wf2q.c:671
bfq_update_vtime block/bfq-wf2q.c:1280 [inline]
__bfq_lookup_next_entity block/bfq-wf2q.c:1374 [inline]
bfq_lookup_next_entity+0x350/0x480 block/bfq-wf2q.c:1433
bfq_update_next_in_service+0x1c0/0x4f0 block/bfq-wf2q.c:128
bfq_deactivate_entity+0x10a/0x240 block/bfq-wf2q.c:1188
bfq_deactivate_bfqq block/bfq-wf2q.c:1592 [inline]
bfq_del_bfqq_busy+0x2e8/0xad0 block/bfq-wf2q.c:1659
bfq_release_process_ref+0x1cc/0x220 block/bfq-iosched.c:3139
bfq_split_bfqq+0x481/0xdf0 block/bfq-iosched.c:6754
bfq_init_rq+0xf29/0x17a0 block/bfq-iosched.c:6934
bfq_insert_request.isra.0+0xe8/0xa20 block/bfq-iosched.c:6271
bfq_insert_requests+0x27f/0x390 block/bfq-iosched.c:6323
blk_mq_insert_request+0x290/0x8f0 block/blk-mq.c:2660
blk_mq_submit_bio+0x1021/0x15e0 block/blk-mq.c:3143
__submit_bio+0xa0/0x6b0 block/blk-core.c:639
__submit_bio_noacct_mq block/blk-core.c:718 [inline]
submit_bio_noacct_nocheck+0x5b7/0x810 block/blk-core.c:747
submit_bio_noacct+0xca0/0x1990 block/blk-core.c:847
__ext4_read_bh fs/ext4/super.c:205 [inline]
ext4_read_bh+0x15e/0x2e0 fs/ext4/super.c:230
__read_extent_tree_block+0x304/0x6f0 fs/ext4/extents.c:567
ext4_find_extent+0x479/0xd20 fs/ext4/extents.c:947
ext4_ext_map_blocks+0x1a3/0x2680 fs/ext4/extents.c:4182
ext4_map_blocks+0x929/0x15a0 fs/ext4/inode.c:660
ext4_iomap_begin_report+0x298/0x480 fs/ext4/inode.c:3569
iomap_iter+0x3dd/0x1010 fs/iomap/iter.c:91
iomap_fiemap+0x1f4/0x360 fs/iomap/fiemap.c:80
ext4_fiemap+0x181/0x210 fs/ext4/extents.c:5051
ioctl_fiemap.isra.0+0x1b4/0x290 fs/ioctl.c:220
do_vfs_ioctl+0x31c/0x11a0 fs/ioctl.c:811
__do_sys_ioctl fs/ioctl.c:869 [inline]
__se_sys_ioctl+0xae/0x190 fs/ioctl.c:857
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x70/0x120 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x78/0xe2
commit 1ba0403ac644 ("block, bfq: fix uaf for accessing waker_bfqq after
splitting") fix the problem that if waker_bfqq is in the merge chain,
and current is the only procress, waker_bfqq can be freed from
bfq_split_bfqq(). However, the case that waker_bfqq is not in the merge
chain is missed, and if the procress reference of waker_bfqq is 0,
waker_bfqq can be freed as well.
Fix the problem by checking procress reference if waker_bfqq is not in
the merge_chain.
Fixes: 1ba0403ac644 ("block, bfq: fix uaf for accessing waker_bfqq after splitting")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20250108084148.1549973-1-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chris Lu <chris.lu@mediatek.com>
Date: Wed Jan 8 17:50:28 2025 +0800
Bluetooth: btmtk: Fix failed to send func ctrl for MediaTek devices.
[ Upstream commit 67dba2c28fe0af7e25ea1aeade677162ed05310a ]
Use usb_autopm_get_interface() and usb_autopm_put_interface()
in btmtk_usb_shutdown(), it could send func ctrl after enabling
autosuspend.
Bluetooth: btmtk_usb_hci_wmt_sync() hci0: Execution of wmt command
timed out
Bluetooth: btmtk_usb_shutdown() hci0: Failed to send wmt func ctrl
(-110)
Fixes: 5c5e8c52e3ca ("Bluetooth: btmtk: move btusb_mtk_[setup, shutdown] to btmtk.c")
Signed-off-by: Chris Lu <chris.lu@mediatek.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com>
Date: Fri Dec 20 18:32:52 2024 +0530
Bluetooth: btnxpuart: Fix driver sending truncated data
[ Upstream commit 8023dd2204254a70887f5ee58d914bf70a060b9d ]
This fixes the apparent controller hang issue seen during stress test
where the host sends a truncated payload, followed by HCI commands. The
controller treats these HCI commands as a part of previously truncated
payload, leading to command timeouts.
Adding a serdev_device_wait_until_sent() call after
serdev_device_write_buf() fixed the issue.
Fixes: 689ca16e5232 ("Bluetooth: NXP: Add protocol support for NXP Bluetooth chipsets")
Signed-off-by: Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date: Mon Nov 25 15:42:09 2024 -0500
Bluetooth: hci_sync: Fix not setting Random Address when required
[ Upstream commit c2994b008492db033d40bd767be1620229a3035e ]
This fixes errors such as the following when Own address type is set to
Random Address but it has not been programmed yet due to either be
advertising or connecting:
< HCI Command: LE Set Exte.. (0x08|0x0041) plen 13
Own address type: Random (0x03)
Filter policy: Ignore not in accept list (0x01)
PHYs: 0x05
Entry 0: LE 1M
Type: Passive (0x00)
Interval: 60.000 msec (0x0060)
Window: 30.000 msec (0x0030)
Entry 1: LE Coded
Type: Passive (0x00)
Interval: 180.000 msec (0x0120)
Window: 90.000 msec (0x0090)
> HCI Event: Command Complete (0x0e) plen 4
LE Set Extended Scan Parameters (0x08|0x0041) ncmd 1
Status: Success (0x00)
< HCI Command: LE Set Exten.. (0x08|0x0042) plen 6
Extended scan: Enabled (0x01)
Filter duplicates: Enabled (0x01)
Duration: 0 msec (0x0000)
Period: 0.00 sec (0x0000)
> HCI Event: Command Complete (0x0e) plen 4
LE Set Extended Scan Enable (0x08|0x0042) ncmd 1
Status: Invalid HCI Command Parameters (0x12)
Fixes: c45074d68a9b ("Bluetooth: Fix not generating RPA when required")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date: Mon Nov 25 15:42:10 2024 -0500
Bluetooth: MGMT: Fix Add Device to responding before completing
[ Upstream commit a182d9c84f9c52fb5db895ecceeee8b3a1bf661e ]
Add Device with LE type requires updating resolving/accept list which
requires quite a number of commands to complete and each of them may
fail, so instead of pretending it would always work this checks the
return of hci_update_passive_scan_sync which indicates if everything
worked as intended.
Fixes: e8907f76544f ("Bluetooth: hci_sync: Make use of hci_cmd_sync_queue set 3")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michael Chan <michael.chan@broadcom.com>
Date: Fri Jan 3 20:38:48 2025 -0800
bnxt_en: Fix DIM shutdown
[ Upstream commit 40452969a50652e3cbf89dac83d54eebf2206d27 ]
DIM work will call the firmware to adjust the coalescing parameters on
the RX rings. We should cancel DIM work before we call the firmware
to free the RX rings. Otherwise, FW will reject the call from DIM
work if the RX ring has been freed. This will generate an error
message like this:
bnxt_en 0000:21:00.1 ens2f1np1: hwrm req_type 0x53 seq id 0x6fca error 0x2
and cause unnecessary concern for the user. It is also possible to
modify the coalescing parameters of the wrong ring if the ring has
been re-allocated.
To prevent this, cancel DIM work right before freeing the RX rings.
We also have to add a check in NAPI poll to not schedule DIM if the
RX rings are shutting down. Check that the VNIC is active before we
schedule DIM. The VNIC is always disabled before we free the RX rings.
Fixes: 0bc0b97fca73 ("bnxt_en: cleanup DIM work on device shutdown")
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250104043849.3482067-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Date: Fri Jan 3 20:38:47 2025 -0800
bnxt_en: Fix possible memory leak when hwrm_req_replace fails
[ Upstream commit c8dafb0e4398dacc362832098a04b97da3b0395b ]
When hwrm_req_replace() fails, the driver is not invoking bnxt_req_drop()
which could cause a memory leak.
Fixes: bbf33d1d9805 ("bnxt_en: update all firmware calls to use the new APIs")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250104043849.3482067-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Qu Wenruo <wqu@suse.com>
Date: Thu Jan 2 14:44:16 2025 +1030
btrfs: avoid NULL pointer dereference if no valid extent tree
[ Upstream commit 6aecd91a5c5b68939cf4169e32bc49f3cd2dd329 ]
[BUG]
Syzbot reported a crash with the following call trace:
BTRFS info (device loop0): scrub: started on devid 1
BUG: kernel NULL pointer dereference, address: 0000000000000208
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 106e70067 P4D 106e70067 PUD 107143067 PMD 0
Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 UID: 0 PID: 689 Comm: repro Kdump: loaded Tainted: G O 6.13.0-rc4-custom+ #206
Tainted: [O]=OOT_MODULE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022
RIP: 0010:find_first_extent_item+0x26/0x1f0 [btrfs]
Call Trace:
<TASK>
scrub_find_fill_first_stripe+0x13d/0x3b0 [btrfs]
scrub_simple_mirror+0x175/0x260 [btrfs]
scrub_stripe+0x5d4/0x6c0 [btrfs]
scrub_chunk+0xbb/0x170 [btrfs]
scrub_enumerate_chunks+0x2f4/0x5f0 [btrfs]
btrfs_scrub_dev+0x240/0x600 [btrfs]
btrfs_ioctl+0x1dc8/0x2fa0 [btrfs]
? do_sys_openat2+0xa5/0xf0
__x64_sys_ioctl+0x97/0xc0
do_syscall_64+0x4f/0x120
entry_SYSCALL_64_after_hwframe+0x76/0x7e
</TASK>
[CAUSE]
The reproducer is using a corrupted image where extent tree root is
corrupted, thus forcing to use "rescue=all,ro" mount option to mount the
image.
Then it triggered a scrub, but since scrub relies on extent tree to find
where the data/metadata extents are, scrub_find_fill_first_stripe()
relies on an non-empty extent root.
But unfortunately scrub_find_fill_first_stripe() doesn't really expect
an NULL pointer for extent root, it use extent_root to grab fs_info and
triggered a NULL pointer dereference.
[FIX]
Add an extra check for a valid extent root at the beginning of
scrub_find_fill_first_stripe().
The new error path is introduced by 42437a6386ff ("btrfs: introduce
mount option rescue=ignorebadroots"), but that's pretty old, and later
commit b979547513ff ("btrfs: scrub: introduce helper to find and fill
sector info for a scrub_stripe") changed how we do scrub.
So for kernels older than 6.6, the fix will need manual backport.
Reported-by: syzbot+339e9dbe3a2ca419b85d@syzkaller.appspotmail.com
Link: https://lore.kernel.org/linux-btrfs/67756935.050a0220.25abdd.0a12.GAE@google.com/
Fixes: 42437a6386ff ("btrfs: introduce mount option rescue=ignorebadroots")
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Date: Wed Dec 18 11:32:51 2024 +0100
btrfs: zlib: fix avail_in bytes for s390 zlib HW compression path
commit 0ee4736c003daded513de0ff112d4a1e9c85bbab upstream.
Since the input data length passed to zlib_compress_folios() can be
arbitrary, always setting strm.avail_in to a multiple of PAGE_SIZE may
cause read-in bytes to exceed the input range. Currently this triggers
an assert in btrfs_compress_folios() on the debug kernel (see below).
Fix strm.avail_in calculation for S390 hardware acceleration path.
assertion failed: *total_in <= orig_len, in fs/btrfs/compression.c:1041
------------[ cut here ]------------
kernel BUG at fs/btrfs/compression.c:1041!
monitor event: 0040 ilc:2 [#1] PREEMPT SMP
CPU: 16 UID: 0 PID: 325 Comm: kworker/u273:3 Not tainted 6.13.0-20241204.rc1.git6.fae3b21430ca.300.fc41.s390x+debug #1
Hardware name: IBM 3931 A01 703 (z/VM 7.4.0)
Workqueue: btrfs-delalloc btrfs_work_helper
Krnl PSW : 0704d00180000000 0000021761df6538 (btrfs_compress_folios+0x198/0x1a0)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
Krnl GPRS: 0000000080000000 0000000000000001 0000000000000047 0000000000000000
0000000000000006 ffffff01757bb000 000001976232fcc0 000000000000130c
000001976232fcd0 000001976232fcc8 00000118ff4a0e30 0000000000000001
00000111821ab400 0000011100000000 0000021761df6534 000001976232fb58
Krnl Code: 0000021761df6528: c020006f5ef4 larl %r2,0000021762be2310
0000021761df652e: c0e5ffbd09d5 brasl %r14,00000217615978d8
#0000021761df6534: af000000 mc 0,0
>0000021761df6538: 0707 bcr 0,%r7
0000021761df653a: 0707 bcr 0,%r7
0000021761df653c: 0707 bcr 0,%r7
0000021761df653e: 0707 bcr 0,%r7
0000021761df6540: c004004bb7ec brcl 0,000002176276d518
Call Trace:
[<0000021761df6538>] btrfs_compress_folios+0x198/0x1a0
([<0000021761df6534>] btrfs_compress_folios+0x194/0x1a0)
[<0000021761d97788>] compress_file_range+0x3b8/0x6d0
[<0000021761dcee7c>] btrfs_work_helper+0x10c/0x160
[<0000021761645760>] process_one_work+0x2b0/0x5d0
[<000002176164637e>] worker_thread+0x20e/0x3e0
[<000002176165221a>] kthread+0x15a/0x170
[<00000217615b859c>] __ret_from_fork+0x3c/0x60
[<00000217626e72d2>] ret_from_fork+0xa/0x38
INFO: lockdep is turned off.
Last Breaking-Event-Address:
[<0000021761597924>] _printk+0x4c/0x58
Kernel panic - not syncing: Fatal exception: panic_on_oops
Fixes: fd1e75d0105d ("btrfs: make compression path to be subpage compatible")
CC: stable@vger.kernel.org # 6.12+
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Waiman Long <longman@redhat.com>
Date: Thu Dec 5 14:51:01 2024 -0500
cgroup/cpuset: Prevent leakage of isolated CPUs into sched domains
[ Upstream commit 9b496a8bbed9cc292b0dfd796f38ec58b6d0375f ]
Isolated CPUs are not allowed to be used in a non-isolated partition.
The only exception is the top cpuset which is allowed to contain boot
time isolated CPUs.
Commit ccac8e8de99c ("cgroup/cpuset: Fix remote root partition creation
problem") introduces a simplified scheme of including only partition
roots in sched domain generation. However, it does not properly account
for this exception case. This can result in leakage of isolated CPUs
into a sched domain.
Fix it by making sure that isolated CPUs are excluded from the top
cpuset before generating sched domains.
Also update the way the boot time isolated CPUs are handled in
test_cpuset_prs.sh to make sure that those isolated CPUs are really
isolated instead of just skipping them in the tests.
Fixes: ccac8e8de99c ("cgroup/cpuset: Fix remote root partition creation problem")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chen Ridong <chenridong@huawei.com>
Date: Mon Jan 6 08:19:04 2025 +0000
cgroup/cpuset: remove kernfs active break
[ Upstream commit 3cb97a927fffe443e1e7e8eddbfebfdb062e86ed ]
A warning was found:
WARNING: CPU: 10 PID: 3486953 at fs/kernfs/file.c:828
CPU: 10 PID: 3486953 Comm: rmdir Kdump: loaded Tainted: G
RIP: 0010:kernfs_should_drain_open_files+0x1a1/0x1b0
RSP: 0018:ffff8881107ef9e0 EFLAGS: 00010202
RAX: 0000000080000002 RBX: ffff888154738c00 RCX: dffffc0000000000
RDX: 0000000000000007 RSI: 0000000000000004 RDI: ffff888154738c04
RBP: ffff888154738c04 R08: ffffffffaf27fa15 R09: ffffed102a8e7180
R10: ffff888154738c07 R11: 0000000000000000 R12: ffff888154738c08
R13: ffff888750f8c000 R14: ffff888750f8c0e8 R15: ffff888154738ca0
FS: 00007f84cd0be740(0000) GS:ffff8887ddc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555f9fbe00c8 CR3: 0000000153eec001 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
kernfs_drain+0x15e/0x2f0
__kernfs_remove+0x165/0x300
kernfs_remove_by_name_ns+0x7b/0xc0
cgroup_rm_file+0x154/0x1c0
cgroup_addrm_files+0x1c2/0x1f0
css_clear_dir+0x77/0x110
kill_css+0x4c/0x1b0
cgroup_destroy_locked+0x194/0x380
cgroup_rmdir+0x2a/0x140
It can be explained by:
rmdir echo 1 > cpuset.cpus
kernfs_fop_write_iter // active=0
cgroup_rm_file
kernfs_remove_by_name_ns kernfs_get_active // active=1
__kernfs_remove // active=0x80000002
kernfs_drain cpuset_write_resmask
wait_event
//waiting (active == 0x80000001)
kernfs_break_active_protection
// active = 0x80000001
// continue
kernfs_unbreak_active_protection
// active = 0x80000002
...
kernfs_should_drain_open_files
// warning occurs
kernfs_put_active
This warning is caused by 'kernfs_break_active_protection' when it is
writing to cpuset.cpus, and the cgroup is removed concurrently.
The commit 3a5a6d0c2b03 ("cpuset: don't nest cgroup_mutex inside
get_online_cpus()") made cpuset_hotplug_workfn asynchronous, This change
involves calling flush_work(), which can create a multiple processes
circular locking dependency that involve cgroup_mutex, potentially leading
to a deadlock. To avoid deadlock. the commit 76bb5ab8f6e3 ("cpuset: break
kernfs active protection in cpuset_write_resmask()") added
'kernfs_break_active_protection' in the cpuset_write_resmask. This could
lead to this warning.
After the commit 2125c0034c5d ("cgroup/cpuset: Make cpuset hotplug
processing synchronous"), the cpuset_write_resmask no longer needs to
wait the hotplug to finish, which means that concurrent hotplug and cpuset
operations are no longer possible. Therefore, the deadlock doesn't exist
anymore and it does not have to 'break active protection' now. To fix this
warning, just remove kernfs_break_active_protection operation in the
'cpuset_write_resmask'.
Fixes: bdb2fd7fc56e ("kernfs: Skip kernfs_drain_open_files() more aggressively")
Fixes: 76bb5ab8f6e3 ("cpuset: break kernfs active protection in cpuset_write_resmask()")
Reported-by: Ji Fa <jifa@huawei.com>
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date: Sat Nov 16 00:32:39 2024 +0100
cpuidle: riscv-sbi: fix device node release in early exit of for_each_possible_cpu
[ Upstream commit 7e25044b804581b9c029d5a28d8800aebde18043 ]
The 'np' device_node is initialized via of_cpu_device_node_get(), which
requires explicit calls to of_node_put() when it is no longer required
to avoid leaking the resource.
Instead of adding the missing calls to of_node_put() in all execution
paths, use the cleanup attribute for 'np' by means of the __free()
macro, which automatically calls of_node_put() when the variable goes
out of scope. Given that 'np' is only used within the
for_each_possible_cpu(), reduce its scope to release the nood after
every iteration of the loop.
Fixes: 6abf32f1d9c5 ("cpuidle: Add RISC-V SBI CPU idle driver")
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Link: https://lore.kernel.org/r/20241116-cpuidle-riscv-sbi-cleanup-v3-1-a3a46372ce08@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Anumula Murali Mohan Reddy <anumula@chelsio.com>
Date: Fri Jan 3 14:53:27 2025 +0530
cxgb4: Avoid removal of uninserted tid
[ Upstream commit 4c1224501e9d6c5fd12d83752f1c1b444e0e3418 ]
During ARP failure, tid is not inserted but _c4iw_free_ep()
attempts to remove tid which results in error.
This patch fixes the issue by avoiding removal of uninserted tid.
Fixes: 59437d78f088 ("cxgb4/chtls: fix ULD connection failures due to wrong TID base")
Signed-off-by: Anumula Murali Mohan Reddy <anumula@chelsio.com>
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Link: https://patch.msgid.link/20250103092327.1011925-1-anumula@chelsio.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming-Hung Tsai <mtsai@redhat.com>
Date: Thu Dec 5 19:41:53 2024 +0800
dm array: fix cursor index when skipping across block boundaries
[ Upstream commit 0bb1968da2737ba68fd63857d1af2b301a18d3bf ]
dm_array_cursor_skip() seeks to the target position by loading array
blocks iteratively until the specified number of entries to skip is
reached. When seeking across block boundaries, it uses
dm_array_cursor_next() to step into the next block.
dm_array_cursor_skip() must first move the cursor index to the end
of the current block; otherwise, the cursor position could incorrectly
remain in the same block, causing the actual number of skipped entries
to be much smaller than expected.
This bug affects cache resizing in v2 metadata and could lead to data
loss if the fast device is shrunk during the first-time resume. For
example:
1. create a cache metadata consists of 32768 blocks, with a dirty block
assigned to the second bitmap block. cache_restore v1.0 is required.
cat <<EOF >> cmeta.xml
<superblock uuid="" block_size="64" nr_cache_blocks="32768" \
policy="smq" hint_width="4">
<mappings>
<mapping cache_block="32767" origin_block="0" dirty="true"/>
</mappings>
</superblock>
EOF
dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
cache_restore -i cmeta.xml -o /dev/mapper/cmeta --metadata-version=2
2. bring up the cache while attempt to discard all the blocks belonging
to the second bitmap block (block# 32576 to 32767). The last command
is expected to fail, but it actually succeeds.
dmsetup create cdata --table "0 2084864 linear /dev/sdc 8192"
dmsetup create corig --table "0 65536 linear /dev/sdc 2105344"
dmsetup create cache --table "0 65536 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 64 2 metadata2 writeback smq \
2 migration_threshold 0"
In addition to the reproducer described above, this fix can be
verified using the "array_cursor/skip" tests in dm-unit:
dm-unit run /pdata/array_cursor/skip/ --kernel-dir <KERNEL_DIR>
Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Fixes: 9b696229aa7d ("dm persistent data: add cursor skip functions to the cursor APIs")
Reviewed-by: Joe Thornber <thornber@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming-Hung Tsai <mtsai@redhat.com>
Date: Thu Dec 5 19:41:51 2024 +0800
dm array: fix releasing a faulty array block twice in dm_array_cursor_end
[ Upstream commit f2893c0804d86230ffb8f1c8703fdbb18648abc8 ]
When dm_bm_read_lock() fails due to locking or checksum errors, it
releases the faulty block implicitly while leaving an invalid output
pointer behind. The caller of dm_bm_read_lock() should not operate on
this invalid dm_block pointer, or it will lead to undefined result.
For example, the dm_array_cursor incorrectly caches the invalid pointer
on reading a faulty array block, causing a double release in
dm_array_cursor_end(), then hitting the BUG_ON in dm-bufio cache_put().
Reproduce steps:
1. initialize a cache device
dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
dmsetup create cdata --table "0 65536 linear /dev/sdc 8192"
dmsetup create corig --table "0 524288 linear /dev/sdc $262144"
dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1
dmsetup create cache --table "0 524288 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"
2. wipe the second array block offline
dmsteup remove cache cmeta cdata corig
mapping_root=$(dd if=/dev/sdc bs=1c count=8 skip=192 \
2>/dev/null | hexdump -e '1/8 "%u\n"')
ablock=$(dd if=/dev/sdc bs=1c count=8 skip=$((4096*mapping_root+2056)) \
2>/dev/null | hexdump -e '1/8 "%u\n"')
dd if=/dev/zero of=/dev/sdc bs=4k count=1 seek=$ablock
3. try reopen the cache device
dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
dmsetup create cdata --table "0 65536 linear /dev/sdc 8192"
dmsetup create corig --table "0 524288 linear /dev/sdc $262144"
dmsetup create cache --table "0 524288 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"
Kernel logs:
(snip)
device-mapper: array: array_block_check failed: blocknr 0 != wanted 10
device-mapper: block manager: array validator check failed for block 10
device-mapper: array: get_ablock failed
device-mapper: cache metadata: dm_array_cursor_next for mapping failed
------------[ cut here ]------------
kernel BUG at drivers/md/dm-bufio.c:638!
Fix by setting the cached block pointer to NULL on errors.
In addition to the reproducer described above, this fix can be
verified using the "array_cursor/damaged" test in dm-unit:
dm-unit run /pdata/array_cursor/damaged --kernel-dir <KERNEL_DIR>
Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Fixes: fdd1315aa5f0 ("dm array: introduce cursor api")
Reviewed-by: Joe Thornber <thornber@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming-Hung Tsai <mtsai@redhat.com>
Date: Thu Dec 5 19:41:52 2024 +0800
dm array: fix unreleased btree blocks on closing a faulty array cursor
[ Upstream commit 626f128ee9c4133b1cfce4be2b34a1508949370e ]
The cached block pointer in dm_array_cursor might be NULL if it reaches
an unreadable array block, or the array is empty. Therefore,
dm_array_cursor_end() should call dm_btree_cursor_end() unconditionally,
to prevent leaving unreleased btree blocks.
This fix can be verified using the "array_cursor/iterate/empty" test
in dm-unit:
dm-unit run /pdata/array_cursor/iterate/empty --kernel-dir <KERNEL_DIR>
Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Fixes: fdd1315aa5f0 ("dm array: introduce cursor api")
Reviewed-by: Joe Thornber <thornber@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Krister Johansen <kjlx@templeofstupid.com>
Date: Tue Jan 7 15:24:58 2025 -0800
dm thin: make get_first_thin use rcu-safe list first function
commit 80f130bfad1dab93b95683fc39b87235682b8f72 upstream.
The documentation in rculist.h explains the absence of list_empty_rcu()
and cautions programmers against relying on a list_empty() ->
list_first() sequence in RCU safe code. This is because each of these
functions performs its own READ_ONCE() of the list head. This can lead
to a situation where the list_empty() sees a valid list entry, but the
subsequent list_first() sees a different view of list head state after a
modification.
In the case of dm-thin, this author had a production box crash from a GP
fault in the process_deferred_bios path. This function saw a valid list
head in get_first_thin() but when it subsequently dereferenced that and
turned it into a thin_c, it got the inside of the struct pool, since the
list was now empty and referring to itself. The kernel on which this
occurred printed both a warning about a refcount_t being saturated, and
a UBSAN error for an out-of-bounds cpuid access in the queued spinlock,
prior to the fault itself. When the resulting kdump was examined, it
was possible to see another thread patiently waiting in thin_dtr's
synchronize_rcu.
The thin_dtr call managed to pull the thin_c out of the active thins
list (and have it be the last entry in the active_thins list) at just
the wrong moment which lead to this crash.
Fortunately, the fix here is straight forward. Switch get_first_thin()
function to use list_first_or_null_rcu() which performs just a single
READ_ONCE() and returns NULL if the list is already empty.
This was run against the devicemapper test suite's thin-provisioning
suites for delete and suspend and no regressions were observed.
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Fixes: b10ebd34ccca ("dm thin: fix rcu_read_lock being held in code that can sleep")
Cc: stable@vger.kernel.org
Acked-by: Ming-Hung Tsai <mtsai@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Mikulas Patocka <mpatocka@redhat.com>
Date: Tue Jan 7 17:47:01 2025 +0100
dm-ebs: don't set the flag DM_TARGET_PASSES_INTEGRITY
commit 47f33c27fc9565fb0bc7dfb76be08d445cd3d236 upstream.
dm-ebs uses dm-bufio to process requests that are not aligned on logical
sector size. dm-bufio doesn't support passing integrity data (and it is
unclear how should it do it), so we shouldn't set the
DM_TARGET_PASSES_INTEGRITY flag.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Fixes: d3c7b35c20d6 ("dm: add emulated block size target")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Milan Broz <gmazyland@gmail.com>
Date: Wed Dec 18 13:56:58 2024 +0100
dm-verity FEC: Fix RS FEC repair for roots unaligned to block size (take 2)
commit 6df90c02bae468a3a6110bafbc659884d0c4966c upstream.
This patch fixes an issue that was fixed in the commit
df7b59ba9245 ("dm verity: fix FEC for RS roots unaligned to block size")
but later broken again in the commit
8ca7cab82bda ("dm verity fec: fix misaligned RS roots IO")
If the Reed-Solomon roots setting spans multiple blocks, the code does not
use proper parity bytes and randomly fails to repair even trivial errors.
This bug cannot happen if the sector size is multiple of RS roots
setting (Android case with roots 2).
The previous solution was to find a dm-bufio block size that is multiple
of the device sector size and roots size. Unfortunately, the optimization
in commit 8ca7cab82bda ("dm verity fec: fix misaligned RS roots IO")
is incorrect and uses data block size for some roots (for example, it uses
4096 block size for roots = 20).
This patch uses a different approach:
- It always uses a configured data block size for dm-bufio to avoid
possible misaligned IOs.
- and it caches the processed parity bytes, so it can join it
if it spans two blocks.
As the RS calculation is called only if an error is detected and
the process is computationally intensive, copying a few more bytes
should not introduce performance issues.
The issue was reported to cryptsetup with trivial reproducer
https://gitlab.com/cryptsetup/cryptsetup/-/issues/923
Reproducer (with roots=20):
# create verity device with RS FEC
dd if=/dev/urandom of=data.img bs=4096 count=8 status=none
veritysetup format data.img hash.img --fec-device=fec.img --fec-roots=20 | \
awk '/^Root hash/{ print $3 }' >roothash
# create an erasure that should always be repairable with this roots setting
dd if=/dev/zero of=data.img conv=notrunc bs=1 count=4 seek=4 status=none
# try to read it through dm-verity
veritysetup open data.img test hash.img --fec-device=fec.img --fec-roots=20 $(cat roothash)
dd if=/dev/mapper/test of=/dev/null bs=4096 status=noxfer
Even now the log says it cannot repair it:
: verity-fec: 7:1: FEC 0: failed to correct: -74
: device-mapper: verity: 7:1: data block 0 is corrupted
...
With this fix, errors are properly repaired.
: verity-fec: 7:1: FEC 0: corrected 4 errors
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Fixes: 8ca7cab82bda ("dm verity fec: fix misaligned RS roots IO")
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Atish Patra <atishp@rivosinc.com>
Date: Thu Dec 12 16:09:32 2024 -0800
drivers/perf: riscv: Fix Platform firmware event data
[ Upstream commit fc58db9aeb15e89b69ff5e9abc69ecf9e5f888ed ]
Platform firmware event data field is allowed to be 62 bits for
Linux as uppper most two bits are reserved to indicate SBI fw or
platform specific firmware events.
However, the event data field is masked as per the hardware raw
event mask which is not correct.
Fix the platform firmware event data field with proper mask.
Fixes: f0c9363db2dd ("perf/riscv-sbi: Add platform specific firmware event handling")
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-1-813e8a4f5962@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Atish Patra <atishp@rivosinc.com>
Date: Thu Dec 12 16:09:33 2024 -0800
drivers/perf: riscv: Return error for default case
[ Upstream commit 2c206cdede567f53035c622e846678a996f39d69 ]
If the upper two bits has an invalid valid (0x1), the event mapping
is not reliable as it returns an uninitialized variable.
Return appropriate value for the default case.
Fixes: f0c9363db2dd ("perf/riscv-sbi: Add platform specific firmware event handling")
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-2-813e8a4f5962@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Roman Li <Roman.Li@amd.com>
Date: Fri Dec 13 13:51:07 2024 -0500
drm/amd/display: Add check for granularity in dml ceil/floor helpers
commit 0881fbc4fd62e00a2b8e102725f76d10351b2ea8 upstream.
[Why]
Wrapper functions for dcn_bw_ceil2() and dcn_bw_floor2()
should check for granularity is non zero to avoid assert and
divide-by-zero error in dcn_bw_ functions.
[How]
Add check for granularity 0.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f6e09701c3eb2ccb8cb0518e0b67f1c69742a4ec)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Melissa Wen <mwen@igalia.com>
Date: Tue Dec 17 17:45:05 2024 -0300
drm/amd/display: fix divide error in DM plane scale calcs
commit 5225fd2a26211d012533acf98a6ad3f983885817 upstream.
dm_get_plane_scale doesn't take into account plane scaled size equal to
zero, leading to a kernel oops due to division by zero. Fix by setting
out-scale size as zero when the dst size is zero, similar to what is
done by drm_calc_scale(). This issue started with the introduction of
cursor ovelay mode that uses this function to assess cursor mode changes
via dm_crtc_get_cursor_mode() before checking plane state.
[Dec17 17:14] Oops: divide error: 0000 [#1] PREEMPT SMP NOPTI
[ +0.000018] CPU: 5 PID: 1660 Comm: surface-DP-1 Not tainted 6.10.0+ #231
[ +0.000007] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0131 01/30/2024
[ +0.000004] RIP: 0010:dm_get_plane_scale+0x3f/0x60 [amdgpu]
[ +0.000553] Code: 44 0f b7 41 3a 44 0f b7 49 3e 83 e0 0f 48 0f a3 c2 73 21 69 41 28 e8 03 00 00 31 d2 41 f7 f1 31 d2 89 06 69 41 2c e8 03 00 00 <41> f7 f0 89 07 e9 d7 d8 7e e9 44 89 c8 45 89 c1 41 89 c0 eb d4 66
[ +0.000005] RSP: 0018:ffffa8df0de6b8a0 EFLAGS: 00010246
[ +0.000006] RAX: 00000000000003e8 RBX: ffff9ac65c1f6e00 RCX: ffff9ac65d055500
[ +0.000003] RDX: 0000000000000000 RSI: ffffa8df0de6b8b0 RDI: ffffa8df0de6b8b4
[ +0.000004] RBP: ffff9ac64e7a5800 R08: 0000000000000000 R09: 0000000000000a00
[ +0.000003] R10: 00000000000000ff R11: 0000000000000054 R12: ffff9ac6d0700010
[ +0.000003] R13: ffff9ac65d054f00 R14: ffff9ac65d055500 R15: ffff9ac64e7a60a0
[ +0.000004] FS: 00007f869ea00640(0000) GS:ffff9ac970080000(0000) knlGS:0000000000000000
[ +0.000004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.000003] CR2: 000055ca701becd0 CR3: 000000010e7f2000 CR4: 0000000000350ef0
[ +0.000004] Call Trace:
[ +0.000007] <TASK>
[ +0.000006] ? __die_body.cold+0x19/0x27
[ +0.000009] ? die+0x2e/0x50
[ +0.000007] ? do_trap+0xca/0x110
[ +0.000007] ? do_error_trap+0x6a/0x90
[ +0.000006] ? dm_get_plane_scale+0x3f/0x60 [amdgpu]
[ +0.000504] ? exc_divide_error+0x38/0x50
[ +0.000005] ? dm_get_plane_scale+0x3f/0x60 [amdgpu]
[ +0.000488] ? asm_exc_divide_error+0x1a/0x20
[ +0.000011] ? dm_get_plane_scale+0x3f/0x60 [amdgpu]
[ +0.000593] dm_crtc_get_cursor_mode+0x33f/0x430 [amdgpu]
[ +0.000562] amdgpu_dm_atomic_check+0x2ef/0x1770 [amdgpu]
[ +0.000501] drm_atomic_check_only+0x5e1/0xa30 [drm]
[ +0.000047] drm_mode_atomic_ioctl+0x832/0xcb0 [drm]
[ +0.000050] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm]
[ +0.000047] drm_ioctl_kernel+0xb3/0x100 [drm]
[ +0.000062] drm_ioctl+0x27a/0x4f0 [drm]
[ +0.000049] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm]
[ +0.000055] amdgpu_drm_ioctl+0x4e/0x90 [amdgpu]
[ +0.000360] __x64_sys_ioctl+0x97/0xd0
[ +0.000010] do_syscall_64+0x82/0x190
[ +0.000008] ? __pfx_drm_mode_createblob_ioctl+0x10/0x10 [drm]
[ +0.000044] ? srso_return_thunk+0x5/0x5f
[ +0.000006] ? drm_ioctl_kernel+0xb3/0x100 [drm]
[ +0.000040] ? srso_return_thunk+0x5/0x5f
[ +0.000005] ? __check_object_size+0x50/0x220
[ +0.000007] ? srso_return_thunk+0x5/0x5f
[ +0.000005] ? srso_return_thunk+0x5/0x5f
[ +0.000005] ? drm_ioctl+0x2a4/0x4f0 [drm]
[ +0.000039] ? __pfx_drm_mode_createblob_ioctl+0x10/0x10 [drm]
[ +0.000043] ? srso_return_thunk+0x5/0x5f
[ +0.000005] ? srso_return_thunk+0x5/0x5f
[ +0.000005] ? __pm_runtime_suspend+0x69/0xc0
[ +0.000006] ? srso_return_thunk+0x5/0x5f
[ +0.000005] ? amdgpu_drm_ioctl+0x71/0x90 [amdgpu]
[ +0.000366] ? srso_return_thunk+0x5/0x5f
[ +0.000006] ? syscall_exit_to_user_mode+0x77/0x210
[ +0.000007] ? srso_return_thunk+0x5/0x5f
[ +0.000005] ? do_syscall_64+0x8e/0x190
[ +0.000006] ? srso_return_thunk+0x5/0x5f
[ +0.000006] ? do_syscall_64+0x8e/0x190
[ +0.000006] ? srso_return_thunk+0x5/0x5f
[ +0.000007] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ +0.000008] RIP: 0033:0x55bb7cd962bc
[ +0.000007] Code: 4c 89 6c 24 18 4c 89 64 24 20 4c 89 74 24 28 0f 57 c0 0f 11 44 24 30 89 c7 48 8d 54 24 08 b8 10 00 00 00 be bc 64 38 c0 0f 05 <49> 89 c7 48 83 3b 00 74 09 4c 89 c7 ff 15 62 64 99 00 48 83 7b 18
[ +0.000005] RSP: 002b:00007f869e9f4da0 EFLAGS: 00000217 ORIG_RAX: 0000000000000010
[ +0.000007] RAX: ffffffffffffffda RBX: 00007f869e9f4fb8 RCX: 000055bb7cd962bc
[ +0.000004] RDX: 00007f869e9f4da8 RSI: 00000000c03864bc RDI: 000000000000003b
[ +0.000003] RBP: 000055bb9ddcbcc0 R08: 00007f86541b9920 R09: 0000000000000009
[ +0.000004] R10: 0000000000000004 R11: 0000000000000217 R12: 00007f865406c6b0
[ +0.000003] R13: 00007f86541b5290 R14: 00007f865410b700 R15: 000055bb9ddcbc18
[ +0.000009] </TASK>
Fixes: 1b04dcca4fb1 ("drm/amd/display: Introduce overlay cursor mode")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3729
Reported-by: Fabio Scaccabarozzi <fsvm88@gmail.com>
Co-developed-by: Fabio Scaccabarozzi <fsvm88@gmail.com>
Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ab75a0d2e07942ae15d32c0a5092fd336451378c)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Melissa Wen <mwen@igalia.com>
Date: Tue Dec 17 17:45:03 2024 -0300
drm/amd/display: fix page fault due to max surface definition mismatch
commit 7de8d5c90be9ad9f6575e818a674801db2ada794 upstream.
DC driver is using two different values to define the maximum number of
surfaces: MAX_SURFACES and MAX_SURFACE_NUM. Consolidate MAX_SURFACES as
the unique definition for surface updates across DC.
It fixes page fault faced by Cosmic users on AMD display versions that
support two overlay planes, since the introduction of cursor overlay
mode.
[Nov26 21:33] BUG: unable to handle page fault for address: 0000000051d0f08b
[ +0.000015] #PF: supervisor read access in kernel mode
[ +0.000006] #PF: error_code(0x0000) - not-present page
[ +0.000005] PGD 0 P4D 0
[ +0.000007] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
[ +0.000006] CPU: 4 PID: 71 Comm: kworker/u32:6 Not tainted 6.10.0+ #300
[ +0.000006] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0131 01/30/2024
[ +0.000007] Workqueue: events_unbound commit_work [drm_kms_helper]
[ +0.000040] RIP: 0010:copy_stream_update_to_stream.isra.0+0x30d/0x750 [amdgpu]
[ +0.000847] Code: 8b 10 49 89 94 24 f8 00 00 00 48 8b 50 08 49 89 94 24 00 01 00 00 8b 40 10 41 89 84 24 08 01 00 00 49 8b 45 78 48 85 c0 74 0b <0f> b6 00 41 88 84 24 90 64 00 00 49 8b 45 60 48 85 c0 74 3b 48 8b
[ +0.000010] RSP: 0018:ffffc203802f79a0 EFLAGS: 00010206
[ +0.000009] RAX: 0000000051d0f08b RBX: 0000000000000004 RCX: ffff9f964f0a8070
[ +0.000004] RDX: ffff9f9710f90e40 RSI: ffff9f96600c8000 RDI: ffff9f964f000000
[ +0.000004] RBP: ffffc203802f79f8 R08: 0000000000000000 R09: 0000000000000000
[ +0.000005] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9f96600c8000
[ +0.000004] R13: ffff9f9710f90e40 R14: ffff9f964f000000 R15: ffff9f96600c8000
[ +0.000004] FS: 0000000000000000(0000) GS:ffff9f9970000000(0000) knlGS:0000000000000000
[ +0.000005] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.000005] CR2: 0000000051d0f08b CR3: 00000002e6a20000 CR4: 0000000000350ef0
[ +0.000005] Call Trace:
[ +0.000011] <TASK>
[ +0.000010] ? __die_body.cold+0x19/0x27
[ +0.000012] ? page_fault_oops+0x15a/0x2d0
[ +0.000014] ? exc_page_fault+0x7e/0x180
[ +0.000009] ? asm_exc_page_fault+0x26/0x30
[ +0.000013] ? copy_stream_update_to_stream.isra.0+0x30d/0x750 [amdgpu]
[ +0.000739] ? dc_commit_state_no_check+0xd6c/0xe70 [amdgpu]
[ +0.000470] update_planes_and_stream_state+0x49b/0x4f0 [amdgpu]
[ +0.000450] ? srso_return_thunk+0x5/0x5f
[ +0.000009] ? commit_minimal_transition_state+0x239/0x3d0 [amdgpu]
[ +0.000446] update_planes_and_stream_v2+0x24a/0x590 [amdgpu]
[ +0.000464] ? srso_return_thunk+0x5/0x5f
[ +0.000009] ? sort+0x31/0x50
[ +0.000007] ? amdgpu_dm_atomic_commit_tail+0x159f/0x3a30 [amdgpu]
[ +0.000508] ? srso_return_thunk+0x5/0x5f
[ +0.000009] ? amdgpu_crtc_get_scanout_position+0x28/0x40 [amdgpu]
[ +0.000377] ? srso_return_thunk+0x5/0x5f
[ +0.000009] ? drm_crtc_vblank_helper_get_vblank_timestamp_internal+0x160/0x390 [drm]
[ +0.000058] ? srso_return_thunk+0x5/0x5f
[ +0.000005] ? dma_fence_default_wait+0x8c/0x260
[ +0.000010] ? srso_return_thunk+0x5/0x5f
[ +0.000005] ? wait_for_completion_timeout+0x13b/0x170
[ +0.000006] ? srso_return_thunk+0x5/0x5f
[ +0.000005] ? dma_fence_wait_timeout+0x108/0x140
[ +0.000010] ? commit_tail+0x94/0x130 [drm_kms_helper]
[ +0.000024] ? process_one_work+0x177/0x330
[ +0.000008] ? worker_thread+0x266/0x3a0
[ +0.000006] ? __pfx_worker_thread+0x10/0x10
[ +0.000004] ? kthread+0xd2/0x100
[ +0.000006] ? __pfx_kthread+0x10/0x10
[ +0.000006] ? ret_from_fork+0x34/0x50
[ +0.000004] ? __pfx_kthread+0x10/0x10
[ +0.000005] ? ret_from_fork_asm+0x1a/0x30
[ +0.000011] </TASK>
Fixes: 1b04dcca4fb1 ("drm/amd/display: Introduce overlay cursor mode")
Suggested-by: Leo Li <sunpeng.li@amd.com>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3693
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1c86c81a86c60f9b15d3e3f43af0363cf56063e7)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Melissa Wen <mwen@igalia.com>
Date: Tue Dec 17 17:45:04 2024 -0300
drm/amd/display: increase MAX_SURFACES to the value supported by hw
commit 21541bc6b44241e3f791f9e552352d8440b2b29e upstream.
As the hw supports up to 4 surfaces, increase the maximum number of
surfaces to prevent the DC error when trying to use more than three
planes.
[drm:dc_state_add_plane [amdgpu]] *ERROR* Surface: can not attach plane_state 000000003e2cb82c! Maximum is: 3
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3693
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b8d6daffc871a42026c3c20bff7b8fa0302298c1)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Alex Hung <alex.hung@amd.com>
Date: Tue Dec 17 14:03:50 2024 -0700
drm/amd/display: Remove unnecessary amdgpu_irq_get/put
commit 5009628d8509dbb90e1b88e01eda00430fa24b4b upstream.
[WHY & HOW]
commit 7fb363c57522 ("drm/amd/display: Let drm_crtc_vblank_on/off manage interrupts")
lets drm_crtc_vblank_* to manage interrupts in amdgpu_dm_crtc_set_vblank,
and amdgpu_irq_get/put do not need to be called here. Part of that
patch got lost somehow, so fix it up.
Fixes: 7fb363c57522 ("drm/amd/display: Let drm_crtc_vblank_on/off manage interrupts")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3782305ce5807c18fbf092124b9e8303cf1723ae)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kun Liu <Kun.Liu2@amd.com>
Date: Fri Dec 27 11:43:22 2024 +0800
drm/amd/pm: fix BUG: scheduling while atomic
commit 2a238b09bfd04e8155a7a323364bce1c38b28c0f upstream.
atomic scheduling will be triggered in interrupt handler for
AC/DC mode switch as following backtrace.
Call Trace:
<IRQ>
dump_stack_lvl
__schedule_bug
__schedule
schedule
schedule_preempt_disabled
__mutex_lock
smu_cmn_send_smc_msg_with_param
smu_v13_0_irq_process
amdgpu_irq_dispatch
amdgpu_ih_process
amdgpu_irq_handler
__handle_irq_event_percpu
handle_irq_event
handle_edge_irq
__common_interrupt
common_interrupt
</IRQ>
<TASK>
asm_common_interrupt
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Kun Liu <Kun.Liu2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 03cc84b102d1a832e8dfc59344346dedcebcdf42)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Date: Tue Dec 10 12:50:08 2024 +0530
drm/amdgpu: Add a lock when accessing the buddy trim function
commit 75c8b703e5bded1e33b08fb09b829e7c2c1ed50a upstream.
When running YouTube videos and Steam games simultaneously,
the tester found a system hang / race condition issue with
the multi-display configuration setting. Adding a lock to
the buddy allocator's trim function would be the solution.
<log snip>
[ 7197.250436] general protection fault, probably for non-canonical address 0xdead000000000108
[ 7197.250447] RIP: 0010:__alloc_range+0x8b/0x340 [amddrm_buddy]
[ 7197.250470] Call Trace:
[ 7197.250472] <TASK>
[ 7197.250475] ? show_regs+0x6d/0x80
[ 7197.250481] ? die_addr+0x37/0xa0
[ 7197.250483] ? exc_general_protection+0x1db/0x480
[ 7197.250488] ? drm_suballoc_new+0x13c/0x93d [drm_suballoc_helper]
[ 7197.250493] ? asm_exc_general_protection+0x27/0x30
[ 7197.250498] ? __alloc_range+0x8b/0x340 [amddrm_buddy]
[ 7197.250501] ? __alloc_range+0x109/0x340 [amddrm_buddy]
[ 7197.250506] amddrm_buddy_block_trim+0x1b5/0x260 [amddrm_buddy]
[ 7197.250511] amdgpu_vram_mgr_new+0x4f5/0x590 [amdgpu]
[ 7197.250682] amdttm_resource_alloc+0x46/0xb0 [amdttm]
[ 7197.250689] ttm_bo_alloc_resource+0xe4/0x370 [amdttm]
[ 7197.250696] amdttm_bo_validate+0x9d/0x180 [amdttm]
[ 7197.250701] amdgpu_bo_pin+0x15a/0x2f0 [amdgpu]
[ 7197.250831] amdgpu_dm_plane_helper_prepare_fb+0xb2/0x360 [amdgpu]
[ 7197.251025] ? try_wait_for_completion+0x59/0x70
[ 7197.251030] drm_atomic_helper_prepare_planes.part.0+0x2f/0x1e0
[ 7197.251035] drm_atomic_helper_prepare_planes+0x5d/0x70
[ 7197.251037] drm_atomic_helper_commit+0x84/0x160
[ 7197.251040] drm_atomic_nonblocking_commit+0x59/0x70
[ 7197.251043] drm_mode_atomic_ioctl+0x720/0x850
[ 7197.251047] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
[ 7197.251049] drm_ioctl_kernel+0xb9/0x120
[ 7197.251053] ? srso_alias_return_thunk+0x5/0xfbef5
[ 7197.251056] drm_ioctl+0x2d4/0x550
[ 7197.251058] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
[ 7197.251063] amdgpu_drm_ioctl+0x4e/0x90 [amdgpu]
[ 7197.251186] __x64_sys_ioctl+0xa0/0xf0
[ 7197.251190] x64_sys_call+0x143b/0x25c0
[ 7197.251193] do_syscall_64+0x7f/0x180
[ 7197.251197] ? srso_alias_return_thunk+0x5/0xfbef5
[ 7197.251199] ? amdgpu_display_user_framebuffer_create+0x215/0x320 [amdgpu]
[ 7197.251329] ? drm_internal_framebuffer_create+0xb7/0x1a0
[ 7197.251332] ? srso_alias_return_thunk+0x5/0xfbef5
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Fixes: 4a5ad08f5377 ("drm/amdgpu: Add address alignment support to DCC buffers")
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3318ba94e56b9183d0304577c74b33b6b01ce516)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jesse.zhang@amd.com <Jesse.zhang@amd.com>
Date: Wed Dec 18 18:23:52 2024 +0800
drm/amdkfd: fixed page fault when enable MES shader debugger
commit 9738609449c3e44d1afb73eecab4763362b57930 upstream.
Initialize the process context address before setting the shader debugger.
[ 260.781212] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0)
[ 260.781236] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10
[ 260.781255] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00040A40
[ 260.781270] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5)
[ 260.781284] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 260.781296] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 260.781308] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x4
[ 260.781320] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 260.781332] amdgpu 0000:03:00.0: amdgpu: RW: 0x1
[ 260.782017] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0)
[ 260.782039] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10
[ 260.782058] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00040A41
[ 260.782073] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5)
[ 260.782087] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1
[ 260.782098] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 260.782110] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x4
[ 260.782122] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 260.782137] amdgpu 0000:03:00.0: amdgpu: RW: 0x1
[ 260.782155] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0)
[ 260.782166] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10
Fixes: 438b39ac74e2 ("drm/amdkfd: pause autosuspend when creating pdd")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3849
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5b231f5bc9ff02ec5737f2ec95cdf15ac95088e9)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Zhu Lingshan <lingshan.zhu@amd.com>
Date: Wed Dec 11 11:51:13 2024 +0800
drm/amdkfd: wq_release signals dma_fence only when available
commit a993d319aebb7cce8a10c6e685344b7c2ad5c4c2 upstream.
kfd_process_wq_release() signals eviction fence by
dma_fence_signal() which wanrs if dma_fence
is NULL.
kfd_process->ef is initialized by kfd_process_device_init_vm()
through ioctl. That means the fence is NULL for a new
created kfd_process, and close a kfd_process right
after open it will trigger the warning.
This commit conditionally signals the eviction fence
in kfd_process_wq_release() only when it is available.
[ 503.660882] WARNING: CPU: 0 PID: 9 at drivers/dma-buf/dma-fence.c:467 dma_fence_signal+0x74/0xa0
[ 503.782940] Workqueue: kfd_process_wq kfd_process_wq_release [amdgpu]
[ 503.789640] RIP: 0010:dma_fence_signal+0x74/0xa0
[ 503.877620] Call Trace:
[ 503.880066] <TASK>
[ 503.882168] ? __warn+0xcd/0x260
[ 503.885407] ? dma_fence_signal+0x74/0xa0
[ 503.889416] ? report_bug+0x288/0x2d0
[ 503.893089] ? handle_bug+0x53/0xa0
[ 503.896587] ? exc_invalid_op+0x14/0x50
[ 503.900424] ? asm_exc_invalid_op+0x16/0x20
[ 503.904616] ? dma_fence_signal+0x74/0xa0
[ 503.908626] kfd_process_wq_release+0x6b/0x370 [amdgpu]
[ 503.914081] process_one_work+0x654/0x10a0
[ 503.918186] worker_thread+0x6c3/0xe70
[ 503.921943] ? srso_alias_return_thunk+0x5/0xfbef5
[ 503.926735] ? srso_alias_return_thunk+0x5/0xfbef5
[ 503.931527] ? __kthread_parkme+0x82/0x140
[ 503.935631] ? __pfx_worker_thread+0x10/0x10
[ 503.939904] kthread+0x2a8/0x380
[ 503.943132] ? __pfx_kthread+0x10/0x10
[ 503.946882] ret_from_fork+0x2d/0x70
[ 503.950458] ? __pfx_kthread+0x10/0x10
[ 503.954210] ret_from_fork_asm+0x1a/0x30
[ 503.958142] </TASK>
[ 503.960328] ---[ end trace 0000000000000000 ]---
Fixes: 967d226eaae8 ("dma-buf: add WARN_ON() illegal dma-fence signaling")
Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2774ef7625adb5fb9e9265c26a59dca7b8fd171e)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Liankun Yang <liankun.yang@mediatek.com>
Date: Wed Dec 18 19:34:07 2024 +0800
drm/mediatek: Add return value check when reading DPCD
[ Upstream commit 522908140645865dc3e2fac70fd3b28834dfa7be ]
Check the return value of drm_dp_dpcd_readb() to confirm that
AUX communication is successful. To simplify the code, replace
drm_dp_dpcd_readb() and DP_GET_SINK_COUNT() with drm_dp_read_sink_count().
Fixes: f70ac097a2cf ("drm/mediatek: Add MT8195 Embedded DisplayPort driver")
Signed-off-by: Liankun Yang <liankun.yang@mediatek.com>
Reviewed-by: Guillaume Ranquet <granquet@baylibre.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20241218113448.2992-1-liankun.yang@mediatek.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jason-JH.Lin <jason-jh.lin@mediatek.com>
Date: Mon Nov 18 10:51:26 2024 +0800
drm/mediatek: Add support for 180-degree rotation in the display driver
[ Upstream commit 5c9d7e79ba154e8e1f0bfdeb7b495f454c1a3eba ]
mediatek-drm driver reported the capability of 180-degree rotation by
adding `DRM_MODE_ROTATE_180` to the plane property, as flip-x combined
with flip-y equals a 180-degree rotation. However, we did not handle
the rotation property in the driver and lead to rotation issues.
Fixes: 74608d8feefd ("drm/mediatek: Add DRM_MODE_ROTATE_0 to rotation property")
Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: CK Hu <ck.hu@mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20241118025126.30808-1-jason-jh.lin@mediatek.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Liankun Yang <liankun.yang@mediatek.com>
Date: Fri Oct 25 16:28:28 2024 +0800
drm/mediatek: Fix mode valid issue for dp
[ Upstream commit 0d68b55887cedc7487036ed34cb4c2097c4228f1 ]
Fix dp mode valid issue to avoid abnormal display of limit state.
After DP passes link training, it can express the lane count of the
current link status is good. Calculate the maximum bandwidth supported
by DP using the current lane count.
The color format will select the best one based on the bandwidth
requirements of the current timing mode. If the current timing mode
uses RGB and meets the DP link bandwidth requirements, RGB will be used.
If the timing mode uses RGB but does not meet the DP link bandwidthi
requirements, it will continue to check whether YUV422 meets
the DP link bandwidth.
FEC overhead is approximately 2.4% from DP 1.4a spec 2.2.1.4.2.
The down-spread amplitude shall either be disabled (0.0%) or up
to 0.5% from 1.4a 3.5.2.6. Add up to approximately 3% total overhead.
Because rate is already divided by 10,
mode->clock does not need to be multiplied by 10.
Fixes: f70ac097a2cf ("drm/mediatek: Add MT8195 Embedded DisplayPort driver")
Signed-off-by: Liankun Yang <liankun.yang@mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20241025083036.8829-3-liankun.yang@mediatek.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Liankun Yang <liankun.yang@mediatek.com>
Date: Fri Oct 25 16:28:27 2024 +0800
drm/mediatek: Fix YCbCr422 color format issue for DP
[ Upstream commit ef24fbd8f12015ff827973fffefed3902ffd61cc ]
Setting up misc0 for Pixel Encoding Format.
According to the definition of YCbCr in spec 1.2a Table 2-96,
0x1 << 1 should be written to the register.
Use switch case to distinguish RGB, YCbCr422,
and unsupported color formats.
Fixes: f70ac097a2cf ("drm/mediatek: Add MT8195 Embedded DisplayPort driver")
Signed-off-by: Liankun Yang <liankun.yang@mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20241025083036.8829-2-liankun.yang@mediatek.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jason-JH.Lin <jason-jh.lin@mediatek.com>
Date: Wed Dec 11 11:47:16 2024 +0800
drm/mediatek: Move mtk_crtc_finish_page_flip() to ddp_cmdq_cb()
[ Upstream commit da03801ad08f2488c01e684509cd89e1aa5d17ec ]
mtk_crtc_finish_page_flip() is used to notify userspace that a
page flip has been completed, allowing userspace to free the frame
buffer of the last frame and commit the next frame.
In MediaTek's hardware design for configuring display hardware by using
GCE, `DRM_EVENT_FLIP_COMPLETE` should be notified to userspace after
GCE has finished configuring all display hardware settings for each
atomic_commit().
Currently, mtk_crtc_finish_page_flip() cannot guarantee that GCE has
configured all the display hardware settings of the last frame.
Therefore, to increase the accuracy of the timing for notifying
`DRM_EVENT_FLIP_COMPLETE` to userspace, mtk_crtc_finish_page_flip()
should be moved to ddp_cmdq_cb().
Fixes: 7f82d9c43879 ("drm/mediatek: Clear pending flag when cmdq packet is done")
Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com>
Reviewed-by: CK Hu <ck.hu@mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20241211034716.29241-1-jason-jh.lin@mediatek.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Date: Thu Dec 19 12:27:33 2024 +0100
drm/mediatek: mtk_dsi: Add registers to pdata to fix MT8186/MT8188
[ Upstream commit 76aed5e00ff2625e0ec4b40c75f3514bdb27fae4 ]
Registers DSI_VM_CMD and DSI_SHADOW_DEBUG start at different
addresses in both MT8186 and MT8188 compared to the older IPs.
Add two members in struct mtk_dsi_driver_data to specify the
offsets for these two registers on a per-SoC basis, then do
specify those in all of the currently present SoC driver data.
This fixes writes to the Video Mode Command Packet Control
register, fixing enablement of command packet transmission
(VM_CMD_EN) and allowance of this transmission during the
VFP period (TS_VFP_EN) on both MT8186 and MT8188.
Fixes: 03d7adc41027 ("drm/mediatek: Add mt8186 dsi compatible to mtk_dsi.c")
Fixes: 814d5341f314 ("drm/mediatek: Add mt8188 dsi compatible to mtk_dsi.c")
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: CK Hu <ck.hu@mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20241219112733.47907-1-angelogioacchino.delregno@collabora.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniel Golle <daniel@makrotopia.org>
Date: Tue Dec 17 01:18:01 2024 +0000
drm/mediatek: Only touch DISP_REG_OVL_PITCH_MSB if AFBC is supported
[ Upstream commit f8d9b91739e1fb436447c437a346a36deb676a36 ]
Touching DISP_REG_OVL_PITCH_MSB leads to video overlay on MT2701, MT7623N
and probably other older SoCs being broken.
Move setting up AFBC layer configuration into a separate function only
being called on hardware which actually supports AFBC which restores the
behavior as it was before commit c410fa9b07c3 ("drm/mediatek: Add AFBC
support to Mediatek DRM driver") on non-AFBC hardware.
Fixes: c410fa9b07c3 ("drm/mediatek: Add AFBC support to Mediatek DRM driver")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: CK Hu <ck.hu@mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/c7fbd3c3e633c0b7dd6d1cd78ccbdded31e1ca0f.1734397800.git.daniel@makrotopia.org/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Guoqing Jiang <guoqing.jiang@canonical.com>
Date: Mon Dec 23 10:32:27 2024 +0800
drm/mediatek: Set private->all_drm_private[i]->drm to NULL if mtk_drm_bind returns err
[ Upstream commit 36684e9d88a2e2401ae26715a2e217cb4295cea7 ]
The pointer need to be set to NULL, otherwise KASAN complains about
use-after-free. Because in mtk_drm_bind, all private's drm are set
as follows.
private->all_drm_private[i]->drm = drm;
And drm will be released by drm_dev_put in case mtk_drm_kms_init returns
failure. However, the shutdown path still accesses the previous allocated
memory in drm_atomic_helper_shutdown.
[ 84.874820] watchdog: watchdog0: watchdog did not stop!
[ 86.512054] ==================================================================
[ 86.513162] BUG: KASAN: use-after-free in drm_atomic_helper_shutdown+0x33c/0x378
[ 86.514258] Read of size 8 at addr ffff0000d46fc068 by task shutdown/1
[ 86.515213]
[ 86.515455] CPU: 1 UID: 0 PID: 1 Comm: shutdown Not tainted 6.13.0-rc1-mtk+gfa1a78e5d24b-dirty #55
[ 86.516752] Hardware name: Unknown Product/Unknown Product, BIOS 2022.10 10/01/2022
[ 86.517960] Call trace:
[ 86.518333] show_stack+0x20/0x38 (C)
[ 86.518891] dump_stack_lvl+0x90/0xd0
[ 86.519443] print_report+0xf8/0x5b0
[ 86.519985] kasan_report+0xb4/0x100
[ 86.520526] __asan_report_load8_noabort+0x20/0x30
[ 86.521240] drm_atomic_helper_shutdown+0x33c/0x378
[ 86.521966] mtk_drm_shutdown+0x54/0x80
[ 86.522546] platform_shutdown+0x64/0x90
[ 86.523137] device_shutdown+0x260/0x5b8
[ 86.523728] kernel_restart+0x78/0xf0
[ 86.524282] __do_sys_reboot+0x258/0x2f0
[ 86.524871] __arm64_sys_reboot+0x90/0xd8
[ 86.525473] invoke_syscall+0x74/0x268
[ 86.526041] el0_svc_common.constprop.0+0xb0/0x240
[ 86.526751] do_el0_svc+0x4c/0x70
[ 86.527251] el0_svc+0x4c/0xc0
[ 86.527719] el0t_64_sync_handler+0x144/0x168
[ 86.528367] el0t_64_sync+0x198/0x1a0
[ 86.528920]
[ 86.529157] The buggy address belongs to the physical page:
[ 86.529972] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff0000d46fd4d0 pfn:0x1146fc
[ 86.531319] flags: 0xbfffc0000000000(node=0|zone=2|lastcpupid=0xffff)
[ 86.532267] raw: 0bfffc0000000000 0000000000000000 dead000000000122 0000000000000000
[ 86.533390] raw: ffff0000d46fd4d0 0000000000000000 00000000ffffffff 0000000000000000
[ 86.534511] page dumped because: kasan: bad access detected
[ 86.535323]
[ 86.535559] Memory state around the buggy address:
[ 86.536265] ffff0000d46fbf00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 86.537314] ffff0000d46fbf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 86.538363] >ffff0000d46fc000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 86.544733] ^
[ 86.551057] ffff0000d46fc080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 86.557510] ffff0000d46fc100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 86.563928] ==================================================================
[ 86.571093] Disabling lock debugging due to kernel taint
[ 86.577642] Unable to handle kernel paging request at virtual address e0e9c0920000000b
[ 86.581834] KASAN: maybe wild-memory-access in range [0x0752049000000058-0x075204900000005f]
...
Fixes: 1ef7ed48356c ("drm/mediatek: Modify mediatek-drm for mt8195 multi mmsys support")
Signed-off-by: Guoqing Jiang <guoqing.jiang@canonical.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20241223023227.1258112-1-guoqing.jiang@canonical.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Arnd Bergmann <arnd@arndb.de>
Date: Wed Dec 18 09:58:31 2024 +0100
drm/mediatek: stop selecting foreign drivers
[ Upstream commit 924d66011f2401a4145e2e814842c5c4572e439f ]
The PHY portion of the mediatek hdmi driver was originally part of
the driver it self and later split out into drivers/phy, which a
'select' to keep the prior behavior.
However, this leads to build failures when the PHY driver cannot
be built:
WARNING: unmet direct dependencies detected for PHY_MTK_HDMI
Depends on [n]: (ARCH_MEDIATEK || COMPILE_TEST [=y]) && COMMON_CLK [=y] && OF [=y] && REGULATOR [=n]
Selected by [m]:
- DRM_MEDIATEK_HDMI [=m] && HAS_IOMEM [=y] && DRM [=m] && DRM_MEDIATEK [=m]
ERROR: modpost: "devm_regulator_register" [drivers/phy/mediatek/phy-mtk-hdmi-drv.ko] undefined!
ERROR: modpost: "rdev_get_drvdata" [drivers/phy/mediatek/phy-mtk-hdmi-drv.ko] undefined!
The best option here is to just not select the phy driver and leave that
up to the defconfig. Do the same for the other PHY and memory drivers
selected here as well for consistency.
Fixes: a481bf2f0ca4 ("drm/mediatek: Separate mtk_hdmi_phy to an independent module")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: CK Hu <ck.hu@mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20241218085837.2670434-1-arnd@kernel.org/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Lucas De Marchi <lucas.demarchi@intel.com>
Date: Thu Jan 2 16:11:10 2025 -0800
drm/xe: Fix tlb invalidation when wedging
[ Upstream commit 9ab4981552930a9c45682d62424ba610edc3992d ]
If GuC fails to load, the driver wedges, but in the process it tries to
do stuff that may not be initialized yet. This moves the
xe_gt_tlb_invalidation_init() to be done earlier: as its own doc says,
it's a software-only initialization and should had been named with the
_early() suffix.
Move it to be called by xe_gt_init_early(), so the locks and seqno are
initialized, avoiding a NULL ptr deref when wedging:
xe 0000:03:00.0: [drm] *ERROR* GT0: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01
xe 0000:03:00.0: [drm] *ERROR* GT0: firmware signature verification failed
xe 0000:03:00.0: [drm] *ERROR* CRITICAL: Xe has declared device 0000:03:00.0 as wedged.
...
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 9 UID: 0 PID: 3908 Comm: modprobe Tainted: G U W 6.13.0-rc4-xe+ #3
Tainted: [U]=USER, [W]=WARN
Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-S ADP-S DDR5 UDIMM CRB, BIOS ADLSFWI1.R00.3275.A00.2207010640 07/01/2022
RIP: 0010:xe_gt_tlb_invalidation_reset+0x75/0x110 [xe]
This can be easily triggered by poking the GuC binary to force a
signature failure. There will still be an extra message,
xe 0000:03:00.0: [drm] *ERROR* GT0: GuC mmio request 0x4100: no reply 0x4100
but that's better than a NULL ptr deref.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3956
Fixes: c9474b726b93 ("drm/xe: Wedge the entire device")
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250103001111.331684-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 5001ef3af8f2c972d6fd9c5221a8457556f8bea6)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jakub Kicinski <kuba@kernel.org>
Date: Mon Jan 6 10:02:10 2025 -0800
eth: gve: use appropriate helper to set xdp_features
[ Upstream commit db78475ba0d3c66d430f7ded2388cc041078a542 ]
Commit f85949f98206 ("xdp: add xdp_set_features_flag utility routine")
added routines to inform the core about XDP flag changes.
GVE support was added around the same time and missed using them.
GVE only changes the flags on error recover or resume.
Presumably the flags may change during resume if VM migrated.
User would not get the notification and upper devices would
not get a chance to recalculate their flags.
Fixes: 75eaae158b1b ("gve: Add XDP DROP and TX support for GQI-QPL format")
Reviewed-By: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250106180210.1861784-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yuezhang Mo <Yuezhang.Mo@sony.com>
Date: Mon Dec 16 13:39:42 2024 +0800
exfat: fix the infinite loop in __exfat_free_cluster()
[ Upstream commit a5324b3a488d883aa2d42f72260054e87d0940a0 ]
In __exfat_free_cluster(), the cluster chain is traversed until the
EOF cluster. If the cluster chain includes a loop due to file system
corruption, the EOF cluster cannot be traversed, resulting in an
infinite loop.
This commit uses the total number of clusters to prevent this infinite
loop.
Reported-by: syzbot+1de5a37cb85a2d536330@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1de5a37cb85a2d536330
Tested-by: syzbot+1de5a37cb85a2d536330@syzkaller.appspotmail.com
Fixes: 31023864e67a ("exfat: add fat entry operations")
Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yuezhang Mo <Yuezhang.Mo@sony.com>
Date: Fri Dec 13 13:08:37 2024 +0800
exfat: fix the infinite loop in exfat_readdir()
[ Upstream commit fee873761bd978d077d8c55334b4966ac4cb7b59 ]
If the file system is corrupted so that a cluster is linked to
itself in the cluster chain, and there is an unused directory
entry in the cluster, 'dentry' will not be incremented, causing
condition 'dentry < max_dentries' unable to prevent an infinite
loop.
This infinite loop causes s_lock not to be released, and other
tasks will hang, such as exfat_sync_fs().
This commit stops traversing the cluster chain when there is unused
directory entry in the cluster to avoid this infinite loop.
Reported-by: syzbot+205c2644abdff9d3f9fc@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=205c2644abdff9d3f9fc
Tested-by: syzbot+205c2644abdff9d3f9fc@syzkaller.appspotmail.com
Fixes: ca06197382bd ("exfat: add directory operations")
Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yuezhang Mo <Yuezhang.Mo@sony.com>
Date: Thu Dec 12 16:29:23 2024 +0800
exfat: fix the new buffer was not zeroed before writing
[ Upstream commit 98e2fb26d1a9eafe79f46d15d54e68e014d81d8c ]
Before writing, if a buffer_head marked as new, its data must
be zeroed, otherwise uninitialized data in the page cache will
be written.
So this commit uses folio_zero_new_buffers() to zero the new
buffers before ->write_end().
Fixes: 6630ea49103c ("exfat: move extend valid_size into ->page_mkwrite()")
Reported-by: syzbot+91ae49e1c1a2634d20c0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=91ae49e1c1a2634d20c0
Tested-by: syzbot+91ae49e1c1a2634d20c0@syzkaller.appspotmail.com
Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: guanjing <guanjing@cmss.chinamobile.com>
Date: Fri Dec 20 09:33:35 2024 +0100
firewall: remove misplaced semicolon from stm32_firewall_get_firewall
[ Upstream commit 155c5bf26f983e9988333eeb0ef217138304d13b ]
Remove misplaced colon in stm32_firewall_get_firewall()
which results in a syntax error when the code is compiled
without CONFIG_STM32_FIREWALL.
Fixes: 5c9668cfc6d7 ("firewall: introduce stm32_firewall framework")
Signed-off-by: guanjing <guanjing@cmss.chinamobile.com>
Reviewed-by: Gatien Chevallier <gatien.chevallier@foss.st.com>
Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pankaj Raghav <p.raghav@samsung.com>
Date: Thu Sep 26 16:01:21 2024 +0200
fs/writeback: convert wbc_account_cgroup_owner to take a folio
[ Upstream commit 30dac24e14b52e1787572d1d4e06eeabe8a63630 ]
Most of the callers of wbc_account_cgroup_owner() are converting a folio
to page before calling the function. wbc_account_cgroup_owner() is
converting the page back to a folio to call mem_cgroup_css_from_folio().
Convert wbc_account_cgroup_owner() to take a folio instead of a page,
and convert all callers to pass a folio directly except f2fs.
Convert the page to folio for all the callers from f2fs as they were the
only callers calling wbc_account_cgroup_owner() with a page. As f2fs is
already in the process of converting to folios, these call sites might
also soon be calling wbc_account_cgroup_owner() with a folio directly in
the future.
No functional changes. Only compile tested.
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Link: https://lore.kernel.org/r/20240926140121.203821-1-kernel@pankajraghav.com
Acked-by: David Sterba <dsterba@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: 51d20d1dacbe ("iomap: fix zero padding data issue in concurrent append writes")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Miklos Szeredi <mszeredi@redhat.com>
Date: Wed Dec 11 13:11:17 2024 +0100
fs: fix is_mnt_ns_file()
commit aa21f333c86c8a09d39189de87abb0153d338190 upstream.
Commit 1fa08aece425 ("nsfs: convert to path_from_stashed() helper") reused
nsfs dentry's d_fsdata, which no longer contains a pointer to
proc_ns_operations.
Fix the remaining use in is_mnt_ns_file().
Fixes: 1fa08aece425 ("nsfs: convert to path_from_stashed() helper")
Cc: stable@vger.kernel.org # v6.9
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Link: https://lore.kernel.org/r/20241211121118.85268-1-mszeredi@redhat.com
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Christian Brauner <brauner@kernel.org>
Date: Sun Dec 15 21:17:05 2024 +0100
fs: kill MNT_ONRB
commit 344bac8f0d73fe970cd9f5b2f132906317d29e8b upstream.
Move mnt->mnt_node into the union with mnt->mnt_rcu and mnt->mnt_llist
instead of keeping it with mnt->mnt_list. This allows us to use
RB_CLEAR_NODE(&mnt->mnt_node) in umount_tree() as well as
list_empty(&mnt->mnt_node). That in turn allows us to remove MNT_ONRB.
This also fixes the bug reported in [1] where seemingly MNT_ONRB wasn't
set in @mnt->mnt_flags even though the mount was present in the mount
rbtree of the mount namespace.
The root cause is the following race. When a btrfs subvolume is mounted
a temporary mount is created:
btrfs_get_tree_subvol()
{
mnt = fc_mount()
// Register the newly allocated mount with sb->mounts:
lock_mount_hash();
list_add_tail(&mnt->mnt_instance, &mnt->mnt.mnt_sb->s_mounts);
unlock_mount_hash();
}
and registered on sb->s_mounts. Later it is added to an anonymous mount
namespace via mount_subvol():
-> mount_subvol()
-> mount_subtree()
-> alloc_mnt_ns()
mnt_add_to_ns()
vfs_path_lookup()
put_mnt_ns()
The mnt_add_to_ns() call raises MNT_ONRB in @mnt->mnt_flags. If someone
concurrently does a ro remount:
reconfigure_super()
-> sb_prepare_remount_readonly()
{
list_for_each_entry(mnt, &sb->s_mounts, mnt_instance) {
}
all mounts registered in sb->s_mounts are visited and first
MNT_WRITE_HOLD is raised, then MNT_READONLY is raised, and finally
MNT_WRITE_HOLD is removed again.
The flag modification for MNT_WRITE_HOLD/MNT_READONLY and MNT_ONRB race
so MNT_ONRB might be lost.
Fixes: 2eea9ce4310d ("mounts: keep list of mounts in an rbtree")
Cc: <stable@kernel.org> # v6.8+
Link: https://lore.kernel.org/r/20241215-vfs-6-14-mount-work-v1-1-fd55922c4af8@kernel.org
Link: https://lore.kernel.org/r/ec6784ed-8722-4695-980a-4400d4e7bd1a@gmx.com [1]
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Amir Goldstein <amir73il@gmail.com>
Date: Thu Dec 19 12:53:01 2024 +0100
fs: relax assertions on failure to encode file handles
commit 974e3fe0ac61de85015bbe5a4990cf4127b304b2 upstream.
Encoding file handles is usually performed by a filesystem >encode_fh()
method that may fail for various reasons.
The legacy users of exportfs_encode_fh(), namely, nfsd and
name_to_handle_at(2) syscall are ready to cope with the possibility
of failure to encode a file handle.
There are a few other users of exportfs_encode_{fh,fid}() that
currently have a WARN_ON() assertion when ->encode_fh() fails.
Relax those assertions because they are wrong.
The second linked bug report states commit 16aac5ad1fa9 ("ovl: support
encoding non-decodable file handles") in v6.6 as the regressing commit,
but this is not accurate.
The aforementioned commit only increases the chances of the assertion
and allows triggering the assertion with the reproducer using overlayfs,
inotify and drop_caches.
Triggering this assertion was always possible with other filesystems and
other reasons of ->encode_fh() failures and more particularly, it was
also possible with the exact same reproducer using overlayfs that is
mounted with options index=on,nfs_export=on also on kernels < v6.6.
Therefore, I am not listing the aforementioned commit as a Fixes commit.
Backport hint: this patch will have a trivial conflict applying to
v6.6.y, and other trivial conflicts applying to stable kernels < v6.6.
Reported-by: syzbot+ec07f6f5ce62b858579f@syzkaller.appspotmail.com
Tested-by: syzbot+ec07f6f5ce62b858579f@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-unionfs/671fd40c.050a0220.4735a.024f.GAE@google.com/
Reported-by: Dmitry Safonov <dima@arista.com>
Closes: https://lore.kernel.org/linux-fsdevel/CAGrbwDTLt6drB9eaUagnQVgdPBmhLfqqxAf3F+Juqy_o6oP8uw@mail.gmail.com/
Cc: stable@vger.kernel.org
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://lore.kernel.org/r/20241219115301.465396-1-amir73il@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Amir Goldstein <amir73il@gmail.com>
Date: Wed Jan 1 14:00:37 2025 +0100
fuse: respect FOPEN_KEEP_CACHE on opendir
[ Upstream commit 03f275adb8fbd7b4ebe96a1ad5044d8e602692dc ]
The re-factoring of fuse_dir_open() missed the need to invalidate
directory inode page cache with open flag FOPEN_KEEP_CACHE.
Fixes: 7de64d521bf92 ("fuse: break up fuse_open_common()")
Reported-by: Prince Kumar <princer@google.com>
Closes: https://lore.kernel.org/linux-fsdevel/CAEW=TRr7CYb4LtsvQPLj-zx5Y+EYBmGfM24SuzwyDoGVNoKm7w@mail.gmail.com/
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://lore.kernel.org/r/20250101130037.96680-1-amir73il@gmail.com
Reviewed-by: Bernd Schubert <bernd.schubert@fastmail.fm>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Binbin Zhou <zhoubinbin@loongson.cn>
Date: Tue Jan 7 18:38:56 2025 +0800
gpio: loongson: Fix Loongson-2K2000 ACPI GPIO register offset
commit e59f4c97172de0c302894cfd5616161c1f0c4d85 upstream.
Since commit 3feb70a61740 ("gpio: loongson: add more gpio chip
support"), the Loongson-2K2000 GPIO is supported.
However, according to the firmware development specification, the
Loongson-2K2000 ACPI GPIO register offsets in the driver do not match
the register base addresses in the firmware, resulting in the registers
not being accessed properly.
Now, we fix it to ensure the GPIO function works properly.
Cc: stable@vger.kernel.org
Cc: Yinbo Zhu <zhuyinbo@loongson.cn>
Fixes: 3feb70a61740 ("gpio: loongson: add more gpio chip support")
Co-developed-by: Hongliang Wang <wanghongliang@loongson.cn>
Signed-off-by: Hongliang Wang <wanghongliang@loongson.cn>
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Link: https://lore.kernel.org/r/20250107103856.1037222-1-zhoubinbin@loongson.cn
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Koichiro Den <koichiro.den@canonical.com>
Date: Fri Jan 3 23:18:27 2025 +0900
gpio: virtuser: fix handling of multiple conn_ids in lookup table
[ Upstream commit 656cc2e892f128b03ea9ef19bd11d70f71d5472b ]
Creating a virtuser device via configfs with multiple conn_ids fails due
to incorrect indexing of lookup entries. Correct the indexing logic to
ensure proper functionality when multiple gpio_virtuser_lookup are
created.
Fixes: 91581c4b3f29 ("gpio: virtuser: new virtual testing driver for the GPIO API")
Signed-off-by: Koichiro Den <koichiro.den@canonical.com>
Link: https://lore.kernel.org/r/20250103141829.430662-3-koichiro.den@canonical.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Koichiro Den <koichiro.den@canonical.com>
Date: Fri Jan 3 23:18:26 2025 +0900
gpio: virtuser: fix missing lookup table cleanups
[ Upstream commit a619cba8c69c434258ff4101d463322cd63e1bdc ]
When a virtuser device is created via configfs and the probe fails due
to an incorrect lookup table, the table is not removed. This prevents
subsequent probe attempts from succeeding, even if the issue is
corrected, unless the device is released. Additionally, cleanup is also
needed in the less likely case of platform_device_register_full()
failure.
Besides, a consistent memory leak in lookup_table->dev_id was spotted
using kmemleak by toggling the live state between 0 and 1 with a correct
lookup table.
Introduce gpio_virtuser_remove_lookup_table() as the counterpart to the
existing gpio_virtuser_make_lookup_table() and call it from all
necessary points to ensure proper cleanup.
Fixes: 91581c4b3f29 ("gpio: virtuser: new virtual testing driver for the GPIO API")
Signed-off-by: Koichiro Den <koichiro.den@canonical.com>
Link: https://lore.kernel.org/r/20250103141829.430662-2-koichiro.den@canonical.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniil Stas <daniil.stas@posteo.net>
Date: Sun Jan 5 21:36:18 2025 +0000
hwmon: (drivetemp) Fix driver producing garbage data when SCSI errors occur
[ Upstream commit 82163d63ae7a4c36142cd252388737205bb7e4b9 ]
scsi_execute_cmd() function can return both negative (linux codes) and
positive (scsi_cmnd result field) error codes.
Currently the driver just passes error codes of scsi_execute_cmd() to
hwmon core, which is incorrect because hwmon only checks for negative
error codes. This leads to hwmon reporting uninitialized data to
userspace in case of SCSI errors (for example if the disk drive was
disconnected).
This patch checks scsi_execute_cmd() output and returns -EIO if it's
error code is positive.
Fixes: 5b46903d8bf37 ("hwmon: Driver for disk and solid state drives with temperature sensors")
Signed-off-by: Daniil Stas <daniil.stas@posteo.net>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Chris Healy <cphealy@gmail.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: linux-ide@vger.kernel.org
Cc: linux-hwmon@vger.kernel.org
Link: https://lore.kernel.org/r/20250105213618.531691-1-daniil.stas@posteo.net
[groeck: Avoid inline variable declaration for portability]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Przemyslaw Korba <przemyslaw.korba@intel.com>
Date: Wed Dec 4 14:22:18 2024 +0100
ice: fix incorrect PHY settings for 100 GB/s
[ Upstream commit 6c5b989116083a98f45aada548ff54e7a83a9c2d ]
ptp4l application reports too high offset when ran on E823 device
with a 100GB/s link. Those values cannot go under 100ns, like in a
working case when using 100 GB/s cable.
This is due to incorrect frequency settings on the PHY clocks for
100 GB/s speed. Changes are introduced to align with the internal
hardware documentation, and correctly initialize frequency in PHY
clocks with the frequency values that are in our HW spec.
To reproduce the issue run ptp4l as a Time Receiver on E823 device,
and observe the offset, which will never approach values seen
in the PTP working case.
Reproduction output:
ptp4l -i enp137s0f3 -m -2 -s -f /etc/ptp4l_8275.conf
ptp4l[5278.775]: master offset 12470 s2 freq +41288 path delay -3002
ptp4l[5278.837]: master offset 10525 s2 freq +39202 path delay -3002
ptp4l[5278.900]: master offset -24840 s2 freq -20130 path delay -3002
ptp4l[5278.963]: master offset 10597 s2 freq +37908 path delay -3002
ptp4l[5279.025]: master offset 8883 s2 freq +36031 path delay -3002
ptp4l[5279.088]: master offset 7267 s2 freq +34151 path delay -3002
ptp4l[5279.150]: master offset 5771 s2 freq +32316 path delay -3002
ptp4l[5279.213]: master offset 4388 s2 freq +30526 path delay -3002
ptp4l[5279.275]: master offset -30434 s2 freq -28485 path delay -3002
ptp4l[5279.338]: master offset -28041 s2 freq -27412 path delay -3002
ptp4l[5279.400]: master offset 7870 s2 freq +31118 path delay -3002
Fixes: 3a7496234d17 ("ice: implement basic E822 PTP support")
Reviewed-by: Milena Olech <milena.olech@intel.com>
Signed-off-by: Przemyslaw Korba <przemyslaw.korba@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Date: Wed Nov 20 08:51:12 2024 +0100
ice: fix max values for dpll pin phase adjust
[ Upstream commit 65104599b3a8ed42d85b3f8f27be650afe1f3a7e ]
Mask admin command returned max phase adjust value for both input and
output pins. Only 31 bits are relevant, last released data sheet wrongly
points that 32 bits are valid - see [1] 3.2.6.4.1 Get CCU Capabilities
Command for reference. Fix of the datasheet itself is in progress.
Fix the min/max assignment logic, previously the value was wrongly
considered as negative value due to most significant bit being set.
Example of previous broken behavior:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \
--do pin-get --json '{"id":1}'| grep phase-adjust
'phase-adjust': 0,
'phase-adjust-max': 16723,
'phase-adjust-min': -16723,
Correct behavior with the fix:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \
--do pin-get --json '{"id":1}'| grep phase-adjust
'phase-adjust': 0,
'phase-adjust-max': 2147466925,
'phase-adjust-min': -2147466925,
[1] https://cdrdv2.intel.com/v1/dl/getContent/613875?explicitVersion=true
Fixes: 90e1c90750d7 ("ice: dpll: implement phase related callbacks")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Keisuke Nishimura <keisuke.nishimura@inria.fr>
Date: Tue Oct 29 19:27:12 2024 +0100
ieee802154: ca8210: Add missing check for kfifo_alloc() in ca8210_probe()
[ Upstream commit 2c87309ea741341c6722efdf1fb3f50dd427c823 ]
ca8210_test_interface_init() returns the result of kfifo_alloc(),
which can be non-zero in case of an error. The caller, ca8210_probe(),
should check the return value and do error-handling if it fails.
Fixes: ded845a781a5 ("ieee802154: Add CA8210 IEEE 802.15.4 device driver")
Signed-off-by: Keisuke Nishimura <keisuke.nishimura@inria.fr>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/20241029182712.318271-1-keisuke.nishimura@inria.fr
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: En-Wei Wu <en-wei.wu@canonical.com>
Date: Wed Dec 18 10:37:42 2024 +0800
igc: return early when failing to read EECD register
[ Upstream commit bd2776e39c2a82ef4681d02678bb77b3d41e79be ]
When booting with a dock connected, the igc driver may get stuck for ~40
seconds if PCIe link is lost during initialization.
This happens because the driver access device after EECD register reads
return all F's, indicating failed reads. Consequently, hw->hw_addr is set
to NULL, which impacts subsequent rd32() reads. This leads to the driver
hanging in igc_get_hw_semaphore_i225(), as the invalid hw->hw_addr
prevents retrieving the expected value.
To address this, a validation check and a corresponding return value
catch is added for the EECD register read result. If all F's are
returned, indicating PCIe link loss, the driver will return -ENXIO
immediately. This avoids the 40-second hang and significantly improves
boot time when using a dock with an igc NIC.
Log before the patch:
[ 0.911913] igc 0000:70:00.0: enabling device (0000 -> 0002)
[ 0.912386] igc 0000:70:00.0: PTM enabled, 4ns granularity
[ 1.571098] igc 0000:70:00.0 (unnamed net_device) (uninitialized): PCIe link lost, device now detached
[ 43.449095] igc_get_hw_semaphore_i225: igc 0000:70:00.0 (unnamed net_device) (uninitialized): Driver can't access device - SMBI bit is set.
[ 43.449186] igc 0000:70:00.0: probe with driver igc failed with error -13
[ 46.345701] igc 0000:70:00.0: enabling device (0000 -> 0002)
[ 46.345777] igc 0000:70:00.0: PTM enabled, 4ns granularity
Log after the patch:
[ 1.031000] igc 0000:70:00.0: enabling device (0000 -> 0002)
[ 1.032097] igc 0000:70:00.0: PTM enabled, 4ns granularity
[ 1.642291] igc 0000:70:00.0 (unnamed net_device) (uninitialized): PCIe link lost, device now detached
[ 5.480490] igc 0000:70:00.0: enabling device (0000 -> 0002)
[ 5.480516] igc 0000:70:00.0: PTM enabled, 4ns granularity
Fixes: ab4056126813 ("igc: Add NVM support")
Cc: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Signed-off-by: En-Wei Wu <en-wei.wu@canonical.com>
Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Date: Mon Nov 4 11:19:04 2024 +0100
iio: adc: ad7124: Disable all channels at probe time
commit 4be339af334c283a1a1af3cb28e7e448a0aa8a7c upstream.
When during a measurement two channels are enabled, two measurements are
done that are reported sequencially in the DATA register. As the code
triggered by reading one of the sysfs properties expects that only one
channel is enabled it only reads the first data set which might or might
not belong to the intended channel.
To prevent this situation disable all channels during probe. This fixes
a problem in practise because the reset default for channel 0 is
enabled. So all measurements before the first measurement on channel 0
(which disables channel 0 at the end) might report wrong values.
Fixes: 7b8d045e497a ("iio: adc: ad7124: allow more than 8 channels")
Reviewed-by: Nuno Sa <nuno.sa@analog.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/20241104101905.845737-2-u.kleine-koenig@baylibre.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: David Lechner <dlechner@baylibre.com>
Date: Wed Nov 27 14:01:53 2024 -0600
iio: adc: ad7173: fix using shared static info struct
commit 36a44e05cd807a54e5ffad4b96d0d67f68ad8576 upstream.
Fix a possible race condition during driver probe in the ad7173 driver
due to using a shared static info struct. If more that one instance of
the driver is probed at the same time, some of the info could be
overwritten by the other instance, leading to incorrect operation.
To fix this, make the static info struct const so that it is read-only
and make a copy of the info struct for each instance of the driver that
can be modified.
Reported-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Fixes: 76a1e6a42802 ("iio: adc: ad7173: add AD7173 driver")
Signed-off-by: David Lechner <dlechner@baylibre.com>
Tested-by: Guillaume Ranquet <granquet@baylibre.com>
Link: https://patch.msgid.link/20241127-iio-adc-ad7313-fix-non-const-info-struct-v2-1-b6d7022b7466@baylibre.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Date: Sat Dec 7 13:30:45 2024 +0900
iio: adc: at91: call input_free_device() on allocated iio_dev
commit de6a73bad1743e9e81ea5a24c178c67429ff510b upstream.
Current implementation of at91_ts_register() calls input_free_deivce()
on st->ts_input, however, the err label can be reached before the
allocated iio_dev is stored to st->ts_input. Thus call
input_free_device() on input instead of st->ts_input.
Fixes: 84882b060301 ("iio: adc: at91_adc: Add support for touchscreens without TSMR")
Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Link: https://patch.msgid.link/20241207043045.1255409-1-joe@pf.is.s.u-tokyo.ac.jp
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date: Mon Nov 25 22:16:12 2024 +0100
iio: adc: rockchip_saradc: fix information leak in triggered buffer
commit 38724591364e1e3b278b4053f102b49ea06ee17c upstream.
The 'data' local struct is used to push data to user space from a
triggered buffer, but it does not set values for inactive channels, as
it only uses iio_for_each_active_channel() to assign new values.
Initialize the struct to zero before using it to avoid pushing
uninitialized information to userspace.
Cc: stable@vger.kernel.org
Fixes: 4e130dc7b413 ("iio: adc: rockchip_saradc: Add support iio buffers")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-4-0cb6e98d895c@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date: Mon Nov 25 22:16:10 2024 +0100
iio: adc: ti-ads1119: fix information leak in triggered buffer
commit 75f339d3ecd38cb1ce05357d647189d4a7f7ed08 upstream.
The 'scan' local struct is used to push data to user space from a
triggered buffer, but it has a hole between the sample (unsigned int)
and the timestamp. This hole is never initialized.
Initialize the struct to zero before using it to avoid pushing
uninitialized information to userspace.
Cc: stable@vger.kernel.org
Fixes: a9306887eba4 ("iio: adc: ti-ads1119: Add driver")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-2-0cb6e98d895c@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date: Mon Dec 2 20:18:44 2024 +0100
iio: adc: ti-ads1119: fix sample size in scan struct for triggered buffer
commit 54d394905c92b9ecc65c1f9b2692c8e10716d8e1 upstream.
This device returns signed, 16-bit samples as stated in its datasheet
(see 8.5.2 Data Format). That is in line with the scan_type definition
for the IIO_VOLTAGE channel, but 'unsigned int' is being used to read
and push the data to userspace.
Given that the size of that type depends on the architecture (at least
2 bytes to store values up to 65535, but its actual size is often 4
bytes), use the 's16' type to provide the same structure in all cases.
Fixes: a9306887eba4 ("iio: adc: ti-ads1119: Add driver")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Link: https://patch.msgid.link/20241202-ti-ads1119_s16_chan-v1-1-fafe3136dc90@gmail.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Fabio Estevam <festevam@gmail.com>
Date: Fri Nov 22 13:43:08 2024 -0300
iio: adc: ti-ads124s08: Use gpiod_set_value_cansleep()
commit 2a8e34096ec70d73ebb6d9920688ea312700cbd9 upstream.
Using gpiod_set_value() to control the reset GPIO causes some verbose
warnings during boot when the reset GPIO is controlled by an I2C IO
expander.
As the caller can sleep, use the gpiod_set_value_cansleep() variant to
fix the issue.
Tested on a custom i.MX93 board with a ADS124S08 ADC.
Cc: stable@kernel.org
Fixes: e717f8c6dfec ("iio: adc: Add the TI ads124s08 ADC code")
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://patch.msgid.link/20241122164308.390340-1-festevam@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Charles Han <hanchunchao@inspur.com>
Date: Mon Nov 18 17:02:08 2024 +0800
iio: adc: ti-ads1298: Add NULL check in ads1298_init
commit bcb394bb28e55312cace75362b8e489eb0e02a30 upstream.
devm_kasprintf() can return a NULL pointer on failure. A check on the
return value of such a call in ads1298_init() is missing. Add it.
Fixes: 00ef7708fa60 ("iio: adc: ti-ads1298: Add driver")
Signed-off-by: Charles Han <hanchunchao@inspur.com>
Link: https://patch.msgid.link/20241118090208.14586-1-hanchunchao@inspur.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date: Mon Nov 25 22:16:16 2024 +0100
iio: adc: ti-ads8688: fix information leak in triggered buffer
commit 2a7377ccfd940cd6e9201756aff1e7852c266e69 upstream.
The 'buffer' local array is used to push data to user space from a
triggered buffer, but it does not set values for inactive channels, as
it only uses iio_for_each_active_channel() to assign new values.
Initialize the array to zero before using it to avoid pushing
uninitialized information to userspace.
Cc: stable@vger.kernel.org
Fixes: 61fa5dfa5f52 ("iio: adc: ti-ads8688: Fix alignment of buffer in iio_push_to_buffers_with_timestamp()")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-8-0cb6e98d895c@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date: Mon Nov 25 22:16:17 2024 +0100
iio: dummy: iio_simply_dummy_buffer: fix information leak in triggered buffer
commit 333be433ee908a53f283beb95585dfc14c8ffb46 upstream.
The 'data' array is allocated via kmalloc() and it is used to push data
to user space from a triggered buffer, but it does not set values for
inactive channels, as it only uses iio_for_each_active_channel()
to assign new values.
Use kzalloc for the memory allocation to avoid pushing uninitialized
information to userspace.
Cc: stable@vger.kernel.org
Fixes: 415f79244757 ("iio: Move IIO Dummy Driver out of staging")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-9-0cb6e98d895c@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Carlos Song <carlos.song@nxp.com>
Date: Sat Nov 16 10:29:45 2024 -0500
iio: gyro: fxas21002c: Fix missing data update in trigger handler
commit fa13ac6cdf9b6c358e7d77c29fb60145c7a87965 upstream.
The fxas21002c_trigger_handler() may fail to acquire sample data because
the runtime PM enters the autosuspend state and sensor can not return
sample data in standby mode..
Resume the sensor before reading the sample data into the buffer within the
trigger handler. After the data is read, place the sensor back into the
autosuspend state.
Fixes: a0701b6263ae ("iio: gyro: add core driver for fxas21002c")
Signed-off-by: Carlos Song <carlos.song@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20241116152945.4006374-1-Frank.Li@nxp.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
Date: Tue Nov 12 10:30:10 2024 +0100
iio: imu: inv_icm42600: fix spi burst write not supported
commit c0f866de4ce447bca3191b9cefac60c4b36a7922 upstream.
Burst write with SPI is not working for all icm42600 chips. It was
only used for setting user offsets with regmap_bulk_write.
Add specific SPI regmap config for using only single write with SPI.
Fixes: 9f9ff91b775b ("iio: imu: inv_icm42600: add SPI driver for inv_icm42600 driver")
Cc: stable@vger.kernel.org
Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
Link: https://patch.msgid.link/20241112-inv-icm42600-fix-spi-burst-write-not-supported-v2-1-97690dc03607@tdk.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
Date: Wed Nov 13 21:25:45 2024 +0100
iio: imu: inv_icm42600: fix timestamps after suspend if sensor is on
commit 65a60a590142c54a3f3be11ff162db2d5b0e1e06 upstream.
Currently suspending while sensors are one will result in timestamping
continuing without gap at resume. It can work with monotonic clock but
not with other clocks. Fix that by resetting timestamping.
Fixes: ec74ae9fd37c ("iio: imu: inv_icm42600: add accurate timestamping")
Cc: stable@vger.kernel.org
Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
Link: https://patch.msgid.link/20241113-inv_icm42600-fix-timestamps-after-suspend-v1-1-dfc77c394173@tdk.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date: Mon Nov 25 22:16:13 2024 +0100
iio: imu: kmx61: fix information leak in triggered buffer
commit 6ae053113f6a226a2303caa4936a4c37f3bfff7b upstream.
The 'buffer' local array is used to push data to user space from a
triggered buffer, but it does not set values for inactive channels, as
it only uses iio_for_each_active_channel() to assign new values.
Initialize the array to zero before using it to avoid pushing
uninitialized information to userspace.
Cc: stable@vger.kernel.org
Fixes: c3a23ecc0901 ("iio: imu: kmx61: Add support for data ready triggers")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-5-0cb6e98d895c@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Date: Wed Dec 4 20:13:42 2024 +0900
iio: inkern: call iio_device_put() only on mapped devices
commit 64f43895b4457532a3cc524ab250b7a30739a1b1 upstream.
In the error path of iio_channel_get_all(), iio_device_put() is called
on all IIO devices, which can cause a refcount imbalance. Fix this error
by calling iio_device_put() only on IIO devices whose refcounts were
previously incremented by iio_device_get().
Fixes: 314be14bb893 ("iio: Rename _st_ functions to loose the bit that meant the staging version.")
Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Link: https://patch.msgid.link/20241204111342.1246706-1-joe@pf.is.s.u-tokyo.ac.jp
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date: Mon Nov 25 22:16:15 2024 +0100
iio: light: bh1745: fix information leak in triggered buffer
commit b62fbe3b8eedd3cf3c9ad0b7cb9f72c3f40815f0 upstream.
The 'scan' local struct is used to push data to user space from a
triggered buffer, but it does not set values for inactive channels, as
it only uses iio_for_each_active_channel() to assign new values.
Initialize the struct to zero before using it to avoid pushing
uninitialized information to userspace.
Cc: stable@vger.kernel.org
Fixes: eab35358aae7 ("iio: light: ROHM BH1745 colour sensor")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-7-0cb6e98d895c@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date: Mon Nov 25 22:16:14 2024 +0100
iio: light: vcnl4035: fix information leak in triggered buffer
commit 47b43e53c0a0edf5578d5d12f5fc71c019649279 upstream.
The 'buffer' local array is used to push data to userspace from a
triggered buffer, but it does not set an initial value for the single
data element, which is an u16 aligned to 8 bytes. That leaves at least
4 bytes uninitialized even after writing an integer value with
regmap_read().
Initialize the array to zero before using it to avoid pushing
uninitialized information to userspace.
Cc: stable@vger.kernel.org
Fixes: ec90b52c07c0 ("iio: light: vcnl4035: Fix buffer alignment in iio_push_to_buffers_with_timestamp()")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-6-0cb6e98d895c@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date: Mon Nov 25 22:16:11 2024 +0100
iio: pressure: zpa2326: fix information leak in triggered buffer
commit 6007d10c5262f6f71479627c1216899ea7f09073 upstream.
The 'sample' local struct is used to push data to user space from a
triggered buffer, but it has a hole between the temperature and the
timestamp (u32 pressure, u16 temperature, GAP, u64 timestamp).
This hole is never initialized.
Initialize the struct to zero before using it to avoid pushing
uninitialized information to userspace.
Cc: stable@vger.kernel.org
Fixes: 03b262f2bbf4 ("iio:pressure: initial zpa2326 barometer support")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-3-0cb6e98d895c@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jens Axboe <axboe@kernel.dk>
Date: Wed Jan 8 10:28:05 2025 -0700
io_uring/eventfd: ensure io_eventfd_signal() defers another RCU period
Commit c9a40292a44e78f71258b8522655bffaf5753bdb upstream.
io_eventfd_do_signal() is invoked from an RCU callback, but when
dropping the reference to the io_ev_fd, it calls io_eventfd_free()
directly if the refcount drops to zero. This isn't correct, as any
potential freeing of the io_ev_fd should be deferred another RCU grace
period.
Just call io_eventfd_put() rather than open-code the dec-and-test and
free, which will correctly defer it another RCU grace period.
Fixes: 21a091b970cd ("io_uring: signal registered eventfd to process deferred task work")
Reported-by: Jann Horn <jannh@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Pavel Begunkov <asml.silence@gmail.com>
Date: Fri Jan 10 14:31:23 2025 +0000
io_uring/sqpoll: zero sqd->thread on tctx errors
commit 4b7cfa8b6c28a9fa22b86894166a1a34f6d630ba upstream.
Syzkeller reports:
BUG: KASAN: slab-use-after-free in thread_group_cputime+0x409/0x700 kernel/sched/cputime.c:341
Read of size 8 at addr ffff88803578c510 by task syz.2.3223/27552
Call Trace:
<TASK>
...
kasan_report+0x143/0x180 mm/kasan/report.c:602
thread_group_cputime+0x409/0x700 kernel/sched/cputime.c:341
thread_group_cputime_adjusted+0xa6/0x340 kernel/sched/cputime.c:639
getrusage+0x1000/0x1340 kernel/sys.c:1863
io_uring_show_fdinfo+0xdfe/0x1770 io_uring/fdinfo.c:197
seq_show+0x608/0x770 fs/proc/fd.c:68
...
That's due to sqd->task not being cleared properly in cases where
SQPOLL task tctx setup fails, which can essentially only happen with
fault injection to insert allocation errors.
Cc: stable@vger.kernel.org
Fixes: 1251d2025c3e1 ("io_uring/sqpoll: early exit thread if task_context wasn't allocated")
Reported-by: syzbot+3d92cfcfa84070b0a470@syzkaller.appspotmail.com
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/efc7ec7010784463b2e7466d7b5c02c2cb381635.1736519461.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Pavel Begunkov <asml.silence@gmail.com>
Date: Sat Jan 4 18:29:02 2025 +0000
io_uring/timeout: fix multishot updates
commit c83c846231db8b153bfcb44d552d373c34f78245 upstream.
After update only the first shot of a multishot timeout request adheres
to the new timeout value while all subsequent retries continue to use
the old value. Don't forget to update the timeout stored in struct
io_timeout_data.
Cc: stable@vger.kernel.org
Fixes: ea97f6c8558e8 ("io_uring: add support for multishot timeouts")
Reported-by: Christian Mazakas <christian.mazakas@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e6516c3304eb654ec234cfa65c88a9579861e597.1736015288.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Pavel Begunkov <asml.silence@gmail.com>
Date: Fri Jan 10 20:36:45 2025 +0000
io_uring: don't touch sqd->thread off tw add
commit bd2703b42decebdcddf76e277ba76b4c4a142d73 upstream.
With IORING_SETUP_SQPOLL all requests are created by the SQPOLL task,
which means that req->task should always match sqd->thread. Since
accesses to sqd->thread should be separately protected, use req->task
in io_req_normal_work_add() instead.
Note, in the eyes of io_req_normal_work_add(), the SQPOLL task struct
is always pinned and alive, and sqd->thread can either be the task or
NULL. It's only problematic if the compiler decides to reload the value
after the null check, which is not so likely.
Cc: stable@vger.kernel.org
Cc: Bui Quang Minh <minhquangbui99@gmail.com>
Reported-by: lizetao <lizetao1@huawei.com>
Fixes: 78f9b61bd8e54 ("io_uring: wake SQPOLL task when task_work is added to an empty queue")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1cbbe72cf32c45a8fee96026463024cd8564a7d7.1736541357.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Long Li <leo.lilong@huawei.com>
Date: Mon Dec 9 19:42:40 2024 +0800
iomap: fix zero padding data issue in concurrent append writes
[ Upstream commit 51d20d1dacbec589d459e11fc88fbca419f84a99 ]
During concurrent append writes to XFS filesystem, zero padding data
may appear in the file after power failure. This happens due to imprecise
disk size updates when handling write completion.
Consider this scenario with concurrent append writes same file:
Thread 1: Thread 2:
------------ -----------
write [A, A+B]
update inode size to A+B
submit I/O [A, A+BS]
write [A+B, A+B+C]
update inode size to A+B+C
<I/O completes, updates disk size to min(A+B+C, A+BS)>
<power failure>
After reboot:
1) with A+B+C < A+BS, the file has zero padding in range [A+B, A+B+C]
|< Block Size (BS) >|
|DDDDDDDDDDDDDDDD0000000000000000|
^ ^ ^
A A+B A+B+C
(EOF)
2) with A+B+C > A+BS, the file has zero padding in range [A+B, A+BS]
|< Block Size (BS) >|< Block Size (BS) >|
|DDDDDDDDDDDDDDDD0000000000000000|00000000000000000000000000000000|
^ ^ ^ ^
A A+B A+BS A+B+C
(EOF)
D = Valid Data
0 = Zero Padding
The issue stems from disk size being set to min(io_offset + io_size,
inode->i_size) at I/O completion. Since io_offset+io_size is block
size granularity, it may exceed the actual valid file data size. In
the case of concurrent append writes, inode->i_size may be larger
than the actual range of valid file data written to disk, leading to
inaccurate disk size updates.
This patch modifies the meaning of io_size to represent the size of
valid data within EOF in an ioend. If the ioend spans beyond i_size,
io_size will be trimmed to provide the file with more accurate size
information. This is particularly useful for on-disk size updates
at completion time.
After this change, ioends that span i_size will not grow or merge with
other ioends in concurrent scenarios. However, these cases that need
growth/merging rarely occur and it seems no noticeable performance impact.
Although rounding up io_size could enable ioend growth/merging in these
scenarios, we decided to keep the code simple after discussion [1].
Another benefit is that it makes the xfs_ioend_is_append() check more
accurate, which can reduce unnecessary end bio callbacks of xfs_end_bio()
in certain scenarios, such as repeated writes at the file tail without
extending the file size.
Link [1]: https://patchwork.kernel.org/project/xfs/patch/20241113091907.56937-1-leo.lilong@huawei.com
Fixes: ae259a9c8593 ("fs: introduce iomap infrastructure") # goes further back than this
Signed-off-by: Long Li <leo.lilong@huawei.com>
Link: https://lore.kernel.org/r/20241209114241.3725722-3-leo.lilong@huawei.com
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Long Li <leo.lilong@huawei.com>
Date: Mon Dec 9 19:42:39 2024 +0800
iomap: pass byte granular end position to iomap_add_to_ioend
[ Upstream commit b44679c63e4d3ac820998b6bd59fba89a72ad3e7 ]
This is a preparatory patch for fixing zero padding issues in concurrent
append write scenarios. In the following patches, we need to obtain
byte-granular writeback end position for io_size trimming after EOF
handling.
Due to concurrent writeback and truncate operations, inode size may
shrink. Resampling inode size would force writeback code to handle the
newly appeared post-EOF blocks, which is undesirable. As Dave
explained in [1]:
"Really, the issue is that writeback mappings have to be able to
handle the range being mapped suddenly appear to be beyond EOF.
This behaviour is a longstanding writeback constraint, and is what
iomap_writepage_handle_eof() is attempting to handle.
We handle this by only sampling i_size_read() whilst we have the
folio locked and can determine the action we should take with that
folio (i.e. nothing, partial zeroing, or skip altogether). Once
we've made the decision that the folio is within EOF and taken
action on it (i.e. moved the folio to writeback state), we cannot
then resample the inode size because a truncate may have started
and changed the inode size."
To avoid resampling inode size after EOF handling, we convert end_pos
to byte-granular writeback position and return it from EOF handling
function.
Since iomap_set_range_dirty() can handle unaligned lengths, this
conversion has no impact on it. However, iomap_find_dirty_range()
requires aligned start and end range to find dirty blocks within the
given range, so the end position needs to be rounded up when passed
to it.
LINK [1]: https://lore.kernel.org/linux-xfs/Z1Gg0pAa54MoeYME@localhost.localdomain/
Signed-off-by: Long Li <leo.lilong@huawei.com>
Link: https://lore.kernel.org/r/20241209114241.3725722-2-leo.lilong@huawei.com
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: 51d20d1dacbe ("iomap: fix zero padding data issue in concurrent append writes")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date: Mon Jan 6 16:19:11 2025 +0900
ipvlan: Fix use-after-free in ipvlan_get_iflink().
[ Upstream commit cb358ff94154774d031159b018adf45e17673941 ]
syzbot presented an use-after-free report [0] regarding ipvlan and
linkwatch.
ipvlan does not hold a refcnt of the lower device unlike vlan and
macvlan.
If the linkwatch work is triggered for the ipvlan dev, the lower dev
might have already been freed, resulting in UAF of ipvlan->phy_dev in
ipvlan_get_iflink().
We can delay the lower dev unregistration like vlan and macvlan by
holding the lower dev's refcnt in dev->netdev_ops->ndo_init() and
releasing it in dev->priv_destructor().
Jakub pointed out calling .ndo_XXX after unregister_netdevice() has
returned is error prone and suggested [1] addressing this UAF in the
core by taking commit 750e51603395 ("net: avoid potential UAF in
default_operstate()") further.
Let's assume unregistering devices DOWN and use RCU protection in
default_operstate() not to race with the device unregistration.
[0]:
BUG: KASAN: slab-use-after-free in ipvlan_get_iflink+0x84/0x88 drivers/net/ipvlan/ipvlan_main.c:353
Read of size 4 at addr ffff0000d768c0e0 by task kworker/u8:35/6944
CPU: 0 UID: 0 PID: 6944 Comm: kworker/u8:35 Not tainted 6.13.0-rc2-g9bc5c9515b48 #12 4c3cb9e8b4565456f6a355f312ff91f4f29b3c47
Hardware name: linux,dummy-virt (DT)
Workqueue: events_unbound linkwatch_event
Call trace:
show_stack+0x38/0x50 arch/arm64/kernel/stacktrace.c:484 (C)
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0xbc/0x108 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0x16c/0x6f0 mm/kasan/report.c:489
kasan_report+0xc0/0x120 mm/kasan/report.c:602
__asan_report_load4_noabort+0x20/0x30 mm/kasan/report_generic.c:380
ipvlan_get_iflink+0x84/0x88 drivers/net/ipvlan/ipvlan_main.c:353
dev_get_iflink+0x7c/0xd8 net/core/dev.c:674
default_operstate net/core/link_watch.c:45 [inline]
rfc2863_policy+0x144/0x360 net/core/link_watch.c:72
linkwatch_do_dev+0x60/0x228 net/core/link_watch.c:175
__linkwatch_run_queue+0x2f4/0x5b8 net/core/link_watch.c:239
linkwatch_event+0x64/0xa8 net/core/link_watch.c:282
process_one_work+0x700/0x1398 kernel/workqueue.c:3229
process_scheduled_works kernel/workqueue.c:3310 [inline]
worker_thread+0x8c4/0xe10 kernel/workqueue.c:3391
kthread+0x2b0/0x360 kernel/kthread.c:389
ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:862
Allocated by task 9303:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x30/0x68 mm/kasan/common.c:68
kasan_save_alloc_info+0x44/0x58 mm/kasan/generic.c:568
poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
__kasan_kmalloc+0x84/0xa0 mm/kasan/common.c:394
kasan_kmalloc include/linux/kasan.h:260 [inline]
__do_kmalloc_node mm/slub.c:4283 [inline]
__kmalloc_node_noprof+0x2a0/0x560 mm/slub.c:4289
__kvmalloc_node_noprof+0x9c/0x230 mm/util.c:650
alloc_netdev_mqs+0xb4/0x1118 net/core/dev.c:11209
rtnl_create_link+0x2b8/0xb60 net/core/rtnetlink.c:3595
rtnl_newlink_create+0x19c/0x868 net/core/rtnetlink.c:3771
__rtnl_newlink net/core/rtnetlink.c:3896 [inline]
rtnl_newlink+0x122c/0x15c0 net/core/rtnetlink.c:4011
rtnetlink_rcv_msg+0x61c/0x918 net/core/rtnetlink.c:6901
netlink_rcv_skb+0x1dc/0x398 net/netlink/af_netlink.c:2542
rtnetlink_rcv+0x34/0x50 net/core/rtnetlink.c:6928
netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline]
netlink_unicast+0x618/0x838 net/netlink/af_netlink.c:1347
netlink_sendmsg+0x5fc/0x8b0 net/netlink/af_netlink.c:1891
sock_sendmsg_nosec net/socket.c:711 [inline]
__sock_sendmsg net/socket.c:726 [inline]
__sys_sendto+0x2ec/0x438 net/socket.c:2197
__do_sys_sendto net/socket.c:2204 [inline]
__se_sys_sendto net/socket.c:2200 [inline]
__arm64_sys_sendto+0xe4/0x110 net/socket.c:2200
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x90/0x278 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x13c/0x250 arch/arm64/kernel/syscall.c:132
do_el0_svc+0x54/0x70 arch/arm64/kernel/syscall.c:151
el0_svc+0x4c/0xa8 arch/arm64/kernel/entry-common.c:744
el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:762
el0t_64_sync+0x198/0x1a0 arch/arm64/kernel/entry.S:600
Freed by task 10200:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x30/0x68 mm/kasan/common.c:68
kasan_save_free_info+0x58/0x70 mm/kasan/generic.c:582
poison_slab_object mm/kasan/common.c:247 [inline]
__kasan_slab_free+0x48/0x68 mm/kasan/common.c:264
kasan_slab_free include/linux/kasan.h:233 [inline]
slab_free_hook mm/slub.c:2338 [inline]
slab_free mm/slub.c:4598 [inline]
kfree+0x140/0x420 mm/slub.c:4746
kvfree+0x4c/0x68 mm/util.c:693
netdev_release+0x94/0xc8 net/core/net-sysfs.c:2034
device_release+0x98/0x1c0
kobject_cleanup lib/kobject.c:689 [inline]
kobject_release lib/kobject.c:720 [inline]
kref_put include/linux/kref.h:65 [inline]
kobject_put+0x2b0/0x438 lib/kobject.c:737
netdev_run_todo+0xdd8/0xf48 net/core/dev.c:10924
rtnl_unlock net/core/rtnetlink.c:152 [inline]
rtnl_net_unlock net/core/rtnetlink.c:209 [inline]
rtnl_dellink+0x484/0x680 net/core/rtnetlink.c:3526
rtnetlink_rcv_msg+0x61c/0x918 net/core/rtnetlink.c:6901
netlink_rcv_skb+0x1dc/0x398 net/netlink/af_netlink.c:2542
rtnetlink_rcv+0x34/0x50 net/core/rtnetlink.c:6928
netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline]
netlink_unicast+0x618/0x838 net/netlink/af_netlink.c:1347
netlink_sendmsg+0x5fc/0x8b0 net/netlink/af_netlink.c:1891
sock_sendmsg_nosec net/socket.c:711 [inline]
__sock_sendmsg net/socket.c:726 [inline]
____sys_sendmsg+0x410/0x708 net/socket.c:2583
___sys_sendmsg+0x178/0x1d8 net/socket.c:2637
__sys_sendmsg net/socket.c:2669 [inline]
__do_sys_sendmsg net/socket.c:2674 [inline]
__se_sys_sendmsg net/socket.c:2672 [inline]
__arm64_sys_sendmsg+0x12c/0x1c8 net/socket.c:2672
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x90/0x278 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x13c/0x250 arch/arm64/kernel/syscall.c:132
do_el0_svc+0x54/0x70 arch/arm64/kernel/syscall.c:151
el0_svc+0x4c/0xa8 arch/arm64/kernel/entry-common.c:744
el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:762
el0t_64_sync+0x198/0x1a0 arch/arm64/kernel/entry.S:600
The buggy address belongs to the object at ffff0000d768c000
which belongs to the cache kmalloc-cg-4k of size 4096
The buggy address is located 224 bytes inside of
freed 4096-byte region [ffff0000d768c000, ffff0000d768d000)
The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x117688
head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
memcg:ffff0000c77ef981
flags: 0xbfffe0000000040(head|node=0|zone=2|lastcpupid=0x1ffff)
page_type: f5(slab)
raw: 0bfffe0000000040 ffff0000c000f500 dead000000000100 dead000000000122
raw: 0000000000000000 0000000000040004 00000001f5000000 ffff0000c77ef981
head: 0bfffe0000000040 ffff0000c000f500 dead000000000100 dead000000000122
head: 0000000000000000 0000000000040004 00000001f5000000 ffff0000c77ef981
head: 0bfffe0000000003 fffffdffc35da201 ffffffffffffffff 0000000000000000
head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff0000d768bf80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff0000d768c000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff0000d768c080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff0000d768c100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff0000d768c180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Fixes: 8c55facecd7a ("net: linkwatch: only report IF_OPER_LOWERLAYERDOWN if iflink is actually down")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/netdev/20250102174400.085fd8ac@kernel.org/ [1]
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250106071911.64355-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zhang Yi <yi.zhang@huawei.com>
Date: Tue Dec 3 09:44:07 2024 +0800
jbd2: flush filesystem device before updating tail sequence
[ Upstream commit a0851ea9cd555c333795b85ddd908898b937c4e1 ]
When committing transaction in jbd2_journal_commit_transaction(), the
disk caches for the filesystem device should be flushed before updating
the journal tail sequence. However, this step is missed if the journal
is not located on the filesystem device. As a result, the filesystem may
become inconsistent following a power failure or system crash. Fix it by
ensuring that the filesystem device is flushed appropriately.
Fixes: 3339578f0578 ("jbd2: cleanup journal tail after transaction commit")
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://lore.kernel.org/r/20241203014407.805916-3-yi.zhang@huaweicloud.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zhang Yi <yi.zhang@huawei.com>
Date: Tue Dec 3 09:44:06 2024 +0800
jbd2: increase IO priority for writing revoke records
[ Upstream commit ac1e21bd8c883aeac2f1835fc93b39c1e6838b35 ]
Commit '6a3afb6ac6df ("jbd2: increase the journal IO's priority")'
increases the priority of journal I/O by marking I/O with the
JBD2_JOURNAL_REQ_FLAGS. However, that commit missed the revoke buffers,
so also addresses that kind of I/Os.
Fixes: 6a3afb6ac6df ("jbd2: increase the journal IO's priority")
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://lore.kernel.org/r/20241203014407.805916-2-yi.zhang@huaweicloud.com
Reviewed-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wentao Liang <liangwentao@iscas.ac.cn>
Date: Mon Dec 23 23:30:50 2024 +0800
ksmbd: fix a missing return value check bug
[ Upstream commit 4c16e1cadcbcaf3c82d5fc310fbd34d0f5d0db7c ]
In the smb2_send_interim_resp(), if ksmbd_alloc_work_struct()
fails to allocate a node, it returns a NULL pointer to the
in_work pointer. This can lead to an illegal memory write of
in_work->response_buf when allocate_interim_rsp_buf() attempts
to perform a kzalloc() on it.
To address this issue, incorporating a check for the return
value of ksmbd_alloc_work_struct() ensures that the function
returns immediately upon allocation failure, thereby preventing
the aforementioned illegal memory access.
Fixes: 041bba4414cd ("ksmbd: fix wrong interim response on compound")
Signed-off-by: Wentao Liang <liangwentao@iscas.ac.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: He Wang <xw897002528@gmail.com>
Date: Mon Jan 6 03:39:54 2025 +0000
ksmbd: fix unexpectedly changed path in ksmbd_vfs_kern_path_locked
[ Upstream commit 2ac538e40278a2c0c051cca81bcaafc547d61372 ]
When `ksmbd_vfs_kern_path_locked` met an error and it is not the last
entry, it will exit without restoring changed path buffer. But later this
buffer may be used as the filename for creation.
Fixes: c5a709f08d40 ("ksmbd: handle caseless file creation")
Signed-off-by: He Wang <xw897002528@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Namjae Jeon <linkinjeon@kernel.org>
Date: Tue Jan 7 17:41:21 2025 +0900
ksmbd: Implement new SMB3 POSIX type
commit e8580b4c600e085b3c8e6404392de2f822d4c132 upstream.
As SMB3 posix extension specification, Give posix file type to posix
mode.
https://www.samba.org/~slow/SMB3_POSIX/fscc_posix_extensions.html#posix-file-type-definition
Cc: stable@vger.kernel.org
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Fri Jan 17 13:41:00 2025 +0100
Linux 6.12.10
Link: https://lore.kernel.org/r/20250115103606.357764746@linuxfoundation.org
Tested-by: Luna Jernberg <droidbittin@gmail.com>
Tested-by: Mark Brown <broonie@kernel.org>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Ronald Warsow <rwarsow@gmx.de>
Tested-by: Justin M. Forbes <jforbes@fedoraproject.org>
Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Salvatore Bonaccorso <carnil@debian.org>
Tested-by: Ron Economos <re@w6rz.net>
Tested-by: Peter Schneider <pschneider1968@googlemail.com>
Tested-by: Hardik Garg <hargar@linux.microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Leo Yang <leo.yang.sy0@gmail.com>
Date: Tue Jan 7 11:15:30 2025 +0800
mctp i3c: fix MCTP I3C driver multi-thread issue
[ Upstream commit 2d2d4f60ed266a8f340a721102d035252606980b ]
We found a timeout problem with the pldm command on our system. The
reason is that the MCTP-I3C driver has a race condition when receiving
multiple-packet messages in multi-thread, resulting in a wrong packet
order problem.
We identified this problem by adding a debug message to the
mctp_i3c_read function.
According to the MCTP spec, a multiple-packet message must be composed
in sequence, and if there is a wrong sequence, the whole message will be
discarded and wait for the next SOM.
For example, SOM → Pkt Seq #2 → Pkt Seq #1 → Pkt Seq #3 → EOM.
Therefore, we try to solve this problem by adding a mutex to the
mctp_i3c_read function. Before the modification, when a command
requesting a multiple-packet message response is sent consecutively, an
error usually occurs within 100 loops. After the mutex, it can go
through 40000 loops without any error, and it seems to run well.
Fixes: c8755b29b58e ("mctp i3c: MCTP I3C driver")
Signed-off-by: Leo Yang <Leo-Yang@quantatw.com>
Link: https://patch.msgid.link/20250107031529.3296094-1-Leo-Yang@quantatw.com
[pabeni@redhat.com: dropped already answered question from changelog]
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rengarajan S <rengarajan.s@microchip.com>
Date: Thu Dec 5 19:06:25 2024 +0530
misc: microchip: pci1xxxx: Resolve kernel panic during GPIO IRQ handling
commit 194f9f94a5169547d682e9bbcc5ae6d18a564735 upstream.
Resolve kernel panic caused by improper handling of IRQs while
accessing GPIO values. This is done by replacing generic_handle_irq with
handle_nested_irq.
Fixes: 1f4d8ae231f4 ("misc: microchip: pci1xxxx: Add gpio irq handler and irq helper functions irq_ack, irq_mask, irq_unmask and irq_set_type of irq_chip.")
Cc: stable <stable@kernel.org>
Signed-off-by: Rengarajan S <rengarajan.s@microchip.com>
Link: https://lore.kernel.org/r/20241205133626.1483499-2-rengarajan.s@microchip.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Rengarajan S <rengarajan.s@microchip.com>
Date: Thu Dec 5 19:06:26 2024 +0530
misc: microchip: pci1xxxx: Resolve return code mismatch during GPIO set config
commit c7a5378a0f707686de3ddb489f1653c523bb7dcc upstream.
Driver returns -EOPNOTSUPPORTED on unsupported parameters case in set
config. Upper level driver checks for -ENOTSUPP. Because of the return
code mismatch, the ioctls from userspace fail. Resolve the issue by
passing -ENOTSUPP during unsupported case.
Fixes: 7d3e4d807df2 ("misc: microchip: pci1xxxx: load gpio driver for the gpio controller auxiliary device enumerated by the auxiliary bus driver.")
Cc: stable <stable@kernel.org>
Signed-off-by: Rengarajan S <rengarajan.s@microchip.com>
Link: https://lore.kernel.org/r/20241205133626.1483499-3-rengarajan.s@microchip.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date: Wed Jan 8 16:34:29 2025 +0100
mptcp: sysctl: avail sched: remove write access
commit 771ec78dc8b48d562e6015bb535ed3cd37043d78 upstream.
'net.mptcp.available_schedulers' sysctl knob is there to list available
schedulers, not to modify this list.
There are then no reasons to give write access to it.
Nothing would have been written anyway, but no errors would have been
returned, which is unexpected.
Fixes: 73c900aa3660 ("mptcp: add net.mptcp.available_schedulers")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-1-5df34b2083e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date: Wed Jan 8 16:34:31 2025 +0100
mptcp: sysctl: blackhole timeout: avoid using current->nsproxy
commit 92cf7a51bdae24a32c592adcdd59a773ae149289 upstream.
As mentioned in the previous commit, using the 'net' structure via
'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only
from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
(null-ptr-deref), e.g. when the current task is exiting, as spotted by
syzbot [1] using acct(2).
The 'pernet' structure can be obtained from the table->data using
container_of().
Fixes: 27069e7cb3d1 ("mptcp: disable active MPTCP in case of blackhole")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1]
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-3-5df34b2083e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date: Wed Jan 8 16:34:30 2025 +0100
mptcp: sysctl: sched: avoid using current->nsproxy
commit d38e26e36206ae3d544d496513212ae931d1da0a upstream.
Using the 'net' structure via 'current' is not recommended for different
reasons.
First, if the goal is to use it to read or write per-netns data, this is
inconsistent with how the "generic" sysctl entries are doing: directly
by only using pointers set to the table entry, e.g. table->data. Linked
to that, the per-netns data should always be obtained from the table
linked to the netns it had been created for, which may not coincide with
the reader's or writer's netns.
Another reason is that access to current->nsproxy->netns can oops if
attempted when current->nsproxy had been dropped when the current task
is exiting. This is what syzbot found, when using acct(2):
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
CPU: 1 UID: 0 PID: 5924 Comm: syz-executor Not tainted 6.13.0-rc5-syzkaller-00004-gccb98ccef0e5 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
RIP: 0010:proc_scheduler+0xc6/0x3c0 net/mptcp/ctrl.c:125
Code: 03 42 80 3c 38 00 0f 85 fe 02 00 00 4d 8b a4 24 08 09 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7c 24 28 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 cc 02 00 00 4d 8b 7c 24 28 48 8d 84 24 c8 00 00
RSP: 0018:ffffc900034774e8 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 1ffff9200068ee9e RCX: ffffc90003477620
RDX: 0000000000000005 RSI: ffffffff8b08f91e RDI: 0000000000000028
RBP: 0000000000000001 R08: ffffc90003477710 R09: 0000000000000040
R10: 0000000000000040 R11: 00000000726f7475 R12: 0000000000000000
R13: ffffc90003477620 R14: ffffc90003477710 R15: dffffc0000000000
FS: 0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fee3cd452d8 CR3: 000000007d116000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
proc_sys_call_handler+0x403/0x5d0 fs/proc/proc_sysctl.c:601
__kernel_write_iter+0x318/0xa80 fs/read_write.c:612
__kernel_write+0xf6/0x140 fs/read_write.c:632
do_acct_process+0xcb0/0x14a0 kernel/acct.c:539
acct_pin_kill+0x2d/0x100 kernel/acct.c:192
pin_kill+0x194/0x7c0 fs/fs_pin.c:44
mnt_pin_kill+0x61/0x1e0 fs/fs_pin.c:81
cleanup_mnt+0x3ac/0x450 fs/namespace.c:1366
task_work_run+0x14e/0x250 kernel/task_work.c:239
exit_task_work include/linux/task_work.h:43 [inline]
do_exit+0xad8/0x2d70 kernel/exit.c:938
do_group_exit+0xd3/0x2a0 kernel/exit.c:1087
get_signal+0x2576/0x2610 kernel/signal.c:3017
arch_do_signal_or_restart+0x90/0x7e0 arch/x86/kernel/signal.c:337
exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
syscall_exit_to_user_mode+0x150/0x2a0 kernel/entry/common.c:218
do_syscall_64+0xda/0x250 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fee3cb87a6a
Code: Unable to access opcode bytes at 0x7fee3cb87a40.
RSP: 002b:00007fffcccac688 EFLAGS: 00000202 ORIG_RAX: 0000000000000037
RAX: 0000000000000000 RBX: 00007fffcccac710 RCX: 00007fee3cb87a6a
RDX: 0000000000000041 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 0000000000000003 R08: 00007fffcccac6ac R09: 00007fffcccacac7
R10: 00007fffcccac710 R11: 0000000000000202 R12: 00007fee3cd49500
R13: 00007fffcccac6ac R14: 0000000000000000 R15: 00007fee3cd4b000
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:proc_scheduler+0xc6/0x3c0 net/mptcp/ctrl.c:125
Code: 03 42 80 3c 38 00 0f 85 fe 02 00 00 4d 8b a4 24 08 09 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7c 24 28 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 cc 02 00 00 4d 8b 7c 24 28 48 8d 84 24 c8 00 00
RSP: 0018:ffffc900034774e8 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 1ffff9200068ee9e RCX: ffffc90003477620
RDX: 0000000000000005 RSI: ffffffff8b08f91e RDI: 0000000000000028
RBP: 0000000000000001 R08: ffffc90003477710 R09: 0000000000000040
R10: 0000000000000040 R11: 00000000726f7475 R12: 0000000000000000
R13: ffffc90003477620 R14: ffffc90003477710 R15: dffffc0000000000
FS: 0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fee3cd452d8 CR3: 000000007d116000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess), 1 bytes skipped:
0: 42 80 3c 38 00 cmpb $0x0,(%rax,%r15,1)
5: 0f 85 fe 02 00 00 jne 0x309
b: 4d 8b a4 24 08 09 00 mov 0x908(%r12),%r12
12: 00
13: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
1a: fc ff df
1d: 49 8d 7c 24 28 lea 0x28(%r12),%rdi
22: 48 89 fa mov %rdi,%rdx
25: 48 c1 ea 03 shr $0x3,%rdx
* 29: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction
2d: 0f 85 cc 02 00 00 jne 0x2ff
33: 4d 8b 7c 24 28 mov 0x28(%r12),%r15
38: 48 rex.W
39: 8d .byte 0x8d
3a: 84 24 c8 test %ah,(%rax,%rcx,8)
Here with 'net.mptcp.scheduler', the 'net' structure is not really
needed, because the table->data already has a pointer to the current
scheduler, the only thing needed from the per-netns data.
Simply use 'data', instead of getting (most of the time) the same thing,
but from a longer and indirect way.
Fixes: 6963c508fd7a ("mptcp: only allow set existing scheduler for net.mptcp.scheduler")
Cc: stable@vger.kernel.org
Reported-by: syzbot+e364f774c6f57f2c86d1@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-2-5df34b2083e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Chenguang Zhao <zhaochenguang@kylinos.cn>
Date: Wed Jan 8 11:00:09 2025 +0800
net/mlx5: Fix variable not being completed when function returns
[ Upstream commit 0e2909c6bec9048f49d0c8e16887c63b50b14647 ]
When cmd_alloc_index(), fails cmd_work_handler() needs
to complete ent->slotted before returning early.
Otherwise the task which issued the command may hang:
mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to allocate command entry
INFO: task kworker/13:2:4055883 blocked for more than 120 seconds.
Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/13:2 D 0 4055883 2 0x00000228
Workqueue: events mlx5e_tx_dim_work [mlx5_core]
Call trace:
__switch_to+0xe8/0x150
__schedule+0x2a8/0x9b8
schedule+0x2c/0x88
schedule_timeout+0x204/0x478
wait_for_common+0x154/0x250
wait_for_completion+0x28/0x38
cmd_exec+0x7a0/0xa00 [mlx5_core]
mlx5_cmd_exec+0x54/0x80 [mlx5_core]
mlx5_core_modify_cq+0x6c/0x80 [mlx5_core]
mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core]
mlx5e_tx_dim_work+0x54/0x68 [mlx5_core]
process_one_work+0x1b0/0x448
worker_thread+0x54/0x468
kthread+0x134/0x138
ret_from_fork+0x10/0x18
Fixes: 485d65e13571 ("net/mlx5: Add a timeout to acquire the command queue semaphore")
Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250108030009.68520-1-zhaochenguang@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Antonio Pastor <antonio.pastor@gmail.com>
Date: Thu Jan 2 20:23:00 2025 -0500
net: 802: LLC+SNAP OID:PID lookup on start of skb data
[ Upstream commit 1e9b0e1c550c42c13c111d1a31e822057232abc4 ]
802.2+LLC+SNAP frames received by napi_complete_done() with GRO and DSA
have skb->transport_header set two bytes short, or pointing 2 bytes
before network_header & skb->data. This was an issue as snap_rcv()
expected offset to point to SNAP header (OID:PID), causing packet to
be dropped.
A fix at llc_fixup_skb() (a024e377efed) resets transport_header for any
LLC consumers that may care about it, and stops SNAP packets from being
dropped, but doesn't fix the problem which is that LLC and SNAP should
not use transport_header offset.
Ths patch eliminates the use of transport_header offset for SNAP lookup
of OID:PID so that SNAP does not rely on the offset at all.
The offset is reset after pull for any SNAP packet consumers that may
(but shouldn't) use it.
Fixes: fda55eca5a33 ("net: introduce skb_transport_header_was_set()")
Signed-off-by: Antonio Pastor <antonio.pastor@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250103012303.746521-1-antonio.pastor@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jakub Kicinski <kuba@kernel.org>
Date: Fri Jan 3 10:32:07 2025 -0800
net: don't dump Tx and uninitialized NAPIs
[ Upstream commit fd48f071a3d6d51e737e953bb43fe69785cf59a9 ]
We use NAPI ID as the key for continuing dumps. We also depend
on the NAPIs being sorted by ID within the driver list. Tx NAPIs
(which don't have an ID assigned) break this expectation, it's
not currently possible to dump them reliably. Since Tx NAPIs
are relatively rare, and can't be used in doit (GET or SET)
hide them from the dump API as well.
Fixes: 27f91aaf49b3 ("netdev-genl: Add netlink framework functions for napi")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250103183207.1216004-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jian Shen <shenjian15@huawei.com>
Date: Mon Jan 6 22:36:39 2025 +0800
net: hns3: don't auto enable misc vector
[ Upstream commit 98b1e3b27734139c76295754b6c317aa4df6d32e ]
Currently, there is a time window between misc irq enabled
and service task inited. If an interrupte is reported at
this time, it will cause warning like below:
[ 16.324639] Call trace:
[ 16.324641] __queue_delayed_work+0xb8/0xe0
[ 16.324643] mod_delayed_work_on+0x78/0xd0
[ 16.324655] hclge_errhand_task_schedule+0x58/0x90 [hclge]
[ 16.324662] hclge_misc_irq_handle+0x168/0x240 [hclge]
[ 16.324666] __handle_irq_event_percpu+0x64/0x1e0
[ 16.324667] handle_irq_event+0x80/0x170
[ 16.324670] handle_fasteoi_edge_irq+0x110/0x2bc
[ 16.324671] __handle_domain_irq+0x84/0xfc
[ 16.324673] gic_handle_irq+0x88/0x2c0
[ 16.324674] el1_irq+0xb8/0x140
[ 16.324677] arch_cpu_idle+0x18/0x40
[ 16.324679] default_idle_call+0x5c/0x1bc
[ 16.324682] cpuidle_idle_call+0x18c/0x1c4
[ 16.324684] do_idle+0x174/0x17c
[ 16.324685] cpu_startup_entry+0x30/0x6c
[ 16.324687] secondary_start_kernel+0x1a4/0x280
[ 16.324688] ---[ end trace 6aa0bff672a964aa ]---
So don't auto enable misc vector when request irq..
Fixes: 7be1b9f3e99f ("net: hns3: make hclge_service use delayed workqueue")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20250106143642.539698-5-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jie Wang <wangjie125@huawei.com>
Date: Mon Jan 6 22:36:42 2025 +0800
net: hns3: fix kernel crash when 1588 is sent on HIP08 devices
[ Upstream commit 9741e72b2286de8b38de9db685588ac421a95c87 ]
Currently, HIP08 devices does not register the ptp devices, so the
hdev->ptp is NULL. But the tx process would still try to set hardware time
stamp info with SKBTX_HW_TSTAMP flag and cause a kernel crash.
[ 128.087798] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018
...
[ 128.280251] pc : hclge_ptp_set_tx_info+0x2c/0x140 [hclge]
[ 128.286600] lr : hclge_ptp_set_tx_info+0x20/0x140 [hclge]
[ 128.292938] sp : ffff800059b93140
[ 128.297200] x29: ffff800059b93140 x28: 0000000000003280
[ 128.303455] x27: ffff800020d48280 x26: ffff0cb9dc814080
[ 128.309715] x25: ffff0cb9cde93fa0 x24: 0000000000000001
[ 128.315969] x23: 0000000000000000 x22: 0000000000000194
[ 128.322219] x21: ffff0cd94f986000 x20: 0000000000000000
[ 128.328462] x19: ffff0cb9d2a166c0 x18: 0000000000000000
[ 128.334698] x17: 0000000000000000 x16: ffffcf1fc523ed24
[ 128.340934] x15: 0000ffffd530a518 x14: 0000000000000000
[ 128.347162] x13: ffff0cd6bdb31310 x12: 0000000000000368
[ 128.353388] x11: ffff0cb9cfbc7070 x10: ffff2cf55dd11e02
[ 128.359606] x9 : ffffcf1f85a212b4 x8 : ffff0cd7cf27dab0
[ 128.365831] x7 : 0000000000000a20 x6 : ffff0cd7cf27d000
[ 128.372040] x5 : 0000000000000000 x4 : 000000000000ffff
[ 128.378243] x3 : 0000000000000400 x2 : ffffcf1f85a21294
[ 128.384437] x1 : ffff0cb9db520080 x0 : ffff0cb9db500080
[ 128.390626] Call trace:
[ 128.393964] hclge_ptp_set_tx_info+0x2c/0x140 [hclge]
[ 128.399893] hns3_nic_net_xmit+0x39c/0x4c4 [hns3]
[ 128.405468] xmit_one.constprop.0+0xc4/0x200
[ 128.410600] dev_hard_start_xmit+0x54/0xf0
[ 128.415556] sch_direct_xmit+0xe8/0x634
[ 128.420246] __dev_queue_xmit+0x224/0xc70
[ 128.425101] dev_queue_xmit+0x1c/0x40
[ 128.429608] ovs_vport_send+0xac/0x1a0 [openvswitch]
[ 128.435409] do_output+0x60/0x17c [openvswitch]
[ 128.440770] do_execute_actions+0x898/0x8c4 [openvswitch]
[ 128.446993] ovs_execute_actions+0x64/0xf0 [openvswitch]
[ 128.453129] ovs_dp_process_packet+0xa0/0x224 [openvswitch]
[ 128.459530] ovs_vport_receive+0x7c/0xfc [openvswitch]
[ 128.465497] internal_dev_xmit+0x34/0xb0 [openvswitch]
[ 128.471460] xmit_one.constprop.0+0xc4/0x200
[ 128.476561] dev_hard_start_xmit+0x54/0xf0
[ 128.481489] __dev_queue_xmit+0x968/0xc70
[ 128.486330] dev_queue_xmit+0x1c/0x40
[ 128.490856] ip_finish_output2+0x250/0x570
[ 128.495810] __ip_finish_output+0x170/0x1e0
[ 128.500832] ip_finish_output+0x3c/0xf0
[ 128.505504] ip_output+0xbc/0x160
[ 128.509654] ip_send_skb+0x58/0xd4
[ 128.513892] udp_send_skb+0x12c/0x354
[ 128.518387] udp_sendmsg+0x7a8/0x9c0
[ 128.522793] inet_sendmsg+0x4c/0x8c
[ 128.527116] __sock_sendmsg+0x48/0x80
[ 128.531609] __sys_sendto+0x124/0x164
[ 128.536099] __arm64_sys_sendto+0x30/0x5c
[ 128.540935] invoke_syscall+0x50/0x130
[ 128.545508] el0_svc_common.constprop.0+0x10c/0x124
[ 128.551205] do_el0_svc+0x34/0xdc
[ 128.555347] el0_svc+0x20/0x30
[ 128.559227] el0_sync_handler+0xb8/0xc0
[ 128.563883] el0_sync+0x160/0x180
Fixes: 0bf5eb788512 ("net: hns3: add support for PTP")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250106143642.539698-8-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hao Lan <lanhao@huawei.com>
Date: Mon Jan 6 22:36:37 2025 +0800
net: hns3: fix missing features due to dev->features configuration too early
[ Upstream commit ac1e2836fe294c2007ca81cf7006862c3bdf0510 ]
Currently, the netdev->features is configured in hns3_nic_set_features.
As a result, __netdev_update_features considers that there is no feature
difference, and the procedures of the real features are missing.
Fixes: 2a7556bb2b73 ("net: hns3: implement ndo_features_check ops for hns3 driver")
Signed-off-by: Hao Lan <lanhao@huawei.com>
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250106143642.539698-3-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hao Lan <lanhao@huawei.com>
Date: Mon Jan 6 22:36:41 2025 +0800
net: hns3: fixed hclge_fetch_pf_reg accesses bar space out of bounds issue
[ Upstream commit 7997ddd46c54408bcba5e37fe18b4d832e45d4d4 ]
The TQP BAR space is divided into two segments. TQPs 0-1023 and TQPs
1024-1279 are in different BAR space addresses. However,
hclge_fetch_pf_reg does not distinguish the tqp space information when
reading the tqp space information. When the number of TQPs is greater
than 1024, access bar space overwriting occurs.
The problem of different segments has been considered during the
initialization of tqp.io_base. Therefore, tqp.io_base is directly used
when the queue is read in hclge_fetch_pf_reg.
The error message:
Unable to handle kernel paging request at virtual address ffff800037200000
pc : hclge_fetch_pf_reg+0x138/0x250 [hclge]
lr : hclge_get_regs+0x84/0x1d0 [hclge]
Call trace:
hclge_fetch_pf_reg+0x138/0x250 [hclge]
hclge_get_regs+0x84/0x1d0 [hclge]
hns3_get_regs+0x2c/0x50 [hns3]
ethtool_get_regs+0xf4/0x270
dev_ethtool+0x674/0x8a0
dev_ioctl+0x270/0x36c
sock_do_ioctl+0x110/0x2a0
sock_ioctl+0x2ac/0x530
__arm64_sys_ioctl+0xa8/0x100
invoke_syscall+0x4c/0x124
el0_svc_common.constprop.0+0x140/0x15c
do_el0_svc+0x30/0xd0
el0_svc+0x1c/0x2c
el0_sync_handler+0xb0/0xb4
el0_sync+0x168/0x180
Fixes: 939ccd107ffc ("net: hns3: move dump regs function to a separate file")
Signed-off-by: Hao Lan <lanhao@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20250106143642.539698-7-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hao Lan <lanhao@huawei.com>
Date: Mon Jan 6 22:36:36 2025 +0800
net: hns3: fixed reset failure issues caused by the incorrect reset type
[ Upstream commit 5a4b584c67699a69981f0740618a144965a63237 ]
When a reset type that is not supported by the driver is input, a reset
pending flag bit of the HNAE3_NONE_RESET type is generated in
reset_pending. The driver does not have a mechanism to clear this type
of error. As a result, the driver considers that the reset is not
complete. This patch provides a mechanism to clear the
HNAE3_NONE_RESET flag and the parameter of
hnae3_ae_ops.set_default_reset_request is verified.
The error message:
hns3 0000:39:01.0: cmd failed -16
hns3 0000:39:01.0: hclge device re-init failed, VF is disabled!
hns3 0000:39:01.0: failed to reset VF stack
hns3 0000:39:01.0: failed to reset VF(4)
hns3 0000:39:01.0: prepare reset(2) wait done
hns3 0000:39:01.0 eth4: already uninitialized
Use the crash tool to view struct hclgevf_dev:
struct hclgevf_dev {
...
default_reset_request = 0x20,
reset_level = HNAE3_NONE_RESET,
reset_pending = 0x100,
reset_type = HNAE3_NONE_RESET,
...
};
Fixes: 720bd5837e37 ("net: hns3: add set_default_reset_request in the hnae3_ae_ops")
Signed-off-by: Hao Lan <lanhao@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20250106143642.539698-2-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jian Shen <shenjian15@huawei.com>
Date: Mon Jan 6 22:36:40 2025 +0800
net: hns3: initialize reset_timer before hclgevf_misc_irq_init()
[ Upstream commit 247fd1e33e1cd156aabe444e932d2648d33f1245 ]
Currently the misc irq is initialized before reset_timer setup. But
it will access the reset_timer in the irq handler. So initialize
the reset_timer earlier.
Fixes: ff200099d271 ("net: hns3: remove unnecessary work in hclgevf_main")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20250106143642.539698-6-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hao Lan <lanhao@huawei.com>
Date: Mon Jan 6 22:36:38 2025 +0800
net: hns3: Resolved the issue that the debugfs query result is inconsistent.
[ Upstream commit 5191a8d3c2ab5bc01930ea3425e06a739af5b0e9 ]
This patch modifies the implementation of debugfs:
When the user process stops unexpectedly, not all data of the file system
is read. In this case, the save_buf pointer is not released. When the
user process is called next time, save_buf is used to copy the cached
data to the user space. As a result, the queried data is stale.
To solve this problem, this patch implements .open() and .release() handler
for debugfs file_operations. moving allocation buffer and execution
of the cmd to the .open() handler and freeing in to the .release() handler.
Allocate separate buffer for each reader and associate the buffer
with the file pointer.
When different user read processes no longer share the buffer,
the stale data problem is fixed.
Fixes: 5e69ea7ee2a6 ("net: hns3: refactor the debugfs process")
Signed-off-by: Hao Lan <lanhao@huawei.com>
Signed-off-by: Guangwei Zhang <zhangwangwei6@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250106143642.539698-4-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jiawen Wu <jiawenwu@trustnetic.com>
Date: Fri Jan 3 16:10:13 2025 +0800
net: libwx: fix firmware mailbox abnormal return
[ Upstream commit 8ce4f287524c74a118b0af1eebd4b24a8efca57a ]
The existing SW-FW interaction flow on the driver is wrong. Follow this
wrong flow, driver would never return error if there is a unknown command.
Since firmware writes back 'firmware ready' and 'unknown command' in the
mailbox message if there is an unknown command sent by driver. So reading
'firmware ready' does not timeout. Then driver would mistakenly believe
that the interaction has completed successfully.
It tends to happen with the use of custom firmware. Move the check for
'unknown command' out of the poll timeout for 'firmware ready'. And adjust
the debug log so that mailbox messages are always printed when commands
timeout.
Fixes: 1efa9bfe58c5 ("net: libwx: Implement interaction with firmware")
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Link: https://patch.msgid.link/20250103081013.1995939-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Parker Newman <pnewman@connecttech.com>
Date: Tue Jan 7 16:24:59 2025 -0500
net: stmmac: dwmac-tegra: Read iommu stream id from device tree
[ Upstream commit 426046e2d62dd19533808661e912b8e8a9eaec16 ]
Nvidia's Tegra MGBE controllers require the IOMMU "Stream ID" (SID) to be
written to the MGBE_WRAP_AXI_ASID0_CTRL register.
The current driver is hard coded to use MGBE0's SID for all controllers.
This causes softirq time outs and kernel panics when using controllers
other than MGBE0.
Example dmesg errors when an ethernet cable is connected to MGBE1:
[ 116.133290] tegra-mgbe 6910000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx
[ 121.851283] tegra-mgbe 6910000.ethernet eth1: NETDEV WATCHDOG: CPU: 5: transmit queue 0 timed out 5690 ms
[ 121.851782] tegra-mgbe 6910000.ethernet eth1: Reset adapter.
[ 121.892464] tegra-mgbe 6910000.ethernet eth1: Register MEM_TYPE_PAGE_POOL RxQ-0
[ 121.905920] tegra-mgbe 6910000.ethernet eth1: PHY [stmmac-1:00] driver [Aquantia AQR113] (irq=171)
[ 121.907356] tegra-mgbe 6910000.ethernet eth1: Enabling Safety Features
[ 121.907578] tegra-mgbe 6910000.ethernet eth1: IEEE 1588-2008 Advanced Timestamp supported
[ 121.908399] tegra-mgbe 6910000.ethernet eth1: registered PTP clock
[ 121.908582] tegra-mgbe 6910000.ethernet eth1: configuring for phy/10gbase-r link mode
[ 125.961292] tegra-mgbe 6910000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx
[ 181.921198] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 181.921404] rcu: 7-....: (1 GPs behind) idle=540c/1/0x4000000000000002 softirq=1748/1749 fqs=2337
[ 181.921684] rcu: (detected by 4, t=6002 jiffies, g=1357, q=1254 ncpus=8)
[ 181.921878] Sending NMI from CPU 4 to CPUs 7:
[ 181.921886] NMI backtrace for cpu 7
[ 181.922131] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Kdump: loaded Not tainted 6.13.0-rc3+ #6
[ 181.922390] Hardware name: NVIDIA CTI Forge + Orin AGX/Jetson, BIOS 202402.1-Unknown 10/28/2024
[ 181.922658] pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 181.922847] pc : handle_softirqs+0x98/0x368
[ 181.922978] lr : __do_softirq+0x18/0x20
[ 181.923095] sp : ffff80008003bf50
[ 181.923189] x29: ffff80008003bf50 x28: 0000000000000008 x27: 0000000000000000
[ 181.923379] x26: ffffce78ea277000 x25: 0000000000000000 x24: 0000001c61befda0
[ 181.924486] x23: 0000000060400009 x22: ffffce78e99918bc x21: ffff80008018bd70
[ 181.925568] x20: ffffce78e8bb00d8 x19: ffff80008018bc20 x18: 0000000000000000
[ 181.926655] x17: ffff318ebe7d3000 x16: ffff800080038000 x15: 0000000000000000
[ 181.931455] x14: ffff000080816680 x13: ffff318ebe7d3000 x12: 000000003464d91d
[ 181.938628] x11: 0000000000000040 x10: ffff000080165a70 x9 : ffffce78e8bb0160
[ 181.945804] x8 : ffff8000827b3160 x7 : f9157b241586f343 x6 : eeb6502a01c81c74
[ 181.953068] x5 : a4acfcdd2e8096bb x4 : ffffce78ea277340 x3 : 00000000ffffd1e1
[ 181.960329] x2 : 0000000000000101 x1 : ffffce78ea277340 x0 : ffff318ebe7d3000
[ 181.967591] Call trace:
[ 181.970043] handle_softirqs+0x98/0x368 (P)
[ 181.974240] __do_softirq+0x18/0x20
[ 181.977743] ____do_softirq+0x14/0x28
[ 181.981415] call_on_irq_stack+0x24/0x30
[ 181.985180] do_softirq_own_stack+0x20/0x30
[ 181.989379] __irq_exit_rcu+0x114/0x140
[ 181.993142] irq_exit_rcu+0x14/0x28
[ 181.996816] el1_interrupt+0x44/0xb8
[ 182.000316] el1h_64_irq_handler+0x14/0x20
[ 182.004343] el1h_64_irq+0x80/0x88
[ 182.007755] cpuidle_enter_state+0xc4/0x4a8 (P)
[ 182.012305] cpuidle_enter+0x3c/0x58
[ 182.015980] cpuidle_idle_call+0x128/0x1c0
[ 182.020005] do_idle+0xe0/0xf0
[ 182.023155] cpu_startup_entry+0x3c/0x48
[ 182.026917] secondary_start_kernel+0xdc/0x120
[ 182.031379] __secondary_switched+0x74/0x78
[ 212.971162] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 7-.... } 6103 jiffies s: 417 root: 0x80/.
[ 212.985935] rcu: blocking rcu_node structures (internal RCU debug):
[ 212.992758] Sending NMI from CPU 0 to CPUs 7:
[ 212.998539] NMI backtrace for cpu 7
[ 213.004304] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Kdump: loaded Not tainted 6.13.0-rc3+ #6
[ 213.016116] Hardware name: NVIDIA CTI Forge + Orin AGX/Jetson, BIOS 202402.1-Unknown 10/28/2024
[ 213.030817] pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 213.040528] pc : handle_softirqs+0x98/0x368
[ 213.046563] lr : __do_softirq+0x18/0x20
[ 213.051293] sp : ffff80008003bf50
[ 213.055839] x29: ffff80008003bf50 x28: 0000000000000008 x27: 0000000000000000
[ 213.067304] x26: ffffce78ea277000 x25: 0000000000000000 x24: 0000001c61befda0
[ 213.077014] x23: 0000000060400009 x22: ffffce78e99918bc x21: ffff80008018bd70
[ 213.087339] x20: ffffce78e8bb00d8 x19: ffff80008018bc20 x18: 0000000000000000
[ 213.097313] x17: ffff318ebe7d3000 x16: ffff800080038000 x15: 0000000000000000
[ 213.107201] x14: ffff000080816680 x13: ffff318ebe7d3000 x12: 000000003464d91d
[ 213.116651] x11: 0000000000000040 x10: ffff000080165a70 x9 : ffffce78e8bb0160
[ 213.127500] x8 : ffff8000827b3160 x7 : 0a37b344852820af x6 : 3f049caedd1ff608
[ 213.138002] x5 : cff7cfdbfaf31291 x4 : ffffce78ea277340 x3 : 00000000ffffde04
[ 213.150428] x2 : 0000000000000101 x1 : ffffce78ea277340 x0 : ffff318ebe7d3000
[ 213.162063] Call trace:
[ 213.165494] handle_softirqs+0x98/0x368 (P)
[ 213.171256] __do_softirq+0x18/0x20
[ 213.177291] ____do_softirq+0x14/0x28
[ 213.182017] call_on_irq_stack+0x24/0x30
[ 213.186565] do_softirq_own_stack+0x20/0x30
[ 213.191815] __irq_exit_rcu+0x114/0x140
[ 213.196891] irq_exit_rcu+0x14/0x28
[ 213.202401] el1_interrupt+0x44/0xb8
[ 213.207741] el1h_64_irq_handler+0x14/0x20
[ 213.213519] el1h_64_irq+0x80/0x88
[ 213.217541] cpuidle_enter_state+0xc4/0x4a8 (P)
[ 213.224364] cpuidle_enter+0x3c/0x58
[ 213.228653] cpuidle_idle_call+0x128/0x1c0
[ 213.233993] do_idle+0xe0/0xf0
[ 213.237928] cpu_startup_entry+0x3c/0x48
[ 213.243791] secondary_start_kernel+0xdc/0x120
[ 213.249830] __secondary_switched+0x74/0x78
This bug has existed since the dwmac-tegra driver was added in Dec 2022
(See Fixes tag below for commit hash).
The Tegra234 SOC has 4 MGBE controllers, however Nvidia's Developer Kit
only uses MGBE0 which is why the bug was not found previously. Connect Tech
has many products that use 2 (or more) MGBE controllers.
The solution is to read the controller's SID from the existing "iommus"
device tree property. The 2nd field of the "iommus" device tree property
is the controller's SID.
Device tree snippet from tegra234.dtsi showing MGBE1's "iommus" property:
smmu_niso0: iommu@12000000 {
compatible = "nvidia,tegra234-smmu", "nvidia,smmu-500";
...
}
/* MGBE1 */
ethernet@6900000 {
compatible = "nvidia,tegra234-mgbe";
...
iommus = <&smmu_niso0 TEGRA234_SID_MGBE_VF1>;
...
}
Nvidia's arm-smmu driver reads the "iommus" property and stores the SID in
the MGBE device's "fwspec" struct. The dwmac-tegra driver can access the
SID using the tegra_dev_iommu_get_stream_id() helper function found in
linux/iommu.h.
Calling tegra_dev_iommu_get_stream_id() should not fail unless the "iommus"
property is removed from the device tree or the IOMMU is disabled.
While the Tegra234 SOC technically supports bypassing the IOMMU, it is not
supported by the current firmware, has not been tested and not recommended.
More detailed discussion with Thierry Reding from Nvidia linked below.
Fixes: d8ca113724e7 ("net: stmmac: tegra: Add MGBE support")
Link: https://lore.kernel.org/netdev/cover.1731685185.git.pnewman@connecttech.com
Signed-off-by: Parker Newman <pnewman@connecttech.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://patch.msgid.link/6fb97f32cf4accb4f7cf92846f6b60064ba0a3bd.1736284360.git.pnewman@connecttech.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Fri Jan 3 10:45:46 2025 +0000
net_sched: cls_flow: validate TCA_FLOW_RSHIFT attribute
[ Upstream commit a039e54397c6a75b713b9ce7894a62e06956aa92 ]
syzbot found that TCA_FLOW_RSHIFT attribute was not validated.
Right shitfing a 32bit integer is undefined for large shift values.
UBSAN: shift-out-of-bounds in net/sched/cls_flow.c:329:23
shift exponent 9445 is too large for 32-bit type 'u32' (aka 'unsigned int')
CPU: 1 UID: 0 PID: 54 Comm: kworker/u8:3 Not tainted 6.13.0-rc3-syzkaller-00180-g4f619d518db9 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Workqueue: ipv6_addrconf addrconf_dad_work
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
ubsan_epilogue lib/ubsan.c:231 [inline]
__ubsan_handle_shift_out_of_bounds+0x3c8/0x420 lib/ubsan.c:468
flow_classify+0x24d5/0x25b0 net/sched/cls_flow.c:329
tc_classify include/net/tc_wrapper.h:197 [inline]
__tcf_classify net/sched/cls_api.c:1771 [inline]
tcf_classify+0x420/0x1160 net/sched/cls_api.c:1867
sfb_classify net/sched/sch_sfb.c:260 [inline]
sfb_enqueue+0x3ad/0x18b0 net/sched/sch_sfb.c:318
dev_qdisc_enqueue+0x4b/0x290 net/core/dev.c:3793
__dev_xmit_skb net/core/dev.c:3889 [inline]
__dev_queue_xmit+0xf0e/0x3f50 net/core/dev.c:4400
dev_queue_xmit include/linux/netdevice.h:3168 [inline]
neigh_hh_output include/net/neighbour.h:523 [inline]
neigh_output include/net/neighbour.h:537 [inline]
ip_finish_output2+0xd41/0x1390 net/ipv4/ip_output.c:236
iptunnel_xmit+0x55d/0x9b0 net/ipv4/ip_tunnel_core.c:82
udp_tunnel_xmit_skb+0x262/0x3b0 net/ipv4/udp_tunnel_core.c:173
geneve_xmit_skb drivers/net/geneve.c:916 [inline]
geneve_xmit+0x21dc/0x2d00 drivers/net/geneve.c:1039
__netdev_start_xmit include/linux/netdevice.h:5002 [inline]
netdev_start_xmit include/linux/netdevice.h:5011 [inline]
xmit_one net/core/dev.c:3590 [inline]
dev_hard_start_xmit+0x27a/0x7d0 net/core/dev.c:3606
__dev_queue_xmit+0x1b73/0x3f50 net/core/dev.c:4434
Fixes: e5dfb815181f ("[NET_SCHED]: Add flow classifier")
Reported-by: syzbot+1dbb57d994e54aaa04d2@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6777bf49.050a0220.178762.0040.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250103104546.3714168-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jakub Kicinski <kuba@kernel.org>
Date: Mon Jan 6 10:01:36 2025 -0800
netdev: prevent accessing NAPI instances from another namespace
commit d1cacd74776895f6435941f86a1130e58f6dd226 upstream.
The NAPI IDs were not fully exposed to user space prior to the netlink
API, so they were never namespaced. The netlink API must ensure that
at the very least NAPI instance belongs to the same netns as the owner
of the genl sock.
napi_by_id() can become static now, but it needs to move because of
dev_get_by_napi_id().
Cc: stable@vger.kernel.org
Fixes: 1287c1ae0fc2 ("netdev-genl: Support setting per-NAPI config values")
Fixes: 27f91aaf49b3 ("netdev-genl: Add netlink framework functions for napi")
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250106180137.1861472-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Wed Jan 8 22:56:33 2025 +0100
netfilter: conntrack: clamp maximum hashtable size to INT_MAX
[ Upstream commit b541ba7d1f5a5b7b3e2e22dc9e40e18a7d6dbc13 ]
Use INT_MAX as maximum size for the conntrack hashtable. Otherwise, it
is possible to hit WARN_ON_ONCE in __kvmalloc_node_noprof() when
resizing hashtable because __GFP_NOWARN is unset. See:
0708a0afe291 ("mm: Consider __GFP_NOWARN flag for oversized kvmalloc() calls")
Note: hashtable resize is only possible from init_netns.
Fixes: 9cc1c73ad666 ("netfilter: conntrack: avoid integer overflow when resizing")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Thu Jan 2 13:01:13 2025 +0100
netfilter: nf_tables: imbalance in flowtable binding
[ Upstream commit 13210fc63f353fe78584048079343413a3cdf819 ]
All these cases cause imbalance between BIND and UNBIND calls:
- Delete an interface from a flowtable with multiple interfaces
- Add a (device to a) flowtable with --check flag
- Delete a netns containing a flowtable
- In an interactive nft session, create a table with owner flag and
flowtable inside, then quit.
Fix it by calling FLOW_BLOCK_UNBIND when unregistering hooks, then
remove late FLOW_BLOCK_UNBIND call when destroying flowtable.
Fixes: ff4bf2f42a40 ("netfilter: nf_tables: add nft_unregister_flowtable_hook()")
Reported-by: Phil Sutter <phil@nwl.cc>
Tested-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Fri Dec 13 13:50:09 2024 +0000
netfs: Fix ceph copy to cache on write-begin
[ Upstream commit 38cf8e945721ffe708fa675507465da7f4f2a9f7 ]
At the end of netfs_unlock_read_folio() in which folios are marked
appropriately for copying to the cache (either with by being marked dirty
and having their private data set or by having PG_private_2 set) and then
unlocked, the folio_queue struct has the entry pointing to the folio
cleared. This presents a problem for netfs_pgpriv2_write_to_the_cache(),
which is used to write folios marked with PG_private_2 to the cache as it
expects to be able to trawl the folio_queue list thereafter to find the
relevant folios, leading to a hang.
Fix this by not clearing the folio_queue entry if we're going to do the
deprecated copy-to-cache. The clearance will be done instead as the folios
are written to the cache.
This can be reproduced by starting cachefiles, mounting a ceph filesystem
with "-o fsc" and writing to it.
Fixes: 796a4049640b ("netfs: In readahead, put the folio refs as soon extracted")
Reported-by: Max Kellermann <max.kellermann@ionos.com>
Closes: https://lore.kernel.org/r/CAKPOu+_4m80thNy5_fvROoxBm689YtA0dZ-=gcmkzwYSY4syqw@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20241213135013.2964079-10-dhowells@redhat.com
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
cc: Jeff Layton <jlayton@kernel.org>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Xiubo Li <xiubli@redhat.com>
cc: netfs@lists.linux.dev
cc: ceph-devel@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Fri Dec 13 13:50:03 2024 +0000
netfs: Fix enomem handling in buffered reads
[ Upstream commit 105549d09a539a876b7c3330ab52d8aceedad358 ]
If netfs_read_to_pagecache() gets an error from either ->prepare_read() or
from netfs_prepare_read_iterator(), it needs to decrement ->nr_outstanding,
cancel the subrequest and break out of the issuing loop. Currently, it
only does this for two of the cases, but there are two more that aren't
handled.
Fix this by moving the handling to a common place and jumping to it from
all four places. This is in preference to inserting a wrapper around
netfs_prepare_read_iterator() as proposed by Dmitry Antipov[1].
Link: https://lore.kernel.org/r/20241202093943.227786-1-dmantipov@yandex.ru/ [1]
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Reported-by: syzbot+404b4b745080b6210c6c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=404b4b745080b6210c6c
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20241213135013.2964079-4-dhowells@redhat.com
Tested-by: syzbot+404b4b745080b6210c6c@syzkaller.appspotmail.com
cc: Dmitry Antipov <dmantipov@yandex.ru>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Mon Dec 16 20:34:45 2024 +0000
netfs: Fix is-caching check in read-retry
[ Upstream commit d4e338de17cb6532bf805fae00db8b41e914009b ]
netfs: Fix is-caching check in read-retry
The read-retry code checks the NETFS_RREQ_COPY_TO_CACHE flag to determine
if there might be failed reads from the cache that need turning into reads
from the server, with the intention of skipping the complicated part if it
can. The code that set the flag, however, got lost during the read-side
rewrite.
Fix the check to see if the cache_resources are valid instead. The flag
can then be removed.
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/3752048.1734381285@warthog.procyon.org.uk
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Tue Jan 7 18:39:27 2025 +0000
netfs: Fix kernel async DIO
[ Upstream commit 3f6bc9e3ab9b127171d39f9ac6eca1abb693b731 ]
Netfslib needs to be able to handle kernel-initiated asynchronous DIO that
is supplied with a bio_vec[] array. Currently, because of the async flag,
this gets passed to netfs_extract_user_iter() which throws a warning and
fails because it only handles IOVEC and UBUF iterators. This can be
triggered through a combination of cifs and a loopback blockdev with
something like:
mount //my/cifs/share /foo
dd if=/dev/zero of=/foo/m0 bs=4K count=1K
losetup --sector-size 4096 --direct-io=on /dev/loop2046 /foo/m0
echo hello >/dev/loop2046
This causes the following to appear in syslog:
WARNING: CPU: 2 PID: 109 at fs/netfs/iterator.c:50 netfs_extract_user_iter+0x170/0x250 [netfs]
and the write to fail.
Fix this by removing the check in netfs_unbuffered_write_iter_locked() that
causes async kernel DIO writes to be handled as userspace writes. Note
that this change relies on the kernel caller maintaining the existence of
the bio_vec array (or kvec[] or folio_queue) until the op is complete.
Fixes: 153a9961b551 ("netfs: Implement unbuffered/DIO write support")
Reported-by: Nicolas Baranger <nicolas.baranger@3xo.fr>
Closes: https://lore.kernel.org/r/fedd8a40d54b2969097ffa4507979858@3xo.fr/
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/608725.1736275167@warthog.procyon.org.uk
Tested-by: Nicolas Baranger <nicolas.baranger@3xo.fr>
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
cc: Steve French <smfrench@gmail.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Fri Dec 13 13:50:07 2024 +0000
netfs: Fix missing barriers by using clear_and_wake_up_bit()
[ Upstream commit aa3956418985bda1f68313eadde3267921847978 ]
Use clear_and_wake_up_bit() rather than something like:
clear_bit_unlock(NETFS_RREQ_IN_PROGRESS, &rreq->flags);
wake_up_bit(&rreq->flags, NETFS_RREQ_IN_PROGRESS);
as there needs to be a barrier inserted between which is present in
clear_and_wake_up_bit().
Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20241213135013.2964079-8-dhowells@redhat.com
Reviewed-by: Akira Yokosawa <akiyks@gmail.com>
cc: Zilin Guan <zilin@seu.edu.cn>
cc: Akira Yokosawa <akiyks@gmail.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Tue Jan 7 14:43:30 2025 +0000
netfs: Fix read-retry for fs with no ->prepare_read()
[ Upstream commit 904abff4b1b94184aaa0e9f5fce7821f7b5b81a3 ]
Fix netfslib's read-retry to only call ->prepare_read() in the backing
filesystem such a function is provided. We can get to this point if a
there's an active cache as failed reads from the cache need negotiating
with the server instead.
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/529329.1736261010@warthog.procyon.org.uk
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Fri Dec 13 13:50:10 2024 +0000
netfs: Fix the (non-)cancellation of copy when cache is temporarily disabled
[ Upstream commit d0327c824338cdccad058723a31d038ecd553409 ]
When the caching for a cookie is temporarily disabled (e.g. due to a DIO
write on that file), future copying to the cache for that file is disabled
until all fds open on that file are closed. However, if netfslib is using
the deprecated PG_private_2 method (such as is currently used by ceph), and
decides it wants to copy to the cache, netfs_advance_write() will just bail
at the first check seeing that the cache stream is unavailable, and
indicate that it dealt with all the content.
This means that we have no subrequests to provide notifications to drive
the state machine or even to pin the request and the request just gets
discarded, leaving the folios with PG_private_2 set.
Fix this by jumping directly to cancel the request if the cache is not
available. That way, we don't remove mark3 from the folio_queue list and
netfs_pgpriv2_cancel() will clean up the folios.
This was found by running the generic/013 xfstest against ceph with an
active cache and the "-o fsc" option passed to ceph. That would usually
hang
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Reported-by: Max Kellermann <max.kellermann@ionos.com>
Closes: https://lore.kernel.org/r/CAKPOu+_4m80thNy5_fvROoxBm689YtA0dZ-=gcmkzwYSY4syqw@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20241213135013.2964079-11-dhowells@redhat.com
cc: Jeff Layton <jlayton@kernel.org>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Xiubo Li <xiubli@redhat.com>
cc: netfs@lists.linux.dev
cc: ceph-devel@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Howells <dhowells@redhat.com>
Date: Fri Dec 13 13:50:04 2024 +0000
nfs: Fix oops in nfs_netfs_init_request() when copying to cache
[ Upstream commit 86ad1a58f6a9453f49e06ef957a40a8dac00a13f ]
When netfslib wants to copy some data that has just been read on behalf of
nfs, it creates a new write request and calls nfs_netfs_init_request() to
initialise it, but with a NULL file pointer. This causes
nfs_file_open_context() to oops - however, we don't actually need the nfs
context as we're only going to write to the cache.
Fix this by just returning if we aren't given a file pointer and emit a
warning if the request was for something other than copy-to-cache.
Further, fix nfs_netfs_free_request() so that it doesn't try to free the
context if the pointer is NULL.
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Reported-by: Max Kellermann <max.kellermann@ionos.com>
Closes: https://lore.kernel.org/r/CAKPOu+9DyMbKLhyJb7aMLDTb=Fh0T8Teb9sjuf_pze+XWT1VaQ@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20241213135013.2964079-5-dhowells@redhat.com
cc: Trond Myklebust <trondmy@kernel.org>
cc: Anna Schumaker <anna@kernel.org>
cc: Dave Wysochanski <dwysocha@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-nfs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Amir Goldstein <amir73il@gmail.com>
Date: Sun Jan 5 17:24:03 2025 +0100
ovl: pass realinode to ovl_encode_real_fh() instead of realdentry
[ Upstream commit 07aeefae7ff44d80524375253980b1bdee2396b0 ]
We want to be able to encode an fid from an inode with no alias.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://lore.kernel.org/r/20250105162404.357058-2-amir73il@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: c45beebfde34 ("ovl: support encoding fid from inode with no alias")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Amir Goldstein <amir73il@gmail.com>
Date: Sun Jan 5 17:24:04 2025 +0100
ovl: support encoding fid from inode with no alias
[ Upstream commit c45beebfde34aa71afbc48b2c54cdda623515037 ]
Dmitry Safonov reported that a WARN_ON() assertion can be trigered by
userspace when calling inotify_show_fdinfo() for an overlayfs watched
inode, whose dentry aliases were discarded with drop_caches.
The WARN_ON() assertion in inotify_show_fdinfo() was removed, because
it is possible for encoding file handle to fail for other reason, but
the impact of failing to encode an overlayfs file handle goes beyond
this assertion.
As shown in the LTP test case mentioned in the link below, failure to
encode an overlayfs file handle from a non-aliased inode also leads to
failure to report an fid with FAN_DELETE_SELF fanotify events.
As Dmitry notes in his analyzis of the problem, ovl_encode_fh() fails
if it cannot find an alias for the inode, but this failure can be fixed.
ovl_encode_fh() seldom uses the alias and in the case of non-decodable
file handles, as is often the case with fanotify fid info,
ovl_encode_fh() never needs to use the alias to encode a file handle.
Defer finding an alias until it is actually needed so ovl_encode_fh()
will not fail in the common case of FAN_DELETE_SELF fanotify events.
Fixes: 16aac5ad1fa9 ("ovl: support encoding non-decodable file handles")
Reported-by: Dmitry Safonov <dima@arista.com>
Closes: https://lore.kernel.org/linux-fsdevel/CAOQ4uxiie81voLZZi2zXS1BziXZCM24nXqPAxbu8kxXCUWdwOg@mail.gmail.com/
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://lore.kernel.org/r/20250105162404.357058-3-amir73il@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shannon Nelson <shannon.nelson@amd.com>
Date: Fri Jan 3 11:51:47 2025 -0800
pds_core: limit loop over fw name list
[ Upstream commit 8c817eb26230dc0ae553cee16ff43a4a895f6756 ]
Add an array size limit to the for-loop to be sure we don't try
to reference a fw_version string off the end of the fw info names
array. We know that our firmware only has a limited number
of firmware slot names, but we shouldn't leave this unchecked.
Fixes: 45d76f492938 ("pds_core: set up device and adminq")
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250103195147.7408-1-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Date: Mon Jan 6 18:40:34 2025 +0100
platform/x86/amd/pmc: Only disable IRQ1 wakeup where i8042 actually enabled it
[ Upstream commit dd410d784402c5775f66faf8b624e85e41c38aaf ]
Wakeup for IRQ1 should be disabled only in cases where i8042 had
actually enabled it, otherwise "wake_depth" for this IRQ will try to
drop below zero and there will be an unpleasant WARN() logged:
kernel: atkbd serio0: Disabling IRQ1 wakeup source to avoid platform firmware bug
kernel: ------------[ cut here ]------------
kernel: Unbalanced IRQ 1 wake disable
kernel: WARNING: CPU: 10 PID: 6431 at kernel/irq/manage.c:920 irq_set_irq_wake+0x147/0x1a0
The PMC driver uses DEFINE_SIMPLE_DEV_PM_OPS() to define its dev_pm_ops
which sets amd_pmc_suspend_handler() to the .suspend, .freeze, and
.poweroff handlers. i8042_pm_suspend(), however, is only set as
the .suspend handler.
Fix the issue by call PMC suspend handler only from the same set of
dev_pm_ops handlers as i8042_pm_suspend(), which currently means just
the .suspend handler.
To reproduce this issue try hibernating (S4) the machine after a fresh boot
without putting it into s2idle first.
Fixes: 8e60615e8932 ("platform/x86/amd: pmc: Disable IRQ1 wakeup for RN/CZN")
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Link: https://lore.kernel.org/r/c8f28c002ca3c66fbeeb850904a1f43118e17200.1736184606.git.mail@maciej.szmigiero.name
[ij: edited the commit message.]
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David E. Box <david.e.box@linux.intel.com>
Date: Mon Jan 6 09:46:52 2025 -0800
platform/x86: intel/pmc: Fix ioremap() of bad address
[ Upstream commit 1d7461d0c8330689117286169106af6531a747ed ]
In pmc_core_ssram_get_pmc(), the physical addresses for hidden SSRAM
devices are retrieved from the MMIO region of the primary SSRAM device.
If additional devices are not present, the address returned is zero.
Currently, the code does not check for this condition, resulting in
ioremap() incorrectly attempting to map address 0.
Add a check for a zero address and return 0 if no additional devices
are found, as it is not an error for the device to be absent.
Fixes: a01486dc4bb1 ("platform/x86/intel/pmc: Cleanup SSRAM discovery")
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Link: https://lore.kernel.org/r/20250106174653.1497128-1-david.e.box@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date: Wed Jan 8 16:34:37 2025 +0100
rds: sysctl: rds_tcp_{rcv,snd}buf: avoid using current->nsproxy
commit 7f5611cbc4871c7fb1ad36c2e5a9edad63dca95c upstream.
As mentioned in a previous commit of this series, using the 'net'
structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only
from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
(null-ptr-deref), e.g. when the current task is exiting, as spotted by
syzbot [1] using acct(2).
The per-netns structure can be obtained from the table->data using
container_of(), then the 'net' one can be retrieved from the listen
socket (if available).
Fixes: c6a58ffed536 ("RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1]
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-9-5df34b2083e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Date: Thu Dec 12 00:19:08 2024 +0000
Revert "drm/mediatek: dsi: Correct calculation formula of PHY Timing"
commit d08555758fb1dbfb48f0cb58176fdc98009e6070 upstream.
This reverts commit 417d8c47271d5cf1a705e997065873b2a9a36fd4.
With that patch the panel in the Tentacruel ASUS Chromebook CM14
(CM1402F) flickers. There are 1 or 2 times per second a black panel.
Stable Kernel 6.11.5 and mainline 6.12-rc4 works only when reverse
that patch.
Fixes: 417d8c47271d ("drm/mediatek: dsi: Correct calculation formula of PHY Timing")
Cc: stable@vger.kernel.org
Cc: Shuijing Li <shuijing.li@mediatek.com>
Reported-by: Jens Ziller <zillerbaer@gmx.de>
Closes: https://patchwork.kernel.org/project/dri-devel/patch/20240412031208.30688-1-shuijing.li@mediatek.com/
Link: https://patchwork.kernel.org/project/dri-devel/patch/20241212001908.6056-1-chunkuang.hu@kernel.org/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Nam Cao <namcao@linutronix.de>
Date: Mon Nov 18 10:13:33 2024 +0100
riscv: Fix sleeping in invalid context in die()
commit 6a97f4118ac07cfdc316433f385dbdc12af5025e upstream.
die() can be called in exception handler, and therefore cannot sleep.
However, die() takes spinlock_t which can sleep with PREEMPT_RT enabled.
That causes the following warning:
BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 285, name: mutex
preempt_count: 110001, expected: 0
RCU nest depth: 0, expected: 0
CPU: 0 UID: 0 PID: 285 Comm: mutex Not tainted 6.12.0-rc7-00022-ge19049cf7d56-dirty #234
Hardware name: riscv-virtio,qemu (DT)
Call Trace:
dump_backtrace+0x1c/0x24
show_stack+0x2c/0x38
dump_stack_lvl+0x5a/0x72
dump_stack+0x14/0x1c
__might_resched+0x130/0x13a
rt_spin_lock+0x2a/0x5c
die+0x24/0x112
do_trap_insn_illegal+0xa0/0xea
_new_vmalloc_restore_context_a0+0xcc/0xd8
Oops - illegal instruction [#1]
Switch to use raw_spinlock_t, which does not sleep even with PREEMPT_RT
enabled.
Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20241118091333.1185288-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Nam Cao <namcao@linutronix.de>
Date: Tue Nov 19 12:10:56 2024 +0100
riscv: kprobes: Fix incorrect address calculation
commit 13134cc949148e1dfa540a0fe5dc73569bc62155 upstream.
p->ainsn.api.insn is a pointer to u32, therefore arithmetic operations are
multiplied by four. This is clearly undesirable for this case.
Cast it to (void *) first before any calculation.
Below is a sample before/after. The dumped memory is two kprobe slots, the
first slot has
- c.addiw a0, 0x1c (0x7125)
- ebreak (0x00100073)
and the second slot has:
- c.addiw a0, -4 (0x7135)
- ebreak (0x00100073)
Before this patch:
(gdb) x/16xh 0xff20000000135000
0xff20000000135000: 0x7125 0x0000 0x0000 0x0000 0x7135 0x0010 0x0000 0x0000
0xff20000000135010: 0x0073 0x0010 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
After this patch:
(gdb) x/16xh 0xff20000000125000
0xff20000000125000: 0x7125 0x0073 0x0010 0x0000 0x7135 0x0073 0x0010 0x0000
0xff20000000125010: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
Fixes: b1756750a397 ("riscv: kprobes: Use patch_text_nosync() for insn slots")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241119111056.2554419-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Xu Lu <luxu.kernel@bytedance.com>
Date: Mon Dec 9 20:26:17 2024 +0800
riscv: mm: Fix the out of bound issue of vmemmap address
[ Upstream commit f754f27e98f88428aaf6be6e00f5cbce97f62d4b ]
In sparse vmemmap model, the virtual address of vmemmap is calculated as:
((struct page *)VMEMMAP_START - (phys_ram_base >> PAGE_SHIFT)).
And the struct page's va can be calculated with an offset:
(vmemmap + (pfn)).
However, when initializing struct pages, kernel actually starts from the
first page from the same section that phys_ram_base belongs to. If the
first page's physical address is not (phys_ram_base >> PAGE_SHIFT), then
we get an va below VMEMMAP_START when calculating va for it's struct page.
For example, if phys_ram_base starts from 0x82000000 with pfn 0x82000, the
first page in the same section is actually pfn 0x80000. During
init_unavailable_range(), we will initialize struct page for pfn 0x80000
with virtual address ((struct page *)VMEMMAP_START - 0x2000), which is
below VMEMMAP_START as well as PCI_IO_END.
This commit fixes this bug by introducing a new variable
'vmemmap_start_pfn' which is aligned with memory section size and using
it to calculate vmemmap address instead of phys_ram_base.
Fixes: a11dd49dcb93 ("riscv: Sparse-Memory/vmemmap out-of-bounds fix")
Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20241209122617.53341-1-luxu.kernel@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Clément Léger <cleger@rivosinc.com>
Date: Thu Nov 28 09:16:34 2024 +0100
riscv: module: remove relocation_head rel_entry member allocation
[ Upstream commit 03f0b548537f758830bdb2dc3f2aba713069cef2 ]
relocation_head's list_head member, rel_entry, doesn't need to be
allocated, its storage can just be part of the allocated relocation_head.
Remove the pointer which allows to get rid of the allocation as well as
an existing memory leak found by Kai Zhang using kmemleak.
Fixes: 8fd6c5142395 ("riscv: Add remaining module relocations")
Reported-by: Kai Zhang <zhangkai@iscas.ac.cn>
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20241128081636.3620468-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Clément Léger <cleger@rivosinc.com>
Date: Mon Dec 9 16:57:12 2024 +0100
riscv: stacktrace: fix backtracing through exceptions
[ Upstream commit 51356ce60e5915a6bd812873186ed54e45c2699d ]
Prior to commit 5d5fc33ce58e ("riscv: Improve exception and system call
latency"), backtrace through exception worked since ra was filled with
ret_from_exception symbol address and the stacktrace code checked 'pc' to
be equal to that symbol. Now that handle_exception uses regular 'call'
instructions, this isn't working anymore and backtrace stops at
handle_exception(). Since there are multiple call site to C code in the
exception handling path, rather than checking multiple potential return
addresses, add a new symbol at the end of exception handling and check pc
to be in that range.
Fixes: 5d5fc33ce58e ("riscv: Improve exception and system call latency")
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241209155714.1239665-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Clément Léger <cleger@rivosinc.com>
Date: Fri Jan 3 15:17:58 2025 +0100
riscv: use local label names instead of global ones in assembly
[ Upstream commit 5cd900b8b7e42c492431eb4261c18927768db1f9 ]
Local labels should be prefix by '.L' or they'll be exported in the
symbol table. Additionally, this messes up the backtrace by displaying
an incorrect symbol:
...
[ 12.751810] [<ffffffff80441628>] _copy_from_user+0x28/0xc2
[ 12.752035] [<ffffffff800152ca>] handle_misaligned_load+0x1ca/0x2fc
[ 12.752310] [<ffffffff80a033e8>] do_trap_load_misaligned+0x24/0xee
[ 12.752596] [<ffffffff80a0dcae>] _new_vmalloc_restore_context_a0+0xc2/0xce
After:
...
[ 10.243916] [<ffffffff804415e4>] _copy_from_user+0x28/0xc2
[ 10.244026] [<ffffffff800152ca>] handle_misaligned_load+0x1ca/0x2fc
[ 10.244150] [<ffffffff80a033a0>] do_trap_load_misaligned+0x24/0xee
[ 10.244268] [<ffffffff80a0dc66>] handle_exception+0x146/0x152
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Fixes: 503638e0babf3 ("riscv: Stop emitting preventive sfence.vma for new vmalloc mappings")
Link: https://lore.kernel.org/r/20250103141814.508865-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dan Carpenter <dan.carpenter@linaro.org>
Date: Wed Jan 8 12:15:53 2025 +0300
rtase: Fix a check for error in rtase_alloc_msix()
[ Upstream commit 2055272e3ae01a954e41a5afb437c5d76f758e0b ]
The pci_irq_vector() function never returns zero. It returns negative
error codes or a positive non-zero IRQ number. Fix the error checking to
test for negatives.
Fixes: a36e9f5cfe9e ("rtase: Add support for a pci table in this module")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/f2ecc88d-af13-4651-9820-7cc665230019@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Toke Høiland-Jørgensen <toke@redhat.com>
Date: Tue Jan 7 13:01:05 2025 +0100
sched: sch_cake: add bounds checks to host bulk flow fairness counts
[ Upstream commit 737d4d91d35b5f7fa5bb442651472277318b0bfd ]
Even though we fixed a logic error in the commit cited below, syzbot
still managed to trigger an underflow of the per-host bulk flow
counters, leading to an out of bounds memory access.
To avoid any such logic errors causing out of bounds memory accesses,
this commit factors out all accesses to the per-host bulk flow counters
to a series of helpers that perform bounds-checking before any
increments and decrements. This also has the benefit of improving
readability by moving the conditional checks for the flow mode into
these helpers, instead of having them spread out throughout the
code (which was the cause of the original logic error).
As part of this change, the flow quantum calculation is consolidated
into a helper function, which means that the dithering applied to the
ost load scaling is now applied both in the DRR rotation and when a
sparse flow's quantum is first initiated. The only user-visible effect
of this is that the maximum packet size that can be sent while a flow
stays sparse will now vary with +/- one byte in some cases. This should
not make a noticeable difference in practice, and thus it's not worth
complicating the code to preserve the old behaviour.
Fixes: 546ea84d07e3 ("sched: sch_cake: fix bulk flow accounting logic for host fairness")
Reported-by: syzbot+f63600d288bfb7057424@syzkaller.appspotmail.com
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Link: https://patch.msgid.link/20250107120105.70685-1-toke@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andrea Righi <arighi@nvidia.com>
Date: Fri Jan 10 23:16:31 2025 +0100
sched_ext: idle: Refresh idle masks during idle-to-idle transitions
[ Upstream commit a2a3374c47c428c0edb0bbc693638d4783f81e31 ]
With the consolidation of put_prev_task/set_next_task(), see
commit 436f3eed5c69 ("sched: Combine the last put_prev_task() and the
first set_next_task()"), we are now skipping the transition between
these two functions when the previous and the next tasks are the same.
As a result, the scx idle state of a CPU is updated only when
transitioning to or from the idle thread. While this is generally
correct, it can lead to uneven and inefficient core utilization in
certain scenarios [1].
A typical scenario involves proactive wake-ups: scx_bpf_pick_idle_cpu()
selects and marks an idle CPU as busy, followed by a wake-up via
scx_bpf_kick_cpu(), without dispatching any tasks. In this case, the CPU
continues running the idle thread, returns to idle, but remains marked
as busy, preventing it from being selected again as an idle CPU (until a
task eventually runs on it and releases the CPU).
For example, running a workload that uses 20% of each CPU, combined with
an scx scheduler using proactive wake-ups, results in the following core
utilization:
CPU 0: 25.7%
CPU 1: 29.3%
CPU 2: 26.5%
CPU 3: 25.5%
CPU 4: 0.0%
CPU 5: 25.5%
CPU 6: 0.0%
CPU 7: 10.5%
To address this, refresh the idle state also in pick_task_idle(), during
idle-to-idle transitions, but only trigger ops.update_idle() on actual
state changes to prevent unnecessary updates to the scx scheduler and
maintain balanced state transitions.
With this change in place, the core utilization in the previous example
becomes the following:
CPU 0: 18.8%
CPU 1: 19.4%
CPU 2: 18.0%
CPU 3: 18.7%
CPU 4: 19.3%
CPU 5: 18.9%
CPU 6: 18.7%
CPU 7: 19.3%
[1] https://github.com/sched-ext/scx/pull/1139
Fixes: 7c65ae81ea86 ("sched_ext: Don't call put_prev_task_scx() before picking the next task")
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Changwoo Min <changwoo@igalia.com>
Date: Thu Jan 9 00:08:06 2025 +0900
sched_ext: Replace rq_lock() to raw_spin_rq_lock() in scx_ops_bypass()
[ Upstream commit 6268d5bc10354fc2ab8d44a0cd3b042d49a0417e ]
scx_ops_bypass() iterates all CPUs to re-enqueue all the scx tasks.
For each CPU, it acquires a lock using rq_lock() regardless of whether
a CPU is offline or the CPU is currently running a task in a higher
scheduler class (e.g., deadline). The rq_lock() is supposed to be used
for online CPUs, and the use of rq_lock() may trigger an unnecessary
warning in rq_pin_lock(). Therefore, replace rq_lock() to
raw_spin_rq_lock() in scx_ops_bypass().
Without this change, we observe the following warning:
===== START =====
[ 6.615205] rq->balance_callback && rq->balance_callback != &balance_push_callback
[ 6.615208] WARNING: CPU: 2 PID: 0 at kernel/sched/sched.h:1730 __schedule+0x1130/0x1c90
===== END =====
Fixes: 0e7ffff1b811 ("scx: Fix raciness in scx_ops_bypass()")
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Honglei Wang <jameshongleiwang@126.com>
Date: Wed Jan 8 10:33:28 2025 +0800
sched_ext: switch class when preempted by higher priority scheduler
[ Upstream commit 68e449d849fd50bd5e61d8bd32b3458dbd3a3df6 ]
ops.cpu_release() function, if defined, must be invoked when preempted by
a higher priority scheduler class task. This scenario was skipped in
commit f422316d7466 ("sched_ext: Remove switch_class_scx()"). Let's fix
it.
Fixes: f422316d7466 ("sched_ext: Remove switch_class_scx()")
Signed-off-by: Honglei Wang <jameshongleiwang@126.com>
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Date: Thu Dec 19 22:20:41 2024 +0530
scsi: ufs: qcom: Power off the PHY if it was already powered on in ufs_qcom_power_up_sequence()
commit 7bac65687510038390a0a54cbe14fba08d037e46 upstream.
PHY might already be powered on during ufs_qcom_power_up_sequence() in a
couple of cases:
1. During UFSHCD_QUIRK_REINIT_AFTER_MAX_GEAR_SWITCH quirk
2. Resuming from spm_lvl = 5 suspend
In those cases, it is necessary to call phy_power_off() and phy_exit() in
ufs_qcom_power_up_sequence() function to power off the PHY before calling
phy_init() and phy_power_on().
Case (1) is doing it via ufs_qcom_reinit_notify() callback, but case (2) is
not handled. So to satisfy both cases, call phy_power_off() and phy_exit()
if the phy_count is non-zero. And with this change, the reinit_notify()
callback is no longer needed.
This fixes the below UFS resume failure with spm_lvl = 5:
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
ufs_device_wlun 0:0:0:49488: ufshcd_wl_resume failed: -5
ufs_device_wlun 0:0:0:49488: PM: dpm_run_callback(): scsi_bus_resume returns -5
ufs_device_wlun 0:0:0:49488: PM: failed to resume async: error -5
Cc: stable@vger.kernel.org # 6.3
Fixes: baf5ddac90dc ("scsi: ufs: ufs-qcom: Add support for reinitializing the UFS device")
Reported-by: Ram Kumar Dwivedi <quic_rdwivedi@quicinc.com>
Tested-by: Amit Pundir <amit.pundir@linaro.org> # on SM8550-HDK
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-1-63c4b95a70b9@linaro.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date: Wed Jan 8 16:34:34 2025 +0100
sctp: sysctl: auth_enable: avoid using current->nsproxy
commit 15649fd5415eda664ef35780c2013adeb5d9c695 upstream.
As mentioned in a previous commit of this series, using the 'net'
structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only
from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
(null-ptr-deref), e.g. when the current task is exiting, as spotted by
syzbot [1] using acct(2).
The 'net' structure can be obtained from the table->data using
container_of().
Note that table->data could also be used directly, but that would
increase the size of this fix, while 'sctp.ctl_sock' still needs to be
retrieved from 'net' structure.
Fixes: b14878ccb7fa ("net: sctp: cache auth_enable per endpoint")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1]
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-6-5df34b2083e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date: Wed Jan 8 16:34:32 2025 +0100
sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy
commit ea62dd1383913b5999f3d16ae99d411f41b528d4 upstream.
As mentioned in a previous commit of this series, using the 'net'
structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only
from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
(null-ptr-deref), e.g. when the current task is exiting, as spotted by
syzbot [1] using acct(2).
The 'net' structure can be obtained from the table->data using
container_of().
Note that table->data could also be used directly, as this is the only
member needed from the 'net' structure, but that would increase the size
of this fix, to use '*data' everywhere 'net->sctp.sctp_hmac_alg' is
used.
Fixes: 3c68198e7511 ("sctp: Make hmac algorithm selection for cookie generation dynamic")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1]
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-4-5df34b2083e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date: Wed Jan 8 16:34:36 2025 +0100
sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy
commit 6259d2484d0ceff42245d1f09cc8cb6ee72d847a upstream.
As mentioned in a previous commit of this series, using the 'net'
structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only
from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
(null-ptr-deref), e.g. when the current task is exiting, as spotted by
syzbot [1] using acct(2).
The 'net' structure can be obtained from the table->data using
container_of().
Note that table->data could also be used directly, as this is the only
member needed from the 'net' structure, but that would increase the size
of this fix, to use '*data' everywhere 'net->sctp.probe_interval' is
used.
Fixes: d1e462a7a5f3 ("sctp: add probe_interval in sysctl and sock/asoc/transport")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1]
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-8-5df34b2083e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date: Wed Jan 8 16:34:33 2025 +0100
sctp: sysctl: rto_min/max: avoid using current->nsproxy
commit 9fc17b76fc70763780aa78b38fcf4742384044a5 upstream.
As mentioned in a previous commit of this series, using the 'net'
structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only
from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
(null-ptr-deref), e.g. when the current task is exiting, as spotted by
syzbot [1] using acct(2).
The 'net' structure can be obtained from the table->data using
container_of().
Note that table->data could also be used directly, as this is the only
member needed from the 'net' structure, but that would increase the size
of this fix, to use '*data' everywhere 'net->sctp.rto_min/max' is used.
Fixes: 4f3fdf3bc59c ("sctp: add check rto_min and rto_max in sysctl")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1]
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-5-5df34b2083e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date: Wed Jan 8 16:34:35 2025 +0100
sctp: sysctl: udp_port: avoid using current->nsproxy
commit c10377bbc1972d858eaf0ab366a311b39f8ef1b6 upstream.
As mentioned in a previous commit of this series, using the 'net'
structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only
from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
(null-ptr-deref), e.g. when the current task is exiting, as spotted by
syzbot [1] using acct(2).
The 'net' structure can be obtained from the table->data using
container_of().
Note that table->data could also be used directly, but that would
increase the size of this fix, while 'sctp.ctl_sock' still needs to be
retrieved from 'net' structure.
Fixes: 046c052b475e ("sctp: enable udp tunneling socks")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1]
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-7-5df34b2083e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Li Zhijian <lizhijian@fujitsu.com>
Date: Wed Dec 18 10:59:31 2024 +0800
selftests/alsa: Fix circular dependency involving global-timer
[ Upstream commit 55853cb829dc707427c3519f6b8686682a204368 ]
The pattern rule `$(OUTPUT)/%: %.c` inadvertently included a circular
dependency on the global-timer target due to its inclusion in
$(TEST_GEN_PROGS_EXTENDED). This resulted in a circular dependency
warning during the build process.
To resolve this, the dependency on $(TEST_GEN_PROGS_EXTENDED) has been
replaced with an explicit dependency on $(OUTPUT)/libatest.so. This change
ensures that libatest.so is built before any other targets that require it,
without creating a circular dependency.
This fix addresses the following warning:
make[4]: Entering directory 'tools/testing/selftests/alsa'
make[4]: Circular default_modconfig/kselftest/alsa/global-timer <- default_modconfig/kselftest/alsa/global-timer dependency dropped.
make[4]: Nothing to be done for 'all'.
make[4]: Leaving directory 'tools/testing/selftests/alsa'
Cc: Mark Brown <broonie@kernel.org>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Link: https://patch.msgid.link/20241218025931.914164-1-lizhijian@fujitsu.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ben Wolsieffer <ben.wolsieffer@hefring.com>
Date: Mon Dec 16 09:53:23 2024 -0500
serial: stm32: use port lock wrappers for break control
commit 0cfc36ea51684b5932cd3951ded523777d807af2 upstream.
Commit 30e945861f3b ("serial: stm32: add support for break control")
added another usage of the port lock, but was merged on the same day as
c5d06662551c ("serial: stm32: Use port lock wrappers"), therefore the
latter did not update this usage to use the port lock wrappers.
Fixes: c5d06662551c ("serial: stm32: Use port lock wrappers")
Cc: stable <stable@kernel.org>
Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Link: https://lore.kernel.org/r/20241216145323.111612-1-ben.wolsieffer@hefring.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Meetakshi Setiya <msetiya@microsoft.com>
Date: Wed Jan 8 05:10:34 2025 -0500
smb: client: sync the root session and superblock context passwords before automounting
commit 20b1aa912316ffb7fbb5f407f17c330f2a22ddff upstream.
In some cases, when password2 becomes the working password, the
client swaps the two password fields in the root session struct, but
not in the smb3_fs_context struct in cifs_sb. DFS automounts inherit
fs context from their parent mounts. Therefore, they might end up
getting the passwords in the stale order.
The automount should succeed, because the mount function will end up
retrying with the actual password anyway. But to reduce these
unnecessary session setup retries for automounts, we can sync the
parent context's passwords with the root session's passwords before
duplicating it to the child's fs context.
Cc: stable@vger.kernel.org
Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Zicheng Qu <quzicheng@huawei.com>
Date: Thu Nov 7 01:10:15 2024 +0000
staging: iio: ad9832: Correct phase range check
commit 4636e859ebe0011f41e35fa79bab585b8004e9a3 upstream.
User Perspective:
When a user sets the phase value, the ad9832_write_phase() is called.
The phase register has a 12-bit resolution, so the valid range is 0 to
4095. If the phase offset value of 4096 is input, it effectively exactly
equals 0 in the lower 12 bits, meaning no offset.
Reasons for the Change:
1) Original Condition (phase > BIT(AD9832_PHASE_BITS)):
This condition allows a phase value equal to 2^12, which is 4096.
However, this value exceeds the valid 12-bit range, as the maximum valid
phase value should be 4095.
2) Modified Condition (phase >= BIT(AD9832_PHASE_BITS)):
Ensures that the phase value is within the valid range, preventing
invalid datafrom being written.
Impact on Subsequent Logic: st->data = cpu_to_be16(addr | phase):
If the phase value is 2^12, i.e., 4096 (0001 0000 0000 0000), and addr
is AD9832_REG_PHASE0 (1100 0000 0000 0000), then addr | phase results in
1101 0000 0000 0000, occupying DB12. According to the section of WRITING
TO A PHASE REGISTER in the datasheet, the MSB 12 PHASE0 bits should be
DB11. The original condition leads to incorrect DB12 usage, which
contradicts the datasheet and could pose potential issues for future
updates if DB12 is used in such related cases.
Fixes: ea707584bac1 ("Staging: IIO: DDS: AD9832 / AD9835 driver")
Cc: stable@vger.kernel.org
Signed-off-by: Zicheng Qu <quzicheng@huawei.com>
Link: https://patch.msgid.link/20241107011015.2472600-3-quzicheng@huawei.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Zicheng Qu <quzicheng@huawei.com>
Date: Thu Nov 7 01:10:14 2024 +0000
staging: iio: ad9834: Correct phase range check
commit c0599762f0c7e260b99c6b7bceb8eae69b804c94 upstream.
User Perspective:
When a user sets the phase value, the ad9834_write_phase() is called.
The phase register has a 12-bit resolution, so the valid range is 0 to
4095. If the phase offset value of 4096 is input, it effectively exactly
equals 0 in the lower 12 bits, meaning no offset.
Reasons for the Change:
1) Original Condition (phase > BIT(AD9834_PHASE_BITS)):
This condition allows a phase value equal to 2^12, which is 4096.
However, this value exceeds the valid 12-bit range, as the maximum valid
phase value should be 4095.
2) Modified Condition (phase >= BIT(AD9834_PHASE_BITS)):
Ensures that the phase value is within the valid range, preventing
invalid datafrom being written.
Impact on Subsequent Logic: st->data = cpu_to_be16(addr | phase):
If the phase value is 2^12, i.e., 4096 (0001 0000 0000 0000), and addr
is AD9834_REG_PHASE0 (1100 0000 0000 0000), then addr | phase results in
1101 0000 0000 0000, occupying DB12. According to the section of WRITING
TO A PHASE REGISTER in the datasheet, the MSB 12 PHASE0 bits should be
DB11. The original condition leads to incorrect DB12 usage, which
contradicts the datasheet and could pose potential issues for future
updates if DB12 is used in such related cases.
Fixes: 12b9d5bf76bf ("Staging: IIO: DDS: AD9833 / AD9834 driver")
Cc: stable@vger.kernel.org
Signed-off-by: Zicheng Qu <quzicheng@huawei.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://patch.msgid.link/20241107011015.2472600-2-quzicheng@huawei.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Zhongqiu Duan <dzq.aishenghu0@gmail.com>
Date: Thu Jan 2 17:14:26 2025 +0000
tcp/dccp: allow a connection when sk_max_ack_backlog is zero
[ Upstream commit 3479c7549fb1dfa7a1db4efb7347c7b8ef50de4b ]
If the backlog of listen() is set to zero, sk_acceptq_is_full() allows
one connection to be made, but inet_csk_reqsk_queue_is_full() does not.
When the net.ipv4.tcp_syncookies is zero, inet_csk_reqsk_queue_is_full()
will cause an immediate drop before the sk_acceptq_is_full() check in
tcp_conn_request(), resulting in no connection can be made.
This patch tries to keep consistent with 64a146513f8f ("[NET]: Revert
incorrect accept queue backlog changes.").
Link: https://lore.kernel.org/netdev/20250102080258.53858-1-kuniyu@amazon.com/
Fixes: ef547f2ac16b ("tcp: remove max_qlen_log")
Signed-off-by: Zhongqiu Duan <dzq.aishenghu0@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250102171426.915276-1-dzq.aishenghu0@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniel Borkmann <daniel@iogearbox.net>
Date: Tue Jan 7 11:14:39 2025 +0100
tcp: Annotate data-race around sk->sk_mark in tcp_v4_send_reset
[ Upstream commit 80fb40baba19e25a1b6f3ecff6fc5c0171806bde ]
This is a follow-up to 3c5b4d69c358 ("net: annotate data-races around
sk->sk_mark"). sk->sk_mark can be read and written without holding
the socket lock. IPv6 equivalent is already covered with READ_ONCE()
annotation in tcp_v6_send_response().
Fixes: 3c5b4d69c358 ("net: annotate data-races around sk->sk_mark")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/f459d1fc44f205e13f6d8bdca2c8bfb9902ffac9.1736244569.git.daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Date: Tue Dec 24 12:18:09 2024 +0900
thermal: of: fix OF node leak in of_thermal_zone_find()
[ Upstream commit 9164e0912af206a72ddac4915f7784e470a04ace ]
of_thermal_zone_find() calls of_parse_phandle_with_args(), but does not
release the OF node reference obtained by it.
Add a of_node_put() call when the call is successful.
Fixes: 3fd6d6e2b4e8 ("thermal/of: Rework the thermal device tree initialization")
Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Link: https://patch.msgid.link/20241224031809.950461-1-joe@pf.is.s.u-tokyo.ac.jp
[ rjw: Changelog edit ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Benjamin Coddington <bcodding@redhat.com>
Date: Sat Jan 4 10:29:45 2025 -0500
tls: Fix tls_sw_sendmsg error handling
[ Upstream commit b341ca51d2679829d26a3f6a4aa9aee9abd94f92 ]
We've noticed that NFS can hang when using RPC over TLS on an unstable
connection, and investigation shows that the RPC layer is stuck in a tight
loop attempting to transmit, but forever getting -EBADMSG back from the
underlying network. The loop begins when tcp_sendmsg_locked() returns
-EPIPE to tls_tx_records(), but that error is converted to -EBADMSG when
calling the socket's error reporting handler.
Instead of converting errors from tcp_sendmsg_locked(), let's pass them
along in this path. The RPC layer handles -EPIPE by reconnecting the
transport, which prevents the endless attempts to transmit on a broken
connection.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance")
Link: https://patch.msgid.link/9594185559881679d81f071b181a10eb07cd079f.1736004079.git.bcodding@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Li Huafei <lihuafei1@huawei.com>
Date: Thu Nov 14 19:01:41 2024 +0800
topology: Keep the cpumask unchanged when printing cpumap
commit cbd399f78e23ad4492c174fc5e6b3676dba74a52 upstream.
During fuzz testing, the following warning was discovered:
different return values (15 and 11) from vsnprintf("%*pbl
", ...)
test:keyward is WARNING in kvasprintf
WARNING: CPU: 55 PID: 1168477 at lib/kasprintf.c:30 kvasprintf+0x121/0x130
Call Trace:
kvasprintf+0x121/0x130
kasprintf+0xa6/0xe0
bitmap_print_to_buf+0x89/0x100
core_siblings_list_read+0x7e/0xb0
kernfs_file_read_iter+0x15b/0x270
new_sync_read+0x153/0x260
vfs_read+0x215/0x290
ksys_read+0xb9/0x160
do_syscall_64+0x56/0x100
entry_SYSCALL_64_after_hwframe+0x78/0xe2
The call trace shows that kvasprintf() reported this warning during the
printing of core_siblings_list. kvasprintf() has several steps:
(1) First, calculate the length of the resulting formatted string.
(2) Allocate a buffer based on the returned length.
(3) Then, perform the actual string formatting.
(4) Check whether the lengths of the formatted strings returned in
steps (1) and (2) are consistent.
If the core_cpumask is modified between steps (1) and (3), the lengths
obtained in these two steps may not match. Indeed our test includes cpu
hotplugging, which should modify core_cpumask while printing.
To fix this issue, cache the cpumask into a temporary variable before
calling cpumap_print_{list, cpumask}_to_buf(), to keep it unchanged
during the printing process.
Fixes: bb9ec13d156e ("topology: use bin_attribute to break the size limitation of cpumap ABI")
Cc: stable <stable@kernel.org>
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20241114110141.94725-1-lihuafei1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Date: Tue Dec 10 19:01:20 2024 +0200
tty: serial: 8250: Fix another runtime PM usage counter underflow
commit ed2761958ad77e54791802b07095786150eab844 upstream.
The commit f9b11229b79c ("serial: 8250: Fix PM usage_count for console
handover") fixed one runtime PM usage counter balance problem that
occurs because .dev is not set during univ8250 setup preventing call to
pm_runtime_get_sync(). Later, univ8250_console_exit() will trigger the
runtime PM usage counter underflow as .dev is already set at that time.
Call pm_runtime_get_sync() to balance the RPM usage counter also in
serial8250_register_8250_port() before trying to add the port.
Reported-by: Borislav Petkov (AMD) <bp@alien8.de>
Fixes: bedb404e91bb ("serial: 8250_port: Don't use power management for kernel console")
Cc: stable <stable@kernel.org>
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20241210170120.2231-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Lubomir Rintel <lrintel@redhat.com>
Date: Wed Jan 1 22:22:06 2025 +0100
usb-storage: Add max sectors quirk for Nokia 208
commit cdef30e0774802df2f87024d68a9d86c3b99ca2a upstream.
This fixes data corruption when accessing the internal SD card in mass
storage mode.
I am actually not too sure why. I didn't figure a straightforward way to
reproduce the issue, but i seem to get garbage when issuing a lot (over 50)
of large reads (over 120 sectors) are done in a quick succession. That is,
time seems to matter here -- larger reads are fine if they are done with
some delay between them.
But I'm not great at understanding this sort of things, so I'll assume
the issue other, smarter, folks were seeing with similar phones is the
same problem and I'll just put my quirk next to theirs.
The "Software details" screen on the phone is as follows:
V 04.06
07-08-13
RM-849
(c) Nokia
TL;DR version of the device descriptor:
idVendor 0x0421 Nokia Mobile Phones
idProduct 0x06c2
bcdDevice 4.06
iManufacturer 1 Nokia
iProduct 2 Nokia 208
The patch assumes older firmwares are broken too (I'm unable to test, but
no biggie if they aren't I guess), and I have no idea if newer firmware
exists.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Cc: stable <stable@kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/20250101212206.2386207-1-lkundrak@v3.sk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Date: Mon Dec 16 10:55:39 2024 +0900
usb: chipidea: ci_hdrc_imx: decrement device's refcount in .remove() and in the error path of .probe()
commit 74adad500346fb07d69af2c79acbff4adb061134 upstream.
Current implementation of ci_hdrc_imx_driver does not decrement the
refcount of the device obtained in usbmisc_get_init_data(). Add a
put_device() call in .remove() and in .probe() before returning an
error.
This bug was found by an experimental static analysis tool that I am
developing.
Cc: stable <stable@kernel.org>
Fixes: f40017e0f332 ("chipidea: usbmisc_imx: Add USB support for VF610 SoCs")
Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Acked-by: Peter Chen <peter.chen@kernel.org>
Link: https://lore.kernel.org/r/20241216015539.352579-1-joe@pf.is.s.u-tokyo.ac.jp
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kai-Heng Feng <kaihengf@nvidia.com>
Date: Fri Dec 6 15:48:17 2024 +0800
USB: core: Disable LPM only for non-suspended ports
commit 59bfeaf5454b7e764288d84802577f4a99bf0819 upstream.
There's USB error when tegra board is shutting down:
[ 180.919315] usb 2-3: Failed to set U1 timeout to 0x0,error code -113
[ 180.919995] usb 2-3: Failed to set U1 timeout to 0xa,error code -113
[ 180.920512] usb 2-3: Failed to set U2 timeout to 0x4,error code -113
[ 186.157172] tegra-xusb 3610000.usb: xHCI host controller not responding, assume dead
[ 186.157858] tegra-xusb 3610000.usb: HC died; cleaning up
[ 186.317280] tegra-xusb 3610000.usb: Timeout while waiting for evaluate context command
The issue is caused by disabling LPM on already suspended ports.
For USB2 LPM, the LPM is already disabled during port suspend. For USB3
LPM, port won't transit to U1/U2 when it's already suspended in U3,
hence disabling LPM is only needed for ports that are not suspended.
Cc: Wayne Chang <waynec@nvidia.com>
Cc: stable <stable@kernel.org>
Fixes: d920a2ed8620 ("usb: Disable USB3 LPM at shutdown")
Signed-off-by: Kai-Heng Feng <kaihengf@nvidia.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://lore.kernel.org/r/20241206074817.89189-1-kaihengf@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Prashanth K <quic_prashk@quicinc.com>
Date: Mon Dec 9 16:27:28 2024 +0530
usb: dwc3-am62: Disable autosuspend during remove
commit 625e70ccb7bbbb2cc912e23c63390946170c085c upstream.
Runtime PM documentation (Section 5) mentions, during remove()
callbacks, drivers should undo the runtime PM changes done in
probe(). Usually this means calling pm_runtime_disable(),
pm_runtime_dont_use_autosuspend() etc. Hence add missing
function to disable autosuspend on dwc3-am62 driver unbind.
Fixes: e8784c0aec03 ("drivers: usb: dwc3: Add AM62 USB wrapper driver")
Cc: stable <stable@kernel.org>
Signed-off-by: Prashanth K <quic_prashk@quicinc.com>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/20241209105728.3216872-1-quic_prashk@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: André Draszik <andre.draszik@linaro.org>
Date: Mon Dec 9 11:49:53 2024 +0000
usb: dwc3: gadget: fix writing NYET threshold
commit 01ea6bf5cb58b20cc1bd159f0cf74a76cf04bb69 upstream.
Before writing a new value to the register, the old value needs to be
masked out for the new value to be programmed as intended, because at
least in some cases the reset value of that field is 0xf (max value).
At the moment, the dwc3 core initialises the threshold to the maximum
value (0xf), with the option to override it via a DT. No upstream DTs
seem to override it, therefore this commit doesn't change behaviour for
any upstream platform. Nevertheless, the code should be fixed to have
the desired outcome.
Do so.
Fixes: 80caf7d21adc ("usb: dwc3: add lpm erratum support")
Cc: stable@vger.kernel.org # 5.10+ (needs adjustment for 5.4)
Signed-off-by: André Draszik <andre.draszik@linaro.org>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/20241209-dwc3-nyet-fix-v2-1-02755683345b@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ma Ke <make_ruc2021@163.com>
Date: Wed Dec 18 15:13:46 2024 +0800
usb: fix reference leak in usb_new_device()
commit 0df11fa8cee5a9cf8753d4e2672bb3667138c652 upstream.
When device_add(&udev->dev) succeeds and a later call fails,
usb_new_device() does not properly call device_del(). As comment of
device_add() says, 'if device_add() succeeds, you should call
device_del() when you want to get rid of it. If device_add() has not
succeeded, use only put_device() to drop the reference count'.
Found by code review.
Cc: stable <stable@kernel.org>
Fixes: 9f8b17e643fe ("USB: make usbdevices export their device nodes instead of using a separate class")
Signed-off-by: Ma Ke <make_ruc2021@163.com>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/20241218071346.2973980-1-make_ruc2021@163.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ingo Rohloff <ingo.rohloff@lauterbach.com>
Date: Thu Dec 12 16:41:14 2024 +0100
usb: gadget: configfs: Ignore trailing LF for user strings to cdev
commit 9466545720e231fc02acd69b5f4e9138e09a26f6 upstream.
Since commit c033563220e0f7a8
("usb: gadget: configfs: Attach arbitrary strings to cdev")
a user can provide extra string descriptors to a USB gadget via configfs.
For "manufacturer", "product", "serialnumber", setting the string via
configfs ignores a trailing LF.
For the arbitrary strings the LF was not ignored.
This patch ignores a trailing LF to make this consistent with the existing
behavior for "manufacturer", ... string descriptors.
Fixes: c033563220e0 ("usb: gadget: configfs: Attach arbitrary strings to cdev")
Cc: stable <stable@kernel.org>
Signed-off-by: Ingo Rohloff <ingo.rohloff@lauterbach.com>
Link: https://lore.kernel.org/r/20241212154114.29295-1-ingo.rohloff@lauterbach.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Akash M <akash.m5@samsung.com>
Date: Thu Dec 19 18:22:19 2024 +0530
usb: gadget: f_fs: Remove WARN_ON in functionfs_bind
commit dfc51e48bca475bbee984e90f33fdc537ce09699 upstream.
This commit addresses an issue related to below kernel panic where
panic_on_warn is enabled. It is caused by the unnecessary use of WARN_ON
in functionsfs_bind, which easily leads to the following scenarios.
1.adb_write in adbd 2. UDC write via configfs
================= =====================
->usb_ffs_open_thread() ->UDC write
->open_functionfs() ->configfs_write_iter()
->adb_open() ->gadget_dev_desc_UDC_store()
->adb_write() ->usb_gadget_register_driver_owner
->driver_register()
->StartMonitor() ->bus_add_driver()
->adb_read() ->gadget_bind_driver()
<times-out without BIND event> ->configfs_composite_bind()
->usb_add_function()
->open_functionfs() ->ffs_func_bind()
->adb_open() ->functionfs_bind()
<ffs->state !=FFS_ACTIVE>
The adb_open, adb_read, and adb_write operations are invoked from the
daemon, but trying to bind the function is a process that is invoked by
UDC write through configfs, which opens up the possibility of a race
condition between the two paths. In this race scenario, the kernel panic
occurs due to the WARN_ON from functionfs_bind when panic_on_warn is
enabled. This commit fixes the kernel panic by removing the unnecessary
WARN_ON.
Kernel panic - not syncing: kernel: panic_on_warn set ...
[ 14.542395] Call trace:
[ 14.542464] ffs_func_bind+0x1c8/0x14a8
[ 14.542468] usb_add_function+0xcc/0x1f0
[ 14.542473] configfs_composite_bind+0x468/0x588
[ 14.542478] gadget_bind_driver+0x108/0x27c
[ 14.542483] really_probe+0x190/0x374
[ 14.542488] __driver_probe_device+0xa0/0x12c
[ 14.542492] driver_probe_device+0x3c/0x220
[ 14.542498] __driver_attach+0x11c/0x1fc
[ 14.542502] bus_for_each_dev+0x104/0x160
[ 14.542506] driver_attach+0x24/0x34
[ 14.542510] bus_add_driver+0x154/0x270
[ 14.542514] driver_register+0x68/0x104
[ 14.542518] usb_gadget_register_driver_owner+0x48/0xf4
[ 14.542523] gadget_dev_desc_UDC_store+0xf8/0x144
[ 14.542526] configfs_write_iter+0xf0/0x138
Fixes: ddf8abd25994 ("USB: f_fs: the FunctionFS driver")
Cc: stable <stable@kernel.org>
Signed-off-by: Akash M <akash.m5@samsung.com>
Link: https://lore.kernel.org/r/20241219125221.1679-1-akash.m5@samsung.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Prashanth K <quic_prashk@quicinc.com>
Date: Wed Dec 11 17:29:15 2024 +0530
usb: gadget: f_uac2: Fix incorrect setting of bNumEndpoints
commit 057bd54dfcf68b1f67e6dfc32a47a72e12198495 upstream.
Currently afunc_bind sets std_ac_if_desc.bNumEndpoints to 1 if
controls (mute/volume) are enabled. During next afunc_bind call,
bNumEndpoints would be unchanged and incorrectly set to 1 even
if the controls aren't enabled.
Fix this by resetting the value of bNumEndpoints to 0 on every
afunc_bind call.
Fixes: eaf6cbe09920 ("usb: gadget: f_uac2: add volume and mute support")
Cc: stable <stable@kernel.org>
Signed-off-by: Prashanth K <quic_prashk@quicinc.com>
Link: https://lore.kernel.org/r/20241211115915.159864-1-quic_prashk@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Takashi Iwai <tiwai@suse.de>
Date: Wed Jan 1 14:11:19 2025 +0100
usb: gadget: midi2: Reverse-select at the right place
commit 6f660ffce7c938f2a5d8473c0e0b45e4fb25ef7f upstream.
We should do reverse selection of other components from
CONFIG_USB_F_MIDI2 which is tristate, instead of
CONFIG_USB_CONFIGFS_F_MIDI2 which is bool, for satisfying subtle
module dependencies.
Fixes: 8b645922b223 ("usb: gadget: Add support for USB MIDI 2.0 function driver")
Cc: stable <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20250101131124.27599-1-tiwai@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Lianqin Hu <hulianqin@vivo.com>
Date: Tue Dec 17 07:58:44 2024 +0000
usb: gadget: u_serial: Disable ep before setting port to null to fix the crash caused by port being null
commit 13014969cbf07f18d62ceea40bd8ca8ec9d36cec upstream.
Considering that in some extreme cases, when performing the
unbinding operation, gserial_disconnect has cleared gser->ioport,
which triggers gadget reconfiguration, and then calls gs_read_complete,
resulting in access to a null pointer. Therefore, ep is disabled before
gserial_disconnect sets port to null to prevent this from happening.
Call trace:
gs_read_complete+0x58/0x240
usb_gadget_giveback_request+0x40/0x160
dwc3_remove_requests+0x170/0x484
dwc3_ep0_out_start+0xb0/0x1d4
__dwc3_gadget_start+0x25c/0x720
kretprobe_trampoline.cfi_jt+0x0/0x8
kretprobe_trampoline.cfi_jt+0x0/0x8
udc_bind_to_driver+0x1d8/0x300
usb_gadget_probe_driver+0xa8/0x1dc
gadget_dev_desc_UDC_store+0x13c/0x188
configfs_write_iter+0x160/0x1f4
vfs_write+0x2d0/0x40c
ksys_write+0x7c/0xf0
__arm64_sys_write+0x20/0x30
invoke_syscall+0x60/0x150
el0_svc_common+0x8c/0xf8
do_el0_svc+0x28/0xa0
el0_svc+0x24/0x84
Fixes: c1dca562be8a ("usb gadget: split out serial core")
Cc: stable <stable@kernel.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Lianqin Hu <hulianqin@vivo.com>
Link: https://lore.kernel.org/r/TYUPR06MB621733B5AC690DBDF80A0DCCD2042@TYUPR06MB6217.apcprd06.prod.outlook.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Xu Yang <xu.yang_2@nxp.com>
Date: Mon Dec 9 19:14:23 2024 +0800
usb: host: xhci-plat: set skip_phy_initialization if software node has XHCI_SKIP_PHY_INIT property
commit e19852d0bfecbc80976b1423cf2af87ca514a58c upstream.
The source of quirk XHCI_SKIP_PHY_INIT comes from xhci_plat_priv.quirks or
software node property. This will set skip_phy_initialization if software
node also has XHCI_SKIP_PHY_INIT property.
Fixes: a6cd2b3fa894 ("usb: host: xhci-plat: Parse xhci-missing_cas_quirk and apply quirk")
Cc: stable <stable@kernel.org>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20241209111423.4085548-1-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Johan Hovold <johan@kernel.org>
Date: Wed Jan 8 11:24:36 2025 +0100
USB: serial: cp210x: add Phoenix Contact UPS Device
commit 854eee93bd6e3dca619d47087af4d65b2045828e upstream.
Phoenix Contact sells UPS Quint devices [1] with a custom datacable [2]
that embeds a Silicon Labs converter:
Bus 001 Device 003: ID 1b93:1013 Silicon Labs Phoenix Contact UPS Device
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 2.00
bDeviceClass 0
bDeviceSubClass 0
bDeviceProtocol 0
bMaxPacketSize0 64
idVendor 0x1b93
idProduct 0x1013
bcdDevice 1.00
iManufacturer 1 Silicon Labs
iProduct 2 Phoenix Contact UPS Device
iSerial 3 <redacted>
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 0x0020
bNumInterfaces 1
bConfigurationValue 1
iConfiguration 0
bmAttributes 0x80
(Bus Powered)
MaxPower 100mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 2
bInterfaceClass 255 Vendor Specific Class
bInterfaceSubClass 0
bInterfaceProtocol 0
iInterface 2 Phoenix Contact UPS Device
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x01 EP 1 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x82 EP 2 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 0
[1] https://www.phoenixcontact.com/en-pc/products/power-supply-unit-quint-ps-1ac-24dc-10-2866763
[2] https://www.phoenixcontact.com/en-il/products/data-cable-preassembled-ifs-usb-datacable-2320500
Reported-by: Giuseppe Corbelli <giuseppe.corbelli@antaresvision.com>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Chukun Pan <amadeus@jmu.edu.cn>
Date: Sun Dec 15 18:00:27 2024 +0800
USB: serial: option: add MeiG Smart SRM815
commit c1947d244f807b1f95605b75a4059e7b37b5dcc3 upstream.
It looks like SRM815 shares ID with SRM825L.
T: Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0
D: Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
P: Vendor=2dee ProdID=4d22 Rev= 4.14
S: Manufacturer=MEIG
S: Product=LTE-A Module
S: SerialNumber=123456
C:* #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Signed-off-by: Chukun Pan <amadeus@jmu.edu.cn>
Link: https://lore.kernel.org/lkml/20241215100027.1970930-1-amadeus@jmu.edu.cn/
Link: https://lore.kernel.org/all/4333b4d0-281f-439d-9944-5570cbc4971d@gmail.com/
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Michal Hrusecky <michal.hrusecky@turris.com>
Date: Tue Jan 7 17:08:29 2025 +0100
USB: serial: option: add Neoway N723-EA support
commit f5b435be70cb126866fa92ffc6f89cda9e112c75 upstream.
Update the USB serial option driver to support Neoway N723-EA.
ID 2949:8700 Marvell Mobile Composite Device Bus
T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0
D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
P: Vendor=2949 ProdID=8700 Rev= 1.00
S: Manufacturer=Marvell
S: Product=Mobile Composite Device Bus
S: SerialNumber=200806006809080000
C:* #Ifs= 5 Cfg#= 1 Atr=c0 MxPwr=500mA
A: FirstIf#= 0 IfCount= 2 Cls=e0(wlcon) Sub=01 Prot=03
I:* If#= 0 Alt= 0 #EPs= 1 Cls=e0(wlcon) Sub=01 Prot=03 Driver=rndis_host
E: Ad=87(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host
E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=0c(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E: Ad=89(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms
E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=0b(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E: Ad=86(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms
E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=0e(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 6 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E: Ad=88(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms
E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=0a(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Tested successfully connecting to the Internet via rndis interface after
dialing via AT commands on If#=4 or If#=6.
Not sure of the purpose of the other serial interface.
Signed-off-by: Michal Hrusecky <michal.hrusecky@turris.com>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: GONG Ruiqi <gongruiqi1@huawei.com>
Date: Tue Jan 7 09:57:50 2025 +0800
usb: typec: fix pm usage counter imbalance in ucsi_ccg_sync_control()
commit b0e525d7a22ea350e75e2aec22e47fcfafa4cacd upstream.
The error handling for the case `con_index == 0` should involve dropping
the pm usage counter, as ucsi_ccg_sync_control() gets it at the
beginning. Fix it.
Cc: stable <stable@kernel.org>
Fixes: e56aac6e5a25 ("usb: typec: fix potential array underflow in ucsi_ccg_sync_control()")
Signed-off-by: GONG Ruiqi <gongruiqi1@huawei.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20250107015750.2778646-1-gongruiqi1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Xu Yang <xu.yang_2@nxp.com>
Date: Wed Dec 18 17:53:28 2024 +0800
usb: typec: tcpci: fix NULL pointer issue on shared irq case
commit 862a9c0f68487fd6ced15622d9cdcec48f8b5aaa upstream.
The tcpci_irq() may meet below NULL pointer dereference issue:
[ 2.641851] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
[ 2.641951] status 0x1, 0x37f
[ 2.650659] Mem abort info:
[ 2.656490] ESR = 0x0000000096000004
[ 2.660230] EC = 0x25: DABT (current EL), IL = 32 bits
[ 2.665532] SET = 0, FnV = 0
[ 2.668579] EA = 0, S1PTW = 0
[ 2.671715] FSC = 0x04: level 0 translation fault
[ 2.676584] Data abort info:
[ 2.679459] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 2.684936] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 2.689980] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 2.695284] [0000000000000010] user address but active_mm is swapper
[ 2.701632] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[ 2.707883] Modules linked in:
[ 2.710936] CPU: 1 UID: 0 PID: 87 Comm: irq/111-2-0051 Not tainted 6.12.0-rc6-06316-g7f63786ad3d1-dirty #4
[ 2.720570] Hardware name: NXP i.MX93 11X11 EVK board (DT)
[ 2.726040] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 2.732989] pc : tcpci_irq+0x38/0x318
[ 2.736647] lr : _tcpci_irq+0x14/0x20
[ 2.740295] sp : ffff80008324bd30
[ 2.743597] x29: ffff80008324bd70 x28: ffff800080107894 x27: ffff800082198f70
[ 2.750721] x26: ffff0000050e6680 x25: ffff000004d172ac x24: ffff0000050f0000
[ 2.757845] x23: ffff000004d17200 x22: 0000000000000001 x21: ffff0000050f0000
[ 2.764969] x20: ffff000004d17200 x19: 0000000000000000 x18: 0000000000000001
[ 2.772093] x17: 0000000000000000 x16: ffff80008183d8a0 x15: ffff00007fbab040
[ 2.779217] x14: ffff00007fb918c0 x13: 0000000000000000 x12: 000000000000017a
[ 2.786341] x11: 0000000000000001 x10: 0000000000000a90 x9 : ffff80008324bd00
[ 2.793465] x8 : ffff0000050f0af0 x7 : ffff00007fbaa840 x6 : 0000000000000031
[ 2.800589] x5 : 000000000000017a x4 : 0000000000000002 x3 : 0000000000000002
[ 2.807713] x2 : ffff80008324bd3a x1 : 0000000000000010 x0 : 0000000000000000
[ 2.814838] Call trace:
[ 2.817273] tcpci_irq+0x38/0x318
[ 2.820583] _tcpci_irq+0x14/0x20
[ 2.823885] irq_thread_fn+0x2c/0xa8
[ 2.827456] irq_thread+0x16c/0x2f4
[ 2.830940] kthread+0x110/0x114
[ 2.834164] ret_from_fork+0x10/0x20
[ 2.837738] Code: f9426420 f9001fe0 d2800000 52800201 (f9400a60)
This may happen on shared irq case. Such as two Type-C ports share one
irq. After the first port finished tcpci_register_port(), it may trigger
interrupt. However, if the interrupt comes by chance the 2nd port finishes
devm_request_threaded_irq(), the 2nd port interrupt handler will run at
first. Then the above issue happens due to tcpci is still a NULL pointer
in tcpci_irq() when dereference to regmap.
devm_request_threaded_irq()
<-- port1 irq comes
disable_irq(client->irq);
tcpci_register_port()
This will restore the logic to the state before commit (77e85107a771 "usb:
typec: tcpci: support edge irq").
However, moving tcpci_register_port() earlier creates a problem when use
edge irq because tcpci_init() will be called before
devm_request_threaded_irq(). The tcpci_init() writes the ALERT_MASK to
the hardware to tell it to start generating interrupts but we're not ready
to deal with them yet, then the ALERT events may be missed and ALERT line
will not recover to high level forever. To avoid the issue, this will also
set ALERT_MASK register after devm_request_threaded_irq() return.
Fixes: 77e85107a771 ("usb: typec: tcpci: support edge irq")
Cc: stable <stable@kernel.org>
Tested-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/20241218095328.2604607-1-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Dan Carpenter <dan.carpenter@linaro.org>
Date: Fri Dec 6 16:09:18 2024 +0300
usb: typec: tcpm/tcpci_maxim: fix error code in max_contaminant_read_resistance_kohm()
commit b9711ff7cde0cfbcdd44cb1fac55b6eec496e690 upstream.
If max_contaminant_read_adc_mv() fails, then return the error code. Don't
return zero.
Fixes: 02b332a06397 ("usb: typec: maxim_contaminant: Implement check_contaminant callback")
Cc: stable <stable@kernel.org>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: André Draszik <andre.draszik@linaro.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/f1bf3768-419e-40dd-989c-f7f455d6c824@stanley.mountain
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jun Yan <jerrysteve1101@gmail.com>
Date: Thu Dec 12 22:38:52 2024 +0800
USB: usblp: return error when setting unsupported protocol
commit 7a3d76a0b60b3f6fc3375e4de2174bab43f64545 upstream.
Fix the regression introduced by commit d8c6edfa3f4e ("USB:
usblp: don't call usb_set_interface if there's a single alt"),
which causes that unsupported protocols can also be set via
ioctl when the num_altsetting of the device is 1.
Move the check for protocol support to the earlier stage.
Fixes: d8c6edfa3f4e ("USB: usblp: don't call usb_set_interface if there's a single alt")
Cc: stable <stable@kernel.org>
Signed-off-by: Jun Yan <jerrysteve1101@gmail.com>
Link: https://lore.kernel.org/r/20241212143852.671889-1-jerrysteve1101@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Alex Williamson <alex.williamson@redhat.com>
Date: Thu Jan 2 11:32:54 2025 -0700
vfio/pci: Fallback huge faults for unaligned pfn
commit 09dfc8a5f2ce897005a94bf66cca4f91e4e03700 upstream.
The PFN must also be aligned to the fault order to insert a huge
pfnmap. Test the alignment and fallback when unaligned.
Fixes: f9e54c3a2f5b ("vfio/pci: implement huge_fault support")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219619
Reported-by: Athul Krishna <athul.krishna.kr@protonmail.com>
Reported-by: Precific <precification@posteo.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Precific <precification@posteo.de>
Link: https://lore.kernel.org/r/20250102183416.1841878-1-alex.williamson@redhat.com
Cc: stable@vger.kernel.org
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Rick Edgecombe <rick.p.edgecombe@intel.com>
Date: Tue Jan 7 15:30:56 2025 -0800
x86/fpu: Ensure shadow stack is active before "getting" registers
commit a9d9c33132d49329ada647e4514d210d15e31d81 upstream.
The x86 shadow stack support has its own set of registers. Those registers
are XSAVE-managed, but they are "supervisor state components" which means
that userspace can not touch them with XSAVE/XRSTOR. It also means that
they are not accessible from the existing ptrace ABI for XSAVE state.
Thus, there is a new ptrace get/set interface for it.
The regset code that ptrace uses provides an ->active() handler in
addition to the get/set ones. For shadow stack this ->active() handler
verifies that shadow stack is enabled via the ARCH_SHSTK_SHSTK bit in the
thread struct. The ->active() handler is checked from some call sites of
the regset get/set handlers, but not the ptrace ones. This was not
understood when shadow stack support was put in place.
As a result, both the set/get handlers can be called with
XFEATURE_CET_USER in its init state, which would cause get_xsave_addr() to
return NULL and trigger a WARN_ON(). The ssp_set() handler luckily has an
ssp_active() check to avoid surprising the kernel with shadow stack
behavior when the kernel is not ready for it (ARCH_SHSTK_SHSTK==0). That
check just happened to avoid the warning.
But the ->get() side wasn't so lucky. It can be called with shadow stacks
disabled, triggering the warning in practice, as reported by Christina
Schimpe:
WARNING: CPU: 5 PID: 1773 at arch/x86/kernel/fpu/regset.c:198 ssp_get+0x89/0xa0
[...]
Call Trace:
<TASK>
? show_regs+0x6e/0x80
? ssp_get+0x89/0xa0
? __warn+0x91/0x150
? ssp_get+0x89/0xa0
? report_bug+0x19d/0x1b0
? handle_bug+0x46/0x80
? exc_invalid_op+0x1d/0x80
? asm_exc_invalid_op+0x1f/0x30
? __pfx_ssp_get+0x10/0x10
? ssp_get+0x89/0xa0
? ssp_get+0x52/0xa0
__regset_get+0xad/0xf0
copy_regset_to_user+0x52/0xc0
ptrace_regset+0x119/0x140
ptrace_request+0x13c/0x850
? wait_task_inactive+0x142/0x1d0
? do_syscall_64+0x6d/0x90
arch_ptrace+0x102/0x300
[...]
Ensure that shadow stacks are active in a thread before looking them up
in the XSAVE buffer. Since ARCH_SHSTK_SHSTK and user_ssp[SHSTK_EN] are
set at the same time, the active check ensures that there will be
something to find in the XSAVE buffer.
[ dhansen: changelog/subject tweaks ]
Fixes: 2fab02b25ae7 ("x86: Add PTRACE interface for shadow stack")
Reported-by: Christina Schimpe <christina.schimpe@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Christina Schimpe <christina.schimpe@intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250107233056.235536-1-rick.p.edgecombe%40intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>