Список изменений в ядре 6.7.9

af_unix: Drop oob_skb ref before purging queue in GC. [+ + +]

Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Mon Feb 19 09:46:57 2024 -0800

    af_unix: Drop oob_skb ref before purging queue in GC.
    
    commit aa82ac51d63328714645c827775d64dbfd9941f3 upstream.
    
    syzbot reported another task hung in __unix_gc().  [0]
    
    The current while loop assumes that all of the left candidates
    have oob_skb and calling kfree_skb(oob_skb) releases the remaining
    candidates.
    
    However, I missed a case that oob_skb has self-referencing fd and
    another fd and the latter sk is placed before the former in the
    candidate list.  Then, the while loop never proceeds, resulting
    the task hung.
    
    __unix_gc() has the same loop just before purging the collected skb,
    so we can call kfree_skb(oob_skb) there and let __skb_queue_purge()
    release all inflight sockets.
    
    [0]:
    Sending NMI from CPU 0 to CPUs 1:
    NMI backtrace for cpu 1
    CPU: 1 PID: 2784 Comm: kworker/u4:8 Not tainted 6.8.0-rc4-syzkaller-01028-g71b605d32017 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
    Workqueue: events_unbound __unix_gc
    RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x70 kernel/kcov.c:200
    Code: 89 fb e8 23 00 00 00 48 8b 3d 84 f5 1a 0c 48 89 de 5b e9 43 26 57 00 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 <f3> 0f 1e fa 48 8b 04 24 65 48 8b 0d 90 52 70 7e 65 8b 15 91 52 70
    RSP: 0018:ffffc9000a17fa78 EFLAGS: 00000287
    RAX: ffffffff8a0a6108 RBX: ffff88802b6c2640 RCX: ffff88802c0b3b80
    RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000
    RBP: ffffc9000a17fbf0 R08: ffffffff89383f1d R09: 1ffff1100ee5ff84
    R10: dffffc0000000000 R11: ffffed100ee5ff85 R12: 1ffff110056d84ee
    R13: ffffc9000a17fae0 R14: 0000000000000000 R15: ffffffff8f47b840
    FS:  0000000000000000(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007ffef5687ff8 CR3: 0000000029b34000 CR4: 00000000003506f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <NMI>
     </NMI>
     <TASK>
     __unix_gc+0xe69/0xf40 net/unix/garbage.c:343
     process_one_work kernel/workqueue.c:2633 [inline]
     process_scheduled_works+0x913/0x1420 kernel/workqueue.c:2706
     worker_thread+0xa5f/0x1000 kernel/workqueue.c:2787
     kthread+0x2ef/0x390 kernel/kthread.c:388
     ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
     ret_from_fork_asm+0x1b/0x30 arch/x86/entry/entry_64.S:242
     </TASK>
    
    Reported-and-tested-by: syzbot+ecab4d36f920c3574bf9@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=ecab4d36f920c3574bf9
    Fixes: 25236c91b5ab ("af_unix: Fix task hung while purging oob_skb in GC.")
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

af_unix: Fix task hung while purging oob_skb in GC. [+ + +]

Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Fri Feb 9 14:04:53 2024 -0800

    af_unix: Fix task hung while purging oob_skb in GC.
    
    commit 25236c91b5ab4a26a56ba2e79b8060cf4e047839 upstream.
    
    syzbot reported a task hung; at the same time, GC was looping infinitely
    in list_for_each_entry_safe() for OOB skb.  [0]
    
    syzbot demonstrated that the list_for_each_entry_safe() was not actually
    safe in this case.
    
    A single skb could have references for multiple sockets.  If we free such
    a skb in the list_for_each_entry_safe(), the current and next sockets could
    be unlinked in a single iteration.
    
    unix_notinflight() uses list_del_init() to unlink the socket, so the
    prefetched next socket forms a loop itself and list_for_each_entry_safe()
    never stops.
    
    Here, we must use while() and make sure we always fetch the first socket.
    
    [0]:
    Sending NMI from CPU 0 to CPUs 1:
    NMI backtrace for cpu 1
    CPU: 1 PID: 5065 Comm: syz-executor236 Not tainted 6.8.0-rc3-syzkaller-00136-g1f719a2f3fa6 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
    RIP: 0010:preempt_count arch/x86/include/asm/preempt.h:26 [inline]
    RIP: 0010:check_kcov_mode kernel/kcov.c:173 [inline]
    RIP: 0010:__sanitizer_cov_trace_pc+0xd/0x60 kernel/kcov.c:207
    Code: cc cc cc cc 66 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 65 48 8b 14 25 40 c2 03 00 <65> 8b 05 b4 7c 78 7e a9 00 01 ff 00 48 8b 34 24 74 0f f6 c4 01 74
    RSP: 0018:ffffc900033efa58 EFLAGS: 00000283
    RAX: ffff88807b077800 RBX: ffff88807b077800 RCX: 1ffffffff27b1189
    RDX: ffff88802a5a3b80 RSI: ffffffff8968488d RDI: ffff88807b077f70
    RBP: ffffc900033efbb0 R08: 0000000000000001 R09: fffffbfff27a900c
    R10: ffffffff93d48067 R11: ffffffff8ae000eb R12: ffff88807b077800
    R13: dffffc0000000000 R14: ffff88807b077e40 R15: 0000000000000001
    FS:  0000000000000000(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000564f4fc1e3a8 CR3: 000000000d57a000 CR4: 00000000003506f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <NMI>
     </NMI>
     <TASK>
     unix_gc+0x563/0x13b0 net/unix/garbage.c:319
     unix_release_sock+0xa93/0xf80 net/unix/af_unix.c:683
     unix_release+0x91/0xf0 net/unix/af_unix.c:1064
     __sock_release+0xb0/0x270 net/socket.c:659
     sock_close+0x1c/0x30 net/socket.c:1421
     __fput+0x270/0xb80 fs/file_table.c:376
     task_work_run+0x14f/0x250 kernel/task_work.c:180
     exit_task_work include/linux/task_work.h:38 [inline]
     do_exit+0xa8a/0x2ad0 kernel/exit.c:871
     do_group_exit+0xd4/0x2a0 kernel/exit.c:1020
     __do_sys_exit_group kernel/exit.c:1031 [inline]
     __se_sys_exit_group kernel/exit.c:1029 [inline]
     __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1029
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xd5/0x270 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x6f/0x77
    RIP: 0033:0x7f9d6cbdac09
    Code: Unable to access opcode bytes at 0x7f9d6cbdabdf.
    RSP: 002b:00007fff5952feb8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
    RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9d6cbdac09
    RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
    RBP: 00007f9d6cc552b0 R08: ffffffffffffffb8 R09: 0000000000000006
    R10: 0000000000000006 R11: 0000000000000246 R12: 00007f9d6cc552b0
    R13: 0000000000000000 R14: 00007f9d6cc55d00 R15: 00007f9d6cbabe70
     </TASK>
    
    Reported-by: syzbot+4fa4a2d1f5a5ee06f006@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=4fa4a2d1f5a5ee06f006
    Fixes: 1279f9d9dec2 ("af_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC.")
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Link: https://lore.kernel.org/r/20240209220453.96053-1-kuniyu@amazon.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

afs: Fix endless loop in directory parsing [+ + +]

Author: David Howells <dhowells@redhat.com>
Date:   Fri Feb 23 13:15:02 2024 +0000

    afs: Fix endless loop in directory parsing
    
    [ Upstream commit 5f7a07646655fb4108da527565dcdc80124b14c4 ]
    
    If a directory has a block with only ".__afsXXXX" files in it (from
    uncompleted silly-rename), these .__afsXXXX files are skipped but without
    advancing the file position in the dir_context.  This leads to
    afs_dir_iterate() repeating the block again and again.
    
    Fix this by making the code that skips the .__afsXXXX file also manually
    advance the file position.
    
    The symptoms are a soft lookup:
    
            watchdog: BUG: soft lockup - CPU#3 stuck for 52s! [check:5737]
            ...
            RIP: 0010:afs_dir_iterate_block+0x39/0x1fd
            ...
             ? watchdog_timer_fn+0x1a6/0x213
            ...
             ? asm_sysvec_apic_timer_interrupt+0x16/0x20
             ? afs_dir_iterate_block+0x39/0x1fd
             afs_dir_iterate+0x10a/0x148
             afs_readdir+0x30/0x4a
             iterate_dir+0x93/0xd3
             __do_sys_getdents64+0x6b/0xd4
    
    This is almost certainly the actual fix for:
    
            https://bugzilla.kernel.org/show_bug.cgi?id=218496
    
    Fixes: 57e9d49c5452 ("afs: Hide silly-rename files from userspace")
    Signed-off-by: David Howells <dhowells@redhat.com>
    Link: https://lore.kernel.org/r/786185.1708694102@warthog.procyon.org.uk
    Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
    cc: Marc Dionne <marc.dionne@auristor.com>
    cc: Markus Suvanto <markus.suvanto@gmail.com>
    cc: linux-afs@lists.infradead.org
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: Drop leftover snd-rtctimer stuff from Makefile [+ + +]

Author: Takashi Iwai <tiwai@suse.de>
Date:   Wed Feb 21 10:21:56 2024 +0100

    ALSA: Drop leftover snd-rtctimer stuff from Makefile
    
    [ Upstream commit 4df49712eb54141be00a9312547436d55677f092 ]
    
    We forgot to remove the line for snd-rtctimer from Makefile while
    dropping the functionality.  Get rid of the stale line.
    
    Fixes: 34ce71a96dcb ("ALSA: timer: remove legacy rtctimer")
    Link: https://lore.kernel.org/r/20240221092156.28695-1-tiwai@suse.de
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: firewire-lib: fix to check cycle continuity [+ + +]

Author: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Date:   Sun Feb 18 12:30:26 2024 +0900

    ALSA: firewire-lib: fix to check cycle continuity
    
    commit 77ce96543b03f437c6b45f286d8110db2b6622a3 upstream.
    
    The local helper function to compare the given pair of cycle count
    evaluates them. If the left value is less than the right value, the
    function returns negative value.
    
    If the safe cycle is less than the current cycle, it is the case of
    cycle lost. However, it is not currently handled properly.
    
    This commit fixes the bug.
    
    Cc: <stable@vger.kernel.org>
    Fixes: 705794c53b00 ("ALSA: firewire-lib: check cycle continuity")
    Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
    Link: https://lore.kernel.org/r/20240218033026.72577-1-o-takashi@sakamocchi.jp
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek: Add special fixup for Lenovo 14IRP8 [+ + +]

Author: Willian Wang <git@willian.wang>
Date:   Sat Feb 24 13:11:49 2024 -0300

    ALSA: hda/realtek: Add special fixup for Lenovo 14IRP8
    
    commit 0ac32a396e4f41e88df76ce2282423188a2d2ed0 upstream.
    
    Lenovo Slim/Yoga Pro 9 14IRP8 requires a special fixup because there is
    a collision of its PCI SSID (17aa:3802) with Lenovo Yoga DuetITL 2021
    codec SSID.
    
    Fixes: 3babae915f4c ("ALSA: hda/tas2781: Add tas2781 HDA driver")
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=208555
    Link: https://lore.kernel.org/all/d5b42e483566a3815d229270abd668131a0d9f3a.camel@irl.hu
    Cc: stable@vger.kernel.org
    Signed-off-by: Willian Wang <git@willian.wang>
    Reviewed-by: Gergo Koteles <soyer@irl.hu>
    Link: https://lore.kernel.org/r/170879111795.8.6687687359006700715.273812184@willian.wang
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek: Enable Mute LED on HP 840 G8 (MB 8AB8) [+ + +]

Author: Hans Peter <flurry123@gmx.ch>
Date:   Mon Feb 19 17:38:49 2024 +0100

    ALSA: hda/realtek: Enable Mute LED on HP 840 G8 (MB 8AB8)
    
    commit 1fdf4e8be7059e7784fec11d30cd32784f0bdc83 upstream.
    
    On my EliteBook 840 G8 Notebook PC (ProdId 5S7R6EC#ABD; built 2022 for
    german market) the Mute LED is always on. The mute button itself works
    as expected. alsa-info.sh shows a different subsystem-id 0x8ab9 for
    Realtek ALC285 Codec, thus the existing quirks for HP 840 G8 don't work.
    Therefore, add a new quirk for this type of EliteBook.
    
    Signed-off-by: Hans Peter <flurry123@gmx.ch>
    Cc: <stable@vger.kernel.org>
    Link: https://lore.kernel.org/r/20240219164518.4099-1-flurry123@gmx.ch
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek: fix mute/micmute LED For HP mt440 [+ + +]

Author: Eniac Zhang <eniac-xw.zhang@hp.com>
Date:   Tue Feb 20 17:58:12 2024 +0000

    ALSA: hda/realtek: fix mute/micmute LED For HP mt440
    
    commit 67c3d7717efbd46092f217b1f811df1b205cce06 upstream.
    
    The HP mt440 Thin Client uses an ALC236 codec and needs the
    ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF quirk to make the mute and
    micmute LEDs work.
    
    There are two variants of the USB-C PD chip on this device. Each uses
    a different BIOS and board ID, hence the two entries.
    
    Signed-off-by: Eniac Zhang <eniac-xw.zhang@hp.com>
    Signed-off-by: Alexandru Gagniuc <alexandru.gagniuc@hp.com>
    Cc: <stable@vger.kernel.org>
    Link: https://lore.kernel.org/r/20240220175812.782687-1-alexandru.gagniuc@hp.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek: Fix top speaker connection on Dell Inspiron 16 Plus 7630 [+ + +]

Author: Jay Ajit Mate <jay.mate15@gmail.com>
Date:   Mon Feb 19 15:34:04 2024 +0530

    ALSA: hda/realtek: Fix top speaker connection on Dell Inspiron 16 Plus 7630
    
    commit 89a0dff6105e06067bdc57595982dbf6d6dd4959 upstream.
    
    The Dell Inspiron 16 Plus 7630, similar to its predecessors (7620 models),
    experiences an issue with unconnected top speakers. Since the controller
    remains unchanged, this commit addresses the problem by correctly
    connecting the speakers on NID 0X17 to the DAC on NIC 0x03.
    
    Signed-off-by: Jay Ajit Mate <jay.mate15@gmail.com>
    Cc: <stable@vger.kernel.org>
    Link: https://lore.kernel.org/r/20240219100404.9573-1-jay.mate15@gmail.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek: tas2781: enable subwoofer volume control [+ + +]

Author: Gergo Koteles <soyer@irl.hu>
Date:   Fri Feb 23 12:34:30 2024 +0100

    ALSA: hda/realtek: tas2781: enable subwoofer volume control
    
    commit c1947ce61ff4cd4de2fe5f72423abedb6dc83011 upstream.
    
    The volume of subwoofer channels is always at maximum with the
    ALC269_FIXUP_THINKPAD_ACPI chain.
    
    Use ALC285_FIXUP_THINKPAD_HEADSET_JACK to align it to the master volume.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=208555#c827
    
    Fixes: 3babae915f4c ("ALSA: hda/tas2781: Add tas2781 HDA driver")
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Gergo Koteles <soyer@irl.hu>
    Link: https://lore.kernel.org/r/7ffae10ebba58601d25fe2ff8381a6ae3a926e62.1708687813.git.soyer@irl.hu
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: ump: Fix the discard error code from snd_ump_legacy_open() [+ + +]

Author: Takashi Iwai <tiwai@suse.de>
Date:   Tue Feb 20 16:08:43 2024 +0100

    ALSA: ump: Fix the discard error code from snd_ump_legacy_open()
    
    commit 49cbb7b7d36ec3ba73ce1daf7ae1d71d435453b8 upstream.
    
    snd_ump_legacy_open() didn't return the error code properly even if it
    couldn't open.  Fix it.
    
    Fixes: 0b5288f5fe63 ("ALSA: ump: Add legacy raw MIDI support")
    Cc: <stable@vger.kernel.org>
    Link: https://lore.kernel.org/r/20240220150843.28630-1-tiwai@suse.de
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ASoC: cs35l56: cs35l56_component_remove() must clean up wm_adsp [+ + +]

Author: Richard Fitzgerald <rf@opensource.cirrus.com>
Date:   Mon Jan 29 16:27:23 2024 +0000

    ASoC: cs35l56: cs35l56_component_remove() must clean up wm_adsp
    
    [ Upstream commit cd38ccbecdace1469b4e0cfb3ddeec72a3fad226 ]
    
    cs35l56_component_remove() must call wm_adsp_power_down() and
    wm_adsp2_component_remove().
    
    Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
    Fixes: e49611252900 ("ASoC: cs35l56: Add driver for Cirrus Logic CS35L56")
    Link: https://msgid.link/r/20240129162737.497-5-rf@opensource.cirrus.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Stable-dep-of: eba2eb2495f4 ("ASoC: soc-card: Fix missing locking in snd_soc_card_get_kcontrol()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: cs35l56: cs35l56_component_remove() must clear cs35l56->component [+ + +]

Author: Richard Fitzgerald <rf@opensource.cirrus.com>
Date:   Mon Jan 29 16:27:22 2024 +0000

    ASoC: cs35l56: cs35l56_component_remove() must clear cs35l56->component
    
    [ Upstream commit ae861c466ee57e15a29d97629e1c564e3f714a4f ]
    
    The cs35l56->component pointer is used by the suspend-resume handling to
    know whether the driver is fully instantiated. This is to prevent it
    queuing dsp_work which would result in calling wm_adsp when the driver
    is not an instantiated ASoC component. So this pointer must be cleared
    by cs35l56_component_remove().
    
    Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
    Fixes: e49611252900 ("ASoC: cs35l56: Add driver for Cirrus Logic CS35L56")
    Link: https://msgid.link/r/20240129162737.497-4-rf@opensource.cirrus.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Stable-dep-of: eba2eb2495f4 ("ASoC: soc-card: Fix missing locking in snd_soc_card_get_kcontrol()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: cs35l56: Don't add the same register patch multiple times [+ + +]

Author: Richard Fitzgerald <rf@opensource.cirrus.com>
Date:   Mon Jan 29 16:27:24 2024 +0000

    ASoC: cs35l56: Don't add the same register patch multiple times
    
    [ Upstream commit 07687cd0539f8185b6ba0c0afba8473517116d6a ]
    
    Move the call to cs35l56_set_patch() earlier in cs35l56_init() so
    that it only adds the register patch on first-time initialization.
    
    The call was after the post_soft_reset label, so every time this
    function was run to re-initialize the hardware after a reset it would
    call regmap_register_patch() and add the same reg_sequence again.
    
    Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
    Fixes: 898673b905b9 ("ASoC: cs35l56: Move shared data into a common data structure")
    Link: https://msgid.link/r/20240129162737.497-6-rf@opensource.cirrus.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Stable-dep-of: eba2eb2495f4 ("ASoC: soc-card: Fix missing locking in snd_soc_card_get_kcontrol()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: cs35l56: Fix deadlock in ASP1 mixer register initialization [+ + +]

Author: Richard Fitzgerald <rf@opensource.cirrus.com>
Date:   Thu Feb 8 12:37:42 2024 +0000

    ASoC: cs35l56: Fix deadlock in ASP1 mixer register initialization
    
    [ Upstream commit c14f09f010cc569ae7e2f6ef02374f6bfef9917e ]
    
    Rewrite the handling of ASP1 TX mixer mux initialization to prevent a
    deadlock during component_remove().
    
    The firmware can overwrite the ASP1 TX mixer registers with
    system-specific settings. This is mainly for hardware that uses the
    ASP as a chip-to-chip link controlled by the firmware. Because of this
    the driver cannot know the starting state of the ASP1 mixer muxes until
    the firmware has been downloaded and rebooted.
    
    The original workaround for this was to queue a work function from the
    dsp_work() job. This work then read the register values (populating the
    regmap cache the first time around) and then called
    snd_soc_dapm_mux_update_power(). The problem with this is that it was
    ultimately triggered by cs35l56_component_probe() queueing dsp_work,
    which meant that it would be running in parallel with the rest of the
    ASoC component and card initialization. To prevent accessing DAPM before
    it was fully initialized the work function took the card mutex. But this
    would deadlock if cs35l56_component_remove() was called before the work job
    had completed, because ASoC calls component_remove() with the card mutex
    held.
    
    This new version removes the work function. Instead the regmap cache and
    DAPM mux widgets are initialized the first time any of the associated ALSA
    controls is read or written.
    
    Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
    Fixes: 07f7d6e7a124 ("ASoC: cs35l56: Fix for initializing ASP1 mixer registers")
    Link: https://lore.kernel.org/r/20240208123742.1278104-1-rf@opensource.cirrus.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Stable-dep-of: eba2eb2495f4 ("ASoC: soc-card: Fix missing locking in snd_soc_card_get_kcontrol()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: cs35l56: Fix for initializing ASP1 mixer registers [+ + +]

Author: Richard Fitzgerald <rf@opensource.cirrus.com>
Date:   Mon Jan 29 16:27:29 2024 +0000

    ASoC: cs35l56: Fix for initializing ASP1 mixer registers
    
    [ Upstream commit 07f7d6e7a124d3e4de36771e2a4926d0e31c2258 ]
    
    Defer initializing the state of the ASP1 mixer registers until
    the firmware has been downloaded and rebooted.
    
    On a SoundWire system the ASP is free for use as a chip-to-chip
    interconnect. This can be either for the firmware on multiple
    CS35L56 to share reference audio; or as a bridge to another
    device. If it is a firmware interconnect it is owned by the
    firmware and the Linux driver should avoid writing the registers.
    However, if it is a bridge then Linux may take over and handle
    it as a normal codec-to-codec link. Even if the ASP is used
    as a firmware-firmware interconnect it is useful to have
    ALSA controls for the ASP mixer. They are at least useful for
    debugging.
    
    CS35L56 is designed for SDCA and a generic SDCA driver would
    know nothing about these chip-specific registers. So if the
    ASP is being used on a SoundWire system the firmware sets up the
    ASP mixer registers. This means that we can't assume the default
    state of these registers. But we don't know the initial state
    that the firmware set them to until after the firmware has been
    downloaded and booted, which can take several seconds when
    downloading multiple amps.
    
    DAPM normally reads the initial state of mux registers during
    probe() but this would mean blocking probe() for several seconds
    until the firmware has initialized them. To avoid this, the
    mixer muxes are set SND_SOC_NOPM to prevent DAPM trying to read
    the register state. Custom get/set callbacks are implemented for
    ALSA control access, and these can safely block waiting for the
    firmware download.
    
    After the firmware download has completed, the state of the
    mux registers is known so a work job is queued to call
    snd_soc_dapm_mux_update_power() on each of the mux widgets.
    
    Backport note:
    This won't apply cleanly to kernels older than v6.6.
    
    Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
    Fixes: e49611252900 ("ASoC: cs35l56: Add driver for Cirrus Logic CS35L56")
    Link: https://msgid.link/r/20240129162737.497-11-rf@opensource.cirrus.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Stable-dep-of: eba2eb2495f4 ("ASoC: soc-card: Fix missing locking in snd_soc_card_get_kcontrol()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: cs35l56: Fix misuse of wm_adsp 'part' string for silicon revision [+ + +]

Author: Richard Fitzgerald <rf@opensource.cirrus.com>
Date:   Mon Jan 29 16:27:30 2024 +0000

    ASoC: cs35l56: Fix misuse of wm_adsp 'part' string for silicon revision
    
    [ Upstream commit f6c967941c5d6fa526fdd64733a8d86bf2bfab31 ]
    
    Put the silicon revision and secured flag in the wm_adsp fwf_name
    string instead of including them in the part string.
    
    This changes the format of the firmware name string from
    
     cs35l56[s]-rev-misc[-system_name]
    
    to
     cs35l56-rev[-s]-misc[-system_name]
    
    No firmware files have been published, so this doesn't cause a
    compatibility break.
    
    Silicon revision and secured flag are included in the firmware
    filename to pick a firmware compatible with the part. These strings
    were being added to the part string, but that is a misuse of the
    string. The correct place for these is the fwf_name string, which
    is specifically intended to select between multiple firmware files
    for the same part.
    
    Backport note:
    This won't apply to kernels older than v6.6.
    
    Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
    Fixes: 608f1b0dbdde ("ASoC: cs35l56: Move DSP part string generation so that it is done only once")
    Link: https://msgid.link/r/20240129162737.497-12-rf@opensource.cirrus.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Stable-dep-of: eba2eb2495f4 ("ASoC: soc-card: Fix missing locking in snd_soc_card_get_kcontrol()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: cs35l56: fix reversed if statement in cs35l56_dspwait_asp1tx_put() [+ + +]

Author: Dan Carpenter <dan.carpenter@linaro.org>
Date:   Mon Feb 5 15:44:30 2024 +0300

    ASoC: cs35l56: fix reversed if statement in cs35l56_dspwait_asp1tx_put()
    
    commit 4703b014f28bf7a2e56d1da238ee95ef6c5ce76b upstream.
    
    It looks like the "!" character was added accidentally.  The
    regmap_update_bits_check() function is normally going to succeed.  This
    means the rest of the function is unreachable and we don't handle the
    situation where "changed" is true correctly.
    
    Fixes: 07f7d6e7a124 ("ASoC: cs35l56: Fix for initializing ASP1 mixer registers")
    Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
    Reviewed-by: Richard Fitzgerald <rf@opensource.cirrus.com>
    Link: https://lore.kernel.org/r/0c254c07-d1c0-4a5c-a22b-7e135cab032c@moroto.mountain
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ASoC: cs35l56: Must clear HALO_STATE before issuing SYSTEM_RESET [+ + +]

Author: Richard Fitzgerald <rf@opensource.cirrus.com>
Date:   Fri Feb 16 14:05:35 2024 +0000

    ASoC: cs35l56: Must clear HALO_STATE before issuing SYSTEM_RESET
    
    [ Upstream commit e33625c84b75e4f078d7f9bf58f01fe71ab99642 ]
    
    The driver must write 0 to HALO_STATE before sending the SYSTEM_RESET
    command to the firmware.
    
    HALO_STATE is in DSP memory, which is preserved across a soft reset.
    The SYSTEM_RESET command does not change the value of HALO_STATE.
    There is period of time while the CS35L56 is resetting, before the
    firmware has started to boot, where a read of HALO_STATE will return
    the value it had before the SYSTEM_RESET. If the driver does not
    clear HALO_STATE, this would return BOOT_DONE status even though the
    firmware has not booted.
    
    Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
    Fixes: 8a731fd37f8b ("ASoC: cs35l56: Move utility functions to shared file")
    Link: https://msgid.link/r/20240216140535.1434933-1-rf@opensource.cirrus.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: qcom: Fix uninitialized pointer dmactl [+ + +]

Author: Colin Ian King <colin.i.king@gmail.com>
Date:   Wed Feb 21 13:48:04 2024 +0000

    ASoC: qcom: Fix uninitialized pointer dmactl
    
    [ Upstream commit 1382d8b55129875b2e07c4d2a7ebc790183769ee ]
    
    In the case where __lpass_get_dmactl_handle is called and the driver
    id dai_id is invalid the pointer dmactl is not being assigned a value,
    and dmactl contains a garbage value since it has not been initialized
    and so the null check may not work. Fix this to initialize dmactl to
    NULL. One could argue that modern compilers will set this to zero, but
    it is useful to keep this initialized as per the same way in functions
    __lpass_platform_codec_intf_init and lpass_cdc_dma_daiops_hw_params.
    
    Cleans up clang scan build warning:
    sound/soc/qcom/lpass-cdc-dma.c:275:7: warning: Branch condition
    evaluates to a garbage value [core.uninitialized.Branch]
    
    Fixes: b81af585ea54 ("ASoC: qcom: Add lpass CPU driver for codec dma control")
    Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
    Link: https://msgid.link/r/20240221134804.3475989-1-colin.i.king@gmail.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: soc-card: Fix missing locking in snd_soc_card_get_kcontrol() [+ + +]

Author: Richard Fitzgerald <rf@opensource.cirrus.com>
Date:   Wed Feb 21 12:37:10 2024 +0000

    ASoC: soc-card: Fix missing locking in snd_soc_card_get_kcontrol()
    
    [ Upstream commit eba2eb2495f47690400331c722868902784e59de ]
    
    snd_soc_card_get_kcontrol() must be holding a read lock on
    card->controls_rwsem while walking the controls list.
    
    Compare with snd_ctl_find_numid().
    
    The existing function is renamed snd_soc_card_get_kcontrol_locked()
    so that it can be called from contexts that are already holding
    card->controls_rwsem (for example, control get/put functions).
    
    There are few direct or indirect callers of
    snd_soc_card_get_kcontrol(), and most are safe. Three require
    changes, which have been included in this patch:
    
    codecs/cs35l45.c:
      cs35l45_activate_ctl() is called from a control put() function so
      is changed to call snd_soc_card_get_kcontrol_locked().
    
    codecs/cs35l56.c:
      cs35l56_sync_asp1_mixer_widgets_with_firmware() is called from
      control get()/put() functions so is changed to call
      snd_soc_card_get_kcontrol_locked().
    
    fsl/fsl_xcvr.c:
      fsl_xcvr_activate_ctl() is called from three places, one of which
      already holds card->controls_rwsem:
      1. fsl_xcvr_mode_put(), a control put function, which will
         already be holding card->controls_rwsem.
      2. fsl_xcvr_startup(), a DAI startup function.
      3. fsl_xcvr_shutdown(), a DAI shutdown function.
    
      To fix this, fsl_xcvr_activate_ctl() has been changed to call
      snd_soc_card_get_kcontrol_locked() so that it is safe to call
      directly from fsl_xcvr_mode_put().
      The fsl_xcvr_startup() and fsl_xcvr_shutdown() functions have been
      changed to take a read lock on card->controls_rsem() around calls
      to fsl_xcvr_activate_ctl(). While this is not very elegant, it
      keeps the change small, to avoid this patch creating a large
      collateral churn in fsl/fsl_xcvr.c.
    
    Analysis of other callers of snd_soc_card_get_kcontrol() is that
    they do not need any changes, they are not holding card->controls_rwsem
    when they call snd_soc_card_get_kcontrol().
    
    Direct callers of snd_soc_card_get_kcontrol():
      fsl/fsl_spdif.c: fsl_spdif_dai_probe() - DAI probe function
      fsl/fsl_micfil.c: voice_detected_fn() - IRQ handler
    
    Indirect callers via soc_component_notify_control():
      codecs/cs42l43: cs42l43_mic_shutter() - IRQ handler
      codecs/cs42l43: cs42l43_spk_shutter() - IRQ handler
      codecs/ak4118.c: ak4118_irq_handler() - IRQ handler
      codecs/wm_adsp.c: wm_adsp_write_ctl() - not currently used
    
    Indirect callers via snd_soc_limit_volume():
      qcom/sc8280xp.c: sc8280xp_snd_init() - DAIlink init function
      ti/rx51.c: rx51_aic34_init() - DAI init function
    
    I don't have hardware to test the fsl/*, qcom/sc828xp.c, ti/rx51.c
    and ak4118.c changes.
    
    Backport note:
    The fsl/, qcom/, cs35l45, cs35l56 and cs42l43 callers were added
    since the Fixes commit so won't all be present on older kernels.
    
    Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
    Fixes: 209c6cdfd283 ("ASoC: soc-card: move snd_soc_card_get_kcontrol() to soc-card")
    Link: https://lore.kernel.org/r/20240221123710.690224-1-rf@opensource.cirrus.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

block: define bvec_iter as __packed __aligned(4) [+ + +]

Author: Ming Lei <ming.lei@redhat.com>
Date:   Sun Feb 25 11:01:41 2024 +0800

    block: define bvec_iter as __packed __aligned(4)
    
    [ Upstream commit 7838b4656110d950afdd92a081cc0f33e23e0ea8 ]
    
    In commit 19416123ab3e ("block: define 'struct bvec_iter' as packed"),
    what we need is to save the 4byte padding, and avoid `bio` to spread on
    one extra cache line.
    
    It is enough to define it as '__packed __aligned(4)', as '__packed'
    alone means byte aligned, and can cause compiler to generate horrible
    code on architectures that don't support unaligned access in case that
    bvec_iter is embedded in other structures.
    
    Cc: Mikulas Patocka <mpatocka@redhat.com>
    Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
    Fixes: 19416123ab3e ("block: define 'struct bvec_iter' as packed")
    Signed-off-by: Ming Lei <ming.lei@redhat.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: Avoid potential use-after-free in hci_error_reset [+ + +]

Author: Ying Hsu <yinghsu@chromium.org>
Date:   Thu Jan 4 11:56:32 2024 +0000

    Bluetooth: Avoid potential use-after-free in hci_error_reset
    
    [ Upstream commit 2449007d3f73b2842c9734f45f0aadb522daf592 ]
    
    While handling the HCI_EV_HARDWARE_ERROR event, if the underlying
    BT controller is not responding, the GPIO reset mechanism would
    free the hci_dev and lead to a use-after-free in hci_error_reset.
    
    Here's the call trace observed on a ChromeOS device with Intel AX201:
       queue_work_on+0x3e/0x6c
       __hci_cmd_sync_sk+0x2ee/0x4c0 [bluetooth <HASH:3b4a6>]
       ? init_wait_entry+0x31/0x31
       __hci_cmd_sync+0x16/0x20 [bluetooth <HASH:3b4a 6>]
       hci_error_reset+0x4f/0xa4 [bluetooth <HASH:3b4a 6>]
       process_one_work+0x1d8/0x33f
       worker_thread+0x21b/0x373
       kthread+0x13a/0x152
       ? pr_cont_work+0x54/0x54
       ? kthread_blkcg+0x31/0x31
        ret_from_fork+0x1f/0x30
    
    This patch holds the reference count on the hci_dev while processing
    a HCI_EV_HARDWARE_ERROR event to avoid potential crash.
    
    Fixes: c7741d16a57c ("Bluetooth: Perform a power cycle when receiving hardware error event")
    Signed-off-by: Ying Hsu <yinghsu@chromium.org>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: Enforce validation on max value of connection interval [+ + +]

Author: Kai-Heng Feng <kai.heng.feng@canonical.com>
Date:   Thu Jan 25 14:50:28 2024 +0800

    Bluetooth: Enforce validation on max value of connection interval
    
    [ Upstream commit e4b019515f950b4e6e5b74b2e1bb03a90cb33039 ]
    
    Right now Linux BT stack cannot pass test case "GAP/CONN/CPUP/BV-05-C
    'Connection Parameter Update Procedure Invalid Parameters Central
    Responder'" in Bluetooth Test Suite revision GAP.TS.p44. [0]
    
    That was revoled by commit c49a8682fc5d ("Bluetooth: validate BLE
    connection interval updates"), but later got reverted due to devices
    like keyboards and mice may require low connection interval.
    
    So only validate the max value connection interval to pass the Test
    Suite, and let devices to request low connection interval if needed.
    
    [0] https://www.bluetooth.org/docman/handlers/DownloadDoc.ashx?doc_id=229869
    
    Fixes: 68d19d7d9957 ("Revert "Bluetooth: validate BLE connection interval updates"")
    Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: hci_bcm4377: do not mark valid bd_addr as invalid [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Wed Dec 27 11:10:03 2023 +0100

    Bluetooth: hci_bcm4377: do not mark valid bd_addr as invalid
    
    commit c17d2a7b216e168c3ba62d93482179c01b369ac7 upstream.
    
    A recent commit restored the original (and still documented) semantics
    for the HCI_QUIRK_USE_BDADDR_PROPERTY quirk so that the device address
    is considered invalid unless an address is provided by firmware.
    
    This specifically means that this flag must only be set for devices with
    invalid addresses, but the Broadcom BCM4377 driver has so far been
    setting this flag unconditionally.
    
    Fortunately the driver already checks for invalid addresses during setup
    and sets the HCI_QUIRK_INVALID_BDADDR flag, which can simply be replaced
    with HCI_QUIRK_USE_BDADDR_PROPERTY to indicate that the default address
    is invalid but can be overridden by firmware (long term, this should
    probably just always be allowed).
    
    Fixes: 6945795bc81a ("Bluetooth: fix use-bdaddr-property quirk")
    Cc: stable@vger.kernel.org      # 6.5
    Reported-by: Felix Zhang <mrman@mrman314.tech>
    Link: https://lore.kernel.org/r/77419ffacc5b4875e920e038332575a2a5bff29f.camel@mrman314.tech/
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Reported-by: Felix Zhang <mrman@mrman314.tech>
    Reviewed-by: Neal Gompa <neal@gompa.dev>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bluetooth: hci_event: Fix handling of HCI_EV_IO_CAPA_REQUEST [+ + +]

Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date:   Mon Jan 22 09:02:47 2024 -0500

    Bluetooth: hci_event: Fix handling of HCI_EV_IO_CAPA_REQUEST
    
    [ Upstream commit 7e74aa53a68bf60f6019bd5d9a9a1406ec4d4865 ]
    
    If we received HCI_EV_IO_CAPA_REQUEST while
    HCI_OP_READ_REMOTE_EXT_FEATURES is yet to be responded assume the remote
    does support SSP since otherwise this event shouldn't be generated.
    
    Link: https://lore.kernel.org/linux-bluetooth/CABBYNZ+9UdG1cMZVmdtN3U2aS16AKMCyTARZZyFX7xTEDWcMOw@mail.gmail.com/T/#t
    Fixes: c7f59461f5a7 ("Bluetooth: Fix a refcnt underflow problem for hci_conn")
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: hci_event: Fix wrongly recorded wakeup BD_ADDR [+ + +]

Author: Zijun Hu <quic_zijuhu@quicinc.com>
Date:   Tue Jan 9 19:03:23 2024 +0800

    Bluetooth: hci_event: Fix wrongly recorded wakeup BD_ADDR
    
    [ Upstream commit 61a5ab72edea7ebc3ad2c6beea29d966f528ebfb ]
    
    hci_store_wake_reason() wrongly parses event HCI_Connection_Request
    as HCI_Connection_Complete and HCI_Connection_Complete as
    HCI_Connection_Request, so causes recording wakeup BD_ADDR error and
    potential stability issue, fix it by using the correct field.
    
    Fixes: 2f20216c1d6f ("Bluetooth: Emit controller suspend and resume events")
    Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: hci_qca: Set BDA quirk bit if fwnode exists in DT [+ + +]

Author: Janaki Ramaiah Thota <quic_janathot@quicinc.com>
Date:   Wed Jan 24 20:00:42 2024 +0530

    Bluetooth: hci_qca: Set BDA quirk bit if fwnode exists in DT
    
    [ Upstream commit 7dcd3e014aa7faeeaf4047190b22d8a19a0db696 ]
    
    BT adapter going into UNCONFIGURED state during BT turn ON when
    devicetree has no local-bd-address node.
    
    Bluetooth will not work out of the box on such devices, to avoid this
    problem, added check to set HCI_QUIRK_USE_BDADDR_PROPERTY based on
    local-bd-address node entry.
    
    When this quirk is not set, the public Bluetooth address read by host
    from controller though HCI Read BD Address command is
    considered as valid.
    
    Fixes: e668eb1e1578 ("Bluetooth: hci_core: Don't stop BT if the BD address missing in dts")
    Signed-off-by: Janaki Ramaiah Thota <quic_janathot@quicinc.com>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: hci_sync: Check the correct flag before starting a scan [+ + +]

Author: Jonas Dreц÷ler <verdre@v0yd.nl>
Date:   Tue Jan 2 19:08:08 2024 +0100

    Bluetooth: hci_sync: Check the correct flag before starting a scan
    
    [ Upstream commit 6b3899be24b16ff8ee0cb25f0bd59b01b15ba1d1 ]
    
    There's a very confusing mistake in the code starting a HCI inquiry: We're
    calling hci_dev_test_flag() to test for HCI_INQUIRY, but hci_dev_test_flag()
    checks hdev->dev_flags instead of hdev->flags. HCI_INQUIRY is a bit that's
    set on hdev->flags, not on hdev->dev_flags though.
    
    HCI_INQUIRY equals the integer 7, and in hdev->dev_flags, 7 means
    HCI_BONDABLE, so we were actually checking for HCI_BONDABLE here.
    
    The mistake is only present in the synchronous code for starting an inquiry,
    not in the async one. Also devices are typically bondable while doing an
    inquiry, so that might be the reason why nobody noticed it so far.
    
    Fixes: abfeea476c68 ("Bluetooth: hci_sync: Convert MGMT_OP_START_DISCOVERY")
    Signed-off-by: Jonas Dreц÷ler <verdre@v0yd.nl>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: hci_sync: Fix accept_list when attempting to suspend [+ + +]

Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date:   Fri Jan 5 10:43:26 2024 -0500

    Bluetooth: hci_sync: Fix accept_list when attempting to suspend
    
    [ Upstream commit e5469adb2a7e930d96813316592302d9f8f1df4e ]
    
    During suspend, only wakeable devices can be in acceptlist, so if the
    device was previously added it needs to be removed otherwise the device
    can end up waking up the system prematurely.
    
    Fixes: 3b42055388c3 ("Bluetooth: hci_sync: Fix attempting to suspend with unfiltered passive scan")
    Signed-off-by: Clancy Shang <clancy.shang@quectel.com>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: qca: Fix triggering coredump implementation [+ + +]

Author: Zijun Hu <quic_zijuhu@quicinc.com>
Date:   Fri Jan 26 17:00:24 2024 +0800

    Bluetooth: qca: Fix triggering coredump implementation
    
    [ Upstream commit 6abf9dd26bb1699c17d601b9a292577d01827c0e ]
    
    hci_coredump_qca() uses __hci_cmd_sync() to send a vendor-specific command
    to trigger firmware coredump, but the command does not have any event as
    its sync response, so it is not suitable to use __hci_cmd_sync(), fixed by
    using __hci_cmd_send().
    
    Fixes: 06d3fdfcdf5c ("Bluetooth: hci_qca: Add qcom devcoredump support")
    Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: qca: Fix wrong event type for patch config command [+ + +]

Author: Zijun Hu <quic_zijuhu@quicinc.com>
Date:   Fri Jan 19 17:45:30 2024 +0800

    Bluetooth: qca: Fix wrong event type for patch config command
    
    [ Upstream commit c0dbc56077ae759f2dd602c7561480bc2b1b712c ]
    
    Vendor-specific command patch config has HCI_Command_Complete event as
    response, but qca_send_patch_config_cmd() wrongly expects vendor-specific
    event for the command, fixed by using right event type.
    
    Btmon log for the vendor-specific command are shown below:
    < HCI Command: Vendor (0x3f|0x0000) plen 5
            28 01 00 00 00
    > HCI Event: Command Complete (0x0e) plen 5
          Vendor (0x3f|0x0000) ncmd 1
            Status: Success (0x00)
            28
    
    Fixes: 4fac8a7ac80b ("Bluetooth: btqca: sequential validation")
    Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

btrfs: dev-replace: properly validate device names [+ + +]

Author: David Sterba <dsterba@suse.com>
Date:   Wed Feb 14 16:19:24 2024 +0100

    btrfs: dev-replace: properly validate device names
    
    commit 9845664b9ee47ce7ee7ea93caf47d39a9d4552c4 upstream.
    
    There's a syzbot report that device name buffers passed to device
    replace are not properly checked for string termination which could lead
    to a read out of bounds in getname_kernel().
    
    Add a helper that validates both source and target device name buffers.
    For devid as the source initialize the buffer to empty string in case
    something tries to read it later.
    
    This was originally analyzed and fixed in a different way by Edward Adam
    Davis (see links).
    
    Link: https://lore.kernel.org/linux-btrfs/000000000000d1a1d1060cc9c5e7@google.com/
    Link: https://lore.kernel.org/linux-btrfs/tencent_44CA0665C9836EF9EEC80CB9E7E206DF5206@qq.com/
    CC: stable@vger.kernel.org # 4.19+
    CC: Edward Adam Davis <eadavis@qq.com>
    Reported-and-tested-by: syzbot+33f23b49ac24f986c9e8@syzkaller.appspotmail.com
    Reviewed-by: Boris Burkov <boris@bur.io>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: fix double free of anonymous device after snapshot creation failure [+ + +]

Author: Filipe Manana <fdmanana@suse.com>
Date:   Fri Feb 23 16:38:43 2024 +0000

    btrfs: fix double free of anonymous device after snapshot creation failure
    
    commit e2b54eaf28df0c978626c9736b94f003b523b451 upstream.
    
    When creating a snapshot we may do a double free of an anonymous device
    in case there's an error committing the transaction. The second free may
    result in freeing an anonymous device number that was allocated by some
    other subsystem in the kernel or another btrfs filesystem.
    
    The steps that lead to this:
    
    1) At ioctl.c:create_snapshot() we allocate an anonymous device number
       and assign it to pending_snapshot->anon_dev;
    
    2) Then we call btrfs_commit_transaction() and end up at
       transaction.c:create_pending_snapshot();
    
    3) There we call btrfs_get_new_fs_root() and pass it the anonymous device
       number stored in pending_snapshot->anon_dev;
    
    4) btrfs_get_new_fs_root() frees that anonymous device number because
       btrfs_lookup_fs_root() returned a root - someone else did a lookup
       of the new root already, which could some task doing backref walking;
    
    5) After that some error happens in the transaction commit path, and at
       ioctl.c:create_snapshot() we jump to the 'fail' label, and after
       that we free again the same anonymous device number, which in the
       meanwhile may have been reallocated somewhere else, because
       pending_snapshot->anon_dev still has the same value as in step 1.
    
    Recently syzbot ran into this and reported the following trace:
    
      ------------[ cut here ]------------
      ida_free called for id=51 which is not allocated.
      WARNING: CPU: 1 PID: 31038 at lib/idr.c:525 ida_free+0x370/0x420 lib/idr.c:525
      Modules linked in:
      CPU: 1 PID: 31038 Comm: syz-executor.2 Not tainted 6.8.0-rc4-syzkaller-00410-gc02197fc9076 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
      RIP: 0010:ida_free+0x370/0x420 lib/idr.c:525
      Code: 10 42 80 3c 28 (...)
      RSP: 0018:ffffc90015a67300 EFLAGS: 00010246
      RAX: be5130472f5dd000 RBX: 0000000000000033 RCX: 0000000000040000
      RDX: ffffc90009a7a000 RSI: 000000000003ffff RDI: 0000000000040000
      RBP: ffffc90015a673f0 R08: ffffffff81577992 R09: 1ffff92002b4cdb4
      R10: dffffc0000000000 R11: fffff52002b4cdb5 R12: 0000000000000246
      R13: dffffc0000000000 R14: ffffffff8e256b80 R15: 0000000000000246
      FS:  00007fca3f4b46c0(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f167a17b978 CR3: 000000001ed26000 CR4: 0000000000350ef0
      Call Trace:
       <TASK>
       btrfs_get_root_ref+0xa48/0xaf0 fs/btrfs/disk-io.c:1346
       create_pending_snapshot+0xff2/0x2bc0 fs/btrfs/transaction.c:1837
       create_pending_snapshots+0x195/0x1d0 fs/btrfs/transaction.c:1931
       btrfs_commit_transaction+0xf1c/0x3740 fs/btrfs/transaction.c:2404
       create_snapshot+0x507/0x880 fs/btrfs/ioctl.c:848
       btrfs_mksubvol+0x5d0/0x750 fs/btrfs/ioctl.c:998
       btrfs_mksnapshot+0xb5/0xf0 fs/btrfs/ioctl.c:1044
       __btrfs_ioctl_snap_create+0x387/0x4b0 fs/btrfs/ioctl.c:1306
       btrfs_ioctl_snap_create_v2+0x1ca/0x400 fs/btrfs/ioctl.c:1393
       btrfs_ioctl+0xa74/0xd40
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:871 [inline]
       __se_sys_ioctl+0xfe/0x170 fs/ioctl.c:857
       do_syscall_64+0xfb/0x240
       entry_SYSCALL_64_after_hwframe+0x6f/0x77
      RIP: 0033:0x7fca3e67dda9
      Code: 28 00 00 00 (...)
      RSP: 002b:00007fca3f4b40c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 00007fca3e7abf80 RCX: 00007fca3e67dda9
      RDX: 00000000200005c0 RSI: 0000000050009417 RDI: 0000000000000003
      RBP: 00007fca3e6ca47a R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 000000000000000b R14: 00007fca3e7abf80 R15: 00007fff6bf95658
       </TASK>
    
    Where we get an explicit message where we attempt to free an anonymous
    device number that is not currently allocated. It happens in a different
    code path from the example below, at btrfs_get_root_ref(), so this change
    may not fix the case triggered by syzbot.
    
    To fix at least the code path from the example above, change
    btrfs_get_root_ref() and its callers to receive a dev_t pointer argument
    for the anonymous device number, so that in case it frees the number, it
    also resets it to 0, so that up in the call chain we don't attempt to do
    the double free.
    
    CC: stable@vger.kernel.org # 5.10+
    Link: https://lore.kernel.org/linux-btrfs/000000000000f673a1061202f630@google.com/
    Fixes: e03ee2fe873e ("btrfs: do not ASSERT() if the newly created subvolume already got read")
    Signed-off-by: Filipe Manana <fdmanana@suse.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: fix race between ordered extent completion and fiemap [+ + +]

Author: Filipe Manana <fdmanana@suse.com>
Date:   Thu Feb 22 12:29:26 2024 +0000

    btrfs: fix race between ordered extent completion and fiemap
    
    [ Upstream commit a1a4a9ca77f143c00fce69c1239887ff8b813bec ]
    
    For fiemap we recently stopped locking the target extent range for the
    whole duration of the fiemap call, in order to avoid a deadlock in a
    scenario where the fiemap buffer happens to be a memory mapped range of
    the same file. This use case is very unlikely to be useful in practice but
    it may be triggered by fuzz testing (syzbot, etc).
    
    However by not locking the target extent range for the whole duration of
    the fiemap call we can race with an ordered extent. This happens like
    this:
    
    1) The fiemap task finishes processing a file extent item that covers
       the file range [512K, 1M[, and that file extent item is the last item
       in the leaf currently being processed;
    
    2) And ordered extent for the file range [768K, 2M[, in COW mode,
       completes (btrfs_finish_one_ordered()) and the file extent item
       covering the range [512K, 1M[ is trimmed to cover the range
       [512K, 768K[ and then a new file extent item for the range [768K, 2M[
       is inserted in the inode's subvolume tree;
    
    3) The fiemap task calls fiemap_next_leaf_item(), which then calls
       btrfs_next_leaf() to find the next leaf / item. This finds that the
       the next key following the one we previously processed (its type is
       BTRFS_EXTENT_DATA_KEY and its offset is 512K), is the key corresponding
       to the new file extent item inserted by the ordered extent, which has
       a type of BTRFS_EXTENT_DATA_KEY and an offset of 768K;
    
    4) Later the fiemap code ends up at emit_fiemap_extent() and triggers
       the warning:
    
          if (cache->offset + cache->len > offset) {
                   WARN_ON(1);
                   return -EINVAL;
          }
    
       Since we get 1M > 768K, because the previously emitted entry for the
       old extent covering the file range [512K, 1M[ ends at an offset that
       is greater than the new extent's start offset (768K). This makes fiemap
       fail with -EINVAL besides triggering the warning that produces a stack
       trace like the following:
    
         [1621.677651] ------------[ cut here ]------------
         [1621.677656] WARNING: CPU: 1 PID: 204366 at fs/btrfs/extent_io.c:2492 emit_fiemap_extent+0x84/0x90 [btrfs]
         [1621.677899] Modules linked in: btrfs blake2b_generic (...)
         [1621.677951] CPU: 1 PID: 204366 Comm: pool Not tainted 6.8.0-rc5-btrfs-next-151+ #1
         [1621.677954] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
         [1621.677956] RIP: 0010:emit_fiemap_extent+0x84/0x90 [btrfs]
         [1621.678033] Code: 2b 4c 89 63 (...)
         [1621.678035] RSP: 0018:ffffab16089ffd20 EFLAGS: 00010206
         [1621.678037] RAX: 00000000004fa000 RBX: ffffab16089ffe08 RCX: 0000000000009000
         [1621.678039] RDX: 00000000004f9000 RSI: 00000000004f1000 RDI: ffffab16089ffe90
         [1621.678040] RBP: 00000000004f9000 R08: 0000000000001000 R09: 0000000000000000
         [1621.678041] R10: 0000000000000000 R11: 0000000000001000 R12: 0000000041d78000
         [1621.678043] R13: 0000000000001000 R14: 0000000000000000 R15: ffff9434f0b17850
         [1621.678044] FS:  00007fa6e20006c0(0000) GS:ffff943bdfa40000(0000) knlGS:0000000000000000
         [1621.678046] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
         [1621.678048] CR2: 00007fa6b0801000 CR3: 000000012d404002 CR4: 0000000000370ef0
         [1621.678053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
         [1621.678055] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
         [1621.678056] Call Trace:
         [1621.678074]  <TASK>
         [1621.678076]  ? __warn+0x80/0x130
         [1621.678082]  ? emit_fiemap_extent+0x84/0x90 [btrfs]
         [1621.678159]  ? report_bug+0x1f4/0x200
         [1621.678164]  ? handle_bug+0x42/0x70
         [1621.678167]  ? exc_invalid_op+0x14/0x70
         [1621.678170]  ? asm_exc_invalid_op+0x16/0x20
         [1621.678178]  ? emit_fiemap_extent+0x84/0x90 [btrfs]
         [1621.678253]  extent_fiemap+0x766/0xa30 [btrfs]
         [1621.678339]  btrfs_fiemap+0x45/0x80 [btrfs]
         [1621.678420]  do_vfs_ioctl+0x1e4/0x870
         [1621.678431]  __x64_sys_ioctl+0x6a/0xc0
         [1621.678434]  do_syscall_64+0x52/0x120
         [1621.678445]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
    
    There's also another case where before calling btrfs_next_leaf() we are
    processing a hole or a prealloc extent and we had several delalloc ranges
    within that hole or prealloc extent. In that case if the ordered extents
    complete before we find the next key, we may end up finding an extent item
    with an offset smaller than (or equals to) the offset in cache->offset.
    
    So fix this by changing emit_fiemap_extent() to address these three
    scenarios like this:
    
    1) For the first case, steps listed above, adjust the length of the
       previously cached extent so that it does not overlap with the current
       extent, emit the previous one and cache the current file extent item;
    
    2) For the second case where he had a hole or prealloc extent with
       multiple delalloc ranges inside the hole or prealloc extent's range,
       and the current file extent item has an offset that matches the offset
       in the fiemap cache, just discard what we have in the fiemap cache and
       assign the current file extent item to the cache, since it's more up
       to date;
    
    3) For the third case where he had a hole or prealloc extent with
       multiple delalloc ranges inside the hole or prealloc extent's range
       and the offset of the file extent item we just found is smaller than
       what we have in the cache, just skip the current file extent item
       if its range end at or behind the cached extent's end, because we may
       have emitted (to the fiemap user space buffer) delalloc ranges that
       overlap with the current file extent item's range. If the file extent
       item's range goes beyond the end offset of the cached extent, just
       emit the cached extent and cache a subrange of the file extent item,
       that goes from the end offset of the cached extent to the end offset
       of the file extent item.
    
    Dealing with those cases in those ways makes everything consistent by
    reflecting the current state of file extent items in the btree and
    without emitting extents that have overlapping ranges (which would be
    confusing and violating expectations).
    
    This issue could be triggered often with test case generic/561, and was
    also hit and reported by Wang Yugui.
    
    Reported-by: Wang Yugui <wangyugui@e16-tech.com>
    Link: https://lore.kernel.org/linux-btrfs/20240223104619.701F.409509F4@e16-tech.com/
    Fixes: b0ad381fa769 ("btrfs: fix deadlock with fiemap and extent locking")
    Reviewed-by: Josef Bacik <josef@toxicpanda.com>
    Signed-off-by: Filipe Manana <fdmanana@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

btrfs: send: don't issue unnecessary zero writes for trailing hole [+ + +]

Author: Filipe Manana <fdmanana@suse.com>
Date:   Fri Feb 16 22:17:10 2024 +0000

    btrfs: send: don't issue unnecessary zero writes for trailing hole
    
    commit 5897710b28cabab04ea6c7547f27b7989de646ae upstream.
    
    If we have a sparse file with a trailing hole (from the last extent's end
    to i_size) and then create an extent in the file that ends before the
    file's i_size, then when doing an incremental send we will issue a write
    full of zeroes for the range that starts immediately after the new extent
    ends up to i_size. While this isn't incorrect because the file ends up
    with exactly the same data, it unnecessarily results in using extra space
    at the destination with one or more extents full of zeroes instead of
    having a hole. In same cases this results in using megabytes or even
    gigabytes of unnecessary space.
    
    Example, reproducer:
    
       $ cat test.sh
       #!/bin/bash
    
       DEV=/dev/sdh
       MNT=/mnt/sdh
    
       mkfs.btrfs -f $DEV
       mount $DEV $MNT
    
       # Create 1G sparse file.
       xfs_io -f -c "truncate 1G" $MNT/foobar
    
       # Create base snapshot.
       btrfs subvolume snapshot -r $MNT $MNT/mysnap1
    
       # Create send stream (full send) for the base snapshot.
       btrfs send -f /tmp/1.snap $MNT/mysnap1
    
       # Now write one extent at the beginning of the file and one somewhere
       # in the middle, leaving a gap between the end of this second extent
       # and the file's size.
       xfs_io -c "pwrite -S 0xab 0 128K" \
              -c "pwrite -S 0xcd 512M 128K" \
              $MNT/foobar
    
       # Now create a second snapshot which is going to be used for an
       # incremental send operation.
       btrfs subvolume snapshot -r $MNT $MNT/mysnap2
    
       # Create send stream (incremental send) for the second snapshot.
       btrfs send -p $MNT/mysnap1 -f /tmp/2.snap $MNT/mysnap2
    
       # Now recreate the filesystem by receiving both send streams and
       # verify we get the same content that the original filesystem had
       # and file foobar has only two extents with a size of 128K each.
       umount $MNT
       mkfs.btrfs -f $DEV
       mount $DEV $MNT
    
       btrfs receive -f /tmp/1.snap $MNT
       btrfs receive -f /tmp/2.snap $MNT
    
       echo -e "\nFile fiemap in the second snapshot:"
       # Should have:
       #
       # 128K extent at file range [0, 128K[
       # hole at file range [128K, 512M[
       # 128K extent file range [512M, 512M + 128K[
       # hole at file range [512M + 128K, 1G[
       xfs_io -r -c "fiemap -v" $MNT/mysnap2/foobar
    
       # File should be using 256K of data (two 128K extents).
       echo -e "\nSpace used by the file: $(du -h $MNT/mysnap2/foobar | cut -f 1)"
    
       umount $MNT
    
    Running the test, we can see with fiemap that we get an extent for the
    range [512M, 1G[, while in the source filesystem we have an extent for
    the range [512M, 512M + 128K[ and a hole for the rest of the file (the
    range [512M + 128K, 1G[):
    
       $ ./test.sh
       (...)
       File fiemap in the second snapshot:
       /mnt/sdh/mysnap2/foobar:
        EXT: FILE-OFFSET        BLOCK-RANGE        TOTAL FLAGS
          0: [0..255]:          26624..26879         256   0x0
          1: [256..1048575]:    hole             1048320
          2: [1048576..2097151]: 2156544..3205119 1048576   0x1
    
       Space used by the file: 513M
    
    This happens because once we finish processing an inode, at
    finish_inode_if_needed(), we always issue a hole (write operations full
    of zeros) if there's a gap between the end of the last processed extent
    and the file's size, even if that range is already a hole in the parent
    snapshot. Fix this by issuing the hole only if the range is not already
    a hole.
    
    After this change, running the test above, we get the expected layout:
    
       $ ./test.sh
       (...)
       File fiemap in the second snapshot:
       /mnt/sdh/mysnap2/foobar:
        EXT: FILE-OFFSET        BLOCK-RANGE      TOTAL FLAGS
          0: [0..255]:          26624..26879       256   0x0
          1: [256..1048575]:    hole             1048320
          2: [1048576..1048831]: 26880..27135       256   0x1
          3: [1048832..2097151]: hole             1048320
    
       Space used by the file: 256K
    
    A test case for fstests will follow soon.
    
    CC: stable@vger.kernel.org # 6.1+
    Reported-by: Dorai Ashok S A <dash.btrfs@inix.me>
    Link: https://lore.kernel.org/linux-btrfs/c0bf7818-9c45-46a8-b3d3-513230d0c86e@inix.me/
    Reviewed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
    Reviewed-by: Josef Bacik <josef@toxicpanda.com>
    Signed-off-by: Filipe Manana <fdmanana@suse.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ceph: switch to corrected encoding of max_xattr_size in mdsmap [+ + +]

Author: Xiubo Li <xiubli@redhat.com>
Date:   Mon Feb 19 13:14:32 2024 +0800

    ceph: switch to corrected encoding of max_xattr_size in mdsmap
    
    commit 51d31149a88b5c5a8d2d33f06df93f6187a25b4c upstream.
    
    The addition of bal_rank_mask with encoding version 17 was merged
    into ceph.git in Oct 2022 and made it into v18.2.0 release normally.
    A few months later, the much delayed addition of max_xattr_size got
    merged, also with encoding version 17, placed before bal_rank_mask
    in the encoding -- but it didn't make v18.2.0 release.
    
    The way this ended up being resolved on the MDS side is that
    bal_rank_mask will continue to be encoded in version 17 while
    max_xattr_size is now encoded in version 18.  This does mean that
    older kernels will misdecode version 17, but this is also true for
    v18.2.0 and v18.2.1 clients in userspace.
    
    The best we can do is backport this adjustment -- see ceph.git
    commit 78abfeaff27fee343fb664db633de5b221699a73 for details.
    
    [ idryomov: changelog ]
    
    Cc: stable@vger.kernel.org
    Link: https://tracker.ceph.com/issues/64440
    Fixes: d93231a6bc8a ("ceph: prevent a client from exceeding the MDS maximum xattr size")
    Signed-off-by: Xiubo Li <xiubli@redhat.com>
    Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
    Reviewed-by: Venky Shankar <vshankar@redhat.com>
    Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cpufreq: intel_pstate: fix pstate limits enforcement for adjust_perf call back [+ + +]

Author: Doug Smythies <dsmythies@telus.net>
Date:   Sat Feb 17 13:30:10 2024 -0800

    cpufreq: intel_pstate: fix pstate limits enforcement for adjust_perf call back
    
    [ Upstream commit f0a0fc10abb062d122db5ac4ed42f6d1ca342649 ]
    
    There is a loophole in pstate limit clamping for the intel_cpufreq CPU
    frequency scaling driver (intel_pstate in passive mode), schedutil CPU
    frequency scaling governor, HWP (HardWare Pstate) control enabled, when
    the adjust_perf call back path is used.
    
    Fix it.
    
    Fixes: a365ab6b9dfb cpufreq: intel_pstate: Implement the ->adjust_perf() callback
    Signed-off-by: Doug Smythies <dsmythies@telus.net>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

crypto: arm64/neonbs - fix out-of-bounds access on short input [+ + +]

Author: Ard Biesheuvel <ardb@kernel.org>
Date:   Fri Feb 23 14:20:35 2024 +0100

    crypto: arm64/neonbs - fix out-of-bounds access on short input
    
    commit 1c0cf6d19690141002889d72622b90fc01562ce4 upstream.
    
    The bit-sliced implementation of AES-CTR operates on blocks of 128
    bytes, and will fall back to the plain NEON version for tail blocks or
    inputs that are shorter than 128 bytes to begin with.
    
    It will call straight into the plain NEON asm helper, which performs all
    memory accesses in granules of 16 bytes (the size of a NEON register).
    For this reason, the associated plain NEON glue code will copy inputs
    shorter than 16 bytes into a temporary buffer, given that this is a rare
    occurrence and it is not worth the effort to work around this in the asm
    code.
    
    The fallback from the bit-sliced NEON version fails to take this into
    account, potentially resulting in out-of-bounds accesses. So clone the
    same workaround, and use a temp buffer for short in/outputs.
    
    Fixes: fc074e130051 ("crypto: arm64/aes-neonbs-ctr - fallback to plain NEON for final chunk")
    Cc: <stable@vger.kernel.org>
    Reported-by: syzbot+f1ceaa1a09ab891e1934@syzkaller.appspotmail.com
    Reviewed-by: Eric Biggers <ebiggers@google.com>
    Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dmaengine: dw-edma: Add HDMA remote interrupt configuration [+ + +]

Author: Kory Maincent <kory.maincent@bootlin.com>
Date:   Mon Jan 29 17:26:00 2024 +0100

    dmaengine: dw-edma: Add HDMA remote interrupt configuration
    
    [ Upstream commit e2f6a5789051ee9c632f27a12d0f01f0cbf78aac ]
    
    Only the local interruption was configured, remote interrupt was left
    behind. This patch fix it by setting stop and abort remote interrupts when
    the DW_EDMA_CHIP_LOCAL flag is not set.
    
    Fixes: e74c39573d35 ("dmaengine: dw-edma: Add support for native HDMA")
    Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
    Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
    Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Link: https://lore.kernel.org/r/20240129-b4-feature_hdma_mainline-v7-4-8e8c1acb7a46@bootlin.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

dmaengine: dw-edma: eDMA: Add sync read before starting the DMA transfer in remote setup [+ + +]

Author: Kory Maincent <kory.maincent@bootlin.com>
Date:   Mon Jan 29 17:26:02 2024 +0100

    dmaengine: dw-edma: eDMA: Add sync read before starting the DMA transfer in remote setup
    
    [ Upstream commit bbcc1c83f343e580c3aa1f2a8593343bf7b55bba ]
    
    The Linked list element and pointer are not stored in the same memory as
    the eDMA controller register. If the doorbell register is toggled before
    the full write of the linked list a race condition error will occur.
    In remote setup we can only use a readl to the memory to assure the full
    write has occurred.
    
    Fixes: 7e4b8a4fbe2c ("dmaengine: Add Synopsys eDMA IP version 0 support")
    Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
    Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
    Link: https://lore.kernel.org/r/20240129-b4-feature_hdma_mainline-v7-6-8e8c1acb7a46@bootlin.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

dmaengine: dw-edma: Fix the ch_count hdma callback [+ + +]

Author: Kory Maincent <kory.maincent@bootlin.com>
Date:   Mon Jan 29 17:25:57 2024 +0100

    dmaengine: dw-edma: Fix the ch_count hdma callback
    
    [ Upstream commit cd665bfc757c71e9b7e0abff0f362d8abd38a805 ]
    
    The current check of ch_en enabled to know the maximum number of available
    hardware channels is wrong as it check the number of ch_en register set
    but all of them are unset at probe. This register is set at the
    dw_hdma_v0_core_start function which is run lately before a DMA transfer.
    
    The HDMA IP have no way to know the number of hardware channels available
    like the eDMA IP, then let set it to maximum channels and let the platform
    set the right number of channels.
    
    Fixes: e74c39573d35 ("dmaengine: dw-edma: Add support for native HDMA")
    Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
    Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
    Link: https://lore.kernel.org/r/20240129-b4-feature_hdma_mainline-v7-1-8e8c1acb7a46@bootlin.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

dmaengine: dw-edma: Fix wrong interrupt bit set for HDMA [+ + +]

Author: Kory Maincent <kory.maincent@bootlin.com>
Date:   Mon Jan 29 17:25:58 2024 +0100

    dmaengine: dw-edma: Fix wrong interrupt bit set for HDMA
    
    [ Upstream commit 7b52ba8616e978bf4f38f207f11a8176517244d0 ]
    
    Instead of setting HDMA_V0_LOCAL_ABORT_INT_EN bit, HDMA_V0_LOCAL_STOP_INT_EN
    bit got set twice, due to which the abort interrupt is not getting generated for
    HDMA. Fix it by setting the correct interrupt enable bit.
    
    Fixes: e74c39573d35 ("dmaengine: dw-edma: Add support for native HDMA")
    Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
    Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
    Link: https://lore.kernel.org/r/20240129-b4-feature_hdma_mainline-v7-2-8e8c1acb7a46@bootlin.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

dmaengine: dw-edma: HDMA: Add sync read before starting the DMA transfer in remote setup [+ + +]

Author: Kory Maincent <kory.maincent@bootlin.com>
Date:   Mon Jan 29 17:26:01 2024 +0100

    dmaengine: dw-edma: HDMA: Add sync read before starting the DMA transfer in remote setup
    
    [ Upstream commit 712a92a48158e02155b4b6b21e03a817f78c9b7e ]
    
    The Linked list element and pointer are not stored in the same memory as
    the HDMA controller register. If the doorbell register is toggled before
    the full write of the linked list a race condition error will occur.
    In remote setup we can only use a readl to the memory to assure the full
    write has occurred.
    
    Fixes: e74c39573d35 ("dmaengine: dw-edma: Add support for native HDMA")
    Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
    Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
    Link: https://lore.kernel.org/r/20240129-b4-feature_hdma_mainline-v7-5-8e8c1acb7a46@bootlin.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

dmaengine: dw-edma: HDMA_V0_REMOTEL_STOP_INT_EN typo fix [+ + +]

Author: Kory Maincent <kory.maincent@bootlin.com>
Date:   Mon Jan 29 17:25:59 2024 +0100

    dmaengine: dw-edma: HDMA_V0_REMOTEL_STOP_INT_EN typo fix
    
    [ Upstream commit 930a8a015dcfde4b8906351ff081066dc277748c ]
    
    Fix "HDMA_V0_REMOTEL_STOP_INT_EN" typo error
    
    Fixes: e74c39573d35 ("dmaengine: dw-edma: Add support for native HDMA")
    Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
    Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
    Link: https://lore.kernel.org/r/20240129-b4-feature_hdma_mainline-v7-3-8e8c1acb7a46@bootlin.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

dmaengine: fsl-edma: correct calculation of 'nbytes' in multi-fifo scenario [+ + +]

Author: Joy Zou <joy.zou@nxp.com>
Date:   Wed Jan 31 11:33:18 2024 -0500

    dmaengine: fsl-edma: correct calculation of 'nbytes' in multi-fifo scenario
    
    commit 9ba17defd9edd87970b701085402bc8ecc3a11d4 upstream.
    
    The 'nbytes' should be equivalent to burst * width in audio multi-fifo
    setups. Given that the FIFO width is fixed at 32 bits, adjusts the burst
    size for multi-fifo configurations to match the slave maxburst in the
    configuration.
    
    Cc: stable@vger.kernel.org
    Fixes: 72f5801a4e2b ("dmaengine: fsl-edma: integrate v3 support")
    Signed-off-by: Joy Zou <joy.zou@nxp.com>
    Signed-off-by: Frank Li <Frank.Li@nxp.com>
    Link: https://lore.kernel.org/r/20240131163318.360315-1-Frank.Li@nxp.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dmaengine: fsl-qdma: fix SoC may hang on 16 byte unaligned read [+ + +]

Author: Peng Ma <peng.ma@nxp.com>
Date:   Thu Feb 1 16:50:07 2024 -0500

    dmaengine: fsl-qdma: fix SoC may hang on 16 byte unaligned read
    
    commit 9d739bccf261dd93ec1babf82f5c5d71dd4caa3e upstream.
    
    There is chip (ls1028a) errata:
    
    The SoC may hang on 16 byte unaligned read transactions by QDMA.
    
    Unaligned read transactions initiated by QDMA may stall in the NOC
    (Network On-Chip), causing a deadlock condition. Stalled transactions will
    trigger completion timeouts in PCIe controller.
    
    Workaround:
    Enable prefetch by setting the source descriptor prefetchable bit
    ( SD[PF] = 1 ).
    
    Implement this workaround.
    
    Cc: stable@vger.kernel.org
    Fixes: b092529e0aa0 ("dmaengine: fsl-qdma: Add qDMA controller driver for Layerscape SoCs")
    Signed-off-by: Peng Ma <peng.ma@nxp.com>
    Signed-off-by: Frank Li <Frank.Li@nxp.com>
    Link: https://lore.kernel.org/r/20240201215007.439503-1-Frank.Li@nxp.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dmaengine: fsl-qdma: init irq after reg initialization [+ + +]

Author: Curtis Klein <curtis.klein@hpe.com>
Date:   Thu Feb 1 17:04:06 2024 -0500

    dmaengine: fsl-qdma: init irq after reg initialization
    
    commit 87a39071e0b639f45e05d296cc0538eef44ec0bd upstream.
    
    Initialize the qDMA irqs after the registers are configured so that
    interrupts that may have been pending from a primary kernel don't get
    processed by the irq handler before it is ready to and cause panic with
    the following trace:
    
      Call trace:
       fsl_qdma_queue_handler+0xf8/0x3e8
       __handle_irq_event_percpu+0x78/0x2b0
       handle_irq_event_percpu+0x1c/0x68
       handle_irq_event+0x44/0x78
       handle_fasteoi_irq+0xc8/0x178
       generic_handle_irq+0x24/0x38
       __handle_domain_irq+0x90/0x100
       gic_handle_irq+0x5c/0xb8
       el1_irq+0xb8/0x180
       _raw_spin_unlock_irqrestore+0x14/0x40
       __setup_irq+0x4bc/0x798
       request_threaded_irq+0xd8/0x190
       devm_request_threaded_irq+0x74/0xe8
       fsl_qdma_probe+0x4d4/0xca8
       platform_drv_probe+0x50/0xa0
       really_probe+0xe0/0x3f8
       driver_probe_device+0x64/0x130
       device_driver_attach+0x6c/0x78
       __driver_attach+0xbc/0x158
       bus_for_each_dev+0x5c/0x98
       driver_attach+0x20/0x28
       bus_add_driver+0x158/0x220
       driver_register+0x60/0x110
       __platform_driver_register+0x44/0x50
       fsl_qdma_driver_init+0x18/0x20
       do_one_initcall+0x48/0x258
       kernel_init_freeable+0x1a4/0x23c
       kernel_init+0x10/0xf8
       ret_from_fork+0x10/0x18
    
    Cc: stable@vger.kernel.org
    Fixes: b092529e0aa0 ("dmaengine: fsl-qdma: Add qDMA controller driver for Layerscape SoCs")
    Signed-off-by: Curtis Klein <curtis.klein@hpe.com>
    Signed-off-by: Yi Zhao <yi.zhao@nxp.com>
    Signed-off-by: Frank Li <Frank.Li@nxp.com>
    Link: https://lore.kernel.org/r/20240201220406.440145-1-Frank.Li@nxp.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dmaengine: idxd: Ensure safe user copy of completion record [+ + +]

Author: Fenghua Yu <fenghua.yu@intel.com>
Date:   Fri Feb 9 11:14:12 2024 -0800

    dmaengine: idxd: Ensure safe user copy of completion record
    
    [ Upstream commit d3ea125df37dc37972d581b74a5d3785c3f283ab ]
    
    If CONFIG_HARDENED_USERCOPY is enabled, copying completion record from
    event log cache to user triggers a kernel bug.
    
    [ 1987.159822] usercopy: Kernel memory exposure attempt detected from SLUB object 'dsa0' (offset 74, size 31)!
    [ 1987.170845] ------------[ cut here ]------------
    [ 1987.176086] kernel BUG at mm/usercopy.c:102!
    [ 1987.180946] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
    [ 1987.186866] CPU: 17 PID: 528 Comm: kworker/17:1 Not tainted 6.8.0-rc2+ #5
    [ 1987.194537] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.86B.2492.D03.2307181620 07/18/2023
    [ 1987.206405] Workqueue: wq0.0 idxd_evl_fault_work [idxd]
    [ 1987.212338] RIP: 0010:usercopy_abort+0x72/0x90
    [ 1987.217381] Code: 58 65 9c 50 48 c7 c2 17 85 61 9c 57 48 c7 c7 98 fd 6b 9c 48 0f 44 d6 48 c7 c6 b3 08 62 9c 4c 89 d1 49 0f 44 f3 e8 1e 2e d5 ff <0f> 0b 49 c7 c1 9e 42 61 9c 4c 89 cf 4d 89 c8 eb a9 66 66 2e 0f 1f
    [ 1987.238505] RSP: 0018:ff62f5cf20607d60 EFLAGS: 00010246
    [ 1987.244423] RAX: 000000000000005f RBX: 000000000000001f RCX: 0000000000000000
    [ 1987.252480] RDX: 0000000000000000 RSI: ffffffff9c61429e RDI: 00000000ffffffff
    [ 1987.260538] RBP: ff62f5cf20607d78 R08: ff2a6a89ef3fffe8 R09: 00000000fffeffff
    [ 1987.268595] R10: ff2a6a89eed00000 R11: 0000000000000003 R12: ff2a66934849c89a
    [ 1987.276652] R13: 0000000000000001 R14: ff2a66934849c8b9 R15: ff2a66934849c899
    [ 1987.284710] FS:  0000000000000000(0000) GS:ff2a66b22fe40000(0000) knlGS:0000000000000000
    [ 1987.293850] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 1987.300355] CR2: 00007fe291a37000 CR3: 000000010fbd4005 CR4: 0000000000f71ef0
    [ 1987.308413] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 1987.316470] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
    [ 1987.324527] PKRU: 55555554
    [ 1987.327622] Call Trace:
    [ 1987.330424]  <TASK>
    [ 1987.332826]  ? show_regs+0x6e/0x80
    [ 1987.336703]  ? die+0x3c/0xa0
    [ 1987.339988]  ? do_trap+0xd4/0xf0
    [ 1987.343662]  ? do_error_trap+0x75/0xa0
    [ 1987.347922]  ? usercopy_abort+0x72/0x90
    [ 1987.352277]  ? exc_invalid_op+0x57/0x80
    [ 1987.356634]  ? usercopy_abort+0x72/0x90
    [ 1987.360988]  ? asm_exc_invalid_op+0x1f/0x30
    [ 1987.365734]  ? usercopy_abort+0x72/0x90
    [ 1987.370088]  __check_heap_object+0xb7/0xd0
    [ 1987.374739]  __check_object_size+0x175/0x2d0
    [ 1987.379588]  idxd_copy_cr+0xa9/0x130 [idxd]
    [ 1987.384341]  idxd_evl_fault_work+0x127/0x390 [idxd]
    [ 1987.389878]  process_one_work+0x13e/0x300
    [ 1987.394435]  ? __pfx_worker_thread+0x10/0x10
    [ 1987.399284]  worker_thread+0x2f7/0x420
    [ 1987.403544]  ? _raw_spin_unlock_irqrestore+0x2b/0x50
    [ 1987.409171]  ? __pfx_worker_thread+0x10/0x10
    [ 1987.414019]  kthread+0x107/0x140
    [ 1987.417693]  ? __pfx_kthread+0x10/0x10
    [ 1987.421954]  ret_from_fork+0x3d/0x60
    [ 1987.426019]  ? __pfx_kthread+0x10/0x10
    [ 1987.430281]  ret_from_fork_asm+0x1b/0x30
    [ 1987.434744]  </TASK>
    
    The issue arises because event log cache is created using
    kmem_cache_create() which is not suitable for user copy.
    
    Fix the issue by creating event log cache with
    kmem_cache_create_usercopy(), ensuring safe user copy.
    
    Fixes: c2f156bf168f ("dmaengine: idxd: create kmem cache for event log fault items")
    Reported-by: Tony Zhu <tony.zhu@intel.com>
    Tested-by: Tony Zhu <tony.zhu@intel.com>
    Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
    Reviewed-by: Lijun Pan <lijun.pan@intel.com>
    Reviewed-by: Dave Jiang <dave.jiang@intel.com>
    Link: https://lore.kernel.org/r/20240209191412.1050270-1-fenghua.yu@intel.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

dmaengine: idxd: Remove shadow Event Log head stored in idxd [+ + +]

Author: Fenghua Yu <fenghua.yu@intel.com>
Date:   Wed Feb 14 18:49:31 2024 -0800

    dmaengine: idxd: Remove shadow Event Log head stored in idxd
    
    [ Upstream commit ecec7c9f29a7114a3e23a14020b1149ea7dffb4f ]
    
    head is defined in idxd->evl as a shadow of head in the EVLSTATUS register.
    There are two issues related to the shadow head:
    
    1. Mismatch between the shadow head and the state of the EVLSTATUS
       register:
       If Event Log is supported, upon completion of the Enable Device command,
       the Event Log head in the variable idxd->evl->head should be cleared to
       match the state of the EVLSTATUS register. But the variable is not reset
       currently, leading mismatch between the variable and the register state.
       The mismatch causes incorrect processing of Event Log entries.
    
    2. Unnecessary shadow head definition:
       The shadow head is unnecessary as head can be read directly from the
       EVLSTATUS register. Reading head from the register incurs no additional
       cost because event log head and tail are always read together and
       tail is already read directly from the register as required by hardware.
    
    Remove the shadow Event Log head stored in idxd->evl to address the
    mentioned issues.
    
    Fixes: 244da66cda35 ("dmaengine: idxd: setup event log configuration")
    Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
    Reviewed-by: Dave Jiang <dave.jiang@intel.com>
    Link: https://lore.kernel.org/r/20240215024931.1739621-1-fenghua.yu@intel.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

dmaengine: ptdma: use consistent DMA masks [+ + +]

Author: Tadeusz Struk <tstruk@gigaio.com>
Date:   Thu Feb 22 17:30:53 2024 +0100

    dmaengine: ptdma: use consistent DMA masks
    
    commit df2515a17914ecfc2a0594509deaf7fcb8d191ac upstream.
    
    The PTDMA driver sets DMA masks in two different places for the same
    device inconsistently. First call is in pt_pci_probe(), where it uses
    48bit mask. The second call is in pt_dmaengine_register(), where it
    uses a 64bit mask. Using 64bit dma mask causes IO_PAGE_FAULT errors
    on DMA transfers between main memory and other devices.
    Without the extra call it works fine. Additionally the second call
    doesn't check the return value so it can silently fail.
    Remove the superfluous dma_set_mask() call and only use 48bit mask.
    
    Cc: stable@vger.kernel.org
    Fixes: b0b4a6b10577 ("dmaengine: ptdma: register PTDMA controller as a DMA resource")
    Reviewed-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
    Signed-off-by: Tadeusz Struk <tstruk@gigaio.com>
    Link: https://lore.kernel.org/r/20240222163053.13842-1-tstruk@gigaio.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drivers: perf: added capabilities for legacy PMU [+ + +]

Author: Vadim Shakirov <vadim.shakirov@syntacore.com>
Date:   Tue Feb 27 20:00:01 2024 +0300

    drivers: perf: added capabilities for legacy PMU
    
    [ Upstream commit 65730fe8f4fb039683d76fa8ea7e8d18a53c6cc6 ]
    
    Added the PERF_PMU_CAP_NO_INTERRUPT flag because the legacy pmu driver
    does not provide sampling capabilities
    
    Added the PERF_PMU_CAP_NO_EXCLUDE flag because the legacy pmu driver
    does not provide the ability to disable counter incrementation in
    different privilege modes
    
    Suggested-by: Atish Patra <atishp@rivosinc.com>
    Signed-off-by: Vadim Shakirov <vadim.shakirov@syntacore.com>
    Reviewed-by: Atish Patra <atishp@rivosinc.com>
    Fixes: 9b3e150e310e ("RISC-V: Add a simple platform driver for RISC-V  legacy perf")
    Link: https://lore.kernel.org/r/20240227170002.188671-2-vadim.shakirov@syntacore.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drivers: perf: ctr_get_width function for legacy is not defined [+ + +]

Author: Vadim Shakirov <vadim.shakirov@syntacore.com>
Date:   Tue Feb 27 20:00:02 2024 +0300

    drivers: perf: ctr_get_width function for legacy is not defined
    
    [ Upstream commit 682dc133f83e0194796e6ea72eb642df1c03dfbe ]
    
    With parameters CONFIG_RISCV_PMU_LEGACY=y and CONFIG_RISCV_PMU_SBI=n
    linux kernel crashes when you try perf record:
    
    $ perf record ls
    [ 46.749286] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
    [ 46.750199] Oops [#1]
    [ 46.750342] Modules linked in:
    [ 46.750608] CPU: 0 PID: 107 Comm: perf-exec Not tainted 6.6.0 #2
    [ 46.750906] Hardware name: riscv-virtio,qemu (DT)
    [ 46.751184] epc : 0x0
    [ 46.751430] ra : arch_perf_update_userpage+0x54/0x13e
    [ 46.751680] epc : 0000000000000000 ra : ffffffff8072ee52 sp : ff2000000022b8f0
    [ 46.751958] gp : ffffffff81505988 tp : ff6000000290d400 t0 : ff2000000022b9c0
    [ 46.752229] t1 : 0000000000000001 t2 : 0000000000000003 s0 : ff2000000022b930
    [ 46.752451] s1 : ff600000028fb000 a0 : 0000000000000000 a1 : ff600000028fb000
    [ 46.752673] a2 : 0000000ae2751268 a3 : 00000000004fb708 a4 : 0000000000000004
    [ 46.752895] a5 : 0000000000000000 a6 : 000000000017ffe3 a7 : 00000000000000d2
    [ 46.753117] s2 : ff600000028fb000 s3 : 0000000ae2751268 s4 : 0000000000000000
    [ 46.753338] s5 : ffffffff8153e290 s6 : ff600000863b9000 s7 : ff60000002961078
    [ 46.753562] s8 : ff60000002961048 s9 : ff60000002961058 s10: 0000000000000001
    [ 46.753783] s11: 0000000000000018 t3 : ffffffffffffffff t4 : ffffffffffffffff
    [ 46.754005] t5 : ff6000000292270c t6 : ff2000000022bb30
    [ 46.754179] status: 0000000200000100 badaddr: 0000000000000000 cause: 000000000000000c
    [ 46.754653] Code: Unable to access instruction at 0xffffffffffffffec.
    [ 46.754939] ---[ end trace 0000000000000000 ]---
    [ 46.755131] note: perf-exec[107] exited with irqs disabled
    [ 46.755546] note: perf-exec[107] exited with preempt_count 4
    
    This happens because in the legacy case the ctr_get_width function was not
    defined, but it is used in arch_perf_update_userpage.
    
    Also remove extra check in riscv_pmu_ctr_get_width_mask
    
    Signed-off-by: Vadim Shakirov <vadim.shakirov@syntacore.com>
    Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Reviewed-by: Atish Patra <atishp@rivosinc.com>
    Fixes: cc4c07c89aad ("drivers: perf: Implement perf event mmap support  in the SBI backend")
    Link: https://lore.kernel.org/r/20240227170002.188671-3-vadim.shakirov@syntacore.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Add monitor patch for specific eDP [+ + +]

Author: Ryan Lin <tsung-hua.lin@amd.com>
Date:   Wed Feb 28 11:39:21 2024 -0700

    drm/amd/display: Add monitor patch for specific eDP
    
    commit b7cdccc6a849568775f738b1e233f751a8fed013 upstream.
    
    [WHY]
    Some eDP panels' ext caps don't write initial values. The value of
    dpcd_addr (0x317) can be random and the backlight control interface
    will be incorrect.
    
    [HOW]
    Add new panel patches to remove sink ext caps.
    
    Cc: Mario Limonciello <mario.limonciello@amd.com>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org # 6.5.x
    Cc: Tsung-hua Lin <tsung-hua.lin@amd.com>
    Cc: Chris Chi <moukong.chi@amd.com>
    Reviewed-by: Wayne Lin <wayne.lin@amd.com>
    Acked-by: Alex Hung <alex.hung@amd.com>
    Signed-off-by: Ryan Lin <tsung-hua.lin@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: Prevent potential buffer overflow in map_hw_resources [+ + +]

Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Tue Feb 20 09:14:25 2024 +0530

    drm/amd/display: Prevent potential buffer overflow in map_hw_resources
    
    [ Upstream commit 0f8ca019544a252d1afb468ce840c6dcbac73af4 ]
    
    Adds a check in the map_hw_resources function to prevent a potential
    buffer overflow. The function was accessing arrays using an index that
    could potentially be greater than the size of the arrays, leading to a
    buffer overflow.
    
    Adds a check to ensure that the index is within the bounds of the
    arrays. If the index is out of bounds, an error message is printed and
    break it will continue execution with just ignoring extra data early to
    prevent the buffer overflow.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml2_wrapper.c:79 map_hw_resources() error: buffer overflow 'dml2->v20.scratch.dml_to_dc_pipe_mapping.disp_cfg_to_stream_id' 6 <= 7
    drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml2_wrapper.c:81 map_hw_resources() error: buffer overflow 'dml2->v20.scratch.dml_to_dc_pipe_mapping.disp_cfg_to_plane_id' 6 <= 7
    
    Fixes: 7966f319c66d ("drm/amd/display: Introduce DML2")
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Roman Li <roman.li@amd.com>
    Cc: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Suggested-by: Roman Li <roman.li@amd.com>
    Reviewed-by: Roman Li <roman.li@amd.com>
    Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdgpu/pm: Fix the power1_min_cap value [+ + +]

Author: Ma Jun <Jun.Ma2@amd.com>
Date:   Thu Feb 22 17:08:42 2024 +0800

    drm/amdgpu/pm: Fix the power1_min_cap value
    
    commit 7968e9748fbbd7ae49770d9f8a8231d8bce2aebb upstream.
    
    It's unreasonable to use 0 as the power1_min_cap when
    OD is disabled. So, use the same lower limit as the value
    used when OD is enabled.
    
    Fixes: 1958946858a6 ("drm/amd/pm: Support for getting power1_cap_min value")
    Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
    Acked-by: Alex Deucher <alexander.deucher@amd.com>
    Acked-by: Christian Kц╤nig <christian.koenig@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/buddy: fix range bias [+ + +]

Author: Matthew Auld <matthew.auld@intel.com>
Date:   Mon Feb 19 12:18:52 2024 +0000

    drm/buddy: fix range bias
    
    commit f41900e4a6ef019d64a70394b0e0c3bd048d4ec8 upstream.
    
    There is a corner case here where start/end is after/before the block
    range we are currently checking. If so we need to be sure that splitting
    the block will eventually give use the block size we need. To do that we
    should adjust the block range to account for the start/end, and only
    continue with the split if the size/alignment will fit the requested
    size. Not doing so can result in leaving split blocks unmerged when it
    eventually fails.
    
    Fixes: afea229fe102 ("drm: improve drm_buddy_alloc function")
    Signed-off-by: Matthew Auld <matthew.auld@intel.com>
    Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
    Cc: Christian Kц╤nig <christian.koenig@amd.com>
    Cc: <stable@vger.kernel.org> # v5.18+
    Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240219121851.25774-4-matthew.auld@intel.com
    Signed-off-by: Christian Kц╤nig <christian.koenig@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/nouveau: don't fini scheduler before entity flush [+ + +]

Author: Danilo Krummrich <dakr@redhat.com>
Date:   Mon Mar 4 18:01:46 2024 +0100

    drm/nouveau: don't fini scheduler before entity flush
    
    This bug is present in v6.7 only, since the scheduler design has been
    re-worked in v6.8.
    
    Client scheduler entities must be flushed before an associated GPU
    scheduler is teared down. Otherwise the entitiy might still hold a
    pointer to the scheduler's runqueue which is freed at scheduler tear
    down already.
    
    [  305.224293] ==================================================================
    [  305.224297] BUG: KASAN: slab-use-after-free in drm_sched_entity_flush+0x6c4/0x7b0 [gpu_sched]
    [  305.224310] Read of size 8 at addr ffff8881440a8f48 by task rmmod/4436
    
    [  305.224317] CPU: 10 PID: 4436 Comm: rmmod Tainted: G     U             6.7.6-100.fc38.x86_64+debug #1
    [  305.224321] Hardware name: Dell Inc. Precision 7550/01PXFR, BIOS 1.27.0 11/08/2023
    [  305.224324] Call Trace:
    [  305.224327]  <TASK>
    [  305.224329]  dump_stack_lvl+0x76/0xd0
    [  305.224336]  print_report+0xcf/0x670
    [  305.224342]  ? drm_sched_entity_flush+0x6c4/0x7b0 [gpu_sched]
    [  305.224352]  ? __virt_addr_valid+0x215/0x410
    [  305.224359]  ? drm_sched_entity_flush+0x6c4/0x7b0 [gpu_sched]
    [  305.224368]  kasan_report+0xa6/0xe0
    [  305.224373]  ? drm_sched_entity_flush+0x6c4/0x7b0 [gpu_sched]
    [  305.224385]  drm_sched_entity_flush+0x6c4/0x7b0 [gpu_sched]
    [  305.224395]  ? __pfx_drm_sched_entity_flush+0x10/0x10 [gpu_sched]
    [  305.224406]  ? rcu_is_watching+0x15/0xb0
    [  305.224413]  drm_sched_entity_destroy+0x17/0x20 [gpu_sched]
    [  305.224422]  nouveau_cli_fini+0x6c/0x120 [nouveau]
    [  305.224658]  nouveau_drm_device_fini+0x2ac/0x490 [nouveau]
    [  305.224871]  nouveau_drm_remove+0x18e/0x220 [nouveau]
    [  305.225082]  ? __pfx_nouveau_drm_remove+0x10/0x10 [nouveau]
    [  305.225290]  ? rcu_is_watching+0x15/0xb0
    [  305.225295]  ? _raw_spin_unlock_irqrestore+0x66/0x80
    [  305.225299]  ? trace_hardirqs_on+0x16/0x100
    [  305.225304]  ? _raw_spin_unlock_irqrestore+0x4f/0x80
    [  305.225310]  pci_device_remove+0xa3/0x1d0
    [  305.225316]  device_release_driver_internal+0x379/0x540
    [  305.225322]  driver_detach+0xc5/0x180
    [  305.225327]  bus_remove_driver+0x11e/0x2a0
    [  305.225333]  pci_unregister_driver+0x2a/0x250
    [  305.225339]  nouveau_drm_exit+0x1f/0x970 [nouveau]
    [  305.225548]  __do_sys_delete_module+0x350/0x580
    [  305.225554]  ? __pfx___do_sys_delete_module+0x10/0x10
    [  305.225562]  ? syscall_enter_from_user_mode+0x26/0x90
    [  305.225567]  ? rcu_is_watching+0x15/0xb0
    [  305.225571]  ? syscall_enter_from_user_mode+0x26/0x90
    [  305.225575]  ? trace_hardirqs_on+0x16/0x100
    [  305.225580]  do_syscall_64+0x61/0xe0
    [  305.225584]  ? rcu_is_watching+0x15/0xb0
    [  305.225587]  ? syscall_exit_to_user_mode+0x1f/0x50
    [  305.225592]  ? trace_hardirqs_on_prepare+0xe3/0x100
    [  305.225596]  ? do_syscall_64+0x70/0xe0
    [  305.225600]  ? trace_hardirqs_on_prepare+0xe3/0x100
    [  305.225604]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
    [  305.225609] RIP: 0033:0x7f6148f3592b
    [  305.225650] Code: 73 01 c3 48 8b 0d dd 04 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ad 04 0c 00 f7 d8 64 89 01 48
    [  305.225653] RSP: 002b:00007ffe89986f08 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
    [  305.225659] RAX: ffffffffffffffda RBX: 000055cbb036e900 RCX: 00007f6148f3592b
    [  305.225662] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055cbb036e968
    [  305.225664] RBP: 00007ffe89986f30 R08: 1999999999999999 R09: 0000000000000000
    [  305.225667] R10: 00007f6148fa6ac0 R11: 0000000000000206 R12: 0000000000000000
    [  305.225670] R13: 00007ffe89987190 R14: 000055cbb036e900 R15: 0000000000000000
    [  305.225678]  </TASK>
    
    [  305.225683] Allocated by task 484:
    [  305.225685]  kasan_save_stack+0x33/0x60
    [  305.225690]  kasan_set_track+0x25/0x30
    [  305.225693]  __kasan_kmalloc+0x8f/0xa0
    [  305.225696]  drm_sched_init+0x3c7/0xce0 [gpu_sched]
    [  305.225705]  nouveau_sched_init+0xd2/0x110 [nouveau]
    [  305.225913]  nouveau_drm_device_init+0x130/0x3290 [nouveau]
    [  305.226121]  nouveau_drm_probe+0x1ab/0x6b0 [nouveau]
    [  305.226329]  local_pci_probe+0xda/0x190
    [  305.226333]  pci_device_probe+0x23a/0x780
    [  305.226337]  really_probe+0x3df/0xb80
    [  305.226341]  __driver_probe_device+0x18c/0x450
    [  305.226345]  driver_probe_device+0x4a/0x120
    [  305.226348]  __driver_attach+0x1e5/0x4a0
    [  305.226351]  bus_for_each_dev+0x106/0x190
    [  305.226355]  bus_add_driver+0x2a1/0x570
    [  305.226358]  driver_register+0x134/0x460
    [  305.226361]  do_one_initcall+0xd3/0x430
    [  305.226366]  do_init_module+0x238/0x770
    [  305.226370]  load_module+0x5581/0x6f10
    [  305.226374]  __do_sys_init_module+0x1f2/0x220
    [  305.226377]  do_syscall_64+0x61/0xe0
    [  305.226381]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
    
    [  305.226387] Freed by task 4436:
    [  305.226389]  kasan_save_stack+0x33/0x60
    [  305.226392]  kasan_set_track+0x25/0x30
    [  305.226396]  kasan_save_free_info+0x2b/0x50
    [  305.226399]  __kasan_slab_free+0x10b/0x1a0
    [  305.226402]  slab_free_freelist_hook+0x12b/0x1e0
    [  305.226406]  __kmem_cache_free+0xd4/0x1d0
    [  305.226410]  drm_sched_fini+0x178/0x320 [gpu_sched]
    [  305.226418]  nouveau_drm_device_fini+0x2a0/0x490 [nouveau]
    [  305.226624]  nouveau_drm_remove+0x18e/0x220 [nouveau]
    [  305.226832]  pci_device_remove+0xa3/0x1d0
    [  305.226836]  device_release_driver_internal+0x379/0x540
    [  305.226840]  driver_detach+0xc5/0x180
    [  305.226843]  bus_remove_driver+0x11e/0x2a0
    [  305.226847]  pci_unregister_driver+0x2a/0x250
    [  305.226850]  nouveau_drm_exit+0x1f/0x970 [nouveau]
    [  305.227056]  __do_sys_delete_module+0x350/0x580
    [  305.227060]  do_syscall_64+0x61/0xe0
    [  305.227064]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
    
    [  305.227070] The buggy address belongs to the object at ffff8881440a8f00
                    which belongs to the cache kmalloc-128 of size 128
    [  305.227073] The buggy address is located 72 bytes inside of
                    freed 128-byte region [ffff8881440a8f00, ffff8881440a8f80)
    
    [  305.227078] The buggy address belongs to the physical page:
    [  305.227081] page:00000000627efa0a refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1440a8
    [  305.227085] head:00000000627efa0a order:1 entire_mapcount:0 nr_pages_mapped:0 pincount:0
    [  305.227088] flags: 0x17ffffc0000840(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
    [  305.227093] page_type: 0xffffffff()
    [  305.227097] raw: 0017ffffc0000840 ffff8881000428c0 ffffea0005b33500 dead000000000002
    [  305.227100] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
    [  305.227102] page dumped because: kasan: bad access detected
    
    [  305.227106] Memory state around the buggy address:
    [  305.227109]  ffff8881440a8e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [  305.227112]  ffff8881440a8e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [  305.227114] >ffff8881440a8f00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [  305.227117]                                               ^
    [  305.227120]  ffff8881440a8f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [  305.227122]  ffff8881440a9000: 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc
    [  305.227125] ==================================================================
    
    Cc: <stable@vger.kernel.org> # v6.7 only
    Reported-by: Karol Herbst <kherbst@redhat.com>
    Closes: https://gist.githubusercontent.com/karolherbst/a20eb0f937a06ed6aabe2ac2ca3d11b5/raw/9cd8b1dc5894872d0eeebbee3dd0fdd28bb576bc/gistfile1.txt
    Fixes: b88baab82871 ("drm/nouveau: implement new VM_BIND uAPI")
    Signed-off-by: Danilo Krummrich <dakr@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/nouveau: keep DMA buffers required for suspend/resume [+ + +]

Author: Sid Pranjale <sidpranjale127@protonmail.com>
Date:   Thu Feb 29 21:52:05 2024 +0530

    drm/nouveau: keep DMA buffers required for suspend/resume
    
    [ Upstream commit f6ecfdad359a01c7fd8a3bcfde3ef0acdf107e6e ]
    
    Nouveau deallocates a few buffers post GPU init which are required for GPU suspend/resume to function correctly.
    This is likely not as big an issue on systems where the NVGPU is the only GPU, but on multi-GPU set ups it leads to a regression where the kernel module errors and results in a system-wide rendering freeze.
    
    This commit addresses that regression by moving the two buffers required for suspend and resume to be deallocated at driver unload instead of post init.
    
    Fixes: 042b5f83841fb ("drm/nouveau: fix several DMA buffer leaks")
    Signed-off-by: Sid Pranjale <sidpranjale127@protonmail.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/tegra: Remove existing framebuffer only if we support display [+ + +]

Author: Thierry Reding <treding@nvidia.com>
Date:   Fri Feb 23 16:03:33 2024 +0100

    drm/tegra: Remove existing framebuffer only if we support display
    
    [ Upstream commit 86bf8cfda6d2a6720fa2e6e676c98f0882c9d3d7 ]
    
    Tegra DRM doesn't support display on Tegra234 and later, so make sure
    not to remove any existing framebuffers in that case.
    
    v2: - add comments explaining how this situation can come about
        - clear DRIVER_MODESET and DRIVER_ATOMIC feature bits
    
    Fixes: 6848c291a54f ("drm/aperture: Convert drivers to aperture interfaces")
    Signed-off-by: Thierry Reding <treding@nvidia.com>
    Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
    Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
    Signed-off-by: Robert Foss <rfoss@kernel.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240223150333.1401582-1-thierry.reding@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

efi/capsule-loader: fix incorrect allocation size [+ + +]

Author: Arnd Bergmann <arnd@arndb.de>
Date:   Mon Feb 12 12:24:40 2024 +0100

    efi/capsule-loader: fix incorrect allocation size
    
    [ Upstream commit fccfa646ef3628097d59f7d9c1a3e84d4b6bb45e ]
    
    gcc-14 notices that the allocation with sizeof(void) on 32-bit architectures
    is not enough for a 64-bit phys_addr_t:
    
    drivers/firmware/efi/capsule-loader.c: In function 'efi_capsule_open':
    drivers/firmware/efi/capsule-loader.c:295:24: error: allocation of insufficient size '4' for type 'phys_addr_t' {aka 'long long unsigned int'} with size '8' [-Werror=alloc-size]
      295 |         cap_info->phys = kzalloc(sizeof(void *), GFP_KERNEL);
          |                        ^
    
    Use the correct type instead here.
    
    Fixes: f24c4d478013 ("efi/capsule-loader: Reinstate virtual capsule mapping")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

efivarfs: Request at most 512 bytes for variable names [+ + +]

Author: Tim Schumacher <timschumi@gmx.de>
Date:   Fri Jan 26 17:25:23 2024 +0100

    efivarfs: Request at most 512 bytes for variable names
    
    commit f45812cc23fb74bef62d4eb8a69fe7218f4b9f2a upstream.
    
    Work around a quirk in a few old (2011-ish) UEFI implementations, where
    a call to `GetNextVariableName` with a buffer size larger than 512 bytes
    will always return EFI_INVALID_PARAMETER.
    
    There is some lore around EFI variable names being up to 1024 bytes in
    size, but this has no basis in the UEFI specification, and the upper
    bounds are typically platform specific, and apply to the entire variable
    (name plus payload).
    
    Given that Linux does not permit creating files with names longer than
    NAME_MAX (255) bytes, 512 bytes (== 256 UTF-16 characters) is a
    reasonable limit.
    
    Cc: <stable@vger.kernel.org> # 6.1+
    Signed-off-by: Tim Schumacher <timschumi@gmx.de>
    Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fbcon: always restore the old font data in fbcon_do_set_font() [+ + +]

Author: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Date:   Thu Feb 8 12:44:11 2024 +0100

    fbcon: always restore the old font data in fbcon_do_set_font()
    
    [ Upstream commit 00d6a284fcf3fad1b7e1b5bc3cd87cbfb60ce03f ]
    
    Commit a5a923038d70 (fbdev: fbcon: Properly revert changes when
    vc_resize() failed) started restoring old font data upon failure (of
    vc_resize()). But it performs so only for user fonts. It means that the
    "system"/internal fonts are not restored at all. So in result, the very
    first call to fbcon_do_set_font() performs no restore at all upon
    failing vc_resize().
    
    This can be reproduced by Syzkaller to crash the system on the next
    invocation of font_get(). It's rather hard to hit the allocation failure
    in vc_resize() on the first font_set(), but not impossible. Esp. if
    fault injection is used to aid the execution/failure. It was
    demonstrated by Sirius:
      BUG: unable to handle page fault for address: fffffffffffffff8
      #PF: supervisor read access in kernel mode
      #PF: error_code(0x0000) - not-present page
      PGD cb7b067 P4D cb7b067 PUD cb7d067 PMD 0
      Oops: 0000 [#1] PREEMPT SMP KASAN
      CPU: 1 PID: 8007 Comm: poc Not tainted 6.7.0-g9d1694dc91ce #20
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
      RIP: 0010:fbcon_get_font+0x229/0x800 drivers/video/fbdev/core/fbcon.c:2286
      Call Trace:
       <TASK>
       con_font_get drivers/tty/vt/vt.c:4558 [inline]
       con_font_op+0x1fc/0xf20 drivers/tty/vt/vt.c:4673
       vt_k_ioctl drivers/tty/vt/vt_ioctl.c:474 [inline]
       vt_ioctl+0x632/0x2ec0 drivers/tty/vt/vt_ioctl.c:752
       tty_ioctl+0x6f8/0x1570 drivers/tty/tty_io.c:2803
       vfs_ioctl fs/ioctl.c:51 [inline]
      ...
    
    So restore the font data in any case, not only for user fonts. Note the
    later 'if' is now protected by 'old_userfont' and not 'old_data' as the
    latter is always set now. (And it is supposed to be non-NULL. Otherwise
    we would see the bug above again.)
    
    Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
    Fixes: a5a923038d70 ("fbdev: fbcon: Properly revert changes when vc_resize() failed")
    Reported-and-tested-by: Ubisectech Sirius <bugreport@ubisectech.com>
    Cc: Ubisectech Sirius <bugreport@ubisectech.com>
    Cc: Daniel Vetter <daniel@ffwll.ch>
    Cc: Helge Deller <deller@gmx.de>
    Cc: linux-fbdev@vger.kernel.org
    Cc: dri-devel@lists.freedesktop.org
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240208114411.14604-1-jirislaby@kernel.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fprobe: Fix to allocate entry_data_size buffer with rethook instances [+ + +]

Author: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Date:   Fri Mar 1 09:18:24 2024 +0900

    fprobe: Fix to allocate entry_data_size buffer with rethook instances
    
    commit 6572786006fa96ad2c35bb31757f1f861298093b upstream.
    
    Fix to allocate fprobe::entry_data_size buffer with rethook instances.
    If fprobe doesn't allocate entry_data_size buffer for each rethook instance,
    fprobe entry handler can cause a buffer overrun when storing entry data in
    entry handler.
    
    Link: https://lore.kernel.org/all/170920576727.107552.638161246679734051.stgit@devnote2/
    
    Reported-by: Jiri Olsa <olsajiri@gmail.com>
    Closes: https://lore.kernel.org/all/Zd9eBn2FTQzYyg7L@krava/
    Fixes: 4bbd93455659 ("kprobes: kretprobe scalability improvement")
    Cc: stable@vger.kernel.org
    Tested-by: Jiri Olsa <olsajiri@gmail.com>
    Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

gpio: 74x164: Enable output pins after registers are reset [+ + +]

Author: Arturas Moskvinas <arturas.moskvinas@gmail.com>
Date:   Fri Mar 1 09:12:04 2024 +0200

    gpio: 74x164: Enable output pins after registers are reset
    
    [ Upstream commit 530b1dbd97846b110ea8a94c7cc903eca21786e5 ]
    
    Chip outputs are enabled[1] before actual reset is performed[2] which might
    cause pin output value to flip flop if previous pin value was set to 1.
    Fix that behavior by making sure chip is fully reset before all outputs are
    enabled.
    
    Flip-flop can be noticed when module is removed and inserted again and one of
    the pins was changed to 1 before removal. 100 microsecond flipping is
    noticeable on oscilloscope (100khz SPI bus).
    
    For a properly reset chip - output is enabled around 100 microseconds (on 100khz
    SPI bus) later during probing process hence should be irrelevant behavioral
    change.
    
    Fixes: 7ebc194d0fd4 (gpio: 74x164: Introduce 'enable-gpios' property)
    Link: https://elixir.bootlin.com/linux/v6.7.4/source/drivers/gpio/gpio-74x164.c#L130 [1]
    Link: https://elixir.bootlin.com/linux/v6.7.4/source/drivers/gpio/gpio-74x164.c#L150 [2]
    Signed-off-by: Arturas Moskvinas <arturas.moskvinas@gmail.com>
    Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

gpio: fix resource unwinding order in error path [+ + +]

Author: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Date:   Thu Feb 29 18:25:49 2024 +0100

    gpio: fix resource unwinding order in error path
    
    [ Upstream commit ec5c54a9d3c4f9c15e647b049fea401ee5258696 ]
    
    Hogs are added *after* ACPI so should be removed *before* in error path.
    
    Fixes: a411e81e61df ("gpiolib: add hogs support for machine code")
    Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

gpiolib: Fix the error path order in gpiochip_add_data_with_key() [+ + +]

Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date:   Wed Feb 21 21:28:46 2024 +0200

    gpiolib: Fix the error path order in gpiochip_add_data_with_key()
    
    [ Upstream commit e4aec4daa8c009057b5e063db1b7322252c92dc8 ]
    
    After shuffling the code, error path wasn't updated correctly.
    Fix it here.
    
    Fixes: 2f4133bb5f14 ("gpiolib: No need to call gpiochip_remove_pin_ranges() twice")
    Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

gpu: host1x: Skip reset assert on Tegra186 [+ + +]

Author: Mikko Perttunen <mperttunen@nvidia.com>
Date:   Thu Feb 22 03:05:16 2024 +0200

    gpu: host1x: Skip reset assert on Tegra186
    
    [ Upstream commit 1fa8d07ae1a5fa4e87de42c338e8fc27f46d8bb6 ]
    
    On Tegra186, secure world applications may need to access host1x
    during suspend/resume, and rely on the kernel to keep Host1x out
    of reset during the suspend cycle. As such, as a quirk,
    skip asserting Host1x's reset on Tegra186.
    
    We don't need to keep the clocks enabled, as BPMP ensures the clock
    stays on while Host1x is being used. On newer SoC's, the reset line
    is inaccessible, so there is no need for the quirk.
    
    Fixes: b7c00cdf6df5 ("gpu: host1x: Enable system suspend callbacks")
    Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
    Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Signed-off-by: Thierry Reding <treding@nvidia.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240222010517.1573931-1-cyndis@kapsi.fi
    Signed-off-by: Sasha Levin <sashal@kernel.org>

gtp: fix use-after-free and null-ptr-deref in gtp_newlink() [+ + +]

Author: Alexander Ofitserov <oficerovas@altlinux.org>
Date:   Wed Feb 28 14:47:03 2024 +0300

    gtp: fix use-after-free and null-ptr-deref in gtp_newlink()
    
    commit 616d82c3cfa2a2146dd7e3ae47bda7e877ee549e upstream.
    
    The gtp_link_ops operations structure for the subsystem must be
    registered after registering the gtp_net_ops pernet operations structure.
    
    Syzkaller hit 'general protection fault in gtp_genl_dump_pdp' bug:
    
    [ 1010.702740] gtp: GTP module unloaded
    [ 1010.715877] general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] SMP KASAN NOPTI
    [ 1010.715888] KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
    [ 1010.715895] CPU: 1 PID: 128616 Comm: a.out Not tainted 6.8.0-rc6-std-def-alt1 #1
    [ 1010.715899] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-alt1 04/01/2014
    [ 1010.715908] RIP: 0010:gtp_newlink+0x4d7/0x9c0 [gtp]
    [ 1010.715915] Code: 80 3c 02 00 0f 85 41 04 00 00 48 8b bb d8 05 00 00 e8 ed f6 ff ff 48 89 c2 48 89 c5 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 4f 04 00 00 4c 89 e2 4c 8b 6d 00 48 b8 00 00 00
    [ 1010.715920] RSP: 0018:ffff888020fbf180 EFLAGS: 00010203
    [ 1010.715929] RAX: dffffc0000000000 RBX: ffff88800399c000 RCX: 0000000000000000
    [ 1010.715933] RDX: 0000000000000001 RSI: ffffffff84805280 RDI: 0000000000000282
    [ 1010.715938] RBP: 000000000000000d R08: 0000000000000001 R09: 0000000000000000
    [ 1010.715942] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88800399cc80
    [ 1010.715947] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000400
    [ 1010.715953] FS:  00007fd1509ab5c0(0000) GS:ffff88805b300000(0000) knlGS:0000000000000000
    [ 1010.715958] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 1010.715962] CR2: 0000000000000000 CR3: 000000001c07a000 CR4: 0000000000750ee0
    [ 1010.715968] PKRU: 55555554
    [ 1010.715972] Call Trace:
    [ 1010.715985]  ? __die_body.cold+0x1a/0x1f
    [ 1010.715995]  ? die_addr+0x43/0x70
    [ 1010.716002]  ? exc_general_protection+0x199/0x2f0
    [ 1010.716016]  ? asm_exc_general_protection+0x1e/0x30
    [ 1010.716026]  ? gtp_newlink+0x4d7/0x9c0 [gtp]
    [ 1010.716034]  ? gtp_net_exit+0x150/0x150 [gtp]
    [ 1010.716042]  __rtnl_newlink+0x1063/0x1700
    [ 1010.716051]  ? rtnl_setlink+0x3c0/0x3c0
    [ 1010.716063]  ? is_bpf_text_address+0xc0/0x1f0
    [ 1010.716070]  ? kernel_text_address.part.0+0xbb/0xd0
    [ 1010.716076]  ? __kernel_text_address+0x56/0xa0
    [ 1010.716084]  ? unwind_get_return_address+0x5a/0xa0
    [ 1010.716091]  ? create_prof_cpu_mask+0x30/0x30
    [ 1010.716098]  ? arch_stack_walk+0x9e/0xf0
    [ 1010.716106]  ? stack_trace_save+0x91/0xd0
    [ 1010.716113]  ? stack_trace_consume_entry+0x170/0x170
    [ 1010.716121]  ? __lock_acquire+0x15c5/0x5380
    [ 1010.716139]  ? mark_held_locks+0x9e/0xe0
    [ 1010.716148]  ? kmem_cache_alloc_trace+0x35f/0x3c0
    [ 1010.716155]  ? __rtnl_newlink+0x1700/0x1700
    [ 1010.716160]  rtnl_newlink+0x69/0xa0
    [ 1010.716166]  rtnetlink_rcv_msg+0x43b/0xc50
    [ 1010.716172]  ? rtnl_fdb_dump+0x9f0/0x9f0
    [ 1010.716179]  ? lock_acquire+0x1fe/0x560
    [ 1010.716188]  ? netlink_deliver_tap+0x12f/0xd50
    [ 1010.716196]  netlink_rcv_skb+0x14d/0x440
    [ 1010.716202]  ? rtnl_fdb_dump+0x9f0/0x9f0
    [ 1010.716208]  ? netlink_ack+0xab0/0xab0
    [ 1010.716213]  ? netlink_deliver_tap+0x202/0xd50
    [ 1010.716220]  ? netlink_deliver_tap+0x218/0xd50
    [ 1010.716226]  ? __virt_addr_valid+0x30b/0x590
    [ 1010.716233]  netlink_unicast+0x54b/0x800
    [ 1010.716240]  ? netlink_attachskb+0x870/0x870
    [ 1010.716248]  ? __check_object_size+0x2de/0x3b0
    [ 1010.716254]  netlink_sendmsg+0x938/0xe40
    [ 1010.716261]  ? netlink_unicast+0x800/0x800
    [ 1010.716269]  ? __import_iovec+0x292/0x510
    [ 1010.716276]  ? netlink_unicast+0x800/0x800
    [ 1010.716284]  __sock_sendmsg+0x159/0x190
    [ 1010.716290]  ____sys_sendmsg+0x712/0x880
    [ 1010.716297]  ? sock_write_iter+0x3d0/0x3d0
    [ 1010.716304]  ? __ia32_sys_recvmmsg+0x270/0x270
    [ 1010.716309]  ? lock_acquire+0x1fe/0x560
    [ 1010.716315]  ? drain_array_locked+0x90/0x90
    [ 1010.716324]  ___sys_sendmsg+0xf8/0x170
    [ 1010.716331]  ? sendmsg_copy_msghdr+0x170/0x170
    [ 1010.716337]  ? lockdep_init_map_type+0x2c7/0x860
    [ 1010.716343]  ? lockdep_hardirqs_on_prepare+0x430/0x430
    [ 1010.716350]  ? debug_mutex_init+0x33/0x70
    [ 1010.716360]  ? percpu_counter_add_batch+0x8b/0x140
    [ 1010.716367]  ? lock_acquire+0x1fe/0x560
    [ 1010.716373]  ? find_held_lock+0x2c/0x110
    [ 1010.716384]  ? __fd_install+0x1b6/0x6f0
    [ 1010.716389]  ? lock_downgrade+0x810/0x810
    [ 1010.716396]  ? __fget_light+0x222/0x290
    [ 1010.716403]  __sys_sendmsg+0xea/0x1b0
    [ 1010.716409]  ? __sys_sendmsg_sock+0x40/0x40
    [ 1010.716419]  ? lockdep_hardirqs_on_prepare+0x2b3/0x430
    [ 1010.716425]  ? syscall_enter_from_user_mode+0x1d/0x60
    [ 1010.716432]  do_syscall_64+0x30/0x40
    [ 1010.716438]  entry_SYSCALL_64_after_hwframe+0x62/0xc7
    [ 1010.716444] RIP: 0033:0x7fd1508cbd49
    [ 1010.716452] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ef 70 0d 00 f7 d8 64 89 01 48
    [ 1010.716456] RSP: 002b:00007fff18872348 EFLAGS: 00000202 ORIG_RAX: 000000000000002e
    [ 1010.716463] RAX: ffffffffffffffda RBX: 000055f72bf0eac0 RCX: 00007fd1508cbd49
    [ 1010.716468] RDX: 0000000000000000 RSI: 0000000020000280 RDI: 0000000000000006
    [ 1010.716473] RBP: 00007fff18872360 R08: 00007fff18872360 R09: 00007fff18872360
    [ 1010.716478] R10: 00007fff18872360 R11: 0000000000000202 R12: 000055f72bf0e1b0
    [ 1010.716482] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
    [ 1010.716491] Modules linked in: gtp(+) udp_tunnel ib_core uinput af_packet rfkill qrtr joydev hid_generic usbhid hid kvm_intel iTCO_wdt intel_pmc_bxt iTCO_vendor_support kvm snd_hda_codec_generic ledtrig_audio irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel snd_hda_intel nls_utf8 snd_intel_dspcfg nls_cp866 psmouse aesni_intel vfat crypto_simd fat cryptd glue_helper snd_hda_codec pcspkr snd_hda_core i2c_i801 snd_hwdep i2c_smbus xhci_pci snd_pcm lpc_ich xhci_pci_renesas xhci_hcd qemu_fw_cfg tiny_power_button button sch_fq_codel vboxvideo drm_vram_helper drm_ttm_helper ttm vboxsf vboxguest snd_seq_midi snd_seq_midi_event snd_seq snd_rawmidi snd_seq_device snd_timer snd soundcore msr fuse efi_pstore dm_mod ip_tables x_tables autofs4 virtio_gpu virtio_dma_buf drm_kms_helper cec rc_core drm virtio_rng virtio_scsi rng_core virtio_balloon virtio_blk virtio_net virtio_console net_failover failover ahci libahci libata evdev scsi_mod input_leds serio_raw virtio_pci intel_agp
    [ 1010.716674]  virtio_ring intel_gtt virtio [last unloaded: gtp]
    [ 1010.716693] ---[ end trace 04990a4ce61e174b ]---
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Alexander Ofitserov <oficerovas@altlinux.org>
    Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Link: https://lore.kernel.org/r/20240228114703.465107-1-oficerovas@altlinux.org
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ice: fix connection state of DPLL and out pin [+ + +]

Author: Yochai Hagvi <yochai.hagvi@intel.com>
Date:   Thu Jan 25 15:40:55 2024 +0200

    ice: fix connection state of DPLL and out pin
    
    [ Upstream commit e8335ef57c6816d81b24173ba88cc9b3f043687f ]
    
    Fix the connection state between source DPLL and output pin, updating the
    attribute 'state' of 'parent_device'. Previously, the connection state
    was broken, and didn't reflect the correct state.
    
    When 'state_on_dpll_set' is called with the value
    'DPLL_PIN_STATE_CONNECTED' (1), the output pin will switch to the given
    DPLL, and the state of the given DPLL will be set to connected.
    E.g.:
            --do pin-set --json '{"id":2, "parent-device":{"parent-id":1,
                                                           "state": 1 }}'
    This command will connect DPLL device with id 1 to output pin with id 2.
    
    When 'state_on_dpll_set' is called with the value
    'DPLL_PIN_STATE_DISCONNECTED' (2) and the given DPLL is currently
    connected, then the output pin will be disabled.
    E.g:
            --do pin-set --json '{"id":2, "parent-device":{"parent-id":1,
                                                           "state": 2 }}'
    This command will disable output pin with id 2 if DPLL device with ID 1 is
    connected to it; otherwise, the command is ignored.
    
    Fixes: d7999f5ea64b ("ice: implement dpll interface to control cgu")
    Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
    Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
    Signed-off-by: Yochai Hagvi <yochai.hagvi@intel.com>
    Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ice: fix dpll and dpll_pin data access on PF reset [+ + +]

Author: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Date:   Fri Feb 9 22:24:30 2024 +0100

    ice: fix dpll and dpll_pin data access on PF reset
    
    [ Upstream commit fc7fd1a10a9d2d38378b42e9a508da4c68018453 ]
    
    Do not allow to acquire data or alter configuration of dpll and pins
    through firmware if PF reset is in progress, this would cause confusing
    netlink extack errors as the firmware cannot respond or process the
    request properly during the reset time.
    
    Return (-EBUSY) and extack error for the user who tries access/modify
    the config of dpll/pin through firmware during the reset time.
    
    The PF reset and kernel access to dpll data are both asynchronous. It is
    not possible to guard all the possible reset paths with any determinictic
    approach. I.e., it is possible that reset starts after reset check is
    performed (or if the reset would be checked after mutex is locked), but at
    the same time it is not possible to wait for dpll mutex unlock in the
    reset flow.
    This is best effort solution to at least give a clue to the user
    what is happening in most of the cases, knowing that there are possible
    race conditions where the user could see a different error received
    from firmware due to reset unexpectedly starting.
    
    Test by looping execution of below steps until netlink error appears:
    - perform PF reset
    $ echo 1 > /sys/class/net/<ice PF>/device/reset
    - i.e. try to alter/read dpll/pin config:
    $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \
            --dump pin-get
    
    Fixes: d7999f5ea64b ("ice: implement dpll interface to control cgu")
    Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
    Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ice: fix dpll input pin phase_adjust value updates [+ + +]

Author: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Date:   Thu Feb 8 23:56:31 2024 +0100

    ice: fix dpll input pin phase_adjust value updates
    
    [ Upstream commit 3b14430c65b4f510b2a310ca4f18ed6ca7184b00 ]
    
    The value of phase_adjust for input pin shall be updated in
    ice_dpll_pin_state_update(..). Fix by adding proper argument to the
    firmware query function call - a pin's struct field pointer where the
    phase_adjust value during driver runtime is stored.
    
    Previously the phase_adjust used to misinform user about actual
    phase_adjust value. I.e., if phase_adjust was set to a non zero value and
    if driver was reloaded, the user would see the value equal 0, which is
    not correct - the actual value is equal to value set before driver reload.
    
    Fixes: 90e1c90750d7 ("ice: dpll: implement phase related callbacks")
    Reviewed-by: Alan Brady <alan.brady@intel.com>
    Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ice: fix dpll periodic work data updates on PF reset [+ + +]

Author: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Date:   Fri Feb 9 22:24:31 2024 +0100

    ice: fix dpll periodic work data updates on PF reset
    
    [ Upstream commit 9a8385fe14bcb250a3889e744dc54e9c411d8400 ]
    
    Do not allow dpll periodic work function to acquire data from firmware
    if PF reset is in progress. Acquiring data will cause dmesg errors as the
    firmware cannot respond or process the request properly during the reset
    time.
    
    Test by looping execution of below step until dmesg error appears:
    - perform PF reset
    $ echo 1 > /sys/class/net/<ice PF>/device/reset
    
    Fixes: d7999f5ea64b ("ice: implement dpll interface to control cgu")
    Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com>
    Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ice: fix pin phase adjust updates on PF reset [+ + +]

Author: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Date:   Fri Feb 9 22:24:32 2024 +0100

    ice: fix pin phase adjust updates on PF reset
    
    [ Upstream commit ee89921da471edcb4b1e67f5bbfedddf39749782 ]
    
    Do not allow to set phase adjust value for a pin if PF reset is in
    progress, this would cause confusing netlink extack errors as the firmware
    cannot process the request properly during the reset time.
    
    Return (-EBUSY) and report extack error for the user who tries configure
    pin phase adjust during the reset time.
    
    Test by looping execution of below steps until netlink error appears:
    - perform PF reset
    $ echo 1 > /sys/class/net/<ice PF>/device/reset
    - change pin phase adjust value:
    $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \
            --do pin-set --json '{"id":0, "phase-adjust":1000}'
    
    Fixes: 90e1c90750d7 ("ice: dpll: implement phase related callbacks")
    Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com>
    Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

igb: extend PTP timestamp adjustments to i211 [+ + +]

Author: Oleksij Rempel <o.rempel@pengutronix.de>
Date:   Tue Feb 27 10:49:41 2024 -0800

    igb: extend PTP timestamp adjustments to i211
    
    [ Upstream commit 0bb7b09392eb74b152719ae87b1ba5e4bf910ef0 ]
    
    The i211 requires the same PTP timestamp adjustments as the i210,
    according to its datasheet. To ensure consistent timestamping across
    different platforms, this change extends the existing adjustments to
    include the i211.
    
    The adjustment result are tested and comparable for i210 and i211 based
    systems.
    
    Fixes: 3f544d2a4d5c ("igb: adjust PTP timestamps for Tx/Rx latency")
    Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
    Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Link: https://lore.kernel.org/r/20240227184942.362710-1-anthony.l.nguyen@intel.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

iommufd: Fix iopt_access_list_id overwrite bug [+ + +]

Author: Nicolin Chen <nicolinc@nvidia.com>
Date:   Thu Feb 22 13:23:45 2024 -0800

    iommufd: Fix iopt_access_list_id overwrite bug
    
    commit aeb004c0cd6958e910123a1607634401009c9539 upstream.
    
    Syzkaller reported the following WARN_ON:
      WARNING: CPU: 1 PID: 4738 at drivers/iommu/iommufd/io_pagetable.c:1360
    
      Call Trace:
       iommufd_access_change_ioas+0x2fe/0x4e0
       iommufd_access_destroy_object+0x50/0xb0
       iommufd_object_remove+0x2a3/0x490
       iommufd_object_destroy_user
       iommufd_access_destroy+0x71/0xb0
       iommufd_test_staccess_release+0x89/0xd0
       __fput+0x272/0xb50
       __fput_sync+0x4b/0x60
       __do_sys_close
       __se_sys_close
       __x64_sys_close+0x8b/0x110
       do_syscall_x64
    
    The mismatch between the access pointer in the list and the passed-in
    pointer is resulting from an overwrite of access->iopt_access_list_id, in
    iopt_add_access(). Called from iommufd_access_change_ioas() when
    xa_alloc() succeeds but iopt_calculate_iova_alignment() fails.
    
    Add a new_id in iopt_add_access() and only update iopt_access_list_id when
    returning successfully.
    
    Cc: stable@vger.kernel.org
    Fixes: 9227da7816dd ("iommufd: Add iommufd_access_change_ioas(_id) helpers")
    Link: https://lore.kernel.org/r/2dda7acb25b8562ec5f1310de828ef5da9ef509c.1708636627.git.nicolinc@nvidia.com
    Reported-by: Jason Gunthorpe <jgg@nvidia.com>
    Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
    Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
    Reviewed-by: Kevin Tian <kevin.tian@intel.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iommufd: Fix protection fault in iommufd_test_syz_conv_iova [+ + +]

Author: Nicolin Chen <nicolinc@nvidia.com>
Date:   Thu Feb 22 13:23:47 2024 -0800

    iommufd: Fix protection fault in iommufd_test_syz_conv_iova
    
    commit cf7c2789822db8b5efa34f5ebcf1621bc0008d48 upstream.
    
    Syzkaller reported the following bug:
    
      general protection fault, probably for non-canonical address 0xdffffc0000000038: 0000 [#1] SMP KASAN
      KASAN: null-ptr-deref in range [0x00000000000001c0-0x00000000000001c7]
      Call Trace:
       lock_acquire
       lock_acquire+0x1ce/0x4f0
       down_read+0x93/0x4a0
       iommufd_test_syz_conv_iova+0x56/0x1f0
       iommufd_test_access_rw.isra.0+0x2ec/0x390
       iommufd_test+0x1058/0x1e30
       iommufd_fops_ioctl+0x381/0x510
       vfs_ioctl
       __do_sys_ioctl
       __se_sys_ioctl
       __x64_sys_ioctl+0x170/0x1e0
       do_syscall_x64
       do_syscall_64+0x71/0x140
    
    This is because the new iommufd_access_change_ioas() sets access->ioas to
    NULL during its process, so the lock might be gone in a concurrent racing
    context.
    
    Fix this by doing the same access->ioas sanity as iommufd_access_rw() and
    iommufd_access_pin_pages() functions do.
    
    Cc: stable@vger.kernel.org
    Fixes: 9227da7816dd ("iommufd: Add iommufd_access_change_ioas(_id) helpers")
    Link: https://lore.kernel.org/r/3f1932acaf1dd494d404c04364d73ce8f57f3e5e.1708636627.git.nicolinc@nvidia.com
    Reported-by: Jason Gunthorpe <jgg@nvidia.com>
    Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
    Reviewed-by: Kevin Tian <kevin.tian@intel.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipv6: fix potential "struct net" leak in inet6_rtm_getaddr() [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Feb 22 12:17:47 2024 +0000

    ipv6: fix potential "struct net" leak in inet6_rtm_getaddr()
    
    [ Upstream commit 10bfd453da64a057bcfd1a49fb6b271c48653cdb ]
    
    It seems that if userspace provides a correct IFA_TARGET_NETNSID value
    but no IFA_ADDRESS and IFA_LOCAL attributes, inet6_rtm_getaddr()
    returns -EINVAL with an elevated "struct net" refcount.
    
    Fixes: 6ecf4c37eb3e ("ipv6: enable IFA_TARGET_NETNSID for RTM_GETADDR")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Christian Brauner <brauner@kernel.org>
    Cc: David Ahern <dsahern@kernel.org>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kbuild: Add -Wa,--fatal-warnings to as-instr invocation [+ + +]

Author: Nathan Chancellor <nathan@kernel.org>
Date:   Thu Jan 25 10:32:11 2024 -0700

    kbuild: Add -Wa,--fatal-warnings to as-instr invocation
    
    commit 0ee695a471a750cad4fff22286d91e038b1ef62f upstream.
    
    Certain assembler instruction tests may only induce warnings from the
    assembler on an unsupported instruction or option, which causes as-instr
    to succeed when it was expected to fail. Some tests workaround this
    limitation by additionally testing that invalid input fails as expected.
    However, this is fragile if the assembler is changed to accept the
    invalid input, as it will cause the instruction/option to be unavailable
    like it was unsupported even when it is.
    
    Use '-Wa,--fatal-warnings' in the as-instr macro to turn these warnings
    into hard errors, which avoids this fragility and makes tests more
    robust and well formed.
    
    Cc: stable@vger.kernel.org
    Suggested-by: Eric Biggers <ebiggers@kernel.org>
    Signed-off-by: Nathan Chancellor <nathan@kernel.org>
    Tested-by: Eric Biggers <ebiggers@google.com>
    Tested-by: Andy Chiu <andybnac@gmail.com>
    Reviewed-by: Andy Chiu <andybnac@gmail.com>
    Tested-by: Conor Dooley <conor.dooley@microchip.com>
    Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
    Acked-by: Masahiro Yamada <masahiroy@kernel.org>
    Link: https://lore.kernel.org/r/20240125-fix-riscv-option-arch-llvm-18-v1-1-390ac9cc3cd0@kernel.org
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM/VMX: Move VERW closer to VMentry for MDS mitigation [+ + +]

Author: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Date:   Sun Mar 3 20:24:30 2024 -0800

    KVM/VMX: Move VERW closer to VMentry for MDS mitigation
    
    commit 43fb862de8f628c5db5e96831c915b9aebf62d33 upstream.
    
    During VMentry VERW is executed to mitigate MDS. After VERW, any memory
    access like register push onto stack may put host data in MDS affected
    CPU buffers. A guest can then use MDS to sample host data.
    
    Although likelihood of secrets surviving in registers at current VERW
    callsite is less, but it can't be ruled out. Harden the MDS mitigation
    by moving the VERW mitigation late in VMentry path.
    
    Note that VERW for MMIO Stale Data mitigation is unchanged because of
    the complexity of per-guest conditional VERW which is not easy to handle
    that late in asm with no GPRs available. If the CPU is also affected by
    MDS, VERW is unconditionally executed late in asm regardless of guest
    having MMIO access.
    
    Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
    Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
    Acked-by: Sean Christopherson <seanjc@google.com>
    Link: https://lore.kernel.org/all/20240213-delay-verw-v8-6-a6216d83edb7%40linux.intel.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH [+ + +]

Author: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Date:   Sun Mar 3 20:24:24 2024 -0800

    KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH
    
    From: Sean Christopherson <seanjc@google.com>
    
    commit 706a189dcf74d3b3f955e9384785e726ed6c7c80 upstream.
    
    Use EFLAGS.CF instead of EFLAGS.ZF to track whether to use VMRESUME versus
    VMLAUNCH.  Freeing up EFLAGS.ZF will allow doing VERW, which clobbers ZF,
    for MDS mitigations as late as possible without needing to duplicate VERW
    for both paths.
    
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
    Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
    Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
    Link: https://lore.kernel.org/all/20240213-delay-verw-v8-5-a6216d83edb7%40linux.intel.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

lan78xx: enable auto speed configuration for LAN7850 if no EEPROM is detected [+ + +]

Author: Oleksij Rempel <o.rempel@pengutronix.de>
Date:   Thu Feb 22 13:38:38 2024 +0100

    lan78xx: enable auto speed configuration for LAN7850 if no EEPROM is detected
    
    [ Upstream commit 0e67899abfbfdea0c3c0ed3fd263ffc601c5c157 ]
    
    Same as LAN7800, LAN7850 can be used without EEPROM. If EEPROM is not
    present or not flashed, LAN7850 will fail to sync the speed detected by the PHY
    with the MAC. In case link speed is 100Mbit, it will accidentally work,
    otherwise no data can be transferred.
    
    Better way would be to implement link_up callback, or set auto speed
    configuration unconditionally. But this changes would be more intrusive.
    So, for now, set it only if no EEPROM is found.
    
    Fixes: e69647a19c87 ("lan78xx: Set ASD in MAC_CR when EEE is enabled.")
    Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
    Link: https://lore.kernel.org/r/20240222123839.2816561-1-o.rempel@pengutronix.de
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

landlock: Fix asymmetric private inodes referring [+ + +]

Author: Mickaц╚l Salaц╪n <mic@digikod.net>
Date:   Mon Feb 19 20:03:45 2024 +0100

    landlock: Fix asymmetric private inodes referring
    
    commit d9818b3e906a0ee1ab02ea79e74a2f755fc5461a upstream.
    
    When linking or renaming a file, if only one of the source or
    destination directory is backed by an S_PRIVATE inode, then the related
    set of layer masks would be used as uninitialized by
    is_access_to_paths_allowed().  This would result to indeterministic
    access for one side instead of always being allowed.
    
    This bug could only be triggered with a mounted filesystem containing
    both S_PRIVATE and !S_PRIVATE inodes, which doesn't seem possible.
    
    The collect_domain_accesses() calls return early if
    is_nouser_or_private() returns false, which means that the directory's
    superblock has SB_NOUSER or its inode has S_PRIVATE.  Because rename or
    link actions are only allowed on the same mounted filesystem, the
    superblock is always the same for both source and destination
    directories.  However, it might be possible in theory to have an
    S_PRIVATE parent source inode with an !S_PRIVATE parent destination
    inode, or vice versa.
    
    To make sure this case is not an issue, explicitly initialized both set
    of layer masks to 0, which means to allow all actions on the related
    side.  If at least on side has !S_PRIVATE, then
    collect_domain_accesses() and is_access_to_paths_allowed() check for the
    required access rights.
    
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Christian Brauner <brauner@kernel.org>
    Cc: Gц╪nther Noack <gnoack@google.com>
    Cc: Jann Horn <jannh@google.com>
    Cc: Shervin Oloumi <enlightened@chromium.org>
    Cc: stable@vger.kernel.org
    Fixes: b91c3e4ea756 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER")
    Link: https://lore.kernel.org/r/20240219190345.2928627-1-mic@digikod.net
    Signed-off-by: Mickaц╚l Salaц╪n <mic@digikod.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux: Linux 6.7.9 [+ + +]

Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Wed Mar 6 14:54:01 2024 +0000

    Linux 6.7.9
    
    Link: https://lore.kernel.org/r/20240304211551.833500257@linuxfoundation.org
    Tested-by: SeongJae Park <sj@kernel.org>
    Tested-by: Luna Jernberg <droidbittin@gmail.com>
    Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
    Tested-by: Ronald Warsow <rwarsow@gmx.de>
    Tested-by: Salvatore Bonaccorso <carnil@debian.org>
    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
    Link: https://lore.kernel.org/r/20240305074649.580820283@linuxfoundation.org
    Tested-by: Luna Jernberg <droidbittin@gmail.com>
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Link: https://lore.kernel.org/r/20240305112824.448003471@linuxfoundation.org
    Tested-by: Luna Jernberg <droidbittin@gmail.com>
    Tested-by: Ron Economos <re@w6rz.net>
    Tested-by: Pavel Machek (CIP) <pavel@denx.de>
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Tested-by: Ronald Warsow <rwarsow@gmx.de>
    Tested-by: Ricardo B. Marliere <ricardo@marliere.net>
    Tested-by: Justin M. Forbes <jforbes@fedoraproject.org>
    Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mfd: twl6030-irq: Revert to use of_match_device() [+ + +]

Author: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Date:   Sun Oct 29 13:48:43 2023 +0200

    mfd: twl6030-irq: Revert to use of_match_device()
    
    commit 7a29fa05aeca2c16193f00a883c56ffc7c25b6c5 upstream.
    
    The core twl chip is probed via i2c and the dev->driver->of_match_table is
    NULL, causing the driver to fail to probe.
    
    This partially reverts:
    
      commit 1e0c866887f4 ("mfd: Use device_get_match_data() in a bunch of drivers")
    
    Fixes: 1e0c866887f4 ("mfd: Use device_get_match_data() in a bunch of drivers")
    Signed-off-by: Peter Ujfalusi <peter.ujfalusi@gmail.com>
    Link: https://lore.kernel.org/r/20231029114843.15553-1-peter.ujfalusi@gmail.com
    Signed-off-by: Lee Jones <lee@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/debug_vm_pgtable: fix BUG_ON with pud advanced test [+ + +]

Author: Aneesh Kumar K.V (IBM) <aneesh.kumar@kernel.org>
Date:   Mon Jan 29 11:30:22 2024 +0530

    mm/debug_vm_pgtable: fix BUG_ON with pud advanced test
    
    commit 720da1e593b85a550593b415bf1d79a053133451 upstream.
    
    Architectures like powerpc add debug checks to ensure we find only devmap
    PUD pte entries.  These debug checks are only done with CONFIG_DEBUG_VM.
    This patch marks the ptes used for PUD advanced test devmap pte entries so
    that we don't hit on debug checks on architecture like ppc64 as below.
    
    WARNING: CPU: 2 PID: 1 at arch/powerpc/mm/book3s64/radix_pgtable.c:1382 radix__pud_hugepage_update+0x38/0x138
    ....
    NIP [c0000000000a7004] radix__pud_hugepage_update+0x38/0x138
    LR [c0000000000a77a8] radix__pudp_huge_get_and_clear+0x28/0x60
    Call Trace:
    [c000000004a2f950] [c000000004a2f9a0] 0xc000000004a2f9a0 (unreliable)
    [c000000004a2f980] [000d34c100000000] 0xd34c100000000
    [c000000004a2f9a0] [c00000000206ba98] pud_advanced_tests+0x118/0x334
    [c000000004a2fa40] [c00000000206db34] debug_vm_pgtable+0xcbc/0x1c48
    [c000000004a2fc10] [c00000000000fd28] do_one_initcall+0x60/0x388
    
    Also
    
     kernel BUG at arch/powerpc/mm/book3s64/pgtable.c:202!
     ....
    
     NIP [c000000000096510] pudp_huge_get_and_clear_full+0x98/0x174
     LR [c00000000206bb34] pud_advanced_tests+0x1b4/0x334
     Call Trace:
     [c000000004a2f950] [000d34c100000000] 0xd34c100000000 (unreliable)
     [c000000004a2f9a0] [c00000000206bb34] pud_advanced_tests+0x1b4/0x334
     [c000000004a2fa40] [c00000000206db34] debug_vm_pgtable+0xcbc/0x1c48
     [c000000004a2fc10] [c00000000000fd28] do_one_initcall+0x60/0x388
    
    Link: https://lkml.kernel.org/r/20240129060022.68044-1-aneesh.kumar@kernel.org
    Fixes: 27af67f35631 ("powerpc/book3s64/mm: enable transparent pud hugepage")
    Signed-off-by: Aneesh Kumar K.V (IBM) <aneesh.kumar@kernel.org>
    Cc: Anshuman Khandual <anshuman.khandual@arm.com>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/vmscan: fix a bug calling wakeup_kswapd() with a wrong zone index [+ + +]

Author: Byungchul Park <byungchul@sk.com>
Date:   Fri Feb 16 20:15:02 2024 +0900

    mm/vmscan: fix a bug calling wakeup_kswapd() with a wrong zone index
    
    commit 2774f256e7c0219e2b0a0894af1c76bdabc4f974 upstream.
    
    With numa balancing on, when a numa system is running where a numa node
    doesn't have its local memory so it has no managed zones, the following
    oops has been observed.  It's because wakeup_kswapd() is called with a
    wrong zone index, -1.  Fixed it by checking the index before calling
    wakeup_kswapd().
    
    > BUG: unable to handle page fault for address: 00000000000033f3
    > #PF: supervisor read access in kernel mode
    > #PF: error_code(0x0000) - not-present page
    > PGD 0 P4D 0
    > Oops: 0000 [#1] PREEMPT SMP NOPTI
    > CPU: 2 PID: 895 Comm: masim Not tainted 6.6.0-dirty #255
    > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
    >    rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
    > RIP: 0010:wakeup_kswapd (./linux/mm/vmscan.c:7812)
    > Code: (omitted)
    > RSP: 0000:ffffc90004257d58 EFLAGS: 00010286
    > RAX: ffffffffffffffff RBX: ffff88883fff0480 RCX: 0000000000000003
    > RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88883fff0480
    > RBP: ffffffffffffffff R08: ff0003ffffffffff R09: ffffffffffffffff
    > R10: ffff888106c95540 R11: 0000000055555554 R12: 0000000000000003
    > R13: 0000000000000000 R14: 0000000000000000 R15: ffff88883fff0940
    > FS:  00007fc4b8124740(0000) GS:ffff888827c00000(0000) knlGS:0000000000000000
    > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    > CR2: 00000000000033f3 CR3: 000000026cc08004 CR4: 0000000000770ee0
    > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    > PKRU: 55555554
    > Call Trace:
    >  <TASK>
    > ? __die
    > ? page_fault_oops
    > ? __pte_offset_map_lock
    > ? exc_page_fault
    > ? asm_exc_page_fault
    > ? wakeup_kswapd
    > migrate_misplaced_page
    > __handle_mm_fault
    > handle_mm_fault
    > do_user_addr_fault
    > exc_page_fault
    > asm_exc_page_fault
    > RIP: 0033:0x55b897ba0808
    > Code: (omitted)
    > RSP: 002b:00007ffeefa821a0 EFLAGS: 00010287
    > RAX: 000055b89983acd0 RBX: 00007ffeefa823f8 RCX: 000055b89983acd0
    > RDX: 00007fc2f8122010 RSI: 0000000000020000 RDI: 000055b89983acd0
    > RBP: 00007ffeefa821a0 R08: 0000000000000037 R09: 0000000000000075
    > R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
    > R13: 00007ffeefa82410 R14: 000055b897ba5dd8 R15: 00007fc4b8340000
    >  </TASK>
    
    Link: https://lkml.kernel.org/r/20240216111502.79759-1-byungchul@sk.com
    Signed-off-by: Byungchul Park <byungchul@sk.com>
    Reported-by: Hyeongtak Ji <hyeongtak.ji@sk.com>
    Fixes: c574bbe917036 ("NUMA balancing: optimize page placement for memory tiering system")
    Reviewed-by: Oscar Salvador <osalvador@suse.de>
    Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: "Huang, Ying" <ying.huang@intel.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: cachestat: fix folio read-after-free in cache walk [+ + +]

Author: Nhat Pham <nphamcs@gmail.com>
Date:   Mon Feb 19 19:01:21 2024 -0800

    mm: cachestat: fix folio read-after-free in cache walk
    
    commit 3a75cb05d53f4a6823a32deb078de1366954a804 upstream.
    
    In cachestat, we access the folio from the page cache's xarray to compute
    its page offset, and check for its dirty and writeback flags.  However, we
    do not hold a reference to the folio before performing these actions,
    which means the folio can concurrently be released and reused as another
    folio/page/slab.
    
    Get around this altogether by just using xarray's existing machinery for
    the folio page offsets and dirty/writeback states.
    
    This changes behavior for tmpfs files to now always report zeroes in their
    dirty and writeback counters.  This is okay as tmpfs doesn't follow
    conventional writeback cache behavior: its pages get "cleaned" during
    swapout, after which they're no longer resident etc.
    
    Link: https://lkml.kernel.org/r/20240220153409.GA216065@cmpxchg.org
    Fixes: cf264e1329fb ("cachestat: implement cachestat syscall")
    Reported-by: Jann Horn <jannh@google.com>
    Suggested-by: Matthew Wilcox <willy@infradead.org>
    Signed-off-by: Nhat Pham <nphamcs@gmail.com>
    Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
    Tested-by: Jann Horn <jannh@google.com>
    Cc: <stable@vger.kernel.org>    [6.4+]
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: core: Fix eMMC initialization with 1-bit bus connection [+ + +]

Author: Ivan Semenov <ivan@semenov.dev>
Date:   Tue Feb 6 19:28:45 2024 +0200

    mmc: core: Fix eMMC initialization with 1-bit bus connection
    
    commit ff3206d2186d84e4f77e1378ba1d225633f17b9b upstream.
    
    Initializing an eMMC that's connected via a 1-bit bus is current failing,
    if the HW (DT) informs that 4-bit bus is supported. In fact this is a
    regression, as we were earlier capable of falling back to 1-bit mode, when
    switching to 4/8-bit bus failed. Therefore, let's restore the behaviour.
    
    Log for Samsung eMMC 5.1 chip connected via 1bit bus (only D0 pin)
    Before patch:
    [134509.044225] mmc0: switch to bus width 4 failed
    [134509.044509] mmc0: new high speed MMC card at address 0001
    [134509.054594] mmcblk0: mmc0:0001 BGUF4R 29.1 GiB
    [134509.281602] mmc0: switch to bus width 4 failed
    [134509.282638] I/O error, dev mmcblk0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    [134509.282657] Buffer I/O error on dev mmcblk0, logical block 0, async page read
    [134509.284598] I/O error, dev mmcblk0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    [134509.284602] Buffer I/O error on dev mmcblk0, logical block 0, async page read
    [134509.284609] ldm_validate_partition_table(): Disk read failed.
    [134509.286495] I/O error, dev mmcblk0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    [134509.286500] Buffer I/O error on dev mmcblk0, logical block 0, async page read
    [134509.288303] I/O error, dev mmcblk0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    [134509.288308] Buffer I/O error on dev mmcblk0, logical block 0, async page read
    [134509.289540] I/O error, dev mmcblk0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    [134509.289544] Buffer I/O error on dev mmcblk0, logical block 0, async page read
    [134509.289553]  mmcblk0: unable to read partition table
    [134509.289728] mmcblk0boot0: mmc0:0001 BGUF4R 31.9 MiB
    [134509.290283] mmcblk0boot1: mmc0:0001 BGUF4R 31.9 MiB
    [134509.294577] I/O error, dev mmcblk0, sector 0 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2
    [134509.295835] I/O error, dev mmcblk0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    [134509.295841] Buffer I/O error on dev mmcblk0, logical block 0, async page read
    
    After patch:
    
    [134551.089613] mmc0: switch to bus width 4 failed
    [134551.090377] mmc0: new high speed MMC card at address 0001
    [134551.102271] mmcblk0: mmc0:0001 BGUF4R 29.1 GiB
    [134551.113365]  mmcblk0: p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16 p17 p18 p19 p20 p21
    [134551.114262] mmcblk0boot0: mmc0:0001 BGUF4R 31.9 MiB
    [134551.114925] mmcblk0boot1: mmc0:0001 BGUF4R 31.9 MiB
    
    Fixes: 577fb13199b1 ("mmc: rework selection of bus speed mode")
    Cc: stable@vger.kernel.org
    Signed-off-by: Ivan Semenov <ivan@semenov.dev>
    Link: https://lore.kernel.org/r/20240206172845.34316-1-ivan@semenov.dev
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: mmci: stm32: fix DMA API overlapping mappings warning [+ + +]

Author: Christophe Kerello <christophe.kerello@foss.st.com>
Date:   Wed Feb 7 15:39:51 2024 +0100

    mmc: mmci: stm32: fix DMA API overlapping mappings warning
    
    commit 6b1ba3f9040be5efc4396d86c9752cdc564730be upstream.
    
    Turning on CONFIG_DMA_API_DEBUG_SG results in the following warning:
    
    DMA-API: mmci-pl18x 48220000.mmc: cacheline tracking EEXIST,
    overlapping mappings aren't supported
    WARNING: CPU: 1 PID: 51 at kernel/dma/debug.c:568
    add_dma_entry+0x234/0x2f4
    Modules linked in:
    CPU: 1 PID: 51 Comm: kworker/1:2 Not tainted 6.1.28 #1
    Hardware name: STMicroelectronics STM32MP257F-EV1 Evaluation Board (DT)
    Workqueue: events_freezable mmc_rescan
    Call trace:
    add_dma_entry+0x234/0x2f4
    debug_dma_map_sg+0x198/0x350
    __dma_map_sg_attrs+0xa0/0x110
    dma_map_sg_attrs+0x10/0x2c
    sdmmc_idma_prep_data+0x80/0xc0
    mmci_prep_data+0x38/0x84
    mmci_start_data+0x108/0x2dc
    mmci_request+0xe4/0x190
    __mmc_start_request+0x68/0x140
    mmc_start_request+0x94/0xc0
    mmc_wait_for_req+0x70/0x100
    mmc_send_tuning+0x108/0x1ac
    sdmmc_execute_tuning+0x14c/0x210
    mmc_execute_tuning+0x48/0xec
    mmc_sd_init_uhs_card.part.0+0x208/0x464
    mmc_sd_init_card+0x318/0x89c
    mmc_attach_sd+0xe4/0x180
    mmc_rescan+0x244/0x320
    
    DMA API debug brings to light leaking dma-mappings as dma_map_sg and
    dma_unmap_sg are not correctly balanced.
    
    If an error occurs in mmci_cmd_irq function, only mmci_dma_error
    function is called and as this API is not managed on stm32 variant,
    dma_unmap_sg is never called in this error path.
    
    Signed-off-by: Christophe Kerello <christophe.kerello@foss.st.com>
    Fixes: 46b723dd867d ("mmc: mmci: add stm32 sdmmc variant")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240207143951.938144-1-christophe.kerello@foss.st.com
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: sdhci-xenon: add timeout for PHY init complete [+ + +]

Author: Elad Nachman <enachman@marvell.com>
Date:   Thu Feb 22 21:17:14 2024 +0200

    mmc: sdhci-xenon: add timeout for PHY init complete
    
    commit 09e23823ae9a3e2d5d20f2e1efe0d6e48cef9129 upstream.
    
    AC5X spec says PHY init complete bit must be polled until zero.
    We see cases in which timeout can take longer than the standard
    calculation on AC5X, which is expected following the spec comment above.
    According to the spec, we must wait as long as it takes for that bit to
    toggle on AC5X.
    Cap that with 100 delay loops so we won't get stuck forever.
    
    Fixes: 06c8b667ff5b ("mmc: sdhci-xenon: Add support to PHYs of Marvell Xenon SDHC")
    Acked-by: Adrian Hunter <adrian.hunter@intel.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Elad Nachman <enachman@marvell.com>
    Link: https://lore.kernel.org/r/20240222191714.1216470-3-enachman@marvell.com
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: sdhci-xenon: fix PHY init clock stability [+ + +]

Author: Elad Nachman <enachman@marvell.com>
Date:   Thu Feb 22 22:09:30 2024 +0200

    mmc: sdhci-xenon: fix PHY init clock stability
    
    commit 8e9f25a290ae0016353c9ea13314c95fb3207812 upstream.
    
    Each time SD/mmc phy is initialized, at times, in some of
    the attempts, phy fails to completes its initialization
    which results into timeout error. Per the HW spec, it is
    a pre-requisite to ensure a stable SD clock before a phy
    initialization is attempted.
    
    Fixes: 06c8b667ff5b ("mmc: sdhci-xenon: Add support to PHYs of Marvell Xenon SDHC")
    Acked-by: Adrian Hunter <adrian.hunter@intel.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Elad Nachman <enachman@marvell.com>
    Link: https://lore.kernel.org/r/20240222200930.1277665-1-enachman@marvell.com
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mptcp: avoid printing warning once on client side [+ + +]

Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date:   Fri Feb 23 17:14:13 2024 +0100

    mptcp: avoid printing warning once on client side
    
    commit 5b49c41ac8f27aa3a63a1712b1f54f91015c18f2 upstream.
    
    After the 'Fixes' commit mentioned below, the client side might print
    the following warning once when a subflow is fully established at the
    reception of any valid additional ack:
    
      MPTCP: bogus mpc option on established client sk
    
    That's a normal situation, and no warning should be printed for that. We
    can then skip the check when the label is used.
    
    Fixes: e4a0fa47e816 ("mptcp: corner case locking for rx path fields initialization")
    Cc: stable@vger.kernel.org
    Suggested-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Mat Martineau <martineau@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-3-162e87e48497@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mptcp: fix double-free on socket dismantle [+ + +]

Author: Davide Caratti <dcaratti@redhat.com>
Date:   Fri Feb 23 17:14:18 2024 +0100

    mptcp: fix double-free on socket dismantle
    
    commit 10048689def7e40a4405acda16fdc6477d4ecc5c upstream.
    
    when MPTCP server accepts an incoming connection, it clones its listener
    socket. However, the pointer to 'inet_opt' for the new socket has the same
    value as the original one: as a consequence, on program exit it's possible
    to observe the following splat:
    
      BUG: KASAN: double-free in inet_sock_destruct+0x54f/0x8b0
      Free of addr ffff888485950880 by task swapper/25/0
    
      CPU: 25 PID: 0 Comm: swapper/25 Kdump: loaded Not tainted 6.8.0-rc1+ #609
      Hardware name: Supermicro SYS-6027R-72RF/X9DRH-7TF/7F/iTF/iF, BIOS 3.0  07/26/2013
      Call Trace:
       <IRQ>
       dump_stack_lvl+0x32/0x50
       print_report+0xca/0x620
       kasan_report_invalid_free+0x64/0x90
       __kasan_slab_free+0x1aa/0x1f0
       kfree+0xed/0x2e0
       inet_sock_destruct+0x54f/0x8b0
       __sk_destruct+0x48/0x5b0
       rcu_do_batch+0x34e/0xd90
       rcu_core+0x559/0xac0
       __do_softirq+0x183/0x5a4
       irq_exit_rcu+0x12d/0x170
       sysvec_apic_timer_interrupt+0x6b/0x80
       </IRQ>
       <TASK>
       asm_sysvec_apic_timer_interrupt+0x16/0x20
      RIP: 0010:cpuidle_enter_state+0x175/0x300
      Code: 30 00 0f 84 1f 01 00 00 83 e8 01 83 f8 ff 75 e5 48 83 c4 18 44 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc fb 45 85 ed <0f> 89 60 ff ff ff 48 c1 e5 06 48 c7 43 18 00 00 00 00 48 83 44 2b
      RSP: 0018:ffff888481cf7d90 EFLAGS: 00000202
      RAX: 0000000000000000 RBX: ffff88887facddc8 RCX: 0000000000000000
      RDX: 1ffff1110ff588b1 RSI: 0000000000000019 RDI: ffff88887fac4588
      RBP: 0000000000000004 R08: 0000000000000002 R09: 0000000000043080
      R10: 0009b02ea273363f R11: ffff88887fabf42b R12: ffffffff932592e0
      R13: 0000000000000004 R14: 0000000000000000 R15: 00000022c880ec80
       cpuidle_enter+0x4a/0xa0
       do_idle+0x310/0x410
       cpu_startup_entry+0x51/0x60
       start_secondary+0x211/0x270
       secondary_startup_64_no_verify+0x184/0x18b
       </TASK>
    
      Allocated by task 6853:
       kasan_save_stack+0x1c/0x40
       kasan_save_track+0x10/0x30
       __kasan_kmalloc+0xa6/0xb0
       __kmalloc+0x1eb/0x450
       cipso_v4_sock_setattr+0x96/0x360
       netlbl_sock_setattr+0x132/0x1f0
       selinux_netlbl_socket_post_create+0x6c/0x110
       selinux_socket_post_create+0x37b/0x7f0
       security_socket_post_create+0x63/0xb0
       __sock_create+0x305/0x450
       __sys_socket_create.part.23+0xbd/0x130
       __sys_socket+0x37/0xb0
       __x64_sys_socket+0x6f/0xb0
       do_syscall_64+0x83/0x160
       entry_SYSCALL_64_after_hwframe+0x6e/0x76
    
      Freed by task 6858:
       kasan_save_stack+0x1c/0x40
       kasan_save_track+0x10/0x30
       kasan_save_free_info+0x3b/0x60
       __kasan_slab_free+0x12c/0x1f0
       kfree+0xed/0x2e0
       inet_sock_destruct+0x54f/0x8b0
       __sk_destruct+0x48/0x5b0
       subflow_ulp_release+0x1f0/0x250
       tcp_cleanup_ulp+0x6e/0x110
       tcp_v4_destroy_sock+0x5a/0x3a0
       inet_csk_destroy_sock+0x135/0x390
       tcp_fin+0x416/0x5c0
       tcp_data_queue+0x1bc8/0x4310
       tcp_rcv_state_process+0x15a3/0x47b0
       tcp_v4_do_rcv+0x2c1/0x990
       tcp_v4_rcv+0x41fb/0x5ed0
       ip_protocol_deliver_rcu+0x6d/0x9f0
       ip_local_deliver_finish+0x278/0x360
       ip_local_deliver+0x182/0x2c0
       ip_rcv+0xb5/0x1c0
       __netif_receive_skb_one_core+0x16e/0x1b0
       process_backlog+0x1e3/0x650
       __napi_poll+0xa6/0x500
       net_rx_action+0x740/0xbb0
       __do_softirq+0x183/0x5a4
    
      The buggy address belongs to the object at ffff888485950880
       which belongs to the cache kmalloc-64 of size 64
      The buggy address is located 0 bytes inside of
       64-byte region [ffff888485950880, ffff8884859508c0)
    
      The buggy address belongs to the physical page:
      page:0000000056d1e95e refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888485950700 pfn:0x485950
      flags: 0x57ffffc0000800(slab|node=1|zone=2|lastcpupid=0x1fffff)
      page_type: 0xffffffff()
      raw: 0057ffffc0000800 ffff88810004c640 ffffea00121b8ac0 dead000000000006
      raw: ffff888485950700 0000000000200019 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
    
      Memory state around the buggy address:
       ffff888485950780: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
       ffff888485950800: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      >ffff888485950880: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
                         ^
       ffff888485950900: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
       ffff888485950980: 00 00 00 00 00 01 fc fc fc fc fc fc fc fc fc fc
    
    Something similar (a refcount underflow) happens with CALIPSO/IPv6. Fix
    this by duplicating IP / IPv6 options after clone, so that
    ip{,6}_sock_destruct() doesn't end up freeing the same memory area twice.
    
    Fixes: cf7da0d66cc1 ("mptcp: Create SUBFLOW socket for incoming connections")
    Cc: stable@vger.kernel.org
    Signed-off-by: Davide Caratti <dcaratti@redhat.com>
    Reviewed-by: Mat Martineau <martineau@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-8-162e87e48497@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mptcp: fix possible deadlock in subflow diag [+ + +]

Author: Paolo Abeni <pabeni@redhat.com>
Date:   Fri Feb 23 17:14:19 2024 +0100

    mptcp: fix possible deadlock in subflow diag
    
    commit d6a9608af9a75d13243d217f6ce1e30e57d56ffe upstream.
    
    Syzbot and Eric reported a lockdep splat in the subflow diag:
    
       WARNING: possible circular locking dependency detected
       6.8.0-rc4-syzkaller-00212-g40b9385dd8e6 #0 Not tainted
    
       syz-executor.2/24141 is trying to acquire lock:
       ffff888045870130 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at:
       tcp_diag_put_ulp net/ipv4/tcp_diag.c:100 [inline]
       ffff888045870130 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at:
       tcp_diag_get_aux+0x738/0x830 net/ipv4/tcp_diag.c:137
    
       but task is already holding lock:
       ffffc9000135e488 (&h->lhash2[i].lock){+.+.}-{2:2}, at: spin_lock
       include/linux/spinlock.h:351 [inline]
       ffffc9000135e488 (&h->lhash2[i].lock){+.+.}-{2:2}, at:
       inet_diag_dump_icsk+0x39f/0x1f80 net/ipv4/inet_diag.c:1038
    
       which lock already depends on the new lock.
    
       the existing dependency chain (in reverse order) is:
    
       -> #1 (&h->lhash2[i].lock){+.+.}-{2:2}:
       lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754
       __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
       _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
       spin_lock include/linux/spinlock.h:351 [inline]
       __inet_hash+0x335/0xbe0 net/ipv4/inet_hashtables.c:743
       inet_csk_listen_start+0x23a/0x320 net/ipv4/inet_connection_sock.c:1261
       __inet_listen_sk+0x2a2/0x770 net/ipv4/af_inet.c:217
       inet_listen+0xa3/0x110 net/ipv4/af_inet.c:239
       rds_tcp_listen_init+0x3fd/0x5a0 net/rds/tcp_listen.c:316
       rds_tcp_init_net+0x141/0x320 net/rds/tcp.c:577
       ops_init+0x352/0x610 net/core/net_namespace.c:136
       __register_pernet_operations net/core/net_namespace.c:1214 [inline]
       register_pernet_operations+0x2cb/0x660 net/core/net_namespace.c:1283
       register_pernet_device+0x33/0x80 net/core/net_namespace.c:1370
       rds_tcp_init+0x62/0xd0 net/rds/tcp.c:735
       do_one_initcall+0x238/0x830 init/main.c:1236
       do_initcall_level+0x157/0x210 init/main.c:1298
       do_initcalls+0x3f/0x80 init/main.c:1314
       kernel_init_freeable+0x42f/0x5d0 init/main.c:1551
       kernel_init+0x1d/0x2a0 init/main.c:1441
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1b/0x30 arch/x86/entry/entry_64.S:242
    
       -> #0 (k-sk_lock-AF_INET6){+.+.}-{0:0}:
       check_prev_add kernel/locking/lockdep.c:3134 [inline]
       check_prevs_add kernel/locking/lockdep.c:3253 [inline]
       validate_chain+0x18ca/0x58e0 kernel/locking/lockdep.c:3869
       __lock_acquire+0x1345/0x1fd0 kernel/locking/lockdep.c:5137
       lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754
       lock_sock_fast include/net/sock.h:1723 [inline]
       subflow_get_info+0x166/0xd20 net/mptcp/diag.c:28
       tcp_diag_put_ulp net/ipv4/tcp_diag.c:100 [inline]
       tcp_diag_get_aux+0x738/0x830 net/ipv4/tcp_diag.c:137
       inet_sk_diag_fill+0x10ed/0x1e00 net/ipv4/inet_diag.c:345
       inet_diag_dump_icsk+0x55b/0x1f80 net/ipv4/inet_diag.c:1061
       __inet_diag_dump+0x211/0x3a0 net/ipv4/inet_diag.c:1263
       inet_diag_dump_compat+0x1c1/0x2d0 net/ipv4/inet_diag.c:1371
       netlink_dump+0x59b/0xc80 net/netlink/af_netlink.c:2264
       __netlink_dump_start+0x5df/0x790 net/netlink/af_netlink.c:2370
       netlink_dump_start include/linux/netlink.h:338 [inline]
       inet_diag_rcv_msg_compat+0x209/0x4c0 net/ipv4/inet_diag.c:1405
       sock_diag_rcv_msg+0xe7/0x410
       netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2543
       sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:280
       netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline]
       netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1367
       netlink_sendmsg+0xa3b/0xd70 net/netlink/af_netlink.c:1908
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg+0x221/0x270 net/socket.c:745
       ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
       ___sys_sendmsg net/socket.c:2638 [inline]
       __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
       do_syscall_64+0xf9/0x240
       entry_SYSCALL_64_after_hwframe+0x6f/0x77
    
    As noted by Eric we can break the lock dependency chain avoid
    dumping any extended info for the mptcp subflow listener:
    nothing actually useful is presented there.
    
    Fixes: b8adb69a7d29 ("mptcp: fix lockless access in subflow ULP diag")
    Cc: stable@vger.kernel.org
    Reported-by: Eric Dumazet <edumazet@google.com>
    Closes: https://lore.kernel.org/netdev/CANn89iJ=Oecw6OZDwmSYc9HJKQ_G32uN11L+oUcMu+TOD5Xiaw@mail.gmail.com/
    Suggested-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-9-162e87e48497@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mptcp: fix potential wake-up event loss [+ + +]

Author: Paolo Abeni <pabeni@redhat.com>
Date:   Fri Feb 23 17:14:16 2024 +0100

    mptcp: fix potential wake-up event loss
    
    commit b111d8fbd2cbc63e05f3adfbbe0d4df655dfcc5b upstream.
    
    After the blamed commit below, the send buffer auto-tuning can
    happen after that the mptcp_propagate_sndbuf() completes - via
    the delegated action infrastructure.
    
    We must check for write space even after such change or we risk
    missing the wake-up event.
    
    Fixes: 8005184fd1ca ("mptcp: refactor sndbuf auto-tuning")
    Cc: stable@vger.kernel.org
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Mat Martineau <martineau@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-6-162e87e48497@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mptcp: fix snd_wnd initialization for passive socket [+ + +]

Author: Paolo Abeni <pabeni@redhat.com>
Date:   Fri Feb 23 17:14:15 2024 +0100

    mptcp: fix snd_wnd initialization for passive socket
    
    commit adf1bb78dab55e36d4d557aa2fb446ebcfe9e5ce upstream.
    
    Such value should be inherited from the first subflow, but
    passive sockets always used 'rsk_rcv_wnd'.
    
    Fixes: 6f8a612a33e4 ("mptcp: keep track of advertised windows right edge")
    Cc: stable@vger.kernel.org
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Mat Martineau <martineau@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-5-162e87e48497@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mptcp: map v4 address to v6 when destroying subflow [+ + +]

Author: Geliang Tang <tanggeliang@kylinos.cn>
Date:   Fri Feb 23 17:14:11 2024 +0100

    mptcp: map v4 address to v6 when destroying subflow
    
    commit 535d620ea5ff1a033dc64ee3d912acadc7470619 upstream.
    
    Address family of server side mismatches with that of client side, like
    in "userspace pm add & remove address" test:
    
        userspace_pm_add_addr $ns1 10.0.2.1 10
        userspace_pm_rm_sf $ns1 "::ffff:10.0.2.1" $SUB_ESTABLISHED
    
    That's because on the server side, the family is set to AF_INET6 and the
    v4 address is mapped in a v6 one.
    
    This patch fixes this issue. In mptcp_pm_nl_subflow_destroy_doit(), before
    checking local address family with remote address family, map an IPv4
    address to an IPv6 address if the pair is a v4-mapped address.
    
    Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/387
    Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment")
    Cc: stable@vger.kernel.org
    Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
    Reviewed-by: Mat Martineau <martineau@kernel.org>
    Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-1-162e87e48497@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mptcp: push at DSS boundaries [+ + +]

Author: Paolo Abeni <pabeni@redhat.com>
Date:   Fri Feb 23 17:14:14 2024 +0100

    mptcp: push at DSS boundaries
    
    commit b9cd26f640a308ea314ad23532de9a8592cd09d2 upstream.
    
    when inserting not contiguous data in the subflow write queue,
    the protocol creates a new skb and prevent the TCP stack from
    merging it later with already queued skbs by setting the EOR marker.
    
    Still no push flag is explicitly set at the end of previous GSO
    packet, making the aggregation on the receiver side sub-optimal -
    and packetdrill self-tests less predictable.
    
    Explicitly mark the end of not contiguous DSS with the push flag.
    
    Fixes: 6d0060f600ad ("mptcp: Write MPTCP DSS headers to outgoing data packets")
    Cc: stable@vger.kernel.org
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Mat Martineau <martineau@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-4-162e87e48497@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: rawnand: marvell: fix layouts [+ + +]

Author: Elad Nachman <enachman@marvell.com>
Date:   Mon Feb 5 15:44:35 2024 +0200

    mtd: rawnand: marvell: fix layouts
    
    commit e6a30d0c48a1e8a68f1cc413bee65302ab03ddfb upstream.
    
    The check in nand_base.c, nand_scan_tail() : has the following code:
    (ecc->steps * ecc->size != mtd->writesize) which fails for some NAND chips.
    Remove ECC entries in this driver which are not integral multiplications,
    and adjust the number of chunks for entries which fails the above
    calculation so it will calculate correctly (this was previously done
    automatically before the check and was removed in a later commit).
    
    Fixes: 68c18dae6888 ("mtd: rawnand: marvell: add missing layouts")
    Cc: stable@vger.kernel.org
    Signed-off-by: Elad Nachman <enachman@marvell.com>
    Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: spinand: gigadevice: Fix the get ecc status issue [+ + +]

Author: Han Xu <han.xu@nxp.com>
Date:   Wed Nov 8 09:07:01 2023 -0600

    mtd: spinand: gigadevice: Fix the get ecc status issue
    
    [ Upstream commit 59950610c0c00c7a06d8a75d2ee5d73dba4274cf ]
    
    Some GigaDevice ecc_get_status functions use on-stack buffer for
    spi_mem_op causes spi_mem_check_op failing, fix the issue by using
    spinand scratchbuf.
    
    Fixes: c40c7a990a46 ("mtd: spinand: Add support for GigaDevice GD5F1GQ4UExxG")
    Signed-off-by: Han Xu <han.xu@nxp.com>
    Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Link: https://lore.kernel.org/linux-mtd/20231108150701.593912-1-han.xu@nxp.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: dpaa: fman_memac: accept phy-interface-type = "10gbase-r" in the device tree [+ + +]

Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date:   Wed Feb 21 00:34:42 2024 +0200

    net: dpaa: fman_memac: accept phy-interface-type = "10gbase-r" in the device tree
    
    [ Upstream commit 734f06db599f66d6a159c78abfdbadfea3b7d43b ]
    
    Since commit 5d93cfcf7360 ("net: dpaa: Convert to phylink"), we support
    the "10gbase-r" phy-mode through a driver-based conversion of "xgmii",
    but we still don't actually support it when the device tree specifies
    "10gbase-r" proper.
    
    This is because boards such as LS1046A-RDB do not define pcs-handle-names
    (for whatever reason) in the ethernet@f0000 device tree node, and the
    code enters through this code path:
    
            err = of_property_match_string(mac_node, "pcs-handle-names", "xfi");
            // code takes neither branch and falls through
            if (err >= 0) {
                    (...)
            } else if (err != -EINVAL && err != -ENODATA) {
                    goto _return_fm_mac_free;
            }
    
            (...)
    
            /* For compatibility, if pcs-handle-names is missing, we assume this
             * phy is the first one in pcsphy-handle
             */
            err = of_property_match_string(mac_node, "pcs-handle-names", "sgmii");
            if (err == -EINVAL || err == -ENODATA)
                    pcs = memac_pcs_create(mac_node, 0); // code takes this branch
            else if (err < 0)
                    goto _return_fm_mac_free;
            else
                    pcs = memac_pcs_create(mac_node, err);
    
            // A default PCS is created and saved in "pcs"
    
            // This determination fails and mistakenly saves the default PCS
            // memac->sgmii_pcs instead of memac->xfi_pcs, because at this
            // stage, mac_dev->phy_if == PHY_INTERFACE_MODE_10GBASER.
            if (err && mac_dev->phy_if == PHY_INTERFACE_MODE_XGMII)
                    memac->xfi_pcs = pcs;
            else
                    memac->sgmii_pcs = pcs;
    
    In other words, in the absence of pcs-handle-names, the default
    xfi_pcs assignment logic only works when in the device tree we have
    PHY_INTERFACE_MODE_XGMII.
    
    By reversing the order between the fallback xfi_pcs assignment and the
    "xgmii" overwrite with "10gbase-r", we are able to support both values
    in the device tree, with identical behavior.
    
    Currently, it is impossible to make the s/xgmii/10gbase-r/ device tree
    conversion, because it would break forward compatibility (new device
    tree with old kernel). The only way to modify existing device trees to
    phy-interface-mode = "10gbase-r" is to fix stable kernels to accept this
    value and handle it properly.
    
    One reason why the conversion is desirable is because with pre-phylink
    kernels, the Aquantia PHY driver used to warn about the improper use
    of PHY_INTERFACE_MODE_XGMII [1]. It is best to have a single (latest)
    device tree that works with all supported stable kernel versions.
    
    Note that the blamed commit does not constitute a regression per se.
    Older stable kernels like 6.1 still do not work with "10gbase-r", but
    for a different reason. That is a battle for another time.
    
    [1] https://lore.kernel.org/netdev/20240214-ls1046-dts-use-10gbase-r-v1-1-8c2d68547393@concurrent-rt.com/
    
    Fixes: 5d93cfcf7360 ("net: dpaa: Convert to phylink")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Reviewed-by: Sean Anderson <sean.anderson@seco.com>
    Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: hsr: Use correct offset for HSR TLV values in supervisory HSR frames [+ + +]

Author: Lukasz Majewski <lukma@denx.de>
Date:   Wed Feb 28 09:56:44 2024 +0100

    net: hsr: Use correct offset for HSR TLV values in supervisory HSR frames
    
    [ Upstream commit 51dd4ee0372228ffb0f7709fa7aa0678d4199d06 ]
    
    Current HSR implementation uses following supervisory frame (even for
    HSRv1 the HSR tag is not is not present):
    
    00000000: 01 15 4e 00 01 2d XX YY ZZ 94 77 10 88 fb 00 01
    00000010: 7e 1c 17 06 XX YY ZZ 94 77 10 1e 06 XX YY ZZ 94
    00000020: 77 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    00000030: 00 00 00 00 00 00 00 00 00 00 00 00
    
    The current code adds extra two bytes (i.e. sizeof(struct hsr_sup_tlv))
    when offset for skb_pull() is calculated.
    This is wrong, as both 'struct hsrv1_ethhdr_sp' and 'hsrv0_ethhdr_sp'
    already have 'struct hsr_sup_tag' defined in them, so there is no need
    for adding extra two bytes.
    
    This code was working correctly as with no RedBox support, the check for
    HSR_TLV_EOT (0x00) was off by two bytes, which were corresponding to
    zeroed padded bytes for minimal packet size.
    
    Fixes: eafaa88b3eb7 ("net: hsr: Add support for redbox supervision frames")
    Signed-off-by: Lukasz Majewski <lukma@denx.de>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Link: https://lore.kernel.org/r/20240228085644.3618044-1-lukma@denx.de
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: ip_tunnel: prevent perpetual headroom growth [+ + +]

Author: Florian Westphal <fw@strlen.de>
Date:   Tue Feb 20 14:56:02 2024 +0100

    net: ip_tunnel: prevent perpetual headroom growth
    
    [ Upstream commit 5ae1e9922bbdbaeb9cfbe91085ab75927488ac0f ]
    
    syzkaller triggered following kasan splat:
    BUG: KASAN: use-after-free in __skb_flow_dissect+0x19d1/0x7a50 net/core/flow_dissector.c:1170
    Read of size 1 at addr ffff88812fb4000e by task syz-executor183/5191
    [..]
     kasan_report+0xda/0x110 mm/kasan/report.c:588
     __skb_flow_dissect+0x19d1/0x7a50 net/core/flow_dissector.c:1170
     skb_flow_dissect_flow_keys include/linux/skbuff.h:1514 [inline]
     ___skb_get_hash net/core/flow_dissector.c:1791 [inline]
     __skb_get_hash+0xc7/0x540 net/core/flow_dissector.c:1856
     skb_get_hash include/linux/skbuff.h:1556 [inline]
     ip_tunnel_xmit+0x1855/0x33c0 net/ipv4/ip_tunnel.c:748
     ipip_tunnel_xmit+0x3cc/0x4e0 net/ipv4/ipip.c:308
     __netdev_start_xmit include/linux/netdevice.h:4940 [inline]
     netdev_start_xmit include/linux/netdevice.h:4954 [inline]
     xmit_one net/core/dev.c:3548 [inline]
     dev_hard_start_xmit+0x13d/0x6d0 net/core/dev.c:3564
     __dev_queue_xmit+0x7c1/0x3d60 net/core/dev.c:4349
     dev_queue_xmit include/linux/netdevice.h:3134 [inline]
     neigh_connected_output+0x42c/0x5d0 net/core/neighbour.c:1592
     ...
     ip_finish_output2+0x833/0x2550 net/ipv4/ip_output.c:235
     ip_finish_output+0x31/0x310 net/ipv4/ip_output.c:323
     ..
     iptunnel_xmit+0x5b4/0x9b0 net/ipv4/ip_tunnel_core.c:82
     ip_tunnel_xmit+0x1dbc/0x33c0 net/ipv4/ip_tunnel.c:831
     ipgre_xmit+0x4a1/0x980 net/ipv4/ip_gre.c:665
     __netdev_start_xmit include/linux/netdevice.h:4940 [inline]
     netdev_start_xmit include/linux/netdevice.h:4954 [inline]
     xmit_one net/core/dev.c:3548 [inline]
     dev_hard_start_xmit+0x13d/0x6d0 net/core/dev.c:3564
     ...
    
    The splat occurs because skb->data points past skb->head allocated area.
    This is because neigh layer does:
      __skb_pull(skb, skb_network_offset(skb));
    
    ... but skb_network_offset() returns a negative offset and __skb_pull()
    arg is unsigned.  IOW, we skb->data gets "adjusted" by a huge value.
    
    The negative value is returned because skb->head and skb->data distance is
    more than 64k and skb->network_header (u16) has wrapped around.
    
    The bug is in the ip_tunnel infrastructure, which can cause
    dev->needed_headroom to increment ad infinitum.
    
    The syzkaller reproducer consists of packets getting routed via a gre
    tunnel, and route of gre encapsulated packets pointing at another (ipip)
    tunnel.  The ipip encapsulation finds gre0 as next output device.
    
    This results in the following pattern:
    
    1). First packet is to be sent out via gre0.
    Route lookup found an output device, ipip0.
    
    2).
    ip_tunnel_xmit for gre0 bumps gre0->needed_headroom based on the future
    output device, rt.dev->needed_headroom (ipip0).
    
    3).
    ip output / start_xmit moves skb on to ipip0. which runs the same
    code path again (xmit recursion).
    
    4).
    Routing step for the post-gre0-encap packet finds gre0 as output device
    to use for ipip0 encapsulated packet.
    
    tunl0->needed_headroom is then incremented based on the (already bumped)
    gre0 device headroom.
    
    This repeats for every future packet:
    
    gre0->needed_headroom gets inflated because previous packets' ipip0 step
    incremented rt->dev (gre0) headroom, and ipip0 incremented because gre0
    needed_headroom was increased.
    
    For each subsequent packet, gre/ipip0->needed_headroom grows until
    post-expand-head reallocations result in a skb->head/data distance of
    more than 64k.
    
    Once that happens, skb->network_header (u16) wraps around when
    pskb_expand_head tries to make sure that skb_network_offset() is unchanged
    after the headroom expansion/reallocation.
    
    After this skb_network_offset(skb) returns a different (and negative)
    result post headroom expansion.
    
    The next trip to neigh layer (or anything else that would __skb_pull the
    network header) makes skb->data point to a memory location outside
    skb->head area.
    
    v2: Cap the needed_headroom update to an arbitarily chosen upperlimit to
    prevent perpetual increase instead of dropping the headroom increment
    completely.
    
    Reported-and-tested-by: syzbot+bfde3bef047a81b8fde6@syzkaller.appspotmail.com
    Closes: https://groups.google.com/g/syzkaller-bugs/c/fL9G6GtWskY/m/VKk_PR5FBAAJ
    Fixes: 243aad830e8a ("ip_gre: include route header_len in max_headroom calculation")
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20240220135606.4939-1-fw@strlen.de
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: lan78xx: fix "softirq work is pending" error [+ + +]

Author: Oleksij Rempel <o.rempel@pengutronix.de>
Date:   Mon Feb 26 12:08:20 2024 +0100

    net: lan78xx: fix "softirq work is pending" error
    
    [ Upstream commit e3d5d70cb483df8296dd44e9ae3b6355ef86494c ]
    
    Disable BH around the call to napi_schedule() to avoid following
    error:
    NOHZ tick-stop error: local softirq work is pending, handler #08!!!
    
    Fixes: ec4c7e12396b ("lan78xx: Introduce NAPI polling support")
    Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
    Link: https://lore.kernel.org/r/20240226110820.2113584-1-o.rempel@pengutronix.de
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: mctp: take ownership of skb in mctp_local_output [+ + +]

Author: Jeremy Kerr <jk@codeconstruct.com.au>
Date:   Tue Feb 20 16:10:53 2024 +0800

    net: mctp: take ownership of skb in mctp_local_output
    
    [ Upstream commit 3773d65ae5154ed7df404b050fd7387a36ab5ef3 ]
    
    Currently, mctp_local_output only takes ownership of skb on success, and
    we may leak an skb if mctp_local_output fails in specific states; the
    skb ownership isn't transferred until the actual output routing occurs.
    
    Instead, make mctp_local_output free the skb on all error paths up to
    the route action, so it always consumes the passed skb.
    
    Fixes: 833ef3b91de6 ("mctp: Populate socket implementation")
    Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20240220081053.1439104-1-jk@codeconstruct.com.au
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: usb: dm9601: fix wrong return value in dm9601_mdio_read [+ + +]

Author: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Date:   Sun Feb 25 00:20:06 2024 +0100

    net: usb: dm9601: fix wrong return value in dm9601_mdio_read
    
    [ Upstream commit c68b2c9eba38ec3f60f4894b189090febf4d8d22 ]
    
    The MII code does not check the return value of mdio_read (among
    others), and therefore no error code should be sent. A previous fix to
    the use of an uninitialized variable propagates negative error codes,
    that might lead to wrong operations by the MII library.
    
    An example of such issues is the use of mii_nway_restart by the dm9601
    driver. The mii_nway_restart function does not check the value returned
    by mdio_read, which in this case might be a negative number which could
    contain the exact bit the function checks (BMCR_ANENABLE = 0x1000).
    
    Return zero in case of error, as it is common practice in users of
    mdio_read to avoid wrong uses of the return value.
    
    Fixes: 8f8abb863fa5 ("net: usb: dm9601: fix uninitialized variable use in dm9601_mdio_read")
    Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Reviewed-by: Peter Korsgaard <peter@korsgaard.com>
    Link: https://lore.kernel.org/r/20240225-dm9601_ret_err-v1-1-02c1d959ea59@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: veth: clear GRO when clearing XDP even when down [+ + +]

Author: Jakub Kicinski <kuba@kernel.org>
Date:   Wed Feb 21 15:12:10 2024 -0800

    net: veth: clear GRO when clearing XDP even when down
    
    [ Upstream commit fe9f801355f0b47668419f30f1fac1cf4539e736 ]
    
    veth sets NETIF_F_GRO automatically when XDP is enabled,
    because both features use the same NAPI machinery.
    
    The logic to clear NETIF_F_GRO sits in veth_disable_xdp() which
    is called both on ndo_stop and when XDP is turned off.
    To avoid the flag from being cleared when the device is brought
    down, the clearing is skipped when IFF_UP is not set.
    Bringing the device down should indeed not modify its features.
    
    Unfortunately, this means that clearing is also skipped when
    XDP is disabled _while_ the device is down. And there's nothing
    on the open path to bring the device features back into sync.
    IOW if user enables XDP, disables it and then brings the device
    up we'll end up with a stray GRO flag set but no NAPI instances.
    
    We don't depend on the GRO flag on the datapath, so the datapath
    won't crash. We will crash (or hang), however, next time features
    are sync'ed (either by user via ethtool or peer changing its config).
    The GRO flag will go away, and veth will try to disable the NAPIs.
    But the open path never created them since XDP was off, the GRO flag
    was a stray. If NAPI was initialized before we'll hang in napi_disable().
    If it never was we'll crash trying to stop uninitialized hrtimer.
    
    Move the GRO flag updates to the XDP enable / disable paths,
    instead of mixing them with the ndo_open / ndo_close paths.
    
    Fixes: d3256efd8e8b ("veth: allow enabling NAPI even without XDP")
    Reported-by: Thomas Gleixner <tglx@linutronix.de>
    Reported-by: syzbot+039399a9b96297ddedca@syzkaller.appspotmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Reviewed-by: Toke Hц╦iland-Jц╦rgensen <toke@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: bridge: confirm multicast packets before passing them up the stack [+ + +]

Author: Florian Westphal <fw@strlen.de>
Date:   Tue Feb 27 16:17:51 2024 +0100

    netfilter: bridge: confirm multicast packets before passing them up the stack
    
    [ Upstream commit 62e7151ae3eb465e0ab52a20c941ff33bb6332e9 ]
    
    conntrack nf_confirm logic cannot handle cloned skbs referencing
    the same nf_conn entry, which will happen for multicast (broadcast)
    frames on bridges.
    
     Example:
        macvlan0
           |
          br0
         /  \
      ethX    ethY
    
     ethX (or Y) receives a L2 multicast or broadcast packet containing
     an IP packet, flow is not yet in conntrack table.
    
     1. skb passes through bridge and fake-ip (br_netfilter)Prerouting.
        -> skb->_nfct now references a unconfirmed entry
     2. skb is broad/mcast packet. bridge now passes clones out on each bridge
        interface.
     3. skb gets passed up the stack.
     4. In macvlan case, macvlan driver retains clone(s) of the mcast skb
        and schedules a work queue to send them out on the lower devices.
    
        The clone skb->_nfct is not a copy, it is the same entry as the
        original skb.  The macvlan rx handler then returns RX_HANDLER_PASS.
     5. Normal conntrack hooks (in NF_INET_LOCAL_IN) confirm the orig skb.
    
    The Macvlan broadcast worker and normal confirm path will race.
    
    This race will not happen if step 2 already confirmed a clone. In that
    case later steps perform skb_clone() with skb->_nfct already confirmed (in
    hash table).  This works fine.
    
    But such confirmation won't happen when eb/ip/nftables rules dropped the
    packets before they reached the nf_confirm step in postrouting.
    
    Pablo points out that nf_conntrack_bridge doesn't allow use of stateful
    nat, so we can safely discard the nf_conn entry and let inet call
    conntrack again.
    
    This doesn't work for bridge netfilter: skb could have a nat
    transformation. Also bridge nf prevents re-invocation of inet prerouting
    via 'sabotage_in' hook.
    
    Work around this problem by explicit confirmation of the entry at LOCAL_IN
    time, before upper layer has a chance to clone the unconfirmed entry.
    
    The downside is that this disables NAT and conntrack helpers.
    
    Alternative fix would be to add locking to all code parts that deal with
    unconfirmed packets, but even if that could be done in a sane way this
    opens up other problems, for example:
    
    -m physdev --physdev-out eth0 -j SNAT --snat-to 1.2.3.4
    -m physdev --physdev-out eth1 -j SNAT --snat-to 1.2.3.5
    
    For multicast case, only one of such conflicting mappings will be
    created, conntrack only handles 1:1 NAT mappings.
    
    Users should set create a setup that explicitly marks such traffic
    NOTRACK (conntrack bypass) to avoid this, but we cannot auto-bypass
    them, ruleset might have accept rules for untracked traffic already,
    so user-visible behaviour would change.
    
    Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217777
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nf_tables: allow NFPROTO_INET in nft_(match/target)_validate() [+ + +]

Author: Ignat Korchagin <ignat@cloudflare.com>
Date:   Thu Feb 22 10:33:08 2024 +0000

    netfilter: nf_tables: allow NFPROTO_INET in nft_(match/target)_validate()
    
    [ Upstream commit 7e0f122c65912740327e4c54472acaa5f85868cb ]
    
    Commit d0009effa886 ("netfilter: nf_tables: validate NFPROTO_* family") added
    some validation of NFPROTO_* families in the nft_compat module, but it broke
    the ability to use legacy iptables modules in dual-stack nftables.
    
    While with legacy iptables one had to independently manage IPv4 and IPv6
    tables, with nftables it is possible to have dual-stack tables sharing the
    rules. Moreover, it was possible to use rules based on legacy iptables
    match/target modules in dual-stack nftables.
    
    As an example, the program from [2] creates an INET dual-stack family table
    using an xt_bpf based rule, which looks like the following (the actual output
    was generated with a patched nft tool as the current nft tool does not parse
    dual stack tables with legacy match rules, so consider it for illustrative
    purposes only):
    
    table inet testfw {
      chain input {
        type filter hook prerouting priority filter; policy accept;
        bytecode counter packets 0 bytes 0 accept
      }
    }
    
    After d0009effa886 ("netfilter: nf_tables: validate NFPROTO_* family") we get
    EOPNOTSUPP for the above program.
    
    Fix this by allowing NFPROTO_INET for nft_(match/target)_validate(), but also
    restrict the functions to classic iptables hooks.
    
    Changes in v3:
      * clarify that upstream nft will not display such configuration properly and
        that the output was generated with a patched nft tool
      * remove example program from commit description and link to it instead
      * no code changes otherwise
    
    Changes in v2:
      * restrict nft_(match/target)_validate() to classic iptables hooks
      * rewrite example program to use unmodified libnftnl
    
    Fixes: d0009effa886 ("netfilter: nf_tables: validate NFPROTO_* family")
    Link: https://lore.kernel.org/all/Zc1PfoWN38UuFJRI@calendula/T/#mc947262582c90fec044c7a3398cc92fac7afea72 [1]
    Link: https://lore.kernel.org/all/20240220145509.53357-1-ignat@cloudflare.com/ [2]
    Reported-by: Jordan Griege <jgriege@cloudflare.com>
    Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netlink: add nla be16/32 types to minlen array [+ + +]

Author: Florian Westphal <fw@strlen.de>
Date:   Wed Feb 21 18:27:33 2024 +0100

    netlink: add nla be16/32 types to minlen array
    
    [ Upstream commit 9a0d18853c280f6a0ee99f91619f2442a17a323a ]
    
    BUG: KMSAN: uninit-value in nla_validate_range_unsigned lib/nlattr.c:222 [inline]
    BUG: KMSAN: uninit-value in nla_validate_int_range lib/nlattr.c:336 [inline]
    BUG: KMSAN: uninit-value in validate_nla lib/nlattr.c:575 [inline]
    BUG: KMSAN: uninit-value in __nla_validate_parse+0x2e20/0x45c0 lib/nlattr.c:631
     nla_validate_range_unsigned lib/nlattr.c:222 [inline]
     nla_validate_int_range lib/nlattr.c:336 [inline]
     validate_nla lib/nlattr.c:575 [inline]
    ...
    
    The message in question matches this policy:
    
     [NFTA_TARGET_REV]       = NLA_POLICY_MAX(NLA_BE32, 255),
    
    but because NLA_BE32 size in minlen array is 0, the validation
    code will read past the malformed (too small) attribute.
    
    Note: Other attributes, e.g. BITFIELD32, SINT, UINT.. are also missing:
    those likely should be added too.
    
    Reported-by: syzbot+3f497b07aa3baf2fb4d0@syzkaller.appspotmail.com
    Reported-by: xingwei lee <xrivendell7@gmail.com>
    Closes: https://lore.kernel.org/all/CABOYnLzFYHSnvTyS6zGa-udNX55+izqkOt2sB9WDqUcEGW6n8w@mail.gmail.com/raw
    Fixes: ecaf75ffd5f5 ("netlink: introduce bigendian integer types")
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Link: https://lore.kernel.org/r/20240221172740.5092-1-fw@strlen.de
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netlink: Fix kernel-infoleak-after-free in __skb_datagram_iter [+ + +]

Author: Ryosuke Yasuoka <ryasuoka@redhat.com>
Date:   Wed Feb 21 16:40:48 2024 +0900

    netlink: Fix kernel-infoleak-after-free in __skb_datagram_iter
    
    [ Upstream commit 661779e1fcafe1b74b3f3fe8e980c1e207fea1fd ]
    
    syzbot reported the following uninit-value access issue [1]:
    
    netlink_to_full_skb() creates a new `skb` and puts the `skb->data`
    passed as a 1st arg of netlink_to_full_skb() onto new `skb`. The data
    size is specified as `len` and passed to skb_put_data(). This `len`
    is based on `skb->end` that is not data offset but buffer offset. The
    `skb->end` contains data and tailroom. Since the tailroom is not
    initialized when the new `skb` created, KMSAN detects uninitialized
    memory area when copying the data.
    
    This patch resolved this issue by correct the len from `skb->end` to
    `skb->len`, which is the actual data offset.
    
    BUG: KMSAN: kernel-infoleak-after-free in instrument_copy_to_user include/linux/instrumented.h:114 [inline]
    BUG: KMSAN: kernel-infoleak-after-free in copy_to_user_iter lib/iov_iter.c:24 [inline]
    BUG: KMSAN: kernel-infoleak-after-free in iterate_ubuf include/linux/iov_iter.h:29 [inline]
    BUG: KMSAN: kernel-infoleak-after-free in iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
    BUG: KMSAN: kernel-infoleak-after-free in iterate_and_advance include/linux/iov_iter.h:271 [inline]
    BUG: KMSAN: kernel-infoleak-after-free in _copy_to_iter+0x364/0x2520 lib/iov_iter.c:186
     instrument_copy_to_user include/linux/instrumented.h:114 [inline]
     copy_to_user_iter lib/iov_iter.c:24 [inline]
     iterate_ubuf include/linux/iov_iter.h:29 [inline]
     iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
     iterate_and_advance include/linux/iov_iter.h:271 [inline]
     _copy_to_iter+0x364/0x2520 lib/iov_iter.c:186
     copy_to_iter include/linux/uio.h:197 [inline]
     simple_copy_to_iter+0x68/0xa0 net/core/datagram.c:532
     __skb_datagram_iter+0x123/0xdc0 net/core/datagram.c:420
     skb_copy_datagram_iter+0x5c/0x200 net/core/datagram.c:546
     skb_copy_datagram_msg include/linux/skbuff.h:3960 [inline]
     packet_recvmsg+0xd9c/0x2000 net/packet/af_packet.c:3482
     sock_recvmsg_nosec net/socket.c:1044 [inline]
     sock_recvmsg net/socket.c:1066 [inline]
     sock_read_iter+0x467/0x580 net/socket.c:1136
     call_read_iter include/linux/fs.h:2014 [inline]
     new_sync_read fs/read_write.c:389 [inline]
     vfs_read+0x8f6/0xe00 fs/read_write.c:470
     ksys_read+0x20f/0x4c0 fs/read_write.c:613
     __do_sys_read fs/read_write.c:623 [inline]
     __se_sys_read fs/read_write.c:621 [inline]
     __x64_sys_read+0x93/0xd0 fs/read_write.c:621
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x63/0x6b
    
    Uninit was stored to memory at:
     skb_put_data include/linux/skbuff.h:2622 [inline]
     netlink_to_full_skb net/netlink/af_netlink.c:181 [inline]
     __netlink_deliver_tap_skb net/netlink/af_netlink.c:298 [inline]
     __netlink_deliver_tap+0x5be/0xc90 net/netlink/af_netlink.c:325
     netlink_deliver_tap net/netlink/af_netlink.c:338 [inline]
     netlink_deliver_tap_kernel net/netlink/af_netlink.c:347 [inline]
     netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline]
     netlink_unicast+0x10f1/0x1250 net/netlink/af_netlink.c:1368
     netlink_sendmsg+0x1238/0x13d0 net/netlink/af_netlink.c:1910
     sock_sendmsg_nosec net/socket.c:730 [inline]
     __sock_sendmsg net/socket.c:745 [inline]
     ____sys_sendmsg+0x9c2/0xd60 net/socket.c:2584
     ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
     __sys_sendmsg net/socket.c:2667 [inline]
     __do_sys_sendmsg net/socket.c:2676 [inline]
     __se_sys_sendmsg net/socket.c:2674 [inline]
     __x64_sys_sendmsg+0x307/0x490 net/socket.c:2674
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x63/0x6b
    
    Uninit was created at:
     free_pages_prepare mm/page_alloc.c:1087 [inline]
     free_unref_page_prepare+0xb0/0xa40 mm/page_alloc.c:2347
     free_unref_page_list+0xeb/0x1100 mm/page_alloc.c:2533
     release_pages+0x23d3/0x2410 mm/swap.c:1042
     free_pages_and_swap_cache+0xd9/0xf0 mm/swap_state.c:316
     tlb_batch_pages_flush mm/mmu_gather.c:98 [inline]
     tlb_flush_mmu_free mm/mmu_gather.c:293 [inline]
     tlb_flush_mmu+0x6f5/0x980 mm/mmu_gather.c:300
     tlb_finish_mmu+0x101/0x260 mm/mmu_gather.c:392
     exit_mmap+0x49e/0xd30 mm/mmap.c:3321
     __mmput+0x13f/0x530 kernel/fork.c:1349
     mmput+0x8a/0xa0 kernel/fork.c:1371
     exit_mm+0x1b8/0x360 kernel/exit.c:567
     do_exit+0xd57/0x4080 kernel/exit.c:858
     do_group_exit+0x2fd/0x390 kernel/exit.c:1021
     __do_sys_exit_group kernel/exit.c:1032 [inline]
     __se_sys_exit_group kernel/exit.c:1030 [inline]
     __x64_sys_exit_group+0x3c/0x50 kernel/exit.c:1030
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x63/0x6b
    
    Bytes 3852-3903 of 3904 are uninitialized
    Memory access of size 3904 starts at ffff88812ea1e000
    Data copied to user address 0000000020003280
    
    CPU: 1 PID: 5043 Comm: syz-executor297 Not tainted 6.7.0-rc5-syzkaller-00047-g5bd7ef53ffe5 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
    
    Fixes: 1853c9496460 ("netlink, mmap: transform mmap skb into full skb on taps")
    Reported-and-tested-by: syzbot+34ad5fab48f7bf510349@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=34ad5fab48f7bf510349 [1]
    Signed-off-by: Ryosuke Yasuoka <ryasuoka@redhat.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/20240221074053.1794118-1-ryasuoka@redhat.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFS: Fix data corruption caused by congestion. [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Wed Feb 28 10:23:31 2024 +1100

    NFS: Fix data corruption caused by congestion.
    
    when AOP_WRITEPAGE_ACTIVATE is returned (as NFS does when it detects
    congestion) it is important that the folio is redirtied.
    nfs_writepage_locked() doesn't do this, so files can become corrupted as
    writes can be lost.
    
    Note that this is not needed in v6.8 as AOP_WRITEPAGE_ACTIVATE cannot be
    returned.  It is needed for kernels v5.18..v6.7.  Prior to 6.3 the patch
    is different as it needs to mention "page", not "folio".
    
    Reported-and-tested-by: Jacek Tomaka <Jacek.Tomaka@poczta.fm>
    Fixes: 6df25e58532b ("nfs: remove reliance on bdi congestion")
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

of: property: fw_devlink: Fix stupid bug in remote-endpoint parsing [+ + +]

Author: Saravana Kannan <saravanak@google.com>
Date:   Fri Feb 23 21:24:35 2024 -0800

    of: property: fw_devlink: Fix stupid bug in remote-endpoint parsing
    
    [ Upstream commit 7cb50f6c9fbaa1c0b80100b8971bf13db5d75d06 ]
    
    Introduced a stupid bug in commit 782bfd03c3ae ("of: property: Improve
    finding the supplier of a remote-endpoint property") due to a last minute
    incorrect edit of "index !=0" into "!index". This patch fixes it to be
    "index > 0" to match the comment right next to it.
    
    Reported-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
    Link: https://lore.kernel.org/lkml/20240223171849.10f9901d@booty/
    Fixes: 782bfd03c3ae ("of: property: Improve finding the supplier of a remote-endpoint property")
    Signed-off-by: Saravana Kannan <saravanak@google.com>
    Reviewed-by: Herve Codina <herve.codina@bootlin.com>
    Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
    Tested-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
    Link: https://lore.kernel.org/r/20240224052436.3552333-1-saravanak@google.com
    Signed-off-by: Rob Herring <robh@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

phy: freescale: phy-fsl-imx8-mipi-dphy: Fix alias name to use dashes [+ + +]

Author: Alexander Stein <alexander.stein@ew.tq-group.com>
Date:   Wed Jan 10 10:33:43 2024 +0100

    phy: freescale: phy-fsl-imx8-mipi-dphy: Fix alias name to use dashes
    
    [ Upstream commit 7936378cb6d87073163130e1e1fc1e5f76a597cf ]
    
    Devicetree spec lists only dashes as valid characters for alias names.
    Table 3.2: Valid characters for alias names, Devicee Specification,
    Release v0.4
    
    Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
    Fixes: 3fbae284887de ("phy: freescale: phy-fsl-imx8-mipi-dphy: Add i.MX8qxp LVDS PHY mode support")
    Link: https://lore.kernel.org/r/20240110093343.468810-1-alexander.stein@ew.tq-group.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

phy: qcom-qmp-usb: fix v3 offsets data [+ + +]

Author: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Date:   Tue Feb 13 15:38:24 2024 +0200

    phy: qcom-qmp-usb: fix v3 offsets data
    
    [ Upstream commit d4c08d8b23b22807c712208cd05cb047e92e7672 ]
    
    The MSM8996 platform has registers setup different to the rest of QMP v3
    USB platforms. It has PCS region at 0x600 and no PCS_MISC region, while
    other platforms have PCS region at 0x800 and PCS_MISC at 0x600.  This
    results in the malfunctioning USB host on some of the platforms.  The
    commit f74c35b630d4 ("phy: qcom-qmp-usb: fix register offsets for
    ipq8074/ipq6018") fixed the issue for IPQ platforms, but missed the
    SDM845 which has the same register layout.
    
    To simplify future platform addition and to make the driver more future
    proof, rename qmp_usb_offsets_v3 to qmp_usb_offsets_v3_msm8996 (to mark
    its peculiarity), rename qmp_usb_offsets_ipq8074 to qmp_usb_offsets_v3
    and use it for SDM845 platform.
    
    Fixes: 2be22aae6b18 ("phy: qcom-qmp-usb: populate offsets configuration")
    Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
    Link: https://lore.kernel.org/r/20240213133824.2218916-1-dmitry.baryshkov@linaro.org
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

phy: qcom: phy-qcom-m31: fix wrong pointer pass to PTR_ERR() [+ + +]

Author: Yang Yingliang <yangyingliang@huawei.com>
Date:   Thu Aug 24 17:13:45 2023 +0800

    phy: qcom: phy-qcom-m31: fix wrong pointer pass to PTR_ERR()
    
    [ Upstream commit 95055beb067cb30f626fb10f7019737ca7681df0 ]
    
    It should be 'qphy->vreg' passed to PTR_ERR() when devm_regulator_get() fails.
    
    Fixes: 08e49af50701 ("phy: qcom: Introduce M31 USB PHY driver")
    Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
    Reviewed-by: Varadarajan Narayanan <quic_varada@quicinc.com>
    Link: https://lore.kernel.org/r/20230824091345.1072650-1-yangyingliang@huawei.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

pmdomain: arm: Fix NULL dereference on scmi_perf_domain removal [+ + +]

Author: Cristian Marussi <cristian.marussi@arm.com>
Date:   Thu Jan 25 19:17:56 2024 +0000

    pmdomain: arm: Fix NULL dereference on scmi_perf_domain removal
    
    commit eb5555d422d0fc325e1574a7353d3c616f82d8b5 upstream.
    
    On unloading of the scmi_perf_domain module got the below splat, when in
    the DT provided to the system under test the '#power-domain-cells' property
    was missing. Indeed, this particular setup causes the probe to bail out
    early without giving any error, which leads to the ->remove() callback gets
    to run too, but without all the expected initialized structures in place.
    
    Add a check and bail out early on remove too.
    
     Call trace:
      scmi_perf_domain_remove+0x28/0x70 [scmi_perf_domain]
      scmi_dev_remove+0x28/0x40 [scmi_core]
      device_remove+0x54/0x90
      device_release_driver_internal+0x1dc/0x240
      driver_detach+0x58/0xa8
      bus_remove_driver+0x78/0x108
      driver_unregister+0x38/0x70
      scmi_driver_unregister+0x28/0x180 [scmi_core]
      scmi_perf_domain_driver_exit+0x18/0xb78 [scmi_perf_domain]
      __arm64_sys_delete_module+0x1a8/0x2c0
      invoke_syscall+0x50/0x128
      el0_svc_common.constprop.0+0x48/0xf0
      do_el0_svc+0x24/0x38
      el0_svc+0x34/0xb8
      el0t_64_sync_handler+0x100/0x130
      el0t_64_sync+0x190/0x198
     Code: a90153f3 f9403c14 f9414800 955f8a05 (b9400a80)
     ---[ end trace 0000000000000000 ]---
    
    Fixes: 2af23ceb8624 ("pmdomain: arm: Add the SCMI performance domain")
    Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
    Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240125191756.868860-1-cristian.marussi@arm.com
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pmdomain: qcom: rpmhpd: Fix enabled_corner aggregation [+ + +]

Author: Bjorn Andersson <quic_bjorande@quicinc.com>
Date:   Mon Feb 26 17:49:57 2024 -0800

    pmdomain: qcom: rpmhpd: Fix enabled_corner aggregation
    
    commit 2a93c6cbd5a703d44c414a3c3945a87ce11430ba upstream.
    
    Commit 'e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable
    the domain")' aimed to make sure that a power-domain that is being
    enabled without any particular performance-state requested will at least
    turn the rail on, to avoid filling DeviceTree with otherwise unnecessary
    required-opps properties.
    
    But in the event that aggregation happens on a disabled power-domain, with
    an enabled peer without performance-state, both the local and peer
    corner are 0. The peer's enabled_corner is not considered, with the
    result that the underlying (shared) resource is disabled.
    
    One case where this can be observed is when the display stack keeps mmcx
    enabled (but without a particular performance-state vote) in order to
    access registers and sync_state happens in the rpmhpd driver. As mmcx_ao
    is flushed the state of the peer (mmcx) is not considered and mmcx_ao
    ends up turning off "mmcx.lvl" underneath mmcx. This has been observed
    several times, but has been painted over in DeviceTree by adding an
    explicit vote for the lowest non-disabled performance-state.
    
    Fixes: e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable the domain")
    Reported-by: Johan Hovold <johan@kernel.org>
    Closes: https://lore.kernel.org/linux-arm-msm/ZdMwZa98L23mu3u6@hovoldconsulting.com/
    Cc:  <stable@vger.kernel.org>
    Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
    Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
    Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
    Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
    Reviewed-by: Stephen Boyd <swboyd@chromium.org>
    Tested-by: Johan Hovold <johan+linaro@kernel.org>
    Link: https://lore.kernel.org/r/20240226-rpmhpd-enable-corner-fix-v1-1-68c004cec48c@quicinc.com
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

power: supply: bq27xxx-i2c: Do not free non existing IRQ [+ + +]

Author: Hans de Goede <hdegoede@redhat.com>
Date:   Thu Feb 15 16:51:33 2024 +0100

    power: supply: bq27xxx-i2c: Do not free non existing IRQ
    
    [ Upstream commit 2df70149e73e79783bcbc7db4fa51ecef0e2022c ]
    
    The bq27xxx i2c-client may not have an IRQ, in which case
    client->irq will be 0. bq27xxx_battery_i2c_probe() already has
    an if (client->irq) check wrapping the request_threaded_irq().
    
    But bq27xxx_battery_i2c_remove() unconditionally calls
    free_irq(client->irq) leading to:
    
    [  190.310742] ------------[ cut here ]------------
    [  190.310843] Trying to free already-free IRQ 0
    [  190.310861] WARNING: CPU: 2 PID: 1304 at kernel/irq/manage.c:1893 free_irq+0x1b8/0x310
    
    Followed by a backtrace when unbinding the driver. Add
    an if (client->irq) to bq27xxx_battery_i2c_remove() mirroring
    probe() to fix this.
    
    Fixes: 444ff00734f3 ("power: supply: bq27xxx: Fix I2C IRQ race on remove")
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Link: https://lore.kernel.org/r/20240215155133.70537-1-hdegoede@redhat.com
    Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

power: supply: mm8013: select REGMAP_I2C [+ + +]

Author: Thomas Weiц÷schuh <linux@weissschuh.net>
Date:   Sun Feb 4 18:30:43 2024 +0100

    power: supply: mm8013: select REGMAP_I2C
    
    commit 30d5297862410418bb8f8b4c0a87fa55c3063dd7 upstream.
    
    The driver uses regmap APIs so it should make sure they are available.
    
    Fixes: c75f4bf6800b ("power: supply: Introduce MM8013 fuel gauge driver")
    Cc:  <stable@vger.kernel.org>
    Signed-off-by: Thomas Weiц÷schuh <linux@weissschuh.net>
    Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Link: https://lore.kernel.org/r/20240204-mm8013-regmap-v1-1-7cc6b619b7d3@weissschuh.net
    Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

powerpc/pseries/iommu: IOMMU table is not initialized for kdump over SR-IOV [+ + +]

Author: Gaurav Batra <gbatra@linux.vnet.ibm.com>
Date:   Thu Jan 25 14:30:17 2024 -0600

    powerpc/pseries/iommu: IOMMU table is not initialized for kdump over SR-IOV
    
    [ Upstream commit 09a3c1e46142199adcee372a420b024b4fc61051 ]
    
    When kdump kernel tries to copy dump data over SR-IOV, LPAR panics due
    to NULL pointer exception:
    
      Kernel attempted to read user page (0) - exploit attempt? (uid: 0)
      BUG: Kernel NULL pointer dereference on read at 0x00000000
      Faulting instruction address: 0xc000000020847ad4
      Oops: Kernel access of bad area, sig: 11 [#1]
      LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
      Modules linked in: mlx5_core(+) vmx_crypto pseries_wdt papr_scm libnvdimm mlxfw tls psample sunrpc fuse overlay squashfs loop
      CPU: 12 PID: 315 Comm: systemd-udevd Not tainted 6.4.0-Test102+ #12
      Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_008) hv:phyp pSeries
      NIP:  c000000020847ad4 LR: c00000002083b2dc CTR: 00000000006cd18c
      REGS: c000000029162ca0 TRAP: 0300   Not tainted  (6.4.0-Test102+)
      MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 48288244  XER: 00000008
      CFAR: c00000002083b2d8 DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 1
      ...
      NIP _find_next_zero_bit+0x24/0x110
      LR  bitmap_find_next_zero_area_off+0x5c/0xe0
      Call Trace:
        dev_printk_emit+0x38/0x48 (unreliable)
        iommu_area_alloc+0xc4/0x180
        iommu_range_alloc+0x1e8/0x580
        iommu_alloc+0x60/0x130
        iommu_alloc_coherent+0x158/0x2b0
        dma_iommu_alloc_coherent+0x3c/0x50
        dma_alloc_attrs+0x170/0x1f0
        mlx5_cmd_init+0xc0/0x760 [mlx5_core]
        mlx5_function_setup+0xf0/0x510 [mlx5_core]
        mlx5_init_one+0x84/0x210 [mlx5_core]
        probe_one+0x118/0x2c0 [mlx5_core]
        local_pci_probe+0x68/0x110
        pci_call_probe+0x68/0x200
        pci_device_probe+0xbc/0x1a0
        really_probe+0x104/0x540
        __driver_probe_device+0xb4/0x230
        driver_probe_device+0x54/0x130
        __driver_attach+0x158/0x2b0
        bus_for_each_dev+0xa8/0x130
        driver_attach+0x34/0x50
        bus_add_driver+0x16c/0x300
        driver_register+0xa4/0x1b0
        __pci_register_driver+0x68/0x80
        mlx5_init+0xb8/0x100 [mlx5_core]
        do_one_initcall+0x60/0x300
        do_init_module+0x7c/0x2b0
    
    At the time of LPAR dump, before kexec hands over control to kdump
    kernel, DDWs (Dynamic DMA Windows) are scanned and added to the FDT.
    For the SR-IOV case, default DMA window "ibm,dma-window" is removed from
    the FDT and DDW added, for the device.
    
    Now, kexec hands over control to the kdump kernel.
    
    When the kdump kernel initializes, PCI busses are scanned and IOMMU
    group/tables created, in pci_dma_bus_setup_pSeriesLP(). For the SR-IOV
    case, there is no "ibm,dma-window". The original commit: b1fc44eaa9ba,
    fixes the path where memory is pre-mapped (direct mapped) to the DDW.
    When TCEs are direct mapped, there is no need to initialize IOMMU
    tables.
    
    iommu_table_setparms_lpar() only considers "ibm,dma-window" property
    when initiallizing IOMMU table. In the scenario where TCEs are
    dynamically allocated for SR-IOV, newly created IOMMU table is not
    initialized. Later, when the device driver tries to enter TCEs for the
    SR-IOV device, NULL pointer execption is thrown from iommu_area_alloc().
    
    The fix is to initialize the IOMMU table with DDW property stored in the
    FDT. There are 2 points to remember:
    
            1. For the dedicated adapter, kdump kernel would encounter both
               default and DDW in FDT. In this case, DDW property is used to
               initialize the IOMMU table.
    
            2. A DDW could be direct or dynamic mapped. kdump kernel would
               initialize IOMMU table and mark the existing DDW as
               "dynamic". This works fine since, at the time of table
               initialization, iommu_table_clear() makes some space in the
               DDW, for some predefined number of TCEs which are needed for
               kdump to succeed.
    
    Fixes: b1fc44eaa9ba ("pseries/iommu/ddw: Fix kdump to work in absence of ibm,dma-window")
    Signed-off-by: Gaurav Batra <gbatra@linux.vnet.ibm.com>
    Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://msgid.link/20240125203017.61014-1-gbatra@linux.ibm.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

powerpc/rtas: use correct function name for resetting TCE tables [+ + +]

Author: Nathan Lynch <nathanl@linux.ibm.com>
Date:   Thu Feb 22 16:19:14 2024 -0600

    powerpc/rtas: use correct function name for resetting TCE tables
    
    [ Upstream commit fad87dbd48156ab940538f052f1820f4b6ed2819 ]
    
    The PAPR spec spells the function name as
    
      "ibm,reset-pe-dma-windows"
    
    but in practice firmware uses the singular form:
    
      "ibm,reset-pe-dma-window"
    
    in the device tree. Since we have the wrong spelling in the RTAS
    function table, reverse lookups (token -> name) fail and warn:
    
      unexpected failed lookup for token 86
      WARNING: CPU: 1 PID: 545 at arch/powerpc/kernel/rtas.c:659 __do_enter_rtas_trace+0x2a4/0x2b4
      CPU: 1 PID: 545 Comm: systemd-udevd Not tainted 6.8.0-rc4 #30
      Hardware name: IBM,9105-22A POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NL1060_028) hv:phyp pSeries
      NIP [c0000000000417f0] __do_enter_rtas_trace+0x2a4/0x2b4
      LR [c0000000000417ec] __do_enter_rtas_trace+0x2a0/0x2b4
      Call Trace:
       __do_enter_rtas_trace+0x2a0/0x2b4 (unreliable)
       rtas_call+0x1f8/0x3e0
       enable_ddw.constprop.0+0x4d0/0xc84
       dma_iommu_dma_supported+0xe8/0x24c
       dma_set_mask+0x5c/0xd8
       mlx5_pci_init.constprop.0+0xf0/0x46c [mlx5_core]
       probe_one+0xfc/0x32c [mlx5_core]
       local_pci_probe+0x68/0x12c
       pci_call_probe+0x68/0x1ec
       pci_device_probe+0xbc/0x1a8
       really_probe+0x104/0x570
       __driver_probe_device+0xb8/0x224
       driver_probe_device+0x54/0x130
       __driver_attach+0x158/0x2b0
       bus_for_each_dev+0xa8/0x120
       driver_attach+0x34/0x48
       bus_add_driver+0x174/0x304
       driver_register+0x8c/0x1c4
       __pci_register_driver+0x68/0x7c
       mlx5_init+0xb8/0x118 [mlx5_core]
       do_one_initcall+0x60/0x388
       do_init_module+0x7c/0x2a4
       init_module_from_file+0xb4/0x108
       idempotent_init_module+0x184/0x34c
       sys_finit_module+0x90/0x114
    
    And oopses are possible when lockdep is enabled or the RTAS
    tracepoints are active, since those paths dereference the result of
    the lookup.
    
    Use the correct spelling to match firmware's behavior, adjusting the
    related constants to match.
    
    Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
    Fixes: 8252b88294d2 ("powerpc/rtas: improve function information lookups")
    Reported-by: Gaurav Batra <gbatra@linux.ibm.com>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://msgid.link/20240222-rtas-fix-ibm-reset-pe-dma-window-v1-1-7aaf235ac63c@linux.ibm.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Revert "drm/amd/pm: resolve reboot exception for si oland" [+ + +]

Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Wed Aug 9 15:06:00 2023 -0400

    Revert "drm/amd/pm: resolve reboot exception for si oland"
    
    commit 955558030954b9637b41c97b730f9b38c92ac488 upstream.
    
    This reverts commit e490d60a2f76bff636c68ce4fe34c1b6c34bbd86.
    
    This causes hangs on SI when DC is enabled and errors on driver
    reboot and power off cycles.
    
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3216
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2755
    Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "riscv: mm: support Svnapot in huge vmap" [+ + +]

Author: Alexandre Ghiti <alexghiti@rivosinc.com>
Date:   Tue Feb 27 21:50:15 2024 +0100

    Revert "riscv: mm: support Svnapot in huge vmap"
    
    [ Upstream commit 16ab4646c9057e0528b985ad772e3cb88c613db2 ]
    
    This reverts commit ce173474cf19fe7fbe8f0fc74e3c81ec9c3d9807.
    
    We cannot correctly deal with NAPOT mappings in vmalloc/vmap because if
    some part of a NAPOT mapping is unmapped, the remaining mapping is not
    updated accordingly. For example:
    
    ptr = vmalloc_huge(64 * 1024, GFP_KERNEL);
    vunmap_range((unsigned long)(ptr + PAGE_SIZE),
                 (unsigned long)(ptr + 64 * 1024));
    
    leads to the following kernel page table dump:
    
    0xffff8f8000ef0000-0xffff8f8000ef1000    0x00000001033c0000         4K PTE N   ..     ..   D A G . . W R V
    
    Meaning the first entry which was not unmapped still has the N bit set,
    which, if accessed first and cached in the TLB, could allow access to the
    unmapped range.
    
    That's because the logic to break the NAPOT mapping does not exist and
    likely won't. Indeed, to break a NAPOT mapping, we first have to clear
    the whole mapping, flush the TLB and then set the new mapping ("break-
    before-make" equivalent). That works fine in userspace since we can handle
    any pagefault occurring on the remaining mapping but we can't handle a kernel
    pagefault on such mapping.
    
    So fix this by reverting the commit that introduced the vmap/vmalloc
    support.
    
    Fixes: ce173474cf19 ("riscv: mm: support Svnapot in huge vmap")
    Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Link: https://lore.kernel.org/r/20240227205016.121901-2-alexghiti@rivosinc.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

RISC-V: Drop invalid test from CONFIG_AS_HAS_OPTION_ARCH [+ + +]

Author: Nathan Chancellor <nathan@kernel.org>
Date:   Thu Jan 25 10:32:12 2024 -0700

    RISC-V: Drop invalid test from CONFIG_AS_HAS_OPTION_ARCH
    
    commit 3aff0c459e77ac0fb1c4d6884433467f797f7357 upstream.
    
    Commit e4bb020f3dbb ("riscv: detect assembler support for .option arch")
    added two tests, one for a valid value to '.option arch' that should
    succeed and one for an invalid value that is expected to fail to make
    sure that support for '.option arch' is properly detected because Clang
    does not error when '.option arch' is not supported:
    
      $ clang --target=riscv64-linux-gnu -Werror -x assembler -c -o /dev/null <(echo '.option arch, +m')
      /dev/fd/63:1:9: warning: unknown option, expected 'push', 'pop', 'rvc', 'norvc', 'relax' or 'norelax'
      .option arch, +m
              ^
      $ echo $?
      0
    
    Unfortunately, the invalid test started being accepted by Clang after
    the linked llvm-project change, which causes CONFIG_AS_HAS_OPTION_ARCH
    and configurations that depend on it to be silently disabled, even
    though those versions do support '.option arch'.
    
    The invalid test can be avoided altogether by using
    '-Wa,--fatal-warnings', which will turn all assembler warnings into
    errors, like '-Werror' does for the compiler:
    
      $ clang --target=riscv64-linux-gnu -Werror -Wa,--fatal-warnings -x assembler -c -o /dev/null <(echo '.option arch, +m')
      /dev/fd/63:1:9: error: unknown option, expected 'push', 'pop', 'rvc', 'norvc', 'relax' or 'norelax'
      .option arch, +m
              ^
      $ echo $?
      1
    
    The as-instr macros have been updated to make use of this flag, so
    remove the invalid test, which allows CONFIG_AS_HAS_OPTION_ARCH to work
    for all compiler versions.
    
    Cc: stable@vger.kernel.org
    Fixes: e4bb020f3dbb ("riscv: detect assembler support for .option arch")
    Link: https://github.com/llvm/llvm-project/commit/3ac9fe69f70a2b3541266daedbaaa7dc9c007a2a
    Reported-by: Eric Biggers <ebiggers@kernel.org>
    Closes: https://lore.kernel.org/r/20240121011341.GA97368@sol.localdomain/
    Signed-off-by: Nathan Chancellor <nathan@kernel.org>
    Tested-by: Eric Biggers <ebiggers@google.com>
    Tested-by: Andy Chiu <andybnac@gmail.com>
    Reviewed-by: Andy Chiu <andybnac@gmail.com>
    Tested-by: Conor Dooley <conor.dooley@microchip.com>
    Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
    Acked-by: Masahiro Yamada <masahiroy@kernel.org>
    Link: https://lore.kernel.org/r/20240125-fix-riscv-option-arch-llvm-18-v1-2-390ac9cc3cd0@kernel.org
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs [+ + +]

Author: Conor Dooley <conor@kernel.org>
Date:   Fri Feb 23 11:31:31 2024 +0000

    RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs
    
    [ Upstream commit d82f32202e0df7bf40d4b67c8a4ff9cea32df4d9 ]
    
    Before attempting to support the pre-ratification version of vector
    found on older T-Head CPUs, disallow "v" in riscv,isa on these
    platforms. The deprecated property has no clear way to communicate
    the specific version of vector that is supported and much of the vendor
    provided software puts "v" in the isa string. riscv,isa-extensions
    should be used instead. This should not be too much of a burden for
    these systems, as the vendor shipped devicetrees and firmware do not
    work with a mainline kernel and will require updating.
    
    We can limit this restriction to only ignore v in riscv,isa on CPUs
    that report T-Head's vendor ID and a zero marchid. Newer T-Head CPUs
    that support the ratified version of vector should report non-zero
    marchid, according to Guo Ren [1].
    
    Link: https://lore.kernel.org/linux-riscv/CAJF2gTRy5eK73=d6s7CVy9m9pB8p4rAoMHM3cZFwzg=AuF7TDA@mail.gmail.com/ [1]
    Fixes: dc6667a4e7e3 ("riscv: Extending cpufeature.c to detect V-extension")
    Co-developed-by: Conor Dooley <conor.dooley@microchip.com>
    Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
    Acked-by: Guo Ren <guoren@kernel.org>
    Link: https://lore.kernel.org/r/20240223-tidings-shabby-607f086cb4d7@spud
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

riscv: add CALLER_ADDRx support [+ + +]

Author: Zong Li <zong.li@sifive.com>
Date:   Fri Feb 2 01:51:02 2024 +0000

    riscv: add CALLER_ADDRx support
    
    commit 680341382da56bd192ebfa4e58eaf4fec2e5bca7 upstream.
    
    CALLER_ADDRx returns caller's address at specified level, they are used
    for several tracers. These macros eventually use
    __builtin_return_address(n) to get the caller's address if arch doesn't
    define their own implementation.
    
    In RISC-V, __builtin_return_address(n) only works when n == 0, we need
    to walk the stack frame to get the caller's address at specified level.
    
    data.level started from 'level + 3' due to the call flow of getting
    caller's address in RISC-V implementation. If we don't have additional
    three iteration, the level is corresponding to follows:
    
    callsite -> return_address -> arch_stack_walk -> walk_stackframe
    |           |                 |                  |
    level 3     level 2           level 1            level 0
    
    Fixes: 10626c32e382 ("riscv/ftrace: Add basic support")
    Cc: stable@vger.kernel.org
    Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Signed-off-by: Zong Li <zong.li@sifive.com>
    Link: https://lore.kernel.org/r/20240202015102.26251-1-zong.li@sifive.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

riscv: Fix build error if !CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION [+ + +]

Author: Alexandre Ghiti <alexghiti@rivosinc.com>
Date:   Sun Feb 11 09:36:40 2024 +0100

    riscv: Fix build error if !CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
    
    [ Upstream commit fc325b1a915f6d0c821bfcea21fb3f1354c4323b ]
    
    The new riscv specific arch_hugetlb_migration_supported() must be
    guarded with a #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION to avoid
    the following build error:
    
    In file included from include/linux/hugetlb.h:851,
                        from kernel/fork.c:52:
    >> arch/riscv/include/asm/hugetlb.h:15:42: error: static declaration of 'arch_hugetlb_migration_supported' follows non-static declaration
          15 | #define arch_hugetlb_migration_supported arch_hugetlb_migration_supported
             |                                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       include/linux/hugetlb.h:916:20: note: in expansion of macro 'arch_hugetlb_migration_supported'
         916 | static inline bool arch_hugetlb_migration_supported(struct hstate *h)
             |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       arch/riscv/include/asm/hugetlb.h:14:6: note: previous declaration of 'arch_hugetlb_migration_supported' with type 'bool(struct hstate *)' {aka '_Bool(struct hstate *)'}
          14 | bool arch_hugetlb_migration_supported(struct hstate *h);
             |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    Reported-by: kernel test robot <lkp@intel.com>
    Closes: https://lore.kernel.org/oe-kbuild-all/202402110258.CV51JlEI-lkp@intel.com/
    Fixes: ce68c035457b ("riscv: Fix arch_hugetlb_migration_supported() for NAPOT")
    Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Link: https://lore.kernel.org/r/20240211083640.756583-1-alexghiti@rivosinc.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

riscv: Fix enabling cbo.zero when running in M-mode [+ + +]

Author: Samuel Holland <samuel.holland@sifive.com>
Date:   Tue Feb 27 22:55:33 2024 -0800

    riscv: Fix enabling cbo.zero when running in M-mode
    
    commit 3fb3f7164edc467450e650dca51dbe4823315a56 upstream.
    
    When the kernel is running in M-mode, the CBZE bit must be set in the
    menvcfg CSR, not in senvcfg.
    
    Cc: <stable@vger.kernel.org>
    Fixes: 43c16d51a19b ("RISC-V: Enable cbo.zero in usermode")
    Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
    Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
    Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
    Link: https://lore.kernel.org/r/20240228065559.3434837-2-samuel.holland@sifive.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

riscv: Fix pte_leaf_size() for NAPOT [+ + +]

Author: Alexandre Ghiti <alexghiti@rivosinc.com>
Date:   Tue Feb 27 21:50:16 2024 +0100

    riscv: Fix pte_leaf_size() for NAPOT
    
    [ Upstream commit e0fe5ab4192c171c111976dbe90bbd37d3976be0 ]
    
    pte_leaf_size() must be reimplemented to add support for NAPOT mappings.
    
    Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page")
    Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Link: https://lore.kernel.org/r/20240227205016.121901-3-alexghiti@rivosinc.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

riscv: mm: fix NOCACHE_THEAD does not set bit[61] correctly [+ + +]

Author: Yangyu Chen <cyy@cyyself.name>
Date:   Wed Feb 21 11:02:31 2024 +0800

    riscv: mm: fix NOCACHE_THEAD does not set bit[61] correctly
    
    [ Upstream commit c21f014818600ae017f97ee087e7c136b1916aa7 ]
    
    Previous commit dbfbda3bd6bf ("riscv: mm: update T-Head memory type
    definitions") from patch [1] missed a `<` for bit shifting, result in
    bit(61) does not set in _PAGE_NOCACHE_THEAD and leaves bit(0) set instead.
    This patch get this fixed.
    
    Link: https://lore.kernel.org/linux-riscv/20230912072510.2510-1-jszhang@kernel.org/ [1]
    Fixes: dbfbda3bd6bf ("riscv: mm: update T-Head memory type definitions")
    Signed-off-by: Yangyu Chen <cyy@cyyself.name>
    Reviewed-by: Guo Ren <guoren@kernel.org>
    Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
    Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Link: https://lore.kernel.org/r/tencent_E19FA1A095768063102E654C6FC858A32F06@qq.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

riscv: Sparse-Memory/vmemmap out-of-bounds fix [+ + +]

Author: Dimitris Vlachos <dvlachos@ics.forth.gr>
Date:   Thu Feb 29 21:17:23 2024 +0200

    riscv: Sparse-Memory/vmemmap out-of-bounds fix
    
    [ Upstream commit a11dd49dcb9376776193e15641f84fcc1e5980c9 ]
    
    Offset vmemmap so that the first page of vmemmap will be mapped
    to the first page of physical memory in order to ensure that
    vmemmapБ─≥s bounds will be respected during
    pfn_to_page()/page_to_pfn() operations.
    The conversion macros will produce correct SV39/48/57 addresses
    for every possible/valid DRAM_BASE inside the physical memory limits.
    
    v2:Address Alex's comments
    
    Suggested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Signed-off-by: Dimitris Vlachos <dvlachos@ics.forth.gr>
    Reported-by: Dimitris Vlachos <dvlachos@ics.forth.gr>
    Closes: https://lore.kernel.org/linux-riscv/20240202135030.42265-1-csd4492@csd.uoc.gr
    Fixes: d95f1a542c3d ("RISC-V: Implement sparsemem")
    Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Link: https://lore.kernel.org/r/20240229191723.32779-1-dvlachos@ics.forth.gr
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

riscv: tlb: fix __p*d_free_tlb() [+ + +]

Author: Jisheng Zhang <jszhang@kernel.org>
Date:   Wed Dec 20 01:50:43 2023 +0800

    riscv: tlb: fix __p*d_free_tlb()
    
    [ Upstream commit 8246601a7d391ce8207408149d65732f28af81a1 ]
    
    If non-leaf PTEs I.E pmd, pud or p4d is modified, a sfence.vma is
    a must for safe, imagine if an implementation caches the non-leaf
    translation in TLB, although I didn't meet this HW so far, but it's
    possible in theory.
    
    Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
    Fixes: c5e9b2c2ae82 ("riscv: Improve tlb_flush()")
    Link: https://lore.kernel.org/r/20231219175046.2496-2-jszhang@kernel.org
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

rtnetlink: fix error logic of IFLA_BRIDGE_FLAGS writing back [+ + +]

Author: Lin Ma <linma@zju.edu.cn>
Date:   Tue Feb 27 20:11:28 2024 +0800

    rtnetlink: fix error logic of IFLA_BRIDGE_FLAGS writing back
    
    [ Upstream commit 743ad091fb46e622f1b690385bb15e3cd3daf874 ]
    
    In the commit d73ef2d69c0d ("rtnetlink: let rtnl_bridge_setlink checks
    IFLA_BRIDGE_MODE length"), an adjustment was made to the old loop logic
    in the function `rtnl_bridge_setlink` to enable the loop to also check
    the length of the IFLA_BRIDGE_MODE attribute. However, this adjustment
    removed the `break` statement and led to an error logic of the flags
    writing back at the end of this function.
    
    if (have_flags)
        memcpy(nla_data(attr), &flags, sizeof(flags));
        // attr should point to IFLA_BRIDGE_FLAGS NLA !!!
    
    Before the mentioned commit, the `attr` is granted to be IFLA_BRIDGE_FLAGS.
    However, this is not necessarily true fow now as the updated loop will let
    the attr point to the last NLA, even an invalid NLA which could cause
    overflow writes.
    
    This patch introduces a new variable `br_flag` to save the NLA pointer
    that points to IFLA_BRIDGE_FLAGS and uses it to resolve the mentioned
    error logic.
    
    Fixes: d73ef2d69c0d ("rtnetlink: let rtnl_bridge_setlink checks IFLA_BRIDGE_MODE length")
    Signed-off-by: Lin Ma <linma@zju.edu.cn>
    Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
    Link: https://lore.kernel.org/r/20240227121128.608110-1-linma@zju.edu.cn
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: mptcp: add chk_subflows_total helper [+ + +]

Author: Geliang Tang <geliang@kernel.org>
Date:   Mon Mar 4 14:38:30 2024 +0100

    selftests: mptcp: add chk_subflows_total helper
    
    commit 80775412882e273b8ef62124fae861cde8e6fb3d upstream.
    
    This patch adds a new helper chk_subflows_total(), in it use the newly
    added counter mptcpi_subflows_total to get the "correct" amount of
    subflows, including the initial one.
    
    To be compatible with old 'ss' or kernel versions not supporting this
    counter, get the total subflows by listing TCP connections that are
    MPTCP subflows:
    
        ss -ti state state established state syn-sent state syn-recv |
            grep -c tcp-ulp-mptcp.
    
    Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
    Signed-off-by: Geliang Tang <geliang.tang@suse.com>
    Signed-off-by: Mat Martineau <martineau@kernel.org>
    Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-3-8d6b94150f6b@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests: mptcp: add evts_get_info helper [+ + +]

Author: Geliang Tang <geliang@kernel.org>
Date:   Mon Mar 4 14:38:29 2024 +0100

    selftests: mptcp: add evts_get_info helper
    
    commit 06848c0f341ee3f9226ed01e519c72e4d2b6f001 upstream.
    
    This patch adds a new helper get_info_value(), using 'sed' command to
    parse the value of the given item name in the line with the given keyword,
    to make chk_mptcp_info() and pedit_action_pkts() more readable.
    
    Also add another helper evts_get_info() to use get_info_value() to parse
    the output of 'pm_nl_ctl events' command, to make all the userspace pm
    selftests more readable, both in mptcp_join.sh and userspace_pm.sh.
    
    Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
    Signed-off-by: Geliang Tang <geliang.tang@suse.com>
    Signed-off-by: Mat Martineau <martineau@kernel.org>
    Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-2-8d6b94150f6b@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests: mptcp: add mptcp_lib_is_v6 [+ + +]

Author: Geliang Tang <geliang@kernel.org>
Date:   Mon Mar 4 14:38:32 2024 +0100

    selftests: mptcp: add mptcp_lib_is_v6
    
    commit b850f2c7dd85ecd14a333685c4ffd23f12665e94 upstream.
    
    To avoid duplicated code in different MPTCP selftests, we can add
    and use helpers defined in mptcp_lib.sh.
    
    is_v6() helper is defined in mptcp_connect.sh, mptcp_join.sh and
    mptcp_sockopt.sh, so export it into mptcp_lib.sh and rename it as
    mptcp_lib_is_v6(). Use this new helper in all scripts.
    
    Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
    Signed-off-by: Geliang Tang <geliang.tang@suse.com>
    Signed-off-by: Mat Martineau <martineau@kernel.org>
    Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-10-8d6b94150f6b@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests: mptcp: join: add ss mptcp support check [+ + +]

Author: Geliang Tang <tanggeliang@kylinos.cn>
Date:   Fri Feb 23 17:14:17 2024 +0100

    selftests: mptcp: join: add ss mptcp support check
    
    commit 9480f388a2ef54fba911d9325372abd69a328601 upstream.
    
    Commands 'ss -M' are used in script mptcp_join.sh to display only MPTCP
    sockets. So it must be checked if ss tool supports MPTCP in this script.
    
    Fixes: e274f7154008 ("selftests: mptcp: add subflow limits test-cases")
    Cc: stable@vger.kernel.org
    Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
    Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-7-162e87e48497@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests: mptcp: rm subflow with v4/v4mapped addr [+ + +]

Author: Geliang Tang <tanggeliang@kylinos.cn>
Date:   Mon Mar 4 14:38:33 2024 +0100

    selftests: mptcp: rm subflow with v4/v4mapped addr
    
    commit 7092dbee23282b6fcf1313fc64e2b92649ee16e8 upstream.
    
    Now both a v4 address and a v4-mapped address are supported when
    destroying a userspace pm subflow, this patch adds a second subflow
    to "userspace pm add & remove address" test, and two subflows could
    be removed two different ways, one with the v4mapped and one with v4.
    
    Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/387
    Fixes: 48d73f609dcc ("selftests: mptcp: update userspace pm addr tests")
    Cc: stable@vger.kernel.org
    Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
    Reviewed-by: Mat Martineau <martineau@kernel.org>
    Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-2-162e87e48497@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests: mptcp: update userspace pm test helpers [+ + +]

Author: Geliang Tang <geliang@kernel.org>
Date:   Mon Mar 4 14:38:31 2024 +0100

    selftests: mptcp: update userspace pm test helpers
    
    commit 757c828ce94905a2975873d5e90a376c701b2b90 upstream.
    
    This patch adds a new argument namespace to userspace_pm_add_addr() and
    userspace_pm_add_sf() to make these two helper more versatile.
    
    Add two more versatile helpers for userspace pm remove subflow or address:
    userspace_pm_rm_addr() and userspace_pm_rm_sf(). The original test helpers
    userspace_pm_rm_sf_addr_ns1() and userspace_pm_rm_sf_addr_ns2() can be
    replaced by these new helpers.
    
    Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
    Signed-off-by: Geliang Tang <geliang.tang@suse.com>
    Signed-off-by: Mat Martineau <martineau@kernel.org>
    Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-4-8d6b94150f6b@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

soc: qcom: pmic_glink: Fix boot when QRTR=m [+ + +]

Author: Rob Clark <robdclark@chromium.org>
Date:   Sat Feb 17 16:02:26 2024 +0100

    soc: qcom: pmic_glink: Fix boot when QRTR=m
    
    commit f79ee78767ca60e7a2c89eacd2dbdf237d97e838 upstream.
    
    We need to bail out before adding/removing devices if we are going to
    -EPROBE_DEFER. Otherwise boot can get stuck in a probe deferral loop due
    to a long-standing issue in driver core (see commit fbc35b45f9f6 ("Add
    documentation on meaning of -EPROBE_DEFER")).
    
    Deregistering the altmode child device can potentially also trigger bugs
    in the DRM bridge implementation, which does not expect bridges to go
    away.
    
    [DB: slightly fixed commit message by adding the word 'commit']
    Suggested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
    Signed-off-by: Rob Clark <robdclark@chromium.org>
    Link: https://lore.kernel.org/r/20231213210644.8702-1-robdclark@gmail.com
    [ johan: rebase on 6.8-rc4, amend commit message and mention DRM ]
    Fixes: 58ef4ece1e41 ("soc: qcom: pmic_glink: Introduce base PMIC GLINK driver")
    Cc: <stable@vger.kernel.org>      # 6.3
    Cc: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Reviewed-by: Bjorn Andersson <andersson@kernel.org>
    Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
    Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
    Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240217150228.5788-5-johan+linaro@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

spi: cadence-qspi: fix pointer reference in runtime PM hooks [+ + +]

Author: Thц╘o Lebrun <theo.lebrun@bootlin.com>
Date:   Thu Feb 22 11:12:29 2024 +0100

    spi: cadence-qspi: fix pointer reference in runtime PM hooks
    
    [ Upstream commit 32ce3bb57b6b402de2aec1012511e7ac4e7449dc ]
    
    dev_get_drvdata() gets used to acquire the pointer to cqspi and the SPI
    controller. Neither embed the other; this lead to memory corruption.
    
    On a given platform (Mobileye EyeQ5) the memory corruption is hidden
    inside cqspi->f_pdata. Also, this uninitialised memory is used as a
    mutex (ctlr->bus_lock_mutex) by spi_controller_suspend().
    
    Fixes: 2087e85bb66e ("spi: cadence-quadspi: fix suspend-resume implementations")
    Reviewed-by: Dhruva Gole <d-gole@ti.com>
    Signed-off-by: Thц╘o Lebrun <theo.lebrun@bootlin.com>
    Link: https://msgid.link/r/20240222-cdns-qspi-pm-fix-v4-1-6b6af8bcbf59@bootlin.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: cadence-qspi: remove system-wide suspend helper calls from runtime PM hooks [+ + +]

Author: Thц╘o Lebrun <theo.lebrun@bootlin.com>
Date:   Thu Feb 22 11:12:30 2024 +0100

    spi: cadence-qspi: remove system-wide suspend helper calls from runtime PM hooks
    
    [ Upstream commit 959043afe53ae80633e810416cee6076da6e91c6 ]
    
    The ->runtime_suspend() and ->runtime_resume() callbacks are not
    expected to call spi_controller_suspend() and spi_controller_resume().
    Remove calls to those in the cadence-qspi driver.
    
    Those helpers have two roles currently:
     - They stop/start the queue, including dealing with the kworker.
     - They toggle the SPI controller SPI_CONTROLLER_SUSPENDED flag. It
       requires acquiring ctlr->bus_lock_mutex.
    
    Step one is irrelevant because cadence-qspi is not queued. Step two
    however has two implications:
     - A deadlock occurs, because ->runtime_resume() is called in a context
       where the lock is already taken (in the ->exec_op() callback, where
       the usage count is incremented).
     - It would disallow all operations once the device is auto-suspended.
    
    Here is a brief call tree highlighting the mutex deadlock:
    
    spi_mem_exec_op()
            ...
            spi_mem_access_start()
                    mutex_lock(&ctlr->bus_lock_mutex)
    
            cqspi_exec_mem_op()
                    pm_runtime_resume_and_get()
                            cqspi_resume()
                                    spi_controller_resume()
                                            mutex_lock(&ctlr->bus_lock_mutex)
                    ...
    
            spi_mem_access_end()
                    mutex_unlock(&ctlr->bus_lock_mutex)
            ...
    
    Fixes: 0578a6dbfe75 ("spi: spi-cadence-quadspi: add runtime pm support")
    Signed-off-by: Thц╘o Lebrun <theo.lebrun@bootlin.com>
    Link: https://msgid.link/r/20240222-cdns-qspi-pm-fix-v4-2-6b6af8bcbf59@bootlin.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

stmmac: Clear variable when destroying workqueue [+ + +]

Author: Jakub Raczynski <j.raczynski@samsung.com>
Date:   Mon Feb 26 17:42:32 2024 +0100

    stmmac: Clear variable when destroying workqueue
    
    [ Upstream commit 8af411bbba1f457c33734795f024d0ef26d0963f ]
    
    Currently when suspending driver and stopping workqueue it is checked whether
    workqueue is not NULL and if so, it is destroyed.
    Function destroy_workqueue() does drain queue and does clear variable, but
    it does not set workqueue variable to NULL. This can cause kernel/module
    panic if code attempts to clear workqueue that was not initialized.
    
    This scenario is possible when resuming suspended driver in stmmac_resume(),
    because there is no handling for failed stmmac_hw_setup(),
    which can fail and return if DMA engine has failed to initialize,
    and workqueue is initialized after DMA engine.
    Should DMA engine fail to initialize, resume will proceed normally,
    but interface won't work and TX queue will eventually timeout,
    causing 'Reset adapter' error.
    This then does destroy workqueue during reset process.
    And since workqueue is initialized after DMA engine and can be skipped,
    it will cause kernel/module panic.
    
    To secure against this possible crash, set workqueue variable to NULL when
    destroying workqueue.
    
    Log/backtrace from crash goes as follows:
    [88.031977]------------[ cut here ]------------
    [88.031985]NETDEV WATCHDOG: eth0 (sxgmac): transmit queue 1 timed out
    [88.032017]WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:477 dev_watchdog+0x390/0x398
               <Skipping backtrace for watchdog timeout>
    [88.032251]---[ end trace e70de432e4d5c2c0 ]---
    [88.032282]sxgmac 16d88000.ethernet eth0: Reset adapter.
    [88.036359]------------[ cut here ]------------
    [88.036519]Call trace:
    [88.036523] flush_workqueue+0x3e4/0x430
    [88.036528] drain_workqueue+0xc4/0x160
    [88.036533] destroy_workqueue+0x40/0x270
    [88.036537] stmmac_fpe_stop_wq+0x4c/0x70
    [88.036541] stmmac_release+0x278/0x280
    [88.036546] __dev_close_many+0xcc/0x158
    [88.036551] dev_close_many+0xbc/0x190
    [88.036555] dev_close.part.0+0x70/0xc0
    [88.036560] dev_close+0x24/0x30
    [88.036564] stmmac_service_task+0x110/0x140
    [88.036569] process_one_work+0x1d8/0x4a0
    [88.036573] worker_thread+0x54/0x408
    [88.036578] kthread+0x164/0x170
    [88.036583] ret_from_fork+0x10/0x20
    [88.036588]---[ end trace e70de432e4d5c2c1 ]---
    [88.036597]Unable to handle kernel NULL pointer dereference at virtual address 0000000000000004
    
    Fixes: 5a5586112b929 ("net: stmmac: support FPE link partner hand-shaking procedure")
    Signed-off-by: Jakub Raczynski <j.raczynski@samsung.com>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tls: decrement decrypt_pending if no async completion will be called [+ + +]

Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Wed Feb 28 23:43:57 2024 +0100

    tls: decrement decrypt_pending if no async completion will be called
    
    [ Upstream commit f7fa16d49837f947ee59492958f9e6f0e51d9a78 ]
    
    With mixed sync/async decryption, or failures of crypto_aead_decrypt,
    we increment decrypt_pending but we never do the corresponding
    decrement since tls_decrypt_done will not be called. In this case, we
    should decrement decrypt_pending immediately to avoid getting stuck.
    
    For example, the prequeue prequeue test gets stuck with mixed
    modes (one async decrypt + one sync decrypt).
    
    Fixes: 94524d8fc965 ("net/tls: Add support for async decryption of tls records")
    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Link: https://lore.kernel.org/r/c56d5fc35543891d5319f834f25622360e1bfbec.1709132643.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tls: fix peeking with sync+async decryption [+ + +]

Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Wed Feb 28 23:43:58 2024 +0100

    tls: fix peeking with sync+async decryption
    
    [ Upstream commit 6caaf104423d809b49a67ee6500191d063b40dc6 ]
    
    If we peek from 2 records with a currently empty rx_list, and the
    first record is decrypted synchronously but the second record is
    decrypted async, the following happens:
      1. decrypt record 1 (sync)
      2. copy from record 1 to the userspace's msg
      3. queue the decrypted record to rx_list for future read(!PEEK)
      4. decrypt record 2 (async)
      5. queue record 2 to rx_list
      6. call process_rx_list to copy data from the 2nd record
    
    We currently pass copied=0 as skip offset to process_rx_list, so we
    end up copying once again from the first record. We should skip over
    the data we've already copied.
    
    Seen with selftest tls.12_aes_gcm.recv_peek_large_buf_mult_recs
    
    Fixes: 692d7b5d1f91 ("tls: Fix recvmsg() to be able to peek across multiple records")
    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Link: https://lore.kernel.org/r/1b132d2b2b99296bfde54e8a67672d90d6d16e71.1709132643.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tls: fix use-after-free on failed backlog decryption [+ + +]

Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Wed Feb 28 23:44:00 2024 +0100

    tls: fix use-after-free on failed backlog decryption
    
    [ Upstream commit 13114dc5543069f7b97991e3b79937b6da05f5b0 ]
    
    When the decrypt request goes to the backlog and crypto_aead_decrypt
    returns -EBUSY, tls_do_decryption will wait until all async
    decryptions have completed. If one of them fails, tls_do_decryption
    will return -EBADMSG and tls_decrypt_sg jumps to the error path,
    releasing all the pages. But the pages have been passed to the async
    callback, and have already been released by tls_decrypt_done.
    
    The only true async case is when crypto_aead_decrypt returns
     -EINPROGRESS. With -EBUSY, we already waited so we can tell
    tls_sw_recvmsg that the data is available for immediate copy, but we
    need to notify tls_decrypt_sg (via the new ->async_done flag) that the
    memory has already been released.
    
    Fixes: 859054147318 ("net: tls: handle backlogging of crypto requests")
    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Link: https://lore.kernel.org/r/4755dd8d9bebdefaa19ce1439b833d6199d4364c.1709132643.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tls: separate no-async decryption request handling from async [+ + +]

Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Wed Feb 28 23:43:59 2024 +0100

    tls: separate no-async decryption request handling from async
    
    [ Upstream commit 41532b785e9d79636b3815a64ddf6a096647d011 ]
    
    If we're not doing async, the handling is much simpler. There's no
    reference counting, we just need to wait for the completion to wake us
    up and return its result.
    
    We should preferably also use a separate crypto_wait. I'm not seeing a
    UAF as I did in the past, I think aec7961916f3 ("tls: fix race between
    async notify and socket close") took care of it.
    
    This will make the next fix easier.
    
    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Link: https://lore.kernel.org/r/47bde5f649707610eaef9f0d679519966fc31061.1709132643.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: 13114dc55430 ("tls: fix use-after-free on failed backlog decryption")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tomoyo: fix UAF write bug in tomoyo_write_control() [+ + +]

Author: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date:   Fri Mar 1 22:04:06 2024 +0900

    tomoyo: fix UAF write bug in tomoyo_write_control()
    
    commit 2f03fc340cac9ea1dc63cbf8c93dd2eb0f227815 upstream.
    
    Since tomoyo_write_control() updates head->write_buf when write()
    of long lines is requested, we need to fetch head->write_buf after
    head->io_sem is held.  Otherwise, concurrent write() requests can
    cause use-after-free-write and double-free problems.
    
    Reported-by: Sam Sun <samsun1006219@gmail.com>
    Closes: https://lkml.kernel.org/r/CAEkJfYNDspuGxYx5kym8Lvp--D36CMDUErg4rxfWFJuPbbji8g@mail.gmail.com
    Fixes: bd03a3e4c9a9 ("TOMOYO: Add policy namespace support.")
    Cc:  <stable@vger.kernel.org> # Linux 3.1+
    Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tools: ynl: fix handling of multiple mcast groups [+ + +]

Author: Jakub Kicinski <kuba@kernel.org>
Date:   Mon Feb 26 13:40:18 2024 -0800

    tools: ynl: fix handling of multiple mcast groups
    
    [ Upstream commit b6c65eb20ffa8e3bd89f551427dbeee2876d72ca ]
    
    We never increment the group number iterator, so all groups
    get recorded into index 0 of the mcast_groups[] array.
    
    As a result YNL can only handle using the last group.
    For example using the "netdev" sample on kernel with
    page pool commands results in:
    
      $ ./samples/netdev
      YNL: Multicast group 'mgmt' not found
    
    Most families have only one multicast group, so this hasn't
    been noticed. Plus perhaps developers usually test the last
    group which would have worked.
    
    Fixes: 86878f14d71a ("tools: ynl: user space helpers")
    Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
    Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
    Link: https://lore.kernel.org/r/20240226214019.1255242-1-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tun: Fix xdp_rxq_info's queue_index when detaching [+ + +]

Author: Yunjian Wang <wangyunjian@huawei.com>
Date:   Tue Feb 20 11:12:07 2024 +0800

    tun: Fix xdp_rxq_info's queue_index when detaching
    
    [ Upstream commit 2a770cdc4382b457ca3d43d03f0f0064f905a0d0 ]
    
    When a queue(tfile) is detached, we only update tfile's queue_index,
    but do not update xdp_rxq_info's queue_index. This patch fixes it.
    
    Fixes: 8bf5c4ee1889 ("tun: setup xdp_rxq_info")
    Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
    Link: https://lore.kernel.org/r/1708398727-46308-1-git-send-email-wangyunjian@huawei.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

uapi: in6: replace temporary label with rfc9486 [+ + +]

Author: Justin Iurman <justin.iurman@uliege.be>
Date:   Mon Feb 26 13:49:21 2024 +0100

    uapi: in6: replace temporary label with rfc9486
    
    [ Upstream commit 6a2008641920a9c6fe1abbeb9acbec463215d505 ]
    
    Not really a fix per se, but IPV6_TLV_IOAM is still tagged as "TEMPORARY
    IANA allocation for IOAM", while RFC 9486 is available for some time
    now. Just update the reference.
    
    Fixes: 9ee11f0fff20 ("ipv6: ioam: Data plane support for Pre-allocated Trace")
    Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20240226124921.9097-1-justin.iurman@uliege.be
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

veth: try harder when allocating queue memory [+ + +]

Author: Jakub Kicinski <kuba@kernel.org>
Date:   Fri Feb 23 15:59:08 2024 -0800

    veth: try harder when allocating queue memory
    
    [ Upstream commit 1ce7d306ea63f3e379557c79abd88052e0483813 ]
    
    struct veth_rq is pretty large, 832B total without debug
    options enabled. Since commit under Fixes we try to pre-allocate
    enough queues for every possible CPU. Miao Wang reports that
    this may lead to order-5 allocations which will fail in production.
    
    Let the allocation fallback to vmalloc() and try harder.
    These are the same flags we pass to netdev queue allocation.
    
    Reported-and-tested-by: Miao Wang <shankerwangmiao@gmail.com>
    Fixes: 9d3684c24a52 ("veth: create by default nr_possible_cpus queues")
    Link: https://lore.kernel.org/all/5F52CAE2-2FB7-4712-95F1-3312FBBFA8DD@gmail.com/
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/20240223235908.693010-1-kuba@kernel.org
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: nl80211: reject iftype change with mesh ID change [+ + +]

Author: Johannes Berg <johannes.berg@intel.com>
Date:   Wed Feb 14 20:08:35 2024 +0100

    wifi: nl80211: reject iftype change with mesh ID change
    
    commit f78c1375339a291cba492a70eaf12ec501d28a8e upstream.
    
    It's currently possible to change the mesh ID when the
    interface isn't yet in mesh mode, at the same time as
    changing it into mesh mode. This leads to an overwrite
    of data in the wdev->u union for the interface type it
    currently has, causing cfg80211_change_iface() to do
    wrong things when switching.
    
    We could probably allow setting an interface to mesh
    while setting the mesh ID at the same time by doing a
    different order of operations here, but realistically
    there's no userspace that's going to do this, so just
    disallow changes in iftype when setting mesh ID.
    
    Cc: stable@vger.kernel.org
    Fixes: 29cbe68c516a ("cfg80211/mac80211: add mesh join/leave commands")
    Reported-by: syzbot+dd4779978217b1973180@syzkaller.appspotmail.com
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key [+ + +]

Author: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Date:   Sun Mar 3 20:24:19 2024 -0800

    x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key
    
    commit 6613d82e617dd7eb8b0c40b2fe3acea655b1d611 upstream.
    
    The VERW mitigation at exit-to-user is enabled via a static branch
    mds_user_clear. This static branch is never toggled after boot, and can
    be safely replaced with an ALTERNATIVE() which is convenient to use in
    asm.
    
    Switch to ALTERNATIVE() to use the VERW mitigation late in exit-to-user
    path. Also remove the now redundant VERW in exc_nmi() and
    arch_exit_to_user_mode().
    
    Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
    Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
    Link: https://lore.kernel.org/all/20240213-delay-verw-v8-4-a6216d83edb7%40linux.intel.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registers [+ + +]

Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Thu Feb 1 00:09:02 2024 +0100

    x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registers
    
    commit 6890cb1ace350b4386c8aee1343dc3b3ddd214da upstream.
    
    MKTME repurposes the high bit of physical address to key id for encryption
    key and, even though MAXPHYADDR in CPUID[0x80000008] remains the same,
    the valid bits in the MTRR mask register are based on the reduced number
    of physical address bits.
    
    detect_tme() in arch/x86/kernel/cpu/intel.c detects TME and subtracts
    it from the total usable physical bits, but it is called too late.
    Move the call to early_init_intel() so that it is called in setup_arch(),
    before MTRRs are setup.
    
    This fixes boot on TDX-enabled systems, which until now only worked with
    "disable_mtrr_cleanup".  Without the patch, the values written to the
    MTRRs mask registers were 52-bit wide (e.g. 0x000fffff_80000800) and
    the writes failed; with the patch, the values are 46-bit wide, which
    matches the reduced MAXPHYADDR that is shown in /proc/cpuinfo.
    
    Reported-by: Zixi Chen <zixchen@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
    Cc:stable@vger.kernel.org
    Link: https://lore.kernel.org/all/20240131230902.1867092-3-pbonzini%40redhat.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/cpu: Allow reducing x86_phys_bits during early_identify_cpu() [+ + +]

Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Thu Feb 1 00:09:01 2024 +0100

    x86/cpu: Allow reducing x86_phys_bits during early_identify_cpu()
    
    commit 9a458198eba98b7207669a166e64d04b04cb651b upstream.
    
    In commit fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct
    value straight away, instead of a two-phase approach"), the initialization
    of c->x86_phys_bits was moved after this_cpu->c_early_init(c).  This is
    incorrect because early_init_amd() expected to be able to reduce the
    value according to the contents of CPUID leaf 0x8000001f.
    
    Fortunately, the bug was negated by init_amd()'s call to early_init_amd(),
    which does reduce x86_phys_bits in the end.  However, this is very
    late in the boot process and, most notably, the wrong value is used for
    x86_phys_bits when setting up MTRRs.
    
    To fix this, call get_cpu_address_sizes() as soon as X86_FEATURE_CPUID is
    set/cleared, and c->extended_cpuid_level is retrieved.
    
    Fixes: fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach")
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
    Cc:stable@vger.kernel.org
    Link: https://lore.kernel.org/all/20240131230902.1867092-2-pbonzini%40redhat.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/e820: Don't reserve SETUP_RNG_SEED in e820 [+ + +]

Author: Jiri Bohac <jbohac@suse.cz>
Date:   Wed Jan 31 01:04:28 2024 +0100

    x86/e820: Don't reserve SETUP_RNG_SEED in e820
    
    commit 7fd817c906503b6813ea3b41f5fdf4192449a707 upstream.
    
    SETUP_RNG_SEED in setup_data is supplied by kexec and should
    not be reserved in the e820 map.
    
    Doing so reserves 16 bytes of RAM when booting with kexec.
    (16 bytes because data->len is zeroed by parse_setup_data so only
    sizeof(setup_data) is reserved.)
    
    When kexec is used repeatedly, each boot adds two entries in the
    kexec-provided e820 map as the 16-byte range splits a larger
    range of usable memory. Eventually all of the 128 available entries
    get used up. The next split will result in losing usable memory
    as the new entries cannot be added to the e820 map.
    
    Fixes: 68b8e9713c8e ("x86/setup: Use rng seeds from setup_data")
    Signed-off-by: Jiri Bohac <jbohac@suse.cz>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: <stable@kernel.org>
    Link: https://lore.kernel.org/r/ZbmOjKnARGiaYBd5@dwarf.suse.cz
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/entry_32: Add VERW just before userspace transition [+ + +]

Author: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Date:   Sun Mar 3 20:24:13 2024 -0800

    x86/entry_32: Add VERW just before userspace transition
    
    commit a0e2dab44d22b913b4c228c8b52b2a104434b0b3 upstream.
    
    As done for entry_64, add support for executing VERW late in exit to
    user path for 32-bit mode.
    
    Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
    Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
    Link: https://lore.kernel.org/all/20240213-delay-verw-v8-3-a6216d83edb7%40linux.intel.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/entry_64: Add VERW just before userspace transition [+ + +]

Author: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Date:   Sun Mar 3 20:24:07 2024 -0800

    x86/entry_64: Add VERW just before userspace transition
    
    commit 3c7501722e6b31a6e56edd23cea5e77dbb9ffd1a upstream.
    
    Mitigation for MDS is to use VERW instruction to clear any secrets in
    CPU Buffers. Any memory accesses after VERW execution can still remain
    in CPU buffers. It is safer to execute VERW late in return to user path
    to minimize the window in which kernel data can end up in CPU buffers.
    There are not many kernel secrets to be had after SWITCH_TO_USER_CR3.
    
    Add support for deploying VERW mitigation after user register state is
    restored. This helps minimize the chances of kernel data ending up into
    CPU buffers after executing VERW.
    
    Note that the mitigation at the new location is not yet enabled.
    
      Corner case not handled
      =======================
      Interrupts returning to kernel don't clear CPUs buffers since the
      exit-to-user path is expected to do that anyways. But, there could be
      a case when an NMI is generated in kernel after the exit-to-user path
      has cleared the buffers. This case is not handled and NMI returning to
      kernel don't clear CPU buffers because:
    
      1. It is rare to get an NMI after VERW, but before returning to userspace.
      2. For an unprivileged user, there is no known way to make that NMI
         less rare or target it.
      3. It would take a large number of these precisely-timed NMIs to mount
         an actual attack.  There's presumably not enough bandwidth.
      4. The NMI in question occurs after a VERW, i.e. when user state is
         restored and most interesting data is already scrubbed. Whats left
         is only the data that NMI touches, and that may or may not be of
         any interest.
    
      [ pawan: resolved conflict for hunk swapgs_restore_regs_and_return_to_usermode in backport ]
    
    Suggested-by: Dave Hansen <dave.hansen@intel.com>
    Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
    Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
    Link: https://lore.kernel.org/all/20240213-delay-verw-v8-2-a6216d83edb7%40linux.intel.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Список изменений в Linux 6.7.9