Список изменений в ядре 6.6.15

afs: Hide silly-rename files from userspace [+ + +]

Author: David Howells <dhowells@redhat.com>
Date:   Mon Jan 8 17:22:36 2024 +0000

    afs: Hide silly-rename files from userspace
    
    [ Upstream commit 57e9d49c54528c49b8bffe6d99d782ea051ea534 ]
    
    There appears to be a race between silly-rename files being created/removed
    and various userspace tools iterating over the contents of a directory,
    leading to such errors as:
    
            find: './kernel/.tmp_cpio_dir/include/dt-bindings/reset/.__afs2080': No such file or directory
            tar: ./include/linux/greybus/.__afs3C95: File removed before we read it
    
    when building a kernel.
    
    Fix afs_readdir() so that it doesn't return .__afsXXXX silly-rename files
    to userspace.  This doesn't stop them being looked up directly by name as
    we need to be able to look them up from within the kernel as part of the
    silly-rename algorithm.
    
    Fixes: 79ddbfa500b3 ("afs: Implement sillyrename for unlink and rename")
    Signed-off-by: David Howells <dhowells@redhat.com>
    cc: Marc Dionne <marc.dionne@auristor.com>
    cc: linux-afs@lists.infradead.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>

arm64/sme: Always exit sme_alloc() early with existing storage [+ + +]

Author: Mark Brown <broonie@kernel.org>
Date:   Mon Jan 15 20:15:46 2024 +0000

    arm64/sme: Always exit sme_alloc() early with existing storage
    
    commit dc7eb8755797ed41a0d1b5c0c39df3c8f401b3d9 upstream.
    
    When sme_alloc() is called with existing storage and we are not flushing we
    will always allocate new storage, both leaking the existing storage and
    corrupting the state. Fix this by separating the checks for flushing and
    for existing storage as we do for SVE.
    
    Callers that reallocate (eg, due to changing the vector length) should
    call sme_free() themselves.
    
    Fixes: 5d0a8d2fba50 ("arm64/ptrace: Ensure that SME is set up for target when writing SSVE state")
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Cc: <stable@vger.kernel.org>
    Link: https://lore.kernel.org/r/20240115-arm64-sme-flush-v1-1-7472bd3459b7@kernel.org
    Signed-off-by: Will Deacon <will@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: Add missing vio-supply for AW2013 [+ + +]

Author: Stephan Gerhold <stephan@gerhold.net>
Date:   Mon Dec 4 10:46:11 2023 +0100

    arm64: dts: qcom: Add missing vio-supply for AW2013
    
    commit cc1ec484f2d0f464ad11b56fe3de2589c23f73ec upstream.
    
    Add the missing vio-supply to all usages of the AW2013 LED controller
    to ensure that the regulator needed for pull-up of the interrupt and
    I2C lines is really turned on. While this seems to have worked fine so
    far some of these regulators are not guaranteed to be always-on. For
    example, pm8916_l6 is typically turned off together with the display
    if there aren't any other devices (e.g. sensors) keeping it always-on.
    
    Cc: stable@vger.kernel.org # 6.6
    Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
    Link: https://lore.kernel.org/r/20231204-qcom-aw2013-vio-v1-1-5d264bb5c0b2@gerhold.net
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: msm8916: Make blsp_dma controlled-remotely [+ + +]

Author: Stephan Gerhold <stephan@gerhold.net>
Date:   Mon Dec 4 11:21:20 2023 +0100

    arm64: dts: qcom: msm8916: Make blsp_dma controlled-remotely
    
    commit 7c45b6ddbcff01f9934d11802010cfeb0879e693 upstream.
    
    The blsp_dma controller is shared between the different subsystems,
    which is why it is already initialized by the firmware. We should not
    reinitialize it from Linux to avoid potential other users of the DMA
    engine to misbehave.
    
    In mainline this can be described using the "qcom,controlled-remotely"
    property. In the downstream/vendor kernel from Qualcomm there is an
    opposite "qcom,managed-locally" property. This property is *not* set
    for the qcom,sps-dma@7884000 [1] so adding "qcom,controlled-remotely"
    upstream matches the behavior of the downstream/vendor kernel.
    
    Adding this seems to fix some weird issues with UART where both
    input/output becomes garbled with certain obscure firmware versions on
    some devices.
    
    [1]: https://git.codelinaro.org/clo/la/kernel/msm-3.10/-/blob/LA.BR.1.2.9.1-02310-8x16.0/arch/arm/boot/dts/qcom/msm8916.dtsi#L1466-1472
    
    Cc: stable@vger.kernel.org # 6.5
    Fixes: a0e5fb103150 ("arm64: dts: qcom: Add msm8916 BLSP device nodes")
    Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
    Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
    Link: https://lore.kernel.org/r/20231204-msm8916-blsp-dma-remote-v1-1-3e49c8838c8d@gerhold.net
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: msm8939: Make blsp_dma controlled-remotely [+ + +]

Author: Stephan Gerhold <stephan@gerhold.net>
Date:   Mon Dec 4 11:21:21 2023 +0100

    arm64: dts: qcom: msm8939: Make blsp_dma controlled-remotely
    
    commit 4bbda9421f316efdaef5dbf642e24925ef7de130 upstream.
    
    The blsp_dma controller is shared between the different subsystems,
    which is why it is already initialized by the firmware. We should not
    reinitialize it from Linux to avoid potential other users of the DMA
    engine to misbehave.
    
    In mainline this can be described using the "qcom,controlled-remotely"
    property. In the downstream/vendor kernel from Qualcomm there is an
    opposite "qcom,managed-locally" property. This property is *not* set
    for the qcom,sps-dma@7884000 [1] so adding "qcom,controlled-remotely"
    upstream matches the behavior of the downstream/vendor kernel.
    
    Adding this seems to fix some weird issues with UART where both
    input/output becomes garbled with certain obscure firmware versions on
    some devices.
    
    [1]: https://git.codelinaro.org/clo/la/kernel/msm-3.10/-/blob/LA.BR.1.2.9.1-02310-8x16.0/arch/arm/boot/dts/qcom/msm8939-common.dtsi#L866-872
    
    Cc: stable@vger.kernel.org # 6.5
    Fixes: 61550c6c156c ("arm64: dts: qcom: Add msm8939 SoC")
    Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
    Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
    Link: https://lore.kernel.org/r/20231204-msm8916-blsp-dma-remote-v1-2-3e49c8838c8d@gerhold.net
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: sc7180: fix USB wakeup interrupt types [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Mon Nov 20 17:43:23 2023 +0100

    arm64: dts: qcom: sc7180: fix USB wakeup interrupt types
    
    commit 9b956999bf725fd62613f719c3178fdbee6e5f47 upstream.
    
    The DP/DM wakeup interrupts are edge triggered and which edge to trigger
    on depends on use-case and whether a Low speed or Full/High speed device
    is connected.
    
    Fixes: 0b766e7fe5a2 ("arm64: dts: qcom: sc7180: Add USB related nodes")
    Cc: stable@vger.kernel.org      # 5.10
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Link: https://lore.kernel.org/r/20231120164331.8116-4-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: sc7280: fix usb_1 wakeup interrupt types [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Mon Nov 20 17:43:24 2023 +0100

    arm64: dts: qcom: sc7280: fix usb_1 wakeup interrupt types
    
    commit c34199d967a946e55381404fa949382691737521 upstream.
    
    A recent cleanup reordering the usb_1 wakeup interrupts inadvertently
    switched the DP and SuperSpeed interrupt trigger types.
    
    Fixes: 4a7ffc10d195 ("arm64: dts: qcom: align DWC3 USB interrupts with DT schema")
    Cc: stable@vger.kernel.org      # 5.19
    Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Link: https://lore.kernel.org/r/20231120164331.8116-5-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: sc8180x: fix USB DP/DM HS PHY interrupts [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Wed Dec 13 18:33:59 2023 +0100

    arm64: dts: qcom: sc8180x: fix USB DP/DM HS PHY interrupts
    
    commit 687d402bb350b392fa330e9d9d1b917777ee9ed1 upstream.
    
    The USB DP/DM HS PHY interrupts need to be provided by the PDC interrupt
    controller in order to be able to wake the system up from low-power
    states and to be able to detect disconnect events, which requires
    triggering on falling edges.
    
    A recent commit updated the trigger type but failed to change the
    interrupt provider as required. This leads to the current Linux driver
    failing to probe instead of printing an error during suspend and USB
    wakeup not working as intended.
    
    Fixes: 0dc0f6da3d43 ("arm64: dts: qcom: sc8180x: fix USB wakeup interrupt types")
    Fixes: b080f53a8f44 ("arm64: dts: qcom: sc8180x: Add remoteprocs, wifi and usb nodes")
    Cc: stable@vger.kernel.org      # 6.5
    Cc: Vinod Koul <vkoul@kernel.org>
    Reported-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Tested-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Link: https://lore.kernel.org/r/20231213173403.29544-2-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: sc8180x: fix USB SS wakeup [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Thu Dec 14 08:43:19 2023 +0100

    arm64: dts: qcom: sc8180x: fix USB SS wakeup
    
    commit 0afa885d42d05d30161ab8eab1ebacd993edb82b upstream.
    
    The USB SS PHY interrupt needs to be provided by the PDC interrupt
    controller in order to be able to wake the system up from low-power
    states.
    
    Fixes: b080f53a8f44 ("arm64: dts: qcom: sc8180x: Add remoteprocs, wifi and usb nodes")
    Cc: stable@vger.kernel.org      # 6.5
    Cc: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Link: https://lore.kernel.org/r/20231214074319.11023-4-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: sc8180x: fix USB wakeup interrupt types [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Mon Nov 20 17:43:26 2023 +0100

    arm64: dts: qcom: sc8180x: fix USB wakeup interrupt types
    
    commit 0dc0f6da3d43da8d2297105663e51ecb01b6f790 upstream.
    
    The DP/DM wakeup interrupts are edge triggered and which edge to trigger
    on depends on use-case and whether a Low speed or Full/High speed device
    is connected.
    
    Fixes: b080f53a8f44 ("arm64: dts: qcom: sc8180x: Add remoteprocs, wifi and usb nodes")
    Cc: stable@vger.kernel.org      # 6.5
    Cc: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Link: https://lore.kernel.org/r/20231120164331.8116-7-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: sc8280xp-crd: fix eDP phy compatible [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Mon Oct 16 10:06:58 2023 +0200

    arm64: dts: qcom: sc8280xp-crd: fix eDP phy compatible
    
    commit 663affdb12b3e26c77d103327cf27de720c8117e upstream.
    
    The sc8280xp Display Port PHYs can be used in either DP or eDP mode and
    this is configured using the devicetree compatible string which defaults
    to DP mode in the SoC dtsi.
    
    Override the default compatible string for the CRD eDP PHY node so that
    the eDP settings are used.
    
    Fixes: 4a883a8d80b5 ("arm64: dts: qcom: sc8280xp-crd: Enable EDP")
    Cc: stable@vger.kernel.org      # 6.3
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Link: https://lore.kernel.org/r/20231016080658.6667-1-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: sdm670: fix USB DP/DM HS PHY interrupts [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Thu Dec 14 08:43:17 2023 +0100

    arm64: dts: qcom: sdm670: fix USB DP/DM HS PHY interrupts
    
    commit c42d12ea105f67b0f137f1e52d5c59d13fe12b1f upstream.
    
    The USB DP/DM HS PHY interrupts need to be provided by the PDC interrupt
    controller in order to be able to wake the system up from low-power
    states and to be able to detect disconnect events, which requires
    triggering on falling edges.
    
    A recent commit updated the trigger type but failed to change the
    interrupt provider as required. This leads to the current Linux driver
    failing to probe instead of printing an error during suspend and USB
    wakeup not working as intended.
    
    Fixes: de3b3de30999 ("arm64: dts: qcom: sdm670: fix USB wakeup interrupt types")
    Fixes: 07c8ded6e373 ("arm64: dts: qcom: add sdm670 and pixel 3a device trees")
    Cc: stable@vger.kernel.org      # 6.2
    Cc: Richard Acayan <mailingradian@gmail.com>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Tested-by: Richard Acayan <mailingradian@gmail.com>
    Link: https://lore.kernel.org/r/20231214074319.11023-2-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: sdm670: fix USB SS wakeup [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Thu Dec 14 08:43:18 2023 +0100

    arm64: dts: qcom: sdm670: fix USB SS wakeup
    
    commit 047b2edc35b8db22354b4fba37818b548fc18896 upstream.
    
    The USB SS PHY interrupt needs to be provided by the PDC interrupt
    controller in order to be able to wake the system up from low-power
    states.
    
    Fixes: 07c8ded6e373 ("arm64: dts: qcom: add sdm670 and pixel 3a device trees")
    Cc: stable@vger.kernel.org      # 6.2
    Cc: Richard Acayan <mailingradian@gmail.com>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Tested-by: Richard Acayan <mailingradian@gmail.com>
    Link: https://lore.kernel.org/r/20231214074319.11023-3-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: sdm670: fix USB wakeup interrupt types [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Mon Nov 20 17:43:27 2023 +0100

    arm64: dts: qcom: sdm670: fix USB wakeup interrupt types
    
    commit de3b3de30999106549da4df88a7963d0ac02b91e upstream.
    
    The DP/DM wakeup interrupts are edge triggered and which edge to trigger
    on depends on use-case and whether a Low speed or Full/High speed device
    is connected.
    
    Fixes: 07c8ded6e373 ("arm64: dts: qcom: add sdm670 and pixel 3a device trees")
    Cc: stable@vger.kernel.org      # 6.2
    Cc: Richard Acayan <mailingradian@gmail.com>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Acked-by: Richard Acayan <mailingradian@gmail.com>
    Link: https://lore.kernel.org/r/20231120164331.8116-8-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: sdm845: fix USB DP/DM HS PHY interrupts [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Wed Dec 13 18:34:00 2023 +0100

    arm64: dts: qcom: sdm845: fix USB DP/DM HS PHY interrupts
    
    commit 204f9ed4bad6293933179517624143b8f412347c upstream.
    
    The USB DP/DM HS PHY interrupts need to be provided by the PDC interrupt
    controller in order to be able to wake the system up from low-power
    states and to be able to detect disconnect events, which requires
    triggering on falling edges.
    
    A recent commit updated the trigger type but failed to change the
    interrupt provider as required. This leads to the current Linux driver
    failing to probe instead of printing an error during suspend and USB
    wakeup not working as intended.
    
    Fixes: 84ad9ac8d9ca ("arm64: dts: qcom: sdm845: fix USB wakeup interrupt types")
    Fixes: ca4db2b538a1 ("arm64: dts: qcom: sdm845: Add USB-related nodes")
    Cc: stable@vger.kernel.org      # 4.20
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Link: https://lore.kernel.org/r/20231213173403.29544-3-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: sdm845: fix USB SS wakeup [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Wed Dec 13 18:34:01 2023 +0100

    arm64: dts: qcom: sdm845: fix USB SS wakeup
    
    commit 971f5d8b0618d09db75184ddd8cca0767514db5d upstream.
    
    The USB SS PHY interrupts need to be provided by the PDC interrupt
    controller in order to be able to wake the system up from low-power
    states.
    
    Fixes: ca4db2b538a1 ("arm64: dts: qcom: sdm845: Add USB-related nodes")
    Cc: stable@vger.kernel.org      # 4.20
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Link: https://lore.kernel.org/r/20231213173403.29544-4-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: sdm845: fix USB wakeup interrupt types [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Mon Nov 20 17:43:28 2023 +0100

    arm64: dts: qcom: sdm845: fix USB wakeup interrupt types
    
    commit 84ad9ac8d9ca29033d589e79a991866b38e23b85 upstream.
    
    The DP/DM wakeup interrupts are edge triggered and which edge to trigger
    on depends on use-case and whether a Low speed or Full/High speed device
    is connected.
    
    Fixes: ca4db2b538a1 ("arm64: dts: qcom: sdm845: Add USB-related nodes")
    Cc: stable@vger.kernel.org      # 4.20
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Link: https://lore.kernel.org/r/20231120164331.8116-9-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: sm8150: fix USB DP/DM HS PHY interrupts [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Wed Dec 13 18:34:02 2023 +0100

    arm64: dts: qcom: sm8150: fix USB DP/DM HS PHY interrupts
    
    commit 134de5e831775e8b178db9b131c1d3769a766982 upstream.
    
    The USB DP/DM HS PHY interrupts need to be provided by the PDC interrupt
    controller in order to be able to wake the system up from low-power
    states and to be able to detect disconnect events, which requires
    triggering on falling edges.
    
    A recent commit updated the trigger type but failed to change the
    interrupt provider as required. This leads to the current Linux driver
    failing to probe instead of printing an error during suspend and USB
    wakeup not working as intended.
    
    Fixes: 54524b6987d1 ("arm64: dts: qcom: sm8150: fix USB wakeup interrupt types")
    Fixes: 0c9dde0d2015 ("arm64: dts: qcom: sm8150: Add secondary USB and PHY nodes")
    Fixes: b33d2868e8d3 ("arm64: dts: qcom: sm8150: Add USB and PHY device nodes")
    Cc: stable@vger.kernel.org      # 5.10
    Cc: Jack Pham <quic_jackp@quicinc.com>
    Cc: Jonathan Marek <jonathan@marek.ca>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Link: https://lore.kernel.org/r/20231213173403.29544-5-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: sm8150: fix USB SS wakeup [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Wed Dec 13 18:34:03 2023 +0100

    arm64: dts: qcom: sm8150: fix USB SS wakeup
    
    commit cc4e1da491b84ca05339a19893884cda78f74aef upstream.
    
    The USB SS PHY interrupts need to be provided by the PDC interrupt
    controller in order to be able to wake the system up from low-power
    states.
    
    Fixes: 0c9dde0d2015 ("arm64: dts: qcom: sm8150: Add secondary USB and PHY nodes")
    Fixes: b33d2868e8d3 ("arm64: dts: qcom: sm8150: Add USB and PHY device nodes")
    Cc: stable@vger.kernel.org      # 5.10
    Cc: Jack Pham <quic_jackp@quicinc.com>
    Cc: Jonathan Marek <jonathan@marek.ca>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Link: https://lore.kernel.org/r/20231213173403.29544-6-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: qcom: sm8150: fix USB wakeup interrupt types [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Mon Nov 20 17:43:30 2023 +0100

    arm64: dts: qcom: sm8150: fix USB wakeup interrupt types
    
    commit 54524b6987d1fffe64cbf3dded1b2fa6b903edf9 upstream.
    
    The DP/DM wakeup interrupts are edge triggered and which edge to trigger
    on depends on use-case and whether a Low speed or Full/High speed device
    is connected.
    
    Fixes: 0c9dde0d2015 ("arm64: dts: qcom: sm8150: Add secondary USB and PHY nodes")
    Fixes: b33d2868e8d3 ("arm64: dts: qcom: sm8150: Add USB and PHY device nodes")
    Cc: stable@vger.kernel.org      # 5.10
    Cc: Jonathan Marek <jonathan@marek.ca>
    Cc: Jack Pham <quic_jackp@quicinc.com>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Reviewed-by: Jack Pham <quic_jackp@quicinc.com>
    Link: https://lore.kernel.org/r/20231120164331.8116-11-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: rockchip: configure eth pad driver strength for orangepi r1 plus lts [+ + +]

Author: Tianling Shen <cnsztl@gmail.com>
Date:   Sat Dec 16 12:07:23 2023 +0800

    arm64: dts: rockchip: configure eth pad driver strength for orangepi r1 plus lts
    
    commit fc5a80a432607d05e85bba37971712405f75c546 upstream.
    
    The default strength is not enough to provide stable connection
    under 3.3v LDO voltage.
    
    Fixes: 387b3bbac5ea ("arm64: dts: rockchip: Add Xunlong OrangePi R1 Plus LTS")
    Cc: stable@vger.kernel.org # 6.6+
    Signed-off-by: Tianling Shen <cnsztl@gmail.com>
    Link: https://lore.kernel.org/r/20231216040723.17864-1-cnsztl@gmail.com
    Signed-off-by: Heiko Stuebner <heiko@sntech.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: rockchip: Fix rk3588 USB power-domain clocks [+ + +]

Author: Sam Edwards <cfsworks@gmail.com>
Date:   Fri Dec 15 19:10:19 2023 -0700

    arm64: dts: rockchip: Fix rk3588 USB power-domain clocks
    
    commit 44de8996ed5a10f08f2fe947182da6535edcfae5 upstream.
    
    The QoS blocks saved/restored when toggling the PD_USB power domain are
    clocked by ACLK_USB. Attempting to access these memory regions without
    that clock running will result in an indefinite CPU stall.
    
    The PD_USB node wasn't specifying this clock dependency, resulting in
    hangs when trying to toggle the power domain (either on or off), unless
    we get "lucky" and have ACLK_USB running for another reason at the time.
    This "luck" can result from the bootloader leaving USB powered/clocked,
    and if no built-in driver wants USB, Linux will disable the unused
    PD+CLK on boot when {pd,clk}_ignore_unused aren't given. This can also
    be unlucky because the two cleanup tasks run in parallel and race: if
    the CLK is disabled first, the PD deactivation stalls the boot. In any
    case, the PD cannot then be reenabled (if e.g. the driver loads later)
    once the clock has been stopped.
    
    Fix this by specifying a dependency on ACLK_USB, instead of only
    ACLK_USB_ROOT. The child-parent relationship means the former implies
    the latter anyway.
    
    Fixes: c9211fa2602b8 ("arm64: dts: rockchip: Add base DT for rk3588 SoC")
    Cc: stable@vger.kernel.org
    Signed-off-by: Sam Edwards <CFSworks@gmail.com>
    Link: https://lore.kernel.org/r/20231216021019.1543811-1-CFSworks@gmail.com
    [changed to only include the missing clock, not dropping the root-clocks]
    Signed-off-by: Heiko Stuebner <heiko@sntech.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: sprd: fix the cpu node for UMS512 [+ + +]

Author: Cixi Geng <cixi.geng1@unisoc.com>
Date:   Wed Jul 12 00:23:46 2023 +0800

    arm64: dts: sprd: fix the cpu node for UMS512
    
    commit 2da4f4a7b003441b80f0f12d8a216590f652a40f upstream.
    
    The UMS512 Socs have 8 cores contains 6 a55 and 2 a75.
    modify the cpu nodes to correct information.
    
    Fixes: 2b4881839a39 ("arm64: dts: sprd: Add support for Unisoc's UMS512")
    Cc: stable@vger.kernel.org
    Signed-off-by: Cixi Geng <cixi.geng1@unisoc.com>
    Link: https://lore.kernel.org/r/20230711162346.5978-1-cixi.geng@linux.dev
    Signed-off-by: Chunyan Zhang <chunyan.zhang@unisoc.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: entry: fix ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD [+ + +]

Author: Mark Rutland <mark.rutland@arm.com>
Date:   Tue Jan 16 11:02:20 2024 +0000

    arm64: entry: fix ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD
    
    commit 832dd634bd1b4e3bbe9f10b9c9ba5db6f6f2b97f upstream.
    
    Currently the ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD workaround isn't
    quite right, as it is supposed to be applied after the last explicit
    memory access, but is immediately followed by an LDR.
    
    The ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD workaround is used to
    handle Cortex-A520 erratum 2966298 and Cortex-A510 erratum 3117295,
    which are described in:
    
    * https://developer.arm.com/documentation/SDEN2444153/0600/?lang=en
    * https://developer.arm.com/documentation/SDEN1873361/1600/?lang=en
    
    In both cases the workaround is described as:
    
    | If pagetable isolation is disabled, the context switch logic in the
    | kernel can be updated to execute the following sequence on affected
    | cores before exiting to EL0, and after all explicit memory accesses:
    |
    | 1. A non-shareable TLBI to any context and/or address, including
    |    unused contexts or addresses, such as a `TLBI VALE1 Xzr`.
    |
    | 2. A DSB NSH to guarantee completion of the TLBI.
    
    The important part being that the TLBI+DSB must be placed "after all
    explicit memory accesses".
    
    Unfortunately, as-implemented, the TLBI+DSB is immediately followed by
    an LDR, as we have:
    
    | alternative_if ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD
    |       tlbi    vale1, xzr
    |       dsb     nsh
    | alternative_else_nop_endif
    | alternative_if_not ARM64_UNMAP_KERNEL_AT_EL0
    |       ldr     lr, [sp, #S_LR]
    |       add     sp, sp, #PT_REGS_SIZE           // restore sp
    |       eret
    | alternative_else_nop_endif
    |
    | [ ... KPTI exception return path ... ]
    
    This patch fixes this by reworking the logic to place the TLBI+DSB
    immediately before the ERET, after all explicit memory accesses.
    
    The ERET is currently in a separate alternative block, and alternatives
    cannot be nested. To account for this, the alternative block for
    ARM64_UNMAP_KERNEL_AT_EL0 is replaced with a single alternative branch
    to skip the KPTI logic, with the new shape of the logic being:
    
    | alternative_insn "b .L_skip_tramp_exit_\@", nop, ARM64_UNMAP_KERNEL_AT_EL0
    |       [ ... KPTI exception return path ... ]
    | .L_skip_tramp_exit_\@:
    |
    |       ldr     lr, [sp, #S_LR]
    |       add     sp, sp, #PT_REGS_SIZE           // restore sp
    |
    | alternative_if ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD
    |       tlbi    vale1, xzr
    |       dsb     nsh
    | alternative_else_nop_endif
    |       eret
    
    The new structure means that the workaround is only applied when KPTI is
    not in use; this is fine as noted in the documented implications of the
    erratum:
    
    | Pagetable isolation between EL0 and higher level ELs prevents the
    | issue from occurring.
    
    ... and as per the workaround description quoted above, the workaround
    is only necessary "If pagetable isolation is disabled".
    
    Fixes: 471470bc7052 ("arm64: errata: Add Cortex-A520 speculative unprivileged load workaround")
    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: James Morse <james.morse@arm.com>
    Cc: Rob Herring <robh@kernel.org>
    Cc: Will Deacon <will@kernel.org>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240116110221.420467-2-mark.rutland@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: errata: Add Cortex-A510 speculative unprivileged load workaround [+ + +]

Author: Rob Herring <robh@kernel.org>
Date:   Wed Jan 10 11:29:21 2024 -0600

    arm64: errata: Add Cortex-A510 speculative unprivileged load workaround
    
    commit f827bcdafa2a2ac21c91e47f587e8d0c76195409 upstream.
    
    Implement the workaround for ARM Cortex-A510 erratum 3117295. On an
    affected Cortex-A510 core, a speculatively executed unprivileged load
    might leak data from a privileged load via a cache side channel. The
    issue only exists for loads within a translation regime with the same
    translation (e.g. same ASID and VMID). Therefore, the issue only affects
    the return to EL0.
    
    The erratum and workaround are the same as ARM Cortex-A520 erratum
    2966298, so reuse the existing workaround.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Rob Herring <robh@kernel.org>
    Reviewed-by: Mark Rutland <mark.rutland@arm.com>
    Link: https://lore.kernel.org/r/20240110-arm-errata-a510-v1-2-d02bc51aeeee@kernel.org
    Signed-off-by: Will Deacon <will@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: properly install vmlinuz.efi [+ + +]

Author: Josef Bacik <josef@toxicpanda.com>
Date:   Thu Dec 14 11:18:50 2023 -0500

    arm64: properly install vmlinuz.efi
    
    commit 7b21ed7d119dc06b0ed2ba3e406a02cafe3a8d03 upstream.
    
    If you select CONFIG_EFI_ZBOOT, we will generate vmlinuz.efi, and then
    when we go to install the kernel we'll install the vmlinux instead
    because install.sh only recognizes Image.gz as wanting the compressed
    install image.  With CONFIG_EFI_ZBOOT we don't get the proper kernel
    installed, which means it doesn't boot, which makes for a very confused
    and subsequently angry kernel developer.
    
    Fix this by properly installing our compressed kernel if we've enabled
    CONFIG_EFI_ZBOOT.
    
    Signed-off-by: Josef Bacik <josef@toxicpanda.com>
    Cc: <stable@vger.kernel.org> # 6.1.x
    Fixes: c37b830fef13 ("arm64: efi: enable generic EFI compressed boot")
    Reviewed-by: Simon Glass <sjg@chromium.org>
    Link: https://lore.kernel.org/r/6edb1402769c2c14c4fbef8f7eaedb3167558789.1702570674.git.josef@toxicpanda.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: Rename ARM64_WORKAROUND_2966298 [+ + +]

Author: Rob Herring <robh@kernel.org>
Date:   Wed Jan 10 11:29:20 2024 -0600

    arm64: Rename ARM64_WORKAROUND_2966298
    
    commit 546b7cde9b1dd36089649101b75266564600ffe5 upstream.
    
    In preparation to apply ARM64_WORKAROUND_2966298 for multiple errata,
    rename the kconfig and capability. No functional change.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Rob Herring <robh@kernel.org>
    Reviewed-by: Mark Rutland <mark.rutland@arm.com>
    Link: https://lore.kernel.org/r/20240110-arm-errata-a510-v1-1-d02bc51aeeee@kernel.org
    Signed-off-by: Will Deacon <will@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: dts: exynos4212-tab3: add samsung,invert-vclk flag to fimd [+ + +]

Author: Artur Weber <aweber.kernel@gmail.com>
Date:   Fri Jan 5 07:53:01 2024 +0100

    ARM: dts: exynos4212-tab3: add samsung,invert-vclk flag to fimd
    
    [ Upstream commit eab4f56d3e75dad697acf8dc2c8be3c341d6c63e ]
    
    After more investigation, I've found that it's not the panel driver
    config that needs to be modified to invert the data polarity, but
    the FIMD config.
    
    Add the missing invert-vclk option that is required to get the display
    to work correctly.
    
    Fixes: ee37a457af1d ("ARM: dts: exynos: Add Samsung Galaxy Tab 3 8.0 boards")
    Signed-off-by: Artur Weber <aweber.kernel@gmail.com>
    Link: https://lore.kernel.org/r/20240105-tab3-display-fixes-v2-1-904d1207bf6f@gmail.com
    Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ARM: dts: imx6q-apalis: add can power-up delay on ixora board [+ + +]

Author: Andrejs Cainikovs <andrejs.cainikovs@toradex.com>
Date:   Fri Oct 20 17:30:22 2023 +0200

    ARM: dts: imx6q-apalis: add can power-up delay on ixora board
    
    commit b76bbf835d8945080b22b52fc1e6f41cde06865d upstream.
    
    Newer variants of Ixora boards require a power-up delay when powering up
    the CAN transceiver of up to 1ms.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Andrejs Cainikovs <andrejs.cainikovs@toradex.com>
    Signed-off-by: Shawn Guo <shawnguo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: dts: qcom: sdx55: fix pdc '#interrupt-cells' [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Wed Dec 13 18:31:29 2023 +0100

    ARM: dts: qcom: sdx55: fix pdc '#interrupt-cells'
    
    commit cc25bd06c16aa582596a058d375b2e3133f79b93 upstream.
    
    The Qualcomm PDC interrupt controller binding expects two cells in
    interrupt specifiers.
    
    Fixes: 9d038b2e62de ("ARM: dts: qcom: Add SDX55 platform and MTP board support")
    Cc: stable@vger.kernel.org      # 5.12
    Cc: Manivannan Sadhasivam <mani@kernel.org>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Link: https://lore.kernel.org/r/20231213173131.29436-2-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: dts: qcom: sdx55: fix USB DP/DM HS PHY interrupts [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Wed Dec 13 18:31:30 2023 +0100

    ARM: dts: qcom: sdx55: fix USB DP/DM HS PHY interrupts
    
    commit de95f139394a5ed82270f005bc441d2e7c1e51b7 upstream.
    
    The USB DP/DM HS PHY interrupts need to be provided by the PDC interrupt
    controller in order to be able to wake the system up from low-power
    states and to be able to detect disconnect events, which requires
    triggering on falling edges.
    
    A recent commit updated the trigger type but failed to change the
    interrupt provider as required. This leads to the current Linux driver
    failing to probe instead of printing an error during suspend and USB
    wakeup not working as intended.
    
    Fixes: d0ec3c4c11c3 ("ARM: dts: qcom: sdx55: fix USB wakeup interrupt types")
    Fixes: fea4b41022f3 ("ARM: dts: qcom: sdx55: Add USB3 and PHY support")
    Cc: stable@vger.kernel.org      # 5.12
    Cc: Manivannan Sadhasivam <mani@kernel.org>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Link: https://lore.kernel.org/r/20231213173131.29436-3-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: dts: qcom: sdx55: fix USB SS wakeup [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Wed Dec 13 18:31:31 2023 +0100

    ARM: dts: qcom: sdx55: fix USB SS wakeup
    
    commit 710dd03464e4ab5b3d329768388b165d61958577 upstream.
    
    The USB SS PHY interrupt needs to be provided by the PDC interrupt
    controller in order to be able to wake the system up from low-power
    states.
    
    Fixes: fea4b41022f3 ("ARM: dts: qcom: sdx55: Add USB3 and PHY support")
    Cc: stable@vger.kernel.org      # 5.12
    Cc: Manivannan Sadhasivam <mani@kernel.org>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Link: https://lore.kernel.org/r/20231213173131.29436-4-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: dts: qcom: sdx55: fix USB wakeup interrupt types [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Mon Nov 20 17:43:21 2023 +0100

    ARM: dts: qcom: sdx55: fix USB wakeup interrupt types
    
    commit d0ec3c4c11c3b30e1f2d344973b2a7bf0f986734 upstream.
    
    The DP/DM wakeup interrupts are edge triggered and which edge to trigger
    on depends on use-case and whether a Low speed or Full/High speed device
    is connected.
    
    Fixes: fea4b41022f3 ("ARM: dts: qcom: sdx55: Add USB3 and PHY support")
    Cc: stable@vger.kernel.org      # 5.12
    Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Link: https://lore.kernel.org/r/20231120164331.8116-2-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: dts: samsung: exynos4210-i9100: Unconditionally enable LDO12 [+ + +]

Author: Paul Cercueil <paul@crapouillou.net>
Date:   Wed Dec 6 23:15:54 2023 +0100

    ARM: dts: samsung: exynos4210-i9100: Unconditionally enable LDO12
    
    commit 84228d5e29dbc7a6be51e221000e1d122125826c upstream.
    
    The kernel hangs for a good 12 seconds without any info being printed to
    dmesg, very early in the boot process, if this regulator is not enabled.
    
    Force-enable it to work around this issue, until we know more about the
    underlying problem.
    
    Signed-off-by: Paul Cercueil <paul@crapouillou.net>
    Fixes: 8620cc2f99b7 ("ARM: dts: exynos: Add devicetree file for the Galaxy S2")
    Cc: stable@vger.kernel.org # v5.8+
    Link: https://lore.kernel.org/r/20231206221556.15348-2-paul@crapouillou.net
    Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

async: Introduce async_schedule_dev_nocall() [+ + +]

Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date:   Wed Dec 27 21:38:23 2023 +0100

    async: Introduce async_schedule_dev_nocall()
    
    commit 7d4b5d7a37bdd63a5a3371b988744b060d5bb86f upstream.
    
    In preparation for subsequent changes, introduce a specialized variant
    of async_schedule_dev() that will not invoke the argument function
    synchronously when it cannot be scheduled for asynchronous execution.
    
    The new function, async_schedule_dev_nocall(), will be used for fixing
    possible deadlocks in the system-wide power management core code.
    
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> for the series.
    Tested-by: Youngmin Nam <youngmin.nam@samsung.com>
    Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

async: Split async_schedule_node_domain() [+ + +]

Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date:   Wed Dec 27 21:37:02 2023 +0100

    async: Split async_schedule_node_domain()
    
    commit 6aa09a5bccd8e224d917afdb4c278fc66aacde4d upstream.
    
    In preparation for subsequent changes, split async_schedule_node_domain()
    in two pieces so as to allow the bottom part of it to be called from a
    somewhat different code path.
    
    No functional impact.
    
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
    Tested-by: Youngmin Nam <youngmin.nam@samsung.com>
    Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

block: Move checking GENHD_FL_NO_PART to bdev_add_partition() [+ + +]

Author: Li Lingfeng <lilingfeng3@huawei.com>
Date:   Thu Jan 18 21:04:01 2024 +0800

    block: Move checking GENHD_FL_NO_PART to bdev_add_partition()
    
    [ Upstream commit 7777f47f2ea64efd1016262e7b59fab34adfb869 ]
    
    Commit 1a721de8489f ("block: don't add or resize partition on the disk
    with GENHD_FL_NO_PART") prevented all operations about partitions on disks
    with GENHD_FL_NO_PART in blkpg_do_ioctl() since they are meaningless.
    However, it changed error code in some scenarios. So move checking
    GENHD_FL_NO_PART to bdev_add_partition() to eliminate impact.
    
    Fixes: 1a721de8489f ("block: don't add or resize partition on the disk with GENHD_FL_NO_PART")
    Reported-by: Allison Karlitskaya <allison.karlitskaya@redhat.com>
    Closes: https://lore.kernel.org/all/CAOYeF9VsmqKMcQjo1k6YkGNujwN-nzfxY17N3F-CMikE1tYp+w@mail.gmail.com/
    Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
    Reviewed-by: Yu Kuai <yukuai3@huawei.com>
    Link: https://lore.kernel.org/r/20240118130401.792757-1-lilingfeng@huaweicloud.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bnxt_en: Prevent kernel warning when running offline self test [+ + +]

Author: Michael Chan <michael.chan@broadcom.com>
Date:   Wed Jan 17 15:45:14 2024 -0800

    bnxt_en: Prevent kernel warning when running offline self test
    
    [ Upstream commit c20f482129a582455f02eb9a6dcb2a4215274599 ]
    
    We call bnxt_half_open_nic() to setup the chip partially to run
    loopback tests.  The rings and buffers are initialized normally
    so that we can transmit and receive packets in loopback mode.
    That means page pool buffers are allocated for the aggregation ring
    just like the normal case.  NAPI is not needed because we are just
    polling for the loopback packets.
    
    When we're done with the loopback tests, we call bnxt_half_close_nic()
    to clean up.  When freeing the page pools, we hit a WARN_ON()
    in page_pool_unlink_napi() because the NAPI state linked to the
    page pool is uninitialized.
    
    The simplest way to avoid this warning is just to initialize the
    NAPIs during half open and delete the NAPIs during half close.
    Trying to skip the page pool initialization or skip linking of
    NAPI during half open will be more complicated.
    
    This fix avoids this warning:
    
    WARNING: CPU: 4 PID: 46967 at net/core/page_pool.c:946 page_pool_unlink_napi+0x1f/0x30
    CPU: 4 PID: 46967 Comm: ethtool Tainted: G S      W          6.7.0-rc5+ #22
    Hardware name: Dell Inc. PowerEdge R750/06V45N, BIOS 1.3.8 08/31/2021
    RIP: 0010:page_pool_unlink_napi+0x1f/0x30
    Code: 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 8b 47 18 48 85 c0 74 1b 48 8b 50 10 83 e2 01 74 08 8b 40 34 83 f8 ff 74 02 <0f> 0b 48 c7 47 18 00 00 00 00 c3 cc cc cc cc 66 90 90 90 90 90 90
    RSP: 0018:ffa000003d0dfbe8 EFLAGS: 00010246
    RAX: ff110003607ce640 RBX: ff110010baf5d000 RCX: 0000000000000008
    RDX: 0000000000000000 RSI: ff110001e5e522c0 RDI: ff110010baf5d000
    RBP: ff11000145539b40 R08: 0000000000000001 R09: ffffffffc063f641
    R10: ff110001361eddb8 R11: 000000000040000f R12: 0000000000000001
    R13: 000000000000001c R14: ff1100014553a080 R15: 0000000000003fc0
    FS:  00007f9301c4f740(0000) GS:ff1100103fd00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f91344fa8f0 CR3: 00000003527cc005 CR4: 0000000000771ef0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    PKRU: 55555554
    Call Trace:
     <TASK>
     ? __warn+0x81/0x140
     ? page_pool_unlink_napi+0x1f/0x30
     ? report_bug+0x102/0x200
     ? handle_bug+0x44/0x70
     ? exc_invalid_op+0x13/0x60
     ? asm_exc_invalid_op+0x16/0x20
     ? bnxt_free_ring.isra.123+0xb1/0xd0 [bnxt_en]
     ? page_pool_unlink_napi+0x1f/0x30
     page_pool_destroy+0x3e/0x150
     bnxt_free_mem+0x441/0x5e0 [bnxt_en]
     bnxt_half_close_nic+0x2a/0x40 [bnxt_en]
     bnxt_self_test+0x21d/0x450 [bnxt_en]
     __dev_ethtool+0xeda/0x2e30
     ? native_queued_spin_lock_slowpath+0x17f/0x2b0
     ? __link_object+0xa1/0x160
     ? _raw_spin_unlock_irqrestore+0x23/0x40
     ? __create_object+0x5f/0x90
     ? __kmem_cache_alloc_node+0x317/0x3c0
     ? dev_ethtool+0x59/0x170
     dev_ethtool+0xa7/0x170
     dev_ioctl+0xc3/0x530
     sock_do_ioctl+0xa8/0xf0
     sock_ioctl+0x270/0x310
     __x64_sys_ioctl+0x8c/0xc0
     do_syscall_64+0x3e/0xf0
     entry_SYSCALL_64_after_hwframe+0x6e/0x76
    
    Fixes: 294e39e0d034 ("bnxt: hook NAPIs to page pools")
    Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
    Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
    Signed-off-by: Michael Chan <michael.chan@broadcom.com>
    Link: https://lore.kernel.org/r/20240117234515.226944-5-michael.chan@broadcom.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bnxt_en: Wait for FLR to complete during probe [+ + +]

Author: Michael Chan <michael.chan@broadcom.com>
Date:   Wed Jan 17 15:45:11 2024 -0800

    bnxt_en: Wait for FLR to complete during probe
    
    [ Upstream commit 3c1069fa42872f95cf3c6fedf80723d391e12d57 ]
    
    The first message to firmware may fail if the device is undergoing FLR.
    The driver has some recovery logic for this failure scenario but we must
    wait 100 msec for FLR to complete before proceeding.  Otherwise the
    recovery will always fail.
    
    Fixes: ba02629ff6cb ("bnxt_en: log firmware status on firmware init failure")
    Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
    Signed-off-by: Michael Chan <michael.chan@broadcom.com>
    Link: https://lore.kernel.org/r/20240117234515.226944-2-michael.chan@broadcom.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf: Add bpf_sock_addr_set_sun_path() to allow writing unix sockaddr from bpf [+ + +]

Author: Daan De Meyer <daan.j.demeyer@gmail.com>
Date:   Wed Oct 11 20:51:05 2023 +0200

    bpf: Add bpf_sock_addr_set_sun_path() to allow writing unix sockaddr from bpf
    
    [ Upstream commit 53e380d21441909b12b6e0782b77187ae4b971c4 ]
    
    As prep for adding unix socket support to the cgroup sockaddr hooks,
    let's add a kfunc bpf_sock_addr_set_sun_path() that allows modifying a unix
    sockaddr from bpf. While this is already possible for AF_INET and AF_INET6,
    we'll need this kfunc when we add unix socket support since modifying the
    address for those requires modifying both the address and the sockaddr
    length.
    
    Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com>
    Link: https://lore.kernel.org/r/20231011185113.140426-4-daan.j.demeyer@gmail.com
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
    Stable-dep-of: c5114710c8ce ("xsk: fix usage of multi-buffer BPF helpers for ZC XDP")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf: correct loop detection for iterators convergence [+ + +]

Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Oct 24 03:09:15 2023 +0300

    bpf: correct loop detection for iterators convergence
    
    commit 2a0992829ea3864939d917a5c7b48be6629c6217 upstream.
    
    It turns out that .branches > 0 in is_state_visited() is not a
    sufficient condition to identify if two verifier states form a loop
    when iterators convergence is computed. This commit adds logic to
    distinguish situations like below:
    
     (I)            initial       (II)            initial
                      |                             |
                      V                             V
         .---------> hdr                           ..
         |            |                             |
         |            V                             V
         |    .------...                    .------..
         |    |       |                     |       |
         |    V       V                     V       V
         |   ...     ...               .-> hdr     ..
         |    |       |                |    |       |
         |    V       V                |    V       V
         |   succ <- cur               |   succ <- cur
         |    |                        |    |
         |    V                        |    V
         |   ...                       |   ...
         |    |                        |    |
         '----'                        '----'
    
    For both (I) and (II) successor 'succ' of the current state 'cur' was
    previously explored and has branches count at 0. However, loop entry
    'hdr' corresponding to 'succ' might be a part of current DFS path.
    If that is the case 'succ' and 'cur' are members of the same loop
    and have to be compared exactly.
    
    Co-developed-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
    Co-developed-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
    Reviewed-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231024000917.12153-6-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bpf: exact states comparison for iterator convergence checks [+ + +]

Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Oct 24 03:09:13 2023 +0300

    bpf: exact states comparison for iterator convergence checks
    
    commit 2793a8b015f7f1caadb9bce9c63dc659f7522676 upstream.
    
    Convergence for open coded iterators is computed in is_state_visited()
    by examining states with branches count > 1 and using states_equal().
    states_equal() computes sub-state relation using read and precision marks.
    Read and precision marks are propagated from children states,
    thus are not guaranteed to be complete inside a loop when branches
    count > 1. This could be demonstrated using the following unsafe program:
    
         1. r7 = -16
         2. r6 = bpf_get_prandom_u32()
         3. while (bpf_iter_num_next(&fp[-8])) {
         4.   if (r6 != 42) {
         5.     r7 = -32
         6.     r6 = bpf_get_prandom_u32()
         7.     continue
         8.   }
         9.   r0 = r10
        10.   r0 += r7
        11.   r8 = *(u64 *)(r0 + 0)
        12.   r6 = bpf_get_prandom_u32()
        13. }
    
    Here verifier would first visit path 1-3, create a checkpoint at 3
    with r7=-16, continue to 4-7,3 with r7=-32.
    
    Because instructions at 9-12 had not been visitied yet existing
    checkpoint at 3 does not have read or precision mark for r7.
    Thus states_equal() would return true and verifier would discard
    current state, thus unsafe memory access at 11 would not be caught.
    
    This commit fixes this loophole by introducing exact state comparisons
    for iterator convergence logic:
    - registers are compared using regs_exact() regardless of read or
      precision marks;
    - stack slots have to have identical type.
    
    Unfortunately, this is too strict even for simple programs like below:
    
        i = 0;
        while(iter_next(&it))
          i++;
    
    At each iteration step i++ would produce a new distinct state and
    eventually instruction processing limit would be reached.
    
    To avoid such behavior speculatively forget (widen) range for
    imprecise scalar registers, if those registers were not precise at the
    end of the previous iteration and do not match exactly.
    
    This a conservative heuristic that allows to verify wide range of
    programs, however it precludes verification of programs that conjure
    an imprecise value on the first loop iteration and use it as precise
    on the second.
    
    Test case iter_task_vma_for_each() presents one of such cases:
    
            unsigned int seen = 0;
            ...
            bpf_for_each(task_vma, vma, task, 0) {
                    if (seen >= 1000)
                            break;
                    ...
                    seen++;
            }
    
    Here clang generates the following code:
    
    <LBB0_4>:
          24:       r8 = r6                          ; stash current value of
                    ... body ...                       'seen'
          29:       r1 = r10
          30:       r1 += -0x8
          31:       call bpf_iter_task_vma_next
          32:       r6 += 0x1                        ; seen++;
          33:       if r0 == 0x0 goto +0x2 <LBB0_6>  ; exit on next() == NULL
          34:       r7 += 0x10
          35:       if r8 < 0x3e7 goto -0xc <LBB0_4> ; loop on seen < 1000
    
    <LBB0_6>:
          ... exit ...
    
    Note that counter in r6 is copied to r8 and then incremented,
    conditional jump is done using r8. Because of this precision mark for
    r6 lags one state behind of precision mark on r8 and widening logic
    kicks in.
    
    Adding barrier_var(seen) after conditional is sufficient to force
    clang use the same register for both counting and conditional jump.
    
    This issue was discussed in the thread [1] which was started by
    Andrew Werner <awerner32@gmail.com> demonstrating a similar bug
    in callback functions handling. The callbacks would be addressed
    in a followup patch.
    
    [1] https://lore.kernel.org/bpf/97a90da09404c65c8e810cf83c94ac703705dc0e.camel@gmail.com/
    
    Co-developed-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
    Co-developed-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231024000917.12153-4-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bpf: extract __check_reg_arg() utility function [+ + +]

Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Nov 21 04:06:54 2023 +0200

    bpf: extract __check_reg_arg() utility function
    
    commit 683b96f9606ab7308ffb23c46ab43cecdef8a241 upstream.
    
    Split check_reg_arg() into two utility functions:
    - check_reg_arg() operating on registers from current verifier state;
    - __check_reg_arg() operating on a specific set of registers passed as
      a parameter;
    
    The __check_reg_arg() function would be used by a follow-up change for
    callbacks handling.
    
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231121020701.26440-5-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bpf: extract same_callsites() as utility function [+ + +]

Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Oct 24 03:09:12 2023 +0300

    bpf: extract same_callsites() as utility function
    
    commit 4c97259abc9bc8df7712f76f58ce385581876857 upstream.
    
    Extract same_callsites() from clean_live_states() as a utility function.
    This function would be used by the next patch in the set.
    
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231024000917.12153-3-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bpf: extract setup_func_entry() utility function [+ + +]

Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Nov 21 04:06:55 2023 +0200

    bpf: extract setup_func_entry() utility function
    
    commit 58124a98cb8eda69d248d7f1de954c8b2767c945 upstream.
    
    Move code for simulated stack frame creation to a separate utility
    function. This function would be used in the follow-up change for
    callbacks handling.
    
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231121020701.26440-6-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bpf: keep track of max number of bpf_loop callback iterations [+ + +]

Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Nov 21 04:07:00 2023 +0200

    bpf: keep track of max number of bpf_loop callback iterations
    
    commit bb124da69c47dd98d69361ec13244ece50bec63e upstream.
    
    In some cases verifier can't infer convergence of the bpf_loop()
    iteration. E.g. for the following program:
    
        static int cb(__u32 idx, struct num_context* ctx)
        {
            ctx->i++;
            return 0;
        }
    
        SEC("?raw_tp")
        int prog(void *_)
        {
            struct num_context ctx = { .i = 0 };
            __u8 choice_arr[2] = { 0, 1 };
    
            bpf_loop(2, cb, &ctx, 0);
            return choice_arr[ctx.i];
        }
    
    Each 'cb' simulation would eventually return to 'prog' and reach
    'return choice_arr[ctx.i]' statement. At which point ctx.i would be
    marked precise, thus forcing verifier to track multitude of separate
    states with {.i=0}, {.i=1}, ... at bpf_loop() callback entry.
    
    This commit allows "brute force" handling for such cases by limiting
    number of callback body simulations using 'umax' value of the first
    bpf_loop() parameter.
    
    For this, extend bpf_func_state with 'callback_depth' field.
    Increment this field when callback visiting state is pushed to states
    traversal stack. For frame #N it's 'callback_depth' field counts how
    many times callback with frame depth N+1 had been executed.
    Use bpf_func_state specifically to allow independent tracking of
    callback depths when multiple nested bpf_loop() calls are present.
    
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231121020701.26440-11-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bpf: move explored_state() closer to the beginning of verifier.c [+ + +]

Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Oct 24 03:09:11 2023 +0300

    bpf: move explored_state() closer to the beginning of verifier.c
    
    commit 3c4e420cb6536026ddd50eaaff5f30e4f144200d upstream.
    
    Subsequent patches would make use of explored_state() function.
    Move it up to avoid adding unnecessary prototype.
    
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231024000917.12153-2-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bpf: print full verifier states on infinite loop detection [+ + +]

Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Oct 24 03:09:17 2023 +0300

    bpf: print full verifier states on infinite loop detection
    
    commit b4d8239534fddc036abe4a0fdbf474d9894d4641 upstream.
    
    Additional logging in is_state_visited(): if infinite loop is detected
    print full verifier state for both current and equivalent states.
    
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231024000917.12153-8-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bpf: Propagate modified uaddrlen from cgroup sockaddr programs [+ + +]

Author: Daan De Meyer <daan.j.demeyer@gmail.com>
Date:   Wed Oct 11 20:51:04 2023 +0200

    bpf: Propagate modified uaddrlen from cgroup sockaddr programs
    
    [ Upstream commit fefba7d1ae198dcbf8b3b432de46a4e29f8dbd8c ]
    
    As prep for adding unix socket support to the cgroup sockaddr hooks,
    let's propagate the sockaddr length back to the caller after running
    a bpf cgroup sockaddr hook program. While not important for AF_INET or
    AF_INET6, the sockaddr length is important when working with AF_UNIX
    sockaddrs as the size of the sockaddr cannot be determined just from the
    address family or the sockaddr's contents.
    
    __cgroup_bpf_run_filter_sock_addr() is modified to take the uaddrlen as
    an input/output argument. After running the program, the modified sockaddr
    length is stored in the uaddrlen pointer.
    
    Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com>
    Link: https://lore.kernel.org/r/20231011185113.140426-3-daan.j.demeyer@gmail.com
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
    Stable-dep-of: c5114710c8ce ("xsk: fix usage of multi-buffer BPF helpers for ZC XDP")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf: verify callbacks as if they are called unknown number of times [+ + +]

Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Nov 21 04:06:56 2023 +0200

    bpf: verify callbacks as if they are called unknown number of times
    
    commit ab5cfac139ab8576fb54630d4cca23c3e690ee90 upstream.
    
    Prior to this patch callbacks were handled as regular function calls,
    execution of callback body was modeled exactly once.
    This patch updates callbacks handling logic as follows:
    - introduces a function push_callback_call() that schedules callback
      body verification in env->head stack;
    - updates prepare_func_exit() to reschedule callback body verification
      upon BPF_EXIT;
    - as calls to bpf_*_iter_next(), calls to callback invoking functions
      are marked as checkpoints;
    - is_state_visited() is updated to stop callback based iteration when
      some identical parent state is found.
    
    Paths with callback function invoked zero times are now verified first,
    which leads to necessity to modify some selftests:
    - the following negative tests required adding release/unlock/drop
      calls to avoid previously masked unrelated error reports:
      - cb_refs.c:underflow_prog
      - exceptions_fail.c:reject_rbtree_add_throw
      - exceptions_fail.c:reject_with_cp_reference
    - the following precision tracking selftests needed change in expected
      log trace:
      - verifier_subprog_precision.c:callback_result_precise
        (note: r0 precision is no longer propagated inside callback and
               I think this is a correct behavior)
      - verifier_subprog_precision.c:parent_callee_saved_reg_precise_with_callback
      - verifier_subprog_precision.c:parent_stack_slot_precise_with_callback
    
    Reported-by: Andrew Werner <awerner32@gmail.com>
    Closes: https://lore.kernel.org/bpf/CA+vRuzPChFNXmouzGG+wsy=6eMcfr1mFG0F3g7rbg-sedGKW3w@mail.gmail.com/
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231121020701.26440-7-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bpf: widening for callback iterators [+ + +]

Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Nov 21 04:06:58 2023 +0200

    bpf: widening for callback iterators
    
    commit cafe2c21508a38cdb3ed22708842e957b2572c3e upstream.
    
    Callbacks are similar to open coded iterators, so add imprecise
    widening logic for callback body processing. This makes callback based
    loops behave identically to open coded iterators, e.g. allowing to
    verify programs like below:
    
      struct ctx { u32 i; };
      int cb(u32 idx, struct ctx* ctx)
      {
              ++ctx->i;
              return 0;
      }
      ...
      struct ctx ctx = { .i = 0 };
      bpf_loop(100, cb, &ctx, 0);
      ...
    
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231121020701.26440-9-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: avoid copying BTRFS_ROOT_SUBVOL_DEAD flag to snapshot of subvolume being deleted [+ + +]

Author: Omar Sandoval <osandov@fb.com>
Date:   Thu Jan 4 11:48:47 2024 -0800

    btrfs: avoid copying BTRFS_ROOT_SUBVOL_DEAD flag to snapshot of subvolume being deleted
    
    commit 3324d0547861b16cf436d54abba7052e0c8aa9de upstream.
    
    Sweet Tea spotted a race between subvolume deletion and snapshotting
    that can result in the root item for the snapshot having the
    BTRFS_ROOT_SUBVOL_DEAD flag set. The race is:
    
    Thread 1                                      | Thread 2
    ----------------------------------------------|----------
    btrfs_delete_subvolume                        |
      btrfs_set_root_flags(BTRFS_ROOT_SUBVOL_DEAD)|
                                                  |btrfs_mksubvol
                                                  |  down_read(subvol_sem)
                                                  |  create_snapshot
                                                  |    ...
                                                  |    create_pending_snapshot
                                                  |      copy root item from source
      down_write(subvol_sem)                      |
    
    This flag is only checked in send and swap activate, which this would
    cause to fail mysteriously.
    
    create_snapshot() now checks the root refs to reject a deleted
    subvolume, so we can fix this by locking subvol_sem earlier so that the
    BTRFS_ROOT_SUBVOL_DEAD flag and the root refs are updated atomically.
    
    CC: stable@vger.kernel.org # 4.14+
    Reported-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
    Reviewed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
    Reviewed-by: Anand Jain <anand.jain@oracle.com>
    Signed-off-by: Omar Sandoval <osandov@fb.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: defrag: reject unknown flags of btrfs_ioctl_defrag_range_args [+ + +]

Author: Qu Wenruo <wqu@suse.com>
Date:   Wed Jan 10 08:58:26 2024 +1030

    btrfs: defrag: reject unknown flags of btrfs_ioctl_defrag_range_args
    
    commit 173431b274a9a54fc10b273b46e67f46bcf62d2e upstream.
    
    Add extra sanity check for btrfs_ioctl_defrag_range_args::flags.
    
    This is not really to enhance fuzzing tests, but as a preparation for
    future expansion on btrfs_ioctl_defrag_range_args.
    
    In the future we're going to add new members, allowing more fine tuning
    for btrfs defrag.  Without the -ENONOTSUPP error, there would be no way
    to detect if the kernel supports those new defrag features.
    
    CC: stable@vger.kernel.org # 4.14+
    Reviewed-by: Filipe Manana <fdmanana@suse.com>
    Signed-off-by: Qu Wenruo <wqu@suse.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: don't abort filesystem when attempting to snapshot deleted subvolume [+ + +]

Author: Omar Sandoval <osandov@fb.com>
Date:   Thu Jan 4 11:48:46 2024 -0800

    btrfs: don't abort filesystem when attempting to snapshot deleted subvolume
    
    commit 7081929ab2572920e94d70be3d332e5c9f97095a upstream.
    
    If the source file descriptor to the snapshot ioctl refers to a deleted
    subvolume, we get the following abort:
    
      BTRFS: Transaction aborted (error -2)
      WARNING: CPU: 0 PID: 833 at fs/btrfs/transaction.c:1875 create_pending_snapshot+0x1040/0x1190 [btrfs]
      Modules linked in: pata_acpi btrfs ata_piix libata scsi_mod virtio_net blake2b_generic xor net_failover virtio_rng failover scsi_common rng_core raid6_pq libcrc32c
      CPU: 0 PID: 833 Comm: t_snapshot_dele Not tainted 6.7.0-rc6 #2
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014
      RIP: 0010:create_pending_snapshot+0x1040/0x1190 [btrfs]
      RSP: 0018:ffffa09c01337af8 EFLAGS: 00010282
      RAX: 0000000000000000 RBX: ffff9982053e7c78 RCX: 0000000000000027
      RDX: ffff99827dc20848 RSI: 0000000000000001 RDI: ffff99827dc20840
      RBP: ffffa09c01337c00 R08: 0000000000000000 R09: ffffa09c01337998
      R10: 0000000000000003 R11: ffffffffb96da248 R12: fffffffffffffffe
      R13: ffff99820535bb28 R14: ffff99820b7bd000 R15: ffff99820381ea80
      FS:  00007fe20aadabc0(0000) GS:ffff99827dc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000559a120b502f CR3: 00000000055b6000 CR4: 00000000000006f0
      Call Trace:
       <TASK>
       ? create_pending_snapshot+0x1040/0x1190 [btrfs]
       ? __warn+0x81/0x130
       ? create_pending_snapshot+0x1040/0x1190 [btrfs]
       ? report_bug+0x171/0x1a0
       ? handle_bug+0x3a/0x70
       ? exc_invalid_op+0x17/0x70
       ? asm_exc_invalid_op+0x1a/0x20
       ? create_pending_snapshot+0x1040/0x1190 [btrfs]
       ? create_pending_snapshot+0x1040/0x1190 [btrfs]
       create_pending_snapshots+0x92/0xc0 [btrfs]
       btrfs_commit_transaction+0x66b/0xf40 [btrfs]
       btrfs_mksubvol+0x301/0x4d0 [btrfs]
       btrfs_mksnapshot+0x80/0xb0 [btrfs]
       __btrfs_ioctl_snap_create+0x1c2/0x1d0 [btrfs]
       btrfs_ioctl_snap_create_v2+0xc4/0x150 [btrfs]
       btrfs_ioctl+0x8a6/0x2650 [btrfs]
       ? kmem_cache_free+0x22/0x340
       ? do_sys_openat2+0x97/0xe0
       __x64_sys_ioctl+0x97/0xd0
       do_syscall_64+0x46/0xf0
       entry_SYSCALL_64_after_hwframe+0x6e/0x76
      RIP: 0033:0x7fe20abe83af
      RSP: 002b:00007ffe6eff1360 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fe20abe83af
      RDX: 00007ffe6eff23c0 RSI: 0000000050009417 RDI: 0000000000000003
      RBP: 0000000000000003 R08: 0000000000000000 R09: 00007fe20ad16cd0
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 00007ffe6eff13c0 R14: 00007fe20ad45000 R15: 0000559a120b6d58
       </TASK>
      ---[ end trace 0000000000000000 ]---
      BTRFS: error (device vdc: state A) in create_pending_snapshot:1875: errno=-2 No such entry
      BTRFS info (device vdc: state EA): forced readonly
      BTRFS warning (device vdc: state EA): Skipping commit of aborted transaction.
      BTRFS: error (device vdc: state EA) in cleanup_transaction:2055: errno=-2 No such entry
    
    This happens because create_pending_snapshot() initializes the new root
    item as a copy of the source root item. This includes the refs field,
    which is 0 for a deleted subvolume. The call to btrfs_insert_root()
    therefore inserts a root with refs == 0. btrfs_get_new_fs_root() then
    finds the root and returns -ENOENT if refs == 0, which causes
    create_pending_snapshot() to abort.
    
    Fix it by checking the source root's refs before attempting the
    snapshot, but after locking subvol_sem to avoid racing with deletion.
    
    CC: stable@vger.kernel.org # 4.14+
    Reviewed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
    Reviewed-by: Anand Jain <anand.jain@oracle.com>
    Signed-off-by: Omar Sandoval <osandov@fb.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: don't warn if discard range is not aligned to sector [+ + +]

Author: David Sterba <dsterba@suse.com>
Date:   Mon Jan 15 20:30:26 2024 +0100

    btrfs: don't warn if discard range is not aligned to sector
    
    commit a208b3f132b48e1f94f620024e66fea635925877 upstream.
    
    There's a warning in btrfs_issue_discard() when the range is not aligned
    to 512 bytes, originally added in 4d89d377bbb0 ("btrfs:
    btrfs_issue_discard ensure offset/length are aligned to sector
    boundaries"). We can't do sub-sector writes anyway so the adjustment is
    the only thing that we can do and the warning is unnecessary.
    
    CC: stable@vger.kernel.org # 4.19+
    Reported-by: syzbot+4a4f1eba14eb5c3417d1@syzkaller.appspotmail.com
    Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
    Reviewed-by: Anand Jain <anand.jain@oracle.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: ref-verify: free ref cache before clearing mount opt [+ + +]

Author: Fedor Pchelkin <pchelkin@ispras.ru>
Date:   Wed Jan 3 13:31:27 2024 +0300

    btrfs: ref-verify: free ref cache before clearing mount opt
    
    commit f03e274a8b29d1d1c1bbd7f764766cb5ca537ab7 upstream.
    
    As clearing REF_VERIFY mount option indicates there were some errors in a
    ref-verify process, a ref cache is not relevant anymore and should be
    freed.
    
    btrfs_free_ref_cache() requires REF_VERIFY option being set so call
    it just before clearing the mount option.
    
    Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
    
    Reported-by: syzbot+be14ed7728594dc8bd42@syzkaller.appspotmail.com
    Fixes: fd708b81d972 ("Btrfs: add a extent ref verify tool")
    CC: stable@vger.kernel.org # 5.4+
    Closes: https://lore.kernel.org/lkml/000000000000e5a65c05ee832054@google.com/
    Reported-by: syzbot+c563a3c79927971f950f@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/lkml/0000000000007fe09705fdc6086c@google.com/
    Reviewed-by: Anand Jain <anand.jain@oracle.com>
    Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: scrub: avoid use-after-free when chunk length is not 64K aligned [+ + +]

Author: Qu Wenruo <wqu@suse.com>
Date:   Wed Jan 17 11:02:25 2024 +1030

    btrfs: scrub: avoid use-after-free when chunk length is not 64K aligned
    
    commit f546c4282673497a06ecb6190b50ae7f6c85b02f upstream.
    
    [BUG]
    There is a bug report that, on a ext4-converted btrfs, scrub leads to
    various problems, including:
    
    - "unable to find chunk map" errors
      BTRFS info (device vdb): scrub: started on devid 1
      BTRFS critical (device vdb): unable to find chunk map for logical 2214744064 length 4096
      BTRFS critical (device vdb): unable to find chunk map for logical 2214744064 length 45056
    
      This would lead to unrepariable errors.
    
    - Use-after-free KASAN reports:
      ==================================================================
      BUG: KASAN: slab-use-after-free in __blk_rq_map_sg+0x18f/0x7c0
      Read of size 8 at addr ffff8881013c9040 by task btrfs/909
      CPU: 0 PID: 909 Comm: btrfs Not tainted 6.7.0-x64v3-dbg #11 c50636e9419a8354555555245df535e380563b2b
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2023.11-2 12/24/2023
      Call Trace:
       <TASK>
       dump_stack_lvl+0x43/0x60
       print_report+0xcf/0x640
       kasan_report+0xa6/0xd0
       __blk_rq_map_sg+0x18f/0x7c0
       virtblk_prep_rq.isra.0+0x215/0x6a0 [virtio_blk 19a65eeee9ae6fcf02edfad39bb9ddee07dcdaff]
       virtio_queue_rqs+0xc4/0x310 [virtio_blk 19a65eeee9ae6fcf02edfad39bb9ddee07dcdaff]
       blk_mq_flush_plug_list.part.0+0x780/0x860
       __blk_flush_plug+0x1ba/0x220
       blk_finish_plug+0x3b/0x60
       submit_initial_group_read+0x10a/0x290 [btrfs e57987a360bed82fe8756dcd3e0de5406ccfe965]
       flush_scrub_stripes+0x38e/0x430 [btrfs e57987a360bed82fe8756dcd3e0de5406ccfe965]
       scrub_stripe+0x82a/0xae0 [btrfs e57987a360bed82fe8756dcd3e0de5406ccfe965]
       scrub_chunk+0x178/0x200 [btrfs e57987a360bed82fe8756dcd3e0de5406ccfe965]
       scrub_enumerate_chunks+0x4bc/0xa30 [btrfs e57987a360bed82fe8756dcd3e0de5406ccfe965]
       btrfs_scrub_dev+0x398/0x810 [btrfs e57987a360bed82fe8756dcd3e0de5406ccfe965]
       btrfs_ioctl+0x4b9/0x3020 [btrfs e57987a360bed82fe8756dcd3e0de5406ccfe965]
       __x64_sys_ioctl+0xbd/0x100
       do_syscall_64+0x5d/0xe0
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      RIP: 0033:0x7f47e5e0952b
    
    - Crash, mostly due to above use-after-free
    
    [CAUSE]
    The converted fs has the following data chunk layout:
    
        item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 2214658048) itemoff 16025 itemsize 80
            length 86016 owner 2 stripe_len 65536 type DATA|single
    
    For above logical bytenr 2214744064, it's at the chunk end
    (2214658048 + 86016 = 2214744064).
    
    This means btrfs_submit_bio() would split the bio, and trigger endio
    function for both of the two halves.
    
    However scrub_submit_initial_read() would only expect the endio function
    to be called once, not any more.
    This means the first endio function would already free the bbio::bio,
    leaving the bvec freed, thus the 2nd endio call would lead to
    use-after-free.
    
    [FIX]
    - Make sure scrub_read_endio() only updates bits in its range
      Since we may read less than 64K at the end of the chunk, we should not
      touch the bits beyond chunk boundary.
    
    - Make sure scrub_submit_initial_read() only to read the chunk range
      This is done by calculating the real number of sectors we need to
      read, and add sector-by-sector to the bio.
    
    Thankfully the scrub read repair path won't need extra fixes:
    
    - scrub_stripe_submit_repair_read()
      With above fixes, we won't update error bit for range beyond chunk,
      thus scrub_stripe_submit_repair_read() should never submit any read
      beyond the chunk.
    
    Reported-by: Rongrong <i@rong.moe>
    Fixes: e02ee89baa66 ("btrfs: scrub: switch scrub_simple_mirror() to scrub_stripe infrastructure")
    Tested-by: Rongrong <i@rong.moe>
    Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
    Signed-off-by: Qu Wenruo <wqu@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    [ Use min_t() to fix a compiling error due to difference types ]
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: sysfs: validate scrub_speed_max value [+ + +]

Author: David Disseldorp <ddiss@suse.de>
Date:   Fri Dec 8 11:41:56 2023 +1100

    btrfs: sysfs: validate scrub_speed_max value
    
    commit 2b0122aaa800b021e36027d7f29e206f87c761d6 upstream.
    
    The value set as scrub_speed_max accepts size with suffixes
    (k/m/g/t/p/e) but we should still validate it for trailing characters,
    similar to what we do with chunk_size_store.
    
    CC: stable@vger.kernel.org # 5.15+
    Signed-off-by: David Disseldorp <ddiss@suse.de>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: tree-checker: fix inline ref size in error messages [+ + +]

Author: Chung-Chiang Cheng <cccheng@synology.com>
Date:   Fri Jan 12 15:41:05 2024 +0800

    btrfs: tree-checker: fix inline ref size in error messages
    
    commit f398e70dd69e6ceea71463a5380e6118f219197e upstream.
    
    The error message should accurately reflect the size rather than the
    type.
    
    Fixes: f82d1c7ca8ae ("btrfs: tree-checker: Add EXTENT_ITEM and METADATA_ITEM check")
    CC: stable@vger.kernel.org # 5.4+
    Reviewed-by: Filipe Manana <fdmanana@suse.com>
    Reviewed-by: Qu Wenruo <wqu@suse.com>
    Signed-off-by: Chung-Chiang Cheng <cccheng@synology.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: zoned: factor out prepare_allocation_zoned() [+ + +]

Author: Naohiro Aota <naohiro.aota@wdc.com>
Date:   Tue Dec 19 01:02:28 2023 +0900

    btrfs: zoned: factor out prepare_allocation_zoned()
    
    [ Upstream commit b271fee9a41ca1474d30639fd6cc912c9901d0f8 ]
    
    Factor out prepare_allocation_zoned() for further extension. While at
    it, optimize the if-branch a bit.
    
    Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
    Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Stable-dep-of: 02444f2ac26e ("btrfs: zoned: optimize hint byte for zoned allocator")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

btrfs: zoned: fix lock ordering in btrfs_zone_activate() [+ + +]

Author: Naohiro Aota <naohiro.aota@wdc.com>
Date:   Fri Dec 22 13:56:34 2023 +0900

    btrfs: zoned: fix lock ordering in btrfs_zone_activate()
    
    commit b18f3b60b35a8c01c9a2a0f0d6424c6d73971dc3 upstream.
    
    The btrfs CI reported a lockdep warning as follows by running generic
    generic/129.
    
       WARNING: possible circular locking dependency detected
       6.7.0-rc5+ #1 Not tainted
       ------------------------------------------------------
       kworker/u5:5/793427 is trying to acquire lock:
       ffff88813256d028 (&cache->lock){+.+.}-{2:2}, at: btrfs_zone_finish_one_bg+0x5e/0x130
       but task is already holding lock:
       ffff88810a23a318 (&fs_info->zone_active_bgs_lock){+.+.}-{2:2}, at: btrfs_zone_finish_one_bg+0x34/0x130
       which lock already depends on the new lock.
    
       the existing dependency chain (in reverse order) is:
       -> #1 (&fs_info->zone_active_bgs_lock){+.+.}-{2:2}:
       ...
       -> #0 (&cache->lock){+.+.}-{2:2}:
       ...
    
    This is because we take fs_info->zone_active_bgs_lock after a block_group's
    lock in btrfs_zone_activate() while doing the opposite in other places.
    
    Fix the issue by expanding the fs_info->zone_active_bgs_lock's critical
    section and taking it before a block_group's lock.
    
    Fixes: a7e1ac7bdc5a ("btrfs: zoned: reserve zones for an active metadata/system block group")
    CC: stable@vger.kernel.org # 6.6
    Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: zoned: optimize hint byte for zoned allocator [+ + +]

Author: Naohiro Aota <naohiro.aota@wdc.com>
Date:   Tue Dec 19 01:02:29 2023 +0900

    btrfs: zoned: optimize hint byte for zoned allocator
    
    [ Upstream commit 02444f2ac26eae6385a65fcd66915084d15dffba ]
    
    Writing sequentially to a huge file on btrfs on a SMR HDD revealed a
    decline of the performance (220 MiB/s to 30 MiB/s after 500 minutes).
    
    The performance goes down because of increased latency of the extent
    allocation, which is induced by a traversing of a lot of full block groups.
    
    So, this patch optimizes the ffe_ctl->hint_byte by choosing a block group
    with sufficient size from the active block group list, which does not
    contain full block groups.
    
    After applying the patch, the performance is maintained well.
    
    Fixes: 2eda57089ea3 ("btrfs: zoned: implement sequential extent allocation")
    CC: stable@vger.kernel.org # 5.15+
    Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
    Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bus: mhi: host: Add alignment check for event ring read pointer [+ + +]

Author: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Date:   Tue Oct 31 15:21:05 2023 +0530

    bus: mhi: host: Add alignment check for event ring read pointer
    
    commit eff9704f5332a13b08fbdbe0f84059c9e7051d5f upstream.
    
    Though we do check the event ring read pointer by "is_valid_ring_ptr"
    to make sure it is in the buffer range, but there is another risk the
    pointer may be not aligned.  Since we are expecting event ring elements
    are 128 bits(struct mhi_ring_element) aligned, an unaligned read pointer
    could lead to multiple issues like DoS or ring buffer memory corruption.
    
    So add a alignment check for event ring read pointer.
    
    Fixes: ec32332df764 ("bus: mhi: core: Sanity check values from remote device before use")
    cc: stable@vger.kernel.org
    Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
    Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
    Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Link: https://lore.kernel.org/r/20231031-alignment_check-v2-1-1441db7c5efd@quicinc.com
    Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bus: mhi: host: Add spinlock to protect WP access when queueing TREs [+ + +]

Author: Bhaumik Bhatt <bbhatt@codeaurora.org>
Date:   Mon Dec 11 14:42:51 2023 +0800

    bus: mhi: host: Add spinlock to protect WP access when queueing TREs
    
    commit b89b6a863dd53bc70d8e52d50f9cfaef8ef5e9c9 upstream.
    
    Protect WP accesses such that multiple threads queueing buffers for
    incoming data do not race.
    
    Meanwhile, if CONFIG_TRACE_IRQFLAGS is enabled, irq will be enabled once
    __local_bh_enable_ip is called as part of write_unlock_bh. Hence, let's
    take irqsave lock after TRE is generated to avoid running write_unlock_bh
    when irqsave lock is held.
    
    Cc: stable@vger.kernel.org
    Fixes: 189ff97cca53 ("bus: mhi: core: Add support for data transfer")
    Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
    Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
    Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
    Tested-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
    Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Link: https://lore.kernel.org/r/1702276972-41296-2-git-send-email-quic_qianyu@quicinc.com
    Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bus: mhi: host: Drop chan lock before queuing buffers [+ + +]

Author: Qiang Yu <quic_qianyu@quicinc.com>
Date:   Mon Dec 11 14:42:52 2023 +0800

    bus: mhi: host: Drop chan lock before queuing buffers
    
    commit 01bd694ac2f682fb8017e16148b928482bc8fa4b upstream.
    
    Ensure read and write locks for the channel are not taken in succession by
    dropping the read lock from parse_xfer_event() such that a callback given
    to client can potentially queue buffers and acquire the write lock in that
    process. Any queueing of buffers should be done without channel read lock
    acquired as it can result in multiple locks and a soft lockup.
    
    Cc: <stable@vger.kernel.org> # 5.7
    Fixes: 1d3173a3bae7 ("bus: mhi: core: Add support for processing events from client device")
    Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
    Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
    Tested-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
    Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Link: https://lore.kernel.org/r/1702276972-41296-3-git-send-email-quic_qianyu@quicinc.com
    [mani: added fixes tag and cc'ed stable]
    Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: after disabling multichannel, mark tcon for reconnect [+ + +]

Author: Shyam Prasad N <sprasad@microsoft.com>
Date:   Fri Dec 29 11:30:07 2023 +0000

    cifs: after disabling multichannel, mark tcon for reconnect
    
    commit 27e1fd343f80168ff456785c2443136b6b7ca3cc upstream.
    
    Once the server disables multichannel for an active multichannel
    session, on the following reconnect, the client would reduce
    the number of channels to 1. However, it could be the case that
    the tree connect was active on one of these disabled channels.
    This results in an unrecoverable state.
    
    This change fixes that by making sure that whenever a channel
    is being terminated, the session and tcon are marked for
    reconnect too. This could mean a few redundant tree connect
    calls to the server, but considering that this is not a frequent
    event, we should be okay.
    
    Fixes: ee1d21794e55 ("cifs: handle when server stops supporting multichannel")
    Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: fix a pending undercount of srv_count [+ + +]

Author: Shyam Prasad N <sprasad@microsoft.com>
Date:   Fri Dec 15 17:16:55 2023 +0000

    cifs: fix a pending undercount of srv_count
    
    commit f30bbc38704e279c06d073ecb18fea376791ecab upstream.
    
    The following commit reverted the changes to ref count
    the server struct while scheduling a reconnect work:
    823342524868 Revert "cifs: reconnect work should have reference on server struct"
    
    However, a following change also introduced scheduling
    of reconnect work, and assumed ref counting. This change
    fixes that as well.
    
    Fixes umount problems like:
    
    [73496.157838] CPU: 5 PID: 1321389 Comm: umount Tainted: G        W  OE      6.7.0-060700rc6-generic #202312172332
    [73496.157841] Hardware name: LENOVO 20MAS08500/20MAS08500, BIOS N2CET67W (1.50 ) 12/15/2022
    [73496.157843] RIP: 0010:cifs_put_tcp_session+0x17d/0x190 [cifs]
    [73496.157906] Code: 5d 31 c0 31 d2 31 f6 31 ff c3 cc cc cc cc e8 4a 6e 14 e6 e9 f6 fe ff ff be 03 00 00 00 48 89 d7 e8 78 26 b3 e5 e9 e4 fe ff ff <0f> 0b e9 b1 fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90
    [73496.157908] RSP: 0018:ffffc90003bcbcb8 EFLAGS: 00010286
    [73496.157911] RAX: 00000000ffffffff RBX: ffff8885830fa800 RCX: 0000000000000000
    [73496.157913] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
    [73496.157915] RBP: ffffc90003bcbcc8 R08: 0000000000000000 R09: 0000000000000000
    [73496.157917] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
    [73496.157918] R13: ffff8887d56ba800 R14: 00000000ffffffff R15: ffff8885830fa800
    [73496.157920] FS:  00007f1ff0e33800(0000) GS:ffff88887ba80000(0000) knlGS:0000000000000000
    [73496.157922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [73496.157924] CR2: 0000115f002e2010 CR3: 00000003d1e24005 CR4: 00000000003706f0
    [73496.157926] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [73496.157928] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [73496.157929] Call Trace:
    [73496.157931]  <TASK>
    [73496.157933]  ? show_regs+0x6d/0x80
    [73496.157936]  ? __warn+0x89/0x160
    [73496.157939]  ? cifs_put_tcp_session+0x17d/0x190 [cifs]
    [73496.157976]  ? report_bug+0x17e/0x1b0
    [73496.157980]  ? handle_bug+0x51/0xa0
    [73496.157983]  ? exc_invalid_op+0x18/0x80
    [73496.157985]  ? asm_exc_invalid_op+0x1b/0x20
    [73496.157989]  ? cifs_put_tcp_session+0x17d/0x190 [cifs]
    [73496.158023]  ? cifs_put_tcp_session+0x1e/0x190 [cifs]
    [73496.158057]  __cifs_put_smb_ses+0x2b5/0x540 [cifs]
    [73496.158090]  ? tconInfoFree+0xc2/0x120 [cifs]
    [73496.158130]  cifs_put_tcon.part.0+0x108/0x2b0 [cifs]
    [73496.158173]  cifs_put_tlink+0x49/0x90 [cifs]
    [73496.158220]  cifs_umount+0x56/0xb0 [cifs]
    [73496.158258]  cifs_kill_sb+0x52/0x60 [cifs]
    [73496.158306]  deactivate_locked_super+0x32/0xc0
    [73496.158309]  deactivate_super+0x46/0x60
    [73496.158311]  cleanup_mnt+0xc3/0x170
    [73496.158314]  __cleanup_mnt+0x12/0x20
    [73496.158330]  task_work_run+0x5e/0xa0
    [73496.158333]  exit_to_user_mode_loop+0x105/0x130
    [73496.158336]  exit_to_user_mode_prepare+0xa5/0xb0
    [73496.158338]  syscall_exit_to_user_mode+0x29/0x60
    [73496.158341]  do_syscall_64+0x6c/0xf0
    [73496.158344]  ? syscall_exit_to_user_mode+0x37/0x60
    [73496.158346]  ? do_syscall_64+0x6c/0xf0
    [73496.158349]  ? exit_to_user_mode_prepare+0x30/0xb0
    [73496.158353]  ? syscall_exit_to_user_mode+0x37/0x60
    [73496.158355]  ? do_syscall_64+0x6c/0xf0
    
    Reported-by: Robert Morris <rtm@csail.mit.edu>
    Fixes: 705fc522fe9d ("cifs: handle when server starts supporting multichannel")
    Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: fix lock ordering while disabling multichannel [+ + +]

Author: Shyam Prasad N <sprasad@microsoft.com>
Date:   Tue Nov 14 04:58:23 2023 +0000

    cifs: fix lock ordering while disabling multichannel
    
    commit 5eef12c4e3230f2025dc46ad8c4a3bc19978e5d7 upstream.
    
    The code to handle the case of server disabling multichannel
    was picking iface_lock with chan_lock held. This goes against
    the lock ordering rules, as iface_lock is a higher order lock
    (even if it isn't so obvious).
    
    This change fixes the lock ordering by doing the following in
    that order for each secondary channel:
    1. store iface and server pointers in local variable
    2. remove references to iface and server in channels
    3. unlock chan_lock
    4. lock iface_lock
    5. dec ref count for iface
    6. unlock iface_lock
    7. dec ref count for server
    8. lock chan_lock again
    
    Since this function can only be called in smb2_reconnect, and
    that cannot be called by two parallel processes, we should not
    have races due to dropping chan_lock between steps 3 and 8.
    
    Fixes: ee1d21794e55 ("cifs: handle when server stops supporting multichannel")
    Reported-by: Paulo Alcantara <pc@manguebit.com>
    Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: fix stray unlock in cifs_chan_skip_or_disable [+ + +]

Author: Shyam Prasad N <sprasad@microsoft.com>
Date:   Tue Jan 23 05:07:57 2024 +0000

    cifs: fix stray unlock in cifs_chan_skip_or_disable
    
    [ Upstream commit 993d1c346b1a51ac41b2193609a0d4e51e9748f4 ]
    
    A recent change moved the code that decides to skip
    a channel or disable multichannel entirely, into a
    helper function.
    
    During this, a mutex_unlock of the session_mutex
    should have been removed. Doing that here.
    
    Fixes: f591062bdbf4 ("cifs: handle servers that still advertise multichannel after disabling")
    Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

cifs: handle cases where a channel is closed [+ + +]

Author: Shyam Prasad N <sprasad@microsoft.com>
Date:   Fri Oct 13 09:25:30 2023 +0000

    cifs: handle cases where a channel is closed
    
    [ Upstream commit 0c51cc6f2cb0108e7d49805f6e089cd85caab279 ]
    
    So far, SMB multichannel could only scale up, but not
    scale down the number of channels. In this series of
    patch, we now allow the client to deal with the case
    of multichannel disabled on the server when the share
    is mounted. With that change, we now need the ability
    to scale down the channels.
    
    This change allows the client to deal with cases of
    missing channels more gracefully.
    
    Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Stable-dep-of: 78e727e58e54 ("cifs: update iface_last_update on each query-and-update")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

cifs: handle servers that still advertise multichannel after disabling [+ + +]

Author: Shyam Prasad N <sprasad@microsoft.com>
Date:   Tue Jan 2 13:14:46 2024 +0000

    cifs: handle servers that still advertise multichannel after disabling
    
    [ Upstream commit f591062bdbf4742b7f1622173017f19e927057b0 ]
    
    Some servers like Azure SMB servers always advertise multichannel
    capability in server capabilities list. Such servers return error
    STATUS_NOT_IMPLEMENTED for ioctl calls to query server interfaces,
    and expect clients to consider that as a sign that they do not support
    multichannel.
    
    We already handled this at mount time. Soon after the tree connect,
    we query server interfaces. And when server returned STATUS_NOT_IMPLEMENTED,
    we kept interface list as empty. When cifs_try_adding_channels gets
    called, it would not find any interfaces, so will not add channels.
    
    For the case where an active multichannel mount exists, and multichannel
    is disabled by such a server, this change will now allow the client
    to disable secondary channels on the mount. It will check the return
    status of query server interfaces call soon after a tree reconnect.
    If the return status is EOPNOTSUPP, then instead of the check to add
    more channels, we'll disable the secondary channels instead.
    
    For better code reuse, this change also moves the common code for
    disabling multichannel to a helper function.
    
    Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Stable-dep-of: 78e727e58e54 ("cifs: update iface_last_update on each query-and-update")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

cifs: handle when server starts supporting multichannel [+ + +]

Author: Shyam Prasad N <sprasad@microsoft.com>
Date:   Fri Oct 13 11:33:21 2023 +0000

    cifs: handle when server starts supporting multichannel
    
    [ Upstream commit 705fc522fe9d58848c253ee0948567060f36e2a7 ]
    
    When the user mounts with multichannel option, but the
    server does not support it, there can be a time in future
    where it can be supported.
    
    With this change, such a case is handled.
    
    Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
    Stable-dep-of: 78e727e58e54 ("cifs: update iface_last_update on each query-and-update")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

cifs: handle when server stops supporting multichannel [+ + +]

Author: Shyam Prasad N <sprasad@microsoft.com>
Date:   Fri Oct 13 11:40:09 2023 +0000

    cifs: handle when server stops supporting multichannel
    
    [ Upstream commit ee1d21794e55ab76505745d24101331552182002 ]
    
    When a server stops supporting multichannel, we will
    keep attempting reconnects to the secondary channels today.
    Avoid this by freeing extra channels when negotiate
    returns no multichannel support.
    
    Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Stable-dep-of: 78e727e58e54 ("cifs: update iface_last_update on each query-and-update")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

cifs: reconnect work should have reference on server struct [+ + +]

Author: Shyam Prasad N <sprasad@microsoft.com>
Date:   Fri Oct 13 11:43:09 2023 +0000

    cifs: reconnect work should have reference on server struct
    
    [ Upstream commit 19a4b9d6c372cab6a3b2c9a061a236136fe95274 ]
    
    The delayed work for reconnect takes server struct
    as a parameter. But it does so without holding a ref
    to it. Normally, this may not show a problem as
    the reconnect work is only cancelled on umount.
    
    However, since we now plan to support scaling down of
    channels, and the scale down can happen from reconnect
    work itself, we need to fix it.
    
    This change takes a reference on the server struct
    before it is passed to the delayed work. And drops
    the reference in the delayed work itself. Or if
    the delayed work is successfully cancelled, by the
    process that cancels it.
    
    Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Stable-dep-of: 78e727e58e54 ("cifs: update iface_last_update on each query-and-update")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

cifs: reconnect worker should take reference on server struct unconditionally [+ + +]

Author: Shyam Prasad N <sprasad@microsoft.com>
Date:   Wed Dec 6 16:37:38 2023 +0000

    cifs: reconnect worker should take reference on server struct unconditionally
    
    [ Upstream commit 04909192ada3285070f8ced0af7f07735478b364 ]
    
    Reconnect worker currently assumes that the server struct
    is alive and only takes reference on the server if it needs
    to call smb2_reconnect.
    
    With the new ability to disable channels based on whether the
    server has multichannel disabled, this becomes a problem when
    we need to disable established channels. While disabling the
    channels and deallocating the server, there could be reconnect
    work that could not be cancelled (because it started).
    
    This change forces the reconnect worker to unconditionally
    take a reference on the server when it runs.
    
    Also, this change now allows smb2_reconnect to know if it was
    called by the reconnect worker. Based on this, the cifs_put_tcp_session
    can decide whether it can cancel the reconnect work synchronously or not.
    
    Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Stable-dep-of: 78e727e58e54 ("cifs: update iface_last_update on each query-and-update")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

cifs: update iface_last_update on each query-and-update [+ + +]

Author: Shyam Prasad N <sprasad@microsoft.com>
Date:   Wed Jan 3 12:51:49 2024 +0000

    cifs: update iface_last_update on each query-and-update
    
    [ Upstream commit 78e727e58e54efca4c23863fbd9e16e9d2d83f81 ]
    
    iface_last_update was an unused field when it was introduced.
    Later, when we had periodic update of server interface list,
    this field was used regularly to decide when to update next.
    
    However, with the new logic of updating the interfaces, it
    becomes crucial that this field be updated whenever
    parse_server_interfaces runs successfully.
    
    This change updates this field when either the server does
    not support query of interfaces; so that we do not query
    the interfaces repeatedly. It also updates the field when
    the function reaches the end.
    
    Fixes: aa45dadd34e4 ("cifs: change iface_list from array to sorted linked list")
    Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

clocksource: Skip watchdog check for large watchdog intervals [+ + +]

Author: Jiri Wiesner <jwiesner@suse.de>
Date:   Mon Jan 22 18:23:50 2024 +0100

    clocksource: Skip watchdog check for large watchdog intervals
    
    commit 644649553508b9bacf0fc7a5bdc4f9e0165576a5 upstream.
    
    There have been reports of the watchdog marking clocksources unstable on
    machines with 8 NUMA nodes:
    
      clocksource: timekeeping watchdog on CPU373:
      Marking clocksource 'tsc' as unstable because the skew is too large:
      clocksource:   'hpet' wd_nsec: 14523447520
      clocksource:   'tsc'  cs_nsec: 14524115132
    
    The measured clocksource skew - the absolute difference between cs_nsec
    and wd_nsec - was 668 microseconds:
    
      cs_nsec - wd_nsec = 14524115132 - 14523447520 = 667612
    
    The kernel used 200 microseconds for the uncertainty_margin of both the
    clocksource and watchdog, resulting in a threshold of 400 microseconds (the
    md variable). Both the cs_nsec and the wd_nsec value indicate that the
    readout interval was circa 14.5 seconds.  The observed behaviour is that
    watchdog checks failed for large readout intervals on 8 NUMA node
    machines. This indicates that the size of the skew was directly proportinal
    to the length of the readout interval on those machines. The measured
    clocksource skew, 668 microseconds, was evaluated against a threshold (the
    md variable) that is suited for readout intervals of roughly
    WATCHDOG_INTERVAL, i.e. HZ >> 1, which is 0.5 second.
    
    The intention of 2e27e793e280 ("clocksource: Reduce clocksource-skew
    threshold") was to tighten the threshold for evaluating skew and set the
    lower bound for the uncertainty_margin of clocksources to twice
    WATCHDOG_MAX_SKEW. Later in c37e85c135ce ("clocksource: Loosen clocksource
    watchdog constraints"), the WATCHDOG_MAX_SKEW constant was increased to
    125 microseconds to fit the limit of NTP, which is able to use a
    clocksource that suffers from up to 500 microseconds of skew per second.
    Both the TSC and the HPET use default uncertainty_margin. When the
    readout interval gets stretched the default uncertainty_margin is no
    longer a suitable lower bound for evaluating skew - it imposes a limit
    that is far stricter than the skew with which NTP can deal.
    
    The root causes of the skew being directly proportinal to the length of
    the readout interval are:
    
      * the inaccuracy of the shift/mult pairs of clocksources and the watchdog
      * the conversion to nanoseconds is imprecise for large readout intervals
    
    Prevent this by skipping the current watchdog check if the readout
    interval exceeds 2 * WATCHDOG_INTERVAL. Considering the maximum readout
    interval of 2 * WATCHDOG_INTERVAL, the current default uncertainty margin
    (of the TSC and HPET) corresponds to a limit on clocksource skew of 250
    ppm (microseconds of skew per second).  To keep the limit imposed by NTP
    (500 microseconds of skew per second) for all possible readout intervals,
    the margins would have to be scaled so that the threshold value is
    proportional to the length of the actual readout interval.
    
    As for why the readout interval may get stretched: Since the watchdog is
    executed in softirq context the expiration of the watchdog timer can get
    severely delayed on account of a ksoftirqd thread not getting to run in a
    timely manner. Surely, a system with such belated softirq execution is not
    working well and the scheduling issue should be looked into but the
    clocksource watchdog should be able to deal with it accordingly.
    
    Fixes: 2e27e793e280 ("clocksource: Reduce clocksource-skew threshold")
    Suggested-by: Feng Tang <feng.tang@intel.com>
    Signed-off-by: Jiri Wiesner <jwiesner@suse.de>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Tested-by: Paul E. McKenney <paulmck@kernel.org>
    Reviewed-by: Feng Tang <feng.tang@intel.com>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240122172350.GA740@incl
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cpufreq/amd-pstate: Fix setting scaling max/min freq values [+ + +]

Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Fri Jan 19 05:33:19 2024 -0600

    cpufreq/amd-pstate: Fix setting scaling max/min freq values
    
    [ Upstream commit 22fb4f041999f5f16ecbda15a2859b4ef4cbf47e ]
    
    Scaling min/max freq values were being cached and lagging a setting
    each time.  Fix the ordering of the clamp call to ensure they work.
    
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931
    Fixes: febab20caeba ("cpufreq/amd-pstate: Fix scaling_min_freq and scaling_max_freq update")
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Reviewed-by: Wyes Karny <wkarny@gmail.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

cpufreq: intel_pstate: Refine computation of P-state for given frequency [+ + +]

Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date:   Mon Jan 22 15:18:11 2024 +0100

    cpufreq: intel_pstate: Refine computation of P-state for given frequency
    
    commit 192cdb1c907fd8df2d764c5bb17496e415e59391 upstream.
    
    On systems using HWP, if a given frequency is equal to the maximum turbo
    frequency or the maximum non-turbo frequency, the HWP performance level
    corresponding to it is already known and can be used directly without
    any computation.
    
    Accordingly, adjust the code to use the known HWP performance levels in
    the cases mentioned above.
    
    This also helps to avoid limiting CPU capacity artificially in some
    cases when the BIOS produces the HWP_CAP numbers using a different
    E-core-to-P-core performance scaling factor than expected by the kernel.
    
    Fixes: f5c8cf2a4992 ("cpufreq: intel_pstate: hybrid: Use known scaling factor for P-cores")
    Cc: 6.1+ <stable@vger.kernel.org> # 6.1+
    Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

crypto: api - Disallow identical driver names [+ + +]

Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Thu Dec 7 18:36:57 2023 +0800

    crypto: api - Disallow identical driver names
    
    commit 27016f75f5ed47e2d8e0ca75a8ff1f40bc1a5e27 upstream.
    
    Disallow registration of two algorithms with identical driver names.
    
    Cc: <stable@vger.kernel.org>
    Reported-by: Ovidiu Panait <ovidiu.panait@windriver.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

crypto: lib/mpi - Fix unexpected pointer access in mpi_ec_init [+ + +]

Author: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Date:   Thu Dec 14 11:08:34 2023 +0800

    crypto: lib/mpi - Fix unexpected pointer access in mpi_ec_init
    
    commit ba3c5574203034781ac4231acf117da917efcd2a upstream.
    
    When the mpi_ec_ctx structure is initialized, some fields are not
    cleared, causing a crash when referencing the field when the
    structure was released. Initially, this issue was ignored because
    memory for mpi_ec_ctx is allocated with the __GFP_ZERO flag.
    For example, this error will be triggered when calculating the
    Za value for SM2 separately.
    
    Fixes: d58bb7e55a8a ("lib/mpi: Introduce ec implementation to MPI library")
    Cc: stable@vger.kernel.org # v6.5
    Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

crypto: s390/aes - Fix buffer overread in CTR mode [+ + +]

Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Tue Nov 28 14:22:13 2023 +0800

    crypto: s390/aes - Fix buffer overread in CTR mode
    
    commit d07f951903fa9922c375b8ab1ce81b18a0034e3b upstream.
    
    When processing the last block, the s390 ctr code will always read
    a whole block, even if there isn't a whole block of data left.  Fix
    this by using the actual length left and copy it into a buffer first
    for processing.
    
    Fixes: 0200f3ecc196 ("crypto: s390 - add System z hardware support for CTR mode")
    Cc: <stable@vger.kernel.org>
    Reported-by: Guangwu Zhang <guazhang@redhat.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Reviewd-by: Harald Freudenberger <freude@de.ibm.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux: cxl/regionО╪ Fix overflow issue in alloc_hpa() [+ + +]

Author: Quanquan Cao <caoqq@fujitsu.com>
Date:   Wed Jan 24 17:15:26 2024 +0800

    cxl/regionО╪ Fix overflow issue in alloc_hpa()
    
    commit d76779dd3681c01a4c6c3cae4d0627c9083e0ee6 upstream.
    
    Creating a region with 16 memory devices caused a problem. The div_u64_rem
    function, used for dividing an unsigned 64-bit number by a 32-bit one,
    faced an issue when SZ_256M * p->interleave_ways. The result surpassed
    the maximum limit of the 32-bit divisor (4G), leading to an overflow
    and a remainder of 0.
    note: At this point, p->interleave_ways is 16, meaning 16 * 256M = 4G
    
    To fix this issue, I replaced the div_u64_rem function with div64_u64_rem
    and adjusted the type of the remainder.
    
    Signed-off-by: Quanquan Cao <caoqq@fujitsu.com>
    Reviewed-by: Dave Jiang <dave.jiang@intel.com>
    Fixes: 23a22cd1c98b ("cxl/region: Allocate HPA capacity to regions")
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Dan Williams <dan.j.williams@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dlm: use kernel_connect() and kernel_bind() [+ + +]

Author: Jordan Rife <jrife@google.com>
Date:   Mon Nov 6 15:24:38 2023 -0600

    dlm: use kernel_connect() and kernel_bind()
    
    commit e9cdebbe23f1aa9a1caea169862f479ab3fa2773 upstream.
    
    Recent changes to kernel_connect() and kernel_bind() ensure that
    callers are insulated from changes to the address parameter made by BPF
    SOCK_ADDR hooks. This patch wraps direct calls to ops->connect() and
    ops->bind() with kernel_connect() and kernel_bind() to protect callers
    in such cases.
    
    Link: https://lore.kernel.org/netdev/9944248dba1bce861375fcce9de663934d933ba9.camel@redhat.com/
    Fixes: d74bad4e74ee ("bpf: Hooks for sys_connect")
    Fixes: 4fbac77d2d09 ("bpf: Hooks for sys_bind")
    Cc: stable@vger.kernel.org
    Signed-off-by: Jordan Rife <jrife@google.com>
    Signed-off-by: David Teigland <teigland@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dmaengine: fix NULL pointer in channel unregistration function [+ + +]

Author: Amelie Delaunay <amelie.delaunay@foss.st.com>
Date:   Wed Dec 13 17:04:52 2023 +0100

    dmaengine: fix NULL pointer in channel unregistration function
    
    [ Upstream commit f5c24d94512f1b288262beda4d3dcb9629222fc7 ]
    
    __dma_async_device_channel_register() can fail. In case of failure,
    chan->local is freed (with free_percpu()), and chan->local is nullified.
    When dma_async_device_unregister() is called (because of managed API or
    intentionally by DMA controller driver), channels are unconditionally
    unregistered, leading to this NULL pointer:
    [    1.318693] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000d0
    [...]
    [    1.484499] Call trace:
    [    1.486930]  device_del+0x40/0x394
    [    1.490314]  device_unregister+0x20/0x7c
    [    1.494220]  __dma_async_device_channel_unregister+0x68/0xc0
    
    Look at dma_async_device_register() function error path, channel device
    unregistration is done only if chan->local is not NULL.
    
    Then add the same condition at the beginning of
    __dma_async_device_channel_unregister() function, to avoid NULL pointer
    issue whatever the API used to reach this function.
    
    Fixes: d2fb0a043838 ("dmaengine: break out channel registration")
    Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
    Reviewed-by: Dave Jiang <dave.jiang@intel.com>
    Link: https://lore.kernel.org/r/20231213160452.2598073-1-amelie.delaunay@foss.st.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

dmaengine: fsl-edma: fix eDMAv4 channel allocation issue [+ + +]

Author: Frank Li <Frank.Li@nxp.com>
Date:   Tue Nov 14 10:48:21 2023 -0500

    dmaengine: fsl-edma: fix eDMAv4 channel allocation issue
    
    [ Upstream commit dc51b4442dd94ab12c146c1897bbdb40e16d5636 ]
    
    The eDMAv4 channel mux has a limitation where certain requests must use
    even channels, while others must use odd numbers.
    
    Add two flags (ARGS_EVEN_CH and ARGS_ODD_CH) to reflect this limitation.
    The device tree source (dts) files need to be updated accordingly.
    
    This issue was identified by the following commit:
    commit a725990557e7 ("arm64: dts: imx93: Fix the dmas entries order")
    
    Reverting channel orders triggered this problem.
    
    Fixes: 72f5801a4e2b ("dmaengine: fsl-edma: integrate v3 support")
    Signed-off-by: Frank Li <Frank.Li@nxp.com>
    Link: https://lore.kernel.org/r/20231114154824.3617255-2-Frank.Li@nxp.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

dmaengine: idxd: Move dma_free_coherent() out of spinlocked context [+ + +]

Author: Rex Zhang <rex.zhang@intel.com>
Date:   Tue Dec 12 10:21:58 2023 +0800

    dmaengine: idxd: Move dma_free_coherent() out of spinlocked context
    
    [ Upstream commit e271c0ba3f919c48e90c64b703538fbb7865cb63 ]
    
    Task may be rescheduled within dma_free_coherent(). So dma_free_coherent()
    can't be called between spin_lock() and spin_unlock() to avoid Call Trace:
        Call Trace:
        <TASK>
        dump_stack_lvl+0x37/0x50
        __might_resched+0x16a/0x1c0
        vunmap+0x2c/0x70
        __iommu_dma_free+0x96/0x100
        idxd_device_evl_free+0xd5/0x100 [idxd]
        device_release_driver_internal+0x197/0x200
        unbind_store+0xa1/0xb0
        kernfs_fop_write_iter+0x120/0x1c0
        vfs_write+0x2d3/0x400
        ksys_write+0x63/0xe0
        do_syscall_64+0x44/0xa0
        entry_SYSCALL_64_after_hwframe+0x6e/0xd8
    Move it out of the context.
    
    Fixes: 244da66cda35 ("dmaengine: idxd: setup event log configuration")
    Signed-off-by: Rex Zhang <rex.zhang@intel.com>
    Reviewed-by: Dave Jiang <dave.jiang@intel.com>
    Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
    Link: https://lore.kernel.org/r/20231212022158.358619-2-rex.zhang@intel.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

docs: kernel_abi.py: fix command injection [+ + +]

Author: Vegard Nossum <vegard.nossum@oracle.com>
Date:   Mon Jan 1 00:59:59 2024 +0100

    docs: kernel_abi.py: fix command injection
    
    commit 3231dd5862779c2e15633c96133a53205ad660ce upstream.
    
    The kernel-abi directive passes its argument straight to the shell.
    This is unfortunate and unnecessary.
    
    Let's always use paths relative to $srctree/Documentation/ and use
    subprocess.check_call() instead of subprocess.Popen(shell=True).
    
    This also makes the code shorter.
    
    Link: https://fosstodon.org/@jani/111676532203641247
    Reported-by: Jani Nikula <jani.nikula@intel.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
    Signed-off-by: Jonathan Corbet <corbet@lwn.net>
    Link: https://lore.kernel.org/r/20231231235959.3342928-2-vegard.nossum@oracle.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

docs: kernel_feat.py: fix potential command injection [+ + +]

Author: Vegard Nossum <vegard.nossum@oracle.com>
Date:   Wed Jan 10 18:47:58 2024 +0100

    docs: kernel_feat.py: fix potential command injection
    
    [ Upstream commit c48a7c44a1d02516309015b6134c9bb982e17008 ]
    
    The kernel-feat directive passes its argument straight to the shell.
    This is unfortunate and unnecessary.
    
    Let's always use paths relative to $srctree/Documentation/ and use
    subprocess.check_call() instead of subprocess.Popen(shell=True).
    
    This also makes the code shorter.
    
    This is analogous to commit 3231dd586277 ("docs: kernel_abi.py: fix
    command injection") where we did exactly the same thing for
    kernel_abi.py, somehow I completely missed this one.
    
    Link: https://fosstodon.org/@jani/111676532203641247
    Reported-by: Jani Nikula <jani.nikula@intel.com>
    Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Jonathan Corbet <corbet@lwn.net>
    Link: https://lore.kernel.org/r/20240110174758.3680506-1-vegard.nossum@oracle.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

docs: sparse: add sparse.rst to toctree [+ + +]

Author: Min-Hua Chen <minhuadotchen@gmail.com>
Date:   Sat Sep 2 13:25:12 2023 +0800

    docs: sparse: add sparse.rst to toctree
    
    [ Upstream commit c9ad95adc096f25004d4192258863806a68a9bc8 ]
    
    Add sparst.rst to toctree, so it can be part of the docs build.
    
    Cc: Randy Dunlap <rdunlap@infradead.org>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Suggested-by: Jonathan Corbet <corbet@lwn.net>
    Signed-off-by: Min-Hua Chen <minhuadotchen@gmail.com>
    Signed-off-by: Jonathan Corbet <corbet@lwn.net>
    Link: https://lore.kernel.org/r/20230902052512.12184-4-minhuadotchen@gmail.com
    Stable-dep-of: c48a7c44a1d0 ("docs: kernel_feat.py: fix potential command injection")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

docs: sparse: move TW sparse.txt to TW dev-tools [+ + +]

Author: Min-Hua Chen <minhuadotchen@gmail.com>
Date:   Sat Sep 2 13:25:10 2023 +0800

    docs: sparse: move TW sparse.txt to TW dev-tools
    
    [ Upstream commit 253f68f413a87a4e2bd93e61b00410e5e1b7b774 ]
    
    Follow Randy's advice [1] to move
    Documentation/translations/zh_TW/sparse.txt
    to
    Documentation/translations/zh_TW/dev-tools/sparse.txt
    
    [1] https://lore.kernel.org/lkml/bfab7c5b-e4d3-d8d9-afab-f43c0cdf26cf@infradead.org/
    
    Cc: Randy Dunlap <rdunlap@infradead.org>
    Suggested-by: Randy Dunlap <rdunlap@infradead.org>
    Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
    Signed-off-by: Min-Hua Chen <minhuadotchen@gmail.com>
    Signed-off-by: Jonathan Corbet <corbet@lwn.net>
    Link: https://lore.kernel.org/r/20230902052512.12184-2-minhuadotchen@gmail.com
    Stable-dep-of: c48a7c44a1d0 ("docs: kernel_feat.py: fix potential command injection")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Align the returned error code with legacy DP [+ + +]

Author: Wayne Lin <Wayne.Lin@amd.com>
Date:   Tue Jan 2 14:20:37 2024 +0800

    drm/amd/display: Align the returned error code with legacy DP
    
    commit bfe79f5fff1300d96203383582b078c7b0aec80a upstream.
    
    [Why]
    For usb4 connector, AUX transaction is handled by dmub utilizing a differnt
    code path comparing to legacy DP connector. If the usb4 DP connector is
    disconnected, AUX access will report EBUSY and cause igt@kms_dp_aux_dev
    fail.
    
    [How]
    Align the error code with the one reported by legacy DP as EIO.
    
    Cc: Mario Limonciello <mario.limonciello@amd.com>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Acked-by: Alex Hung <alex.hung@amd.com>
    Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: Disable PSR-SU on Parade 0803 TCON again [+ + +]

Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Mon Jun 19 15:04:24 2023 -0500

    drm/amd/display: Disable PSR-SU on Parade 0803 TCON again
    
    commit 571c2fa26aa654946447c282a09d40a56c7ff128 upstream.
    
    When screen brightness is rapidly changed and PSR-SU is enabled the
    display hangs on panels with this TCON even on the latest DCN 3.1.4
    microcode (0x8002a81 at this time).
    
    This was disabled previously as commit 072030b17830 ("drm/amd: Disable
    PSR-SU on Parade 0803 TCON") but reverted as commit 1e66a17ce546 ("Revert
    "drm/amd: Disable PSR-SU on Parade 0803 TCON"") in favor of testing for
    a new enough microcode (commit cd2e31a9ab93 ("drm/amd/display: Set minimum
    requirement for using PSR-SU on Phoenix")).
    
    As hangs are still happening specifically with this TCON, disable PSR-SU
    again for it until it can be root caused.
    
    Cc: stable@vger.kernel.org
    Cc: aaron.ma@canonical.com
    Cc: binli@gnome.org
    Cc: Marc Rossi <Marc.Rossi@amd.com>
    Cc: Hamza Mahfooz <Hamza.Mahfooz@amd.com>
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2046131
    Acked-by: Alex Deucher <alexander.deucher@amd.com>
    Reviewed-by: Harry Wentland <harry.wentland@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: fix bandwidth validation failure on DCN 2.1 [+ + +]

Author: Melissa Wen <mwen@igalia.com>
Date:   Fri Dec 29 15:25:00 2023 -0100

    drm/amd/display: fix bandwidth validation failure on DCN 2.1
    
    commit 3a0fa3bc245ef92838a8296e0055569b8dff94c4 upstream.
    
    IGT `amdgpu/amd_color/crtc-lut-accuracy` fails right at the beginning of
    the test execution, during atomic check, because DC rejects the
    bandwidth state for a fb sizing 64x64. The test was previously working
    with the deprecated dc_commit_state(). Now using
    dc_validate_with_context() approach, the atomic check needs to perform a
    full state validation. Therefore, set fast_validation to false in the
    dc_validate_global_state call for atomic check.
    
    Cc: stable@vger.kernel.org
    Fixes: b8272241ff9d ("drm/amd/display: Drop dc_commit_state in favor of dc_commit_streams")
    Signed-off-by: Melissa Wen <mwen@igalia.com>
    Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: Fix late derefrence 'dsc' check in 'link_set_dsc_pps_packet()' [+ + +]

Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Wed Jan 10 20:58:35 2024 +0530

    drm/amd/display: Fix late derefrence 'dsc' check in 'link_set_dsc_pps_packet()'
    
    commit 3bb9b1f958c3d986ed90a3ff009f1e77e9553207 upstream.
    
    In link_set_dsc_pps_packet(), 'struct display_stream_compressor *dsc'
    was dereferenced in a DC_LOGGER_INIT(dsc->ctx->logger); before the 'dsc'
    NULL pointer check.
    
    Fixes the below:
    drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_dpms.c:905 link_set_dsc_pps_packet() warn: variable dereferenced before check 'dsc' (see line 903)
    
    Cc: stable@vger.kernel.org
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Cc: Wenjing Liu <wenjing.liu@amd.com>
    Cc: Qingqing Zhuo <qingqing.zhuo@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: Fix uninitialized variable usage in core_link_ 'read_dpcd() & write_dpcd()' functions [+ + +]

Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Wed Jan 17 08:41:52 2024 +0530

    drm/amd/display: Fix uninitialized variable usage in core_link_ 'read_dpcd() & write_dpcd()' functions
    
    commit a58371d632ebab9ea63f10893a6b6731196b6f8d upstream.
    
    The 'status' variable in 'core_link_read_dpcd()' &
    'core_link_write_dpcd()' was uninitialized.
    
    Thus, initializing 'status' variable to 'DC_ERROR_UNEXPECTED' by default.
    
    Fixes the below:
    drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dpcd.c:226 core_link_read_dpcd() error: uninitialized symbol 'status'.
    drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dpcd.c:248 core_link_write_dpcd() error: uninitialized symbol 'status'.
    
    Cc: stable@vger.kernel.org
    Cc: Jerry Zuo <jerry.zuo@amd.com>
    Cc: Jun Lei <Jun.Lei@amd.com>
    Cc: Wayne Lin <Wayne.Lin@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: Fix variable deferencing before NULL check in edp_setup_replay() [+ + +]

Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date:   Mon Jan 8 21:20:28 2024 +0530

    drm/amd/display: Fix variable deferencing before NULL check in edp_setup_replay()
    
    commit 7073934f5d73f8b53308963cee36f0d389ea857c upstream.
    
    In edp_setup_replay(), 'struct dc *dc' & 'struct dmub_replay *replay'
    was dereferenced before the pointer 'link' & 'replay' NULL check.
    
    Fixes the below:
    drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_edp_panel_control.c:947 edp_setup_replay() warn: variable dereferenced before check 'link' (see line 933)
    
    Cc: stable@vger.kernel.org
    Cc: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
    Cc: Harry Wentland <harry.wentland@amd.com>
    Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
    Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: Port DENTIST hang and TDR fixes to OTG disable W/A [+ + +]

Author: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Date:   Fri Dec 15 11:01:42 2023 -0500

    drm/amd/display: Port DENTIST hang and TDR fixes to OTG disable W/A
    
    commit 4b56f7d47be87cde5f368b67bc7fac53a2c3e8d2 upstream.
    
    [Why]
    We can experience DENTIST hangs during optimize_bandwidth or TDRs if
    FIFO is toggled and hangs.
    
    [How]
    Port the DCN35 fixes to DCN314.
    
    Cc: Mario Limonciello <mario.limonciello@amd.com>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Reviewed-by: Charlene Liu <charlene.liu@amd.com>
    Acked-by: Alex Hung <alex.hung@amd.com>
    Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amdgpu/gfx10: set UNORD_DISPATCH in compute MQDs [+ + +]

Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Fri Jan 19 12:23:55 2024 -0500

    drm/amdgpu/gfx10: set UNORD_DISPATCH in compute MQDs
    
    [ Upstream commit 03ff6d7238b77e5fb2b85dc5fe01d2db9eb893bd ]
    
    This needs to be set to 1 to avoid a potential deadlock in
    the GC 10.x and newer.  On GC 9.x and older, this needs
    to be set to 0.  This can lead to hangs in some mixed
    graphics and compute workloads.  Updated firmware is also
    required for AQL.
    
    Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdgpu/gfx11: set UNORD_DISPATCH in compute MQDs [+ + +]

Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Fri Jan 19 12:32:59 2024 -0500

    drm/amdgpu/gfx11: set UNORD_DISPATCH in compute MQDs
    
    [ Upstream commit 3380fcad2c906872110d31ddf7aa1fdea57f9df6 ]
    
    This needs to be set to 1 to avoid a potential deadlock in
    the GC 10.x and newer.  On GC 9.x and older, this needs
    to be set to 0. This can lead to hangs in some mixed
    graphics and compute workloads. Updated firmware is also
    required for AQL.
    
    Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amdgpu/pm: Fix the power source flag error [+ + +]

Author: Ma Jun <Jun.Ma2@amd.com>
Date:   Wed Jan 17 14:35:29 2024 +0800

    drm/amdgpu/pm: Fix the power source flag error
    
    commit ca1ffb174f16b699c536734fc12a4162097c49f4 upstream.
    
    The power source flag should be updated when
    [1] System receives an interrupt indicating that the power source
    has changed.
    [2] System resumes from suspend or runtime suspend
    
    Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
    Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amdgpu: correct the cu count for gfx v11 [+ + +]

Author: Likun Gao <Likun.Gao@amd.com>
Date:   Fri Jan 5 17:33:34 2024 +0800

    drm/amdgpu: correct the cu count for gfx v11
    
    commit f4a94dbb6dc0bed10a5fc63718d00f1de45b12c0 upstream.
    
    Correct the algorithm of active CU to skip disabled
    sa for gfx v11.
    
    Signed-off-by: Likun Gao <Likun.Gao@amd.com>
    Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amdgpu: Fix the null pointer when load rlc firmware [+ + +]

Author: Ma Jun <Jun.Ma2@amd.com>
Date:   Fri Jan 12 13:33:24 2024 +0800

    drm/amdgpu: Fix the null pointer when load rlc firmware
    
    commit bc03c02cc1991a066b23e69bbcc0f66e8f1f7453 upstream.
    
    If the RLC firmware is invalid because of wrong header size,
    the pointer to the rlc firmware is released in function
    amdgpu_ucode_request. There will be a null pointer error
    in subsequent use. So skip validation to fix it.
    
    Fixes: 3da9b71563cb ("drm/amd: Use `amdgpu_ucode_*` helpers for GFX10")
    Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
    Acked-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/bridge: anx7625: Ensure bridge is suspended in disable() [+ + +]

Author: Hsin-Yi Wang <hsinyi@chromium.org>
Date:   Wed Jan 17 17:58:14 2024 -0800

    drm/bridge: anx7625: Ensure bridge is suspended in disable()
    
    [ Upstream commit 4d5b7daa3c610af3f322ad1e91fc0c752ff32f0e ]
    
    Similar to commit 26db46bc9c67 ("drm/bridge: parade-ps8640: Ensure bridge
    is suspended in .post_disable()"). Add a mutex to ensure that aux transfer
    won't race with atomic_disable by holding the PM reference and prevent
    the bridge from suspend.
    
    Also we need to use pm_runtime_put_sync_suspend() to suspend the bridge
    instead of idle with pm_runtime_put_sync().
    
    Fixes: 3203e497eb76 ("drm/bridge: anx7625: Synchronously run runtime suspend.")
    Fixes: adca62ec370c ("drm/bridge: anx7625: Support reading edid through aux channel")
    Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
    Tested-by: Xuxin Xiong <xuxinxiong@huaqin.corp-partner.google.com>
    Reviewed-by: Pin-yen Lin <treapking@chromium.org>
    Reviewed-by: Douglas Anderson <dianders@chromium.org>
    Signed-off-by: Douglas Anderson <dianders@chromium.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240118015916.2296741-1-hsinyi@chromium.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/bridge: nxp-ptn3460: fix i2c_master_send() error checking [+ + +]

Author: Dan Carpenter <dan.carpenter@linaro.org>
Date:   Mon Dec 4 15:29:00 2023 +0300

    drm/bridge: nxp-ptn3460: fix i2c_master_send() error checking
    
    commit 914437992876838662c968cb416f832110fb1093 upstream.
    
    The i2c_master_send/recv() functions return negative error codes or the
    number of bytes that were able to be sent/received.  This code has
    two problems.  1)  Instead of checking if all the bytes were sent or
    received, it checks that at least one byte was sent or received.
    2) If there was a partial send/receive then we should return a negative
    error code but this code returns success.
    
    Fixes: a9fe713d7d45 ("drm/bridge: Add PTN3460 bridge driver")
    Cc: stable@vger.kernel.org
    Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
    Reviewed-by: Robert Foss <rfoss@kernel.org>
    Signed-off-by: Robert Foss <rfoss@kernel.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/0cdc2dce-ca89-451a-9774-1482ab2f4762@moroto.mountain
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/bridge: nxp-ptn3460: simplify some error checking [+ + +]

Author: Dan Carpenter <dan.carpenter@linaro.org>
Date:   Wed Dec 6 18:05:15 2023 +0300

    drm/bridge: nxp-ptn3460: simplify some error checking
    
    commit 28d3d0696688154cc04983f343011d07bf0508e4 upstream.
    
    The i2c_master_send/recv() functions return negative error codes or
    they return "len" on success.  So the error handling here can be written
    as just normal checks for "if (ret < 0) return ret;".  No need to
    complicate things.
    
    Btw, in this code the "len" parameter can never be zero, but even if
    it were, then I feel like this would still be the best way to write it.
    
    Fixes: 914437992876 ("drm/bridge: nxp-ptn3460: fix i2c_master_send() error checking")
    Suggested-by: Neil Armstrong <neil.armstrong@linaro.org>
    Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
    Reviewed-by: Robert Foss <rfoss@kernel.org>
    Signed-off-by: Robert Foss <rfoss@kernel.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/04242630-42d8-4920-8c67-24ac9db6b3c9@moroto.mountain
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/bridge: parade-ps8640: Ensure bridge is suspended in .post_disable() [+ + +]

Author: Pin-yen Lin <treapking@chromium.org>
Date:   Tue Jan 9 20:04:57 2024 +0800

    drm/bridge: parade-ps8640: Ensure bridge is suspended in .post_disable()
    
    [ Upstream commit 26db46bc9c675e43230cc6accd110110a7654299 ]
    
    The ps8640 bridge seems to expect everything to be power cycled at the
    disable process, but sometimes ps8640_aux_transfer() holds the runtime
    PM reference and prevents the bridge from suspend.
    
    Prevent that by introducing a mutex lock between ps8640_aux_transfer()
    and .post_disable() to make sure the bridge is really powered off.
    
    Fixes: 826cff3f7ebb ("drm/bridge: parade-ps8640: Enable runtime power management")
    Signed-off-by: Pin-yen Lin <treapking@chromium.org>
    Reviewed-by: Douglas Anderson <dianders@chromium.org>
    Signed-off-by: Douglas Anderson <dianders@chromium.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240109120528.1292601-1-treapking@chromium.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/bridge: parade-ps8640: Make sure we drop the AUX mutex in the error case [+ + +]

Author: Douglas Anderson <dianders@chromium.org>
Date:   Wed Jan 17 10:35:03 2024 -0800

    drm/bridge: parade-ps8640: Make sure we drop the AUX mutex in the error case
    
    [ Upstream commit a20f1b02bafcbf5a32d96a1d4185d6981cf7d016 ]
    
    After commit 26db46bc9c67 ("drm/bridge: parade-ps8640: Ensure bridge
    is suspended in .post_disable()"), if we hit the error case in
    ps8640_aux_transfer() then we return without dropping the mutex. Fix
    this oversight.
    
    Fixes: 26db46bc9c67 ("drm/bridge: parade-ps8640: Ensure bridge is suspended in .post_disable()")
    Reviewed-by: Hsin-Yi Wang <hsinyi@chromium.org>
    Signed-off-by: Douglas Anderson <dianders@chromium.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240117103502.1.Ib726a0184913925efc7e99c4d4fc801982e1bc24@changeid
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/bridge: parade-ps8640: Wait for HPD when doing an AUX transfer [+ + +]

Author: Douglas Anderson <dianders@chromium.org>
Date:   Thu Dec 21 13:55:48 2023 -0800

    drm/bridge: parade-ps8640: Wait for HPD when doing an AUX transfer
    
    [ Upstream commit 024b32db43a359e0ded3fcc6cd86247cbbed4224 ]
    
    Unlike what is claimed in commit f5aa7d46b0ee ("drm/bridge:
    parade-ps8640: Provide wait_hpd_asserted() in struct drm_dp_aux"), if
    someone manually tries to do an AUX transfer (like via `i2cdump ${bus}
    0x50 i`) while the panel is off we don't just get a simple transfer
    error. Instead, the whole ps8640 gets thrown for a loop and goes into
    a bad state.
    
    Let's put the function to wait for the HPD (and the magical 50 ms
    after first reset) back in when we're doing an AUX transfer. This
    shouldn't actually make things much slower (assuming the panel is on)
    because we should immediately poll and see the HPD high. Mostly this
    is just an extra i2c transfer to the bridge.
    
    Fixes: f5aa7d46b0ee ("drm/bridge: parade-ps8640: Provide wait_hpd_asserted() in struct drm_dp_aux")
    Tested-by: Pin-yen Lin <treapking@chromium.org>
    Reviewed-by: Pin-yen Lin <treapking@chromium.org>
    Signed-off-by: Douglas Anderson <dianders@chromium.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20231221135548.1.I10f326a9305d57ad32cee7f8d9c60518c8be20fb@changeid
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/bridge: sii902x: Fix audio codec unregistration [+ + +]

Author: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Date:   Wed Jan 3 15:31:08 2024 +0200

    drm/bridge: sii902x: Fix audio codec unregistration
    
    [ Upstream commit 3fc6c76a8d208d3955c9e64b382d0ff370bc61fc ]
    
    The driver never unregisters the audio codec platform device, which can
    lead to a crash on module reloading, nor does it handle the return value
    from sii902x_audio_codec_init().
    
    Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
    Fixes: ff5781634c41 ("drm/bridge: sii902x: Implement HDMI audio support")
    Cc: Jyri Sarha <jsarha@ti.com>
    Acked-by: Linus Walleij <linus.walleij@linaro.org>
    Link: https://lore.kernel.org/r/20240103-si902x-fixes-v1-2-b9fd3e448411@ideasonboard.com
    Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240103-si902x-fixes-v1-2-b9fd3e448411@ideasonboard.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/bridge: sii902x: Fix probing race issue [+ + +]

Author: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Date:   Wed Jan 3 15:31:07 2024 +0200

    drm/bridge: sii902x: Fix probing race issue
    
    [ Upstream commit 08ac6f132dd77e40f786d8af51140c96c6d739c9 ]
    
    A null pointer dereference crash has been observed rarely on TI
    platforms using sii9022 bridge:
    
    [   53.271356]  sii902x_get_edid+0x34/0x70 [sii902x]
    [   53.276066]  sii902x_bridge_get_edid+0x14/0x20 [sii902x]
    [   53.281381]  drm_bridge_get_edid+0x20/0x34 [drm]
    [   53.286305]  drm_bridge_connector_get_modes+0x8c/0xcc [drm_kms_helper]
    [   53.292955]  drm_helper_probe_single_connector_modes+0x190/0x538 [drm_kms_helper]
    [   53.300510]  drm_client_modeset_probe+0x1f0/0xbd4 [drm]
    [   53.305958]  __drm_fb_helper_initial_config_and_unlock+0x50/0x510 [drm_kms_helper]
    [   53.313611]  drm_fb_helper_initial_config+0x48/0x58 [drm_kms_helper]
    [   53.320039]  drm_fbdev_dma_client_hotplug+0x84/0xd4 [drm_dma_helper]
    [   53.326401]  drm_client_register+0x5c/0xa0 [drm]
    [   53.331216]  drm_fbdev_dma_setup+0xc8/0x13c [drm_dma_helper]
    [   53.336881]  tidss_probe+0x128/0x264 [tidss]
    [   53.341174]  platform_probe+0x68/0xc4
    [   53.344841]  really_probe+0x188/0x3c4
    [   53.348501]  __driver_probe_device+0x7c/0x16c
    [   53.352854]  driver_probe_device+0x3c/0x10c
    [   53.357033]  __device_attach_driver+0xbc/0x158
    [   53.361472]  bus_for_each_drv+0x88/0xe8
    [   53.365303]  __device_attach+0xa0/0x1b4
    [   53.369135]  device_initial_probe+0x14/0x20
    [   53.373314]  bus_probe_device+0xb0/0xb4
    [   53.377145]  deferred_probe_work_func+0xcc/0x124
    [   53.381757]  process_one_work+0x1f0/0x518
    [   53.385770]  worker_thread+0x1e8/0x3dc
    [   53.389519]  kthread+0x11c/0x120
    [   53.392750]  ret_from_fork+0x10/0x20
    
    The issue here is as follows:
    
    - tidss probes, but is deferred as sii902x is still missing.
    - sii902x starts probing and enters sii902x_init().
    - sii902x calls drm_bridge_add(). Now the sii902x bridge is ready from
      DRM's perspective.
    - sii902x calls sii902x_audio_codec_init() and
      platform_device_register_data()
    - The registration of the audio platform device causes probing of the
      deferred devices.
    - tidss probes, which eventually causes sii902x_bridge_get_edid() to be
      called.
    - sii902x_bridge_get_edid() tries to use the i2c to read the edid.
      However, the sii902x driver has not set up the i2c part yet, leading
      to the crash.
    
    Fix this by moving the drm_bridge_add() to the end of the
    sii902x_init(), which is also at the very end of sii902x_probe().
    
    Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
    Fixes: 21d808405fe4 ("drm/bridge/sii902x: Fix EDID readback")
    Acked-by: Linus Walleij <linus.walleij@linaro.org>
    Link: https://lore.kernel.org/r/20240103-si902x-fixes-v1-1-b9fd3e448411@ideasonboard.com
    Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240103-si902x-fixes-v1-1-b9fd3e448411@ideasonboard.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/exynos: fix accidental on-stack copy of exynos_drm_plane [+ + +]

Author: Arnd Bergmann <arnd@arndb.de>
Date:   Thu Dec 14 13:32:15 2023 +0100

    drm/exynos: fix accidental on-stack copy of exynos_drm_plane
    
    [ Upstream commit 960b537e91725bcb17dd1b19e48950e62d134078 ]
    
    gcc rightfully complains about excessive stack usage in the fimd_win_set_pixfmt()
    function:
    
    drivers/gpu/drm/exynos/exynos_drm_fimd.c: In function 'fimd_win_set_pixfmt':
    drivers/gpu/drm/exynos/exynos_drm_fimd.c:750:1: error: the frame size of 1032 bytes is larger than 1024 byte
    drivers/gpu/drm/exynos/exynos5433_drm_decon.c: In function 'decon_win_set_pixfmt':
    drivers/gpu/drm/exynos/exynos5433_drm_decon.c:381:1: error: the frame size of 1032 bytes is larger than 1024 bytes
    
    There is really no reason to copy the large exynos_drm_plane
    structure to the stack before using one of its members, so just
    use a pointer instead.
    
    Fixes: 6f8ee5c21722 ("drm/exynos: fimd: Make plane alpha configurable")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com>
    Signed-off-by: Inki Dae <inki.dae@samsung.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/exynos: gsc: minor fix for loop iteration in gsc_runtime_resume [+ + +]

Author: Fedor Pchelkin <pchelkin@ispras.ru>
Date:   Wed Dec 20 12:53:15 2023 +0300

    drm/exynos: gsc: minor fix for loop iteration in gsc_runtime_resume
    
    [ Upstream commit 4050957c7c2c14aa795dbf423b4180d5ac04e113 ]
    
    Do not forget to call clk_disable_unprepare() on the first element of
    ctx->clocks array.
    
    Found by Linux Verification Center (linuxtesting.org).
    
    Fixes: 8b7d3ec83aba ("drm/exynos: gsc: Convert driver to IPP v2 core API")
    Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
    Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com>
    Signed-off-by: Inki Dae <inki.dae@samsung.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915/lnl: Remove watchdog timers for PSR [+ + +]

Author: Mika Kahola <mika.kahola@intel.com>
Date:   Tue Oct 10 12:52:33 2023 +0300

    drm/i915/lnl: Remove watchdog timers for PSR
    
    [ Upstream commit a2cd15c2411624a7a97bad60d98d7e0a1e5002a6 ]
    
    Watchdog timers for Lunarlake HW were removed for PSR/PSR2
    The patch removes the use of these timers from the driver code.
    
    BSpec: 69895
    
    v2: Reword commit message (Ville)
        Drop HPD mask from LNL (Ville)
        Revise masking logic (Jouni)
    v3: Revise commit message (Ville)
        Revert HPD mask removal as irrelevant for this patch (Ville)
    
    Signed-off-by: Mika Kahola <mika.kahola@intel.com>
    Reviewed-by: Jouni Hц╤gander <jouni.hogander@intel.com>
    Signed-off-by: Jouni Hц╤gander <jouni.hogander@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20231010095233.590613-1-mika.kahola@intel.com
    Stable-dep-of: f9f031dd21a7 ("drm/i915/psr: Only allow PSR in LPSP mode on HSW non-ULT")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915/psr: Only allow PSR in LPSP mode on HSW non-ULT [+ + +]

Author: Ville Syrjц╓lц╓ <ville.syrjala@linux.intel.com>
Date:   Thu Jan 18 23:21:31 2024 +0200

    drm/i915/psr: Only allow PSR in LPSP mode on HSW non-ULT
    
    [ Upstream commit f9f031dd21a7ce13a13862fa5281d32e1029c70f ]
    
    On HSW non-ULT (or at least on Dell Latitude E6540) external displays
    start to flicker when we enable PSR on the eDP. We observe a much higher
    SR and PC6 residency than should be possible with an external display,
    and indeen much higher than what we observe with eDP disabled and
    only the external display enabled. Looks like the hardware is somehow
    ignoring the fact that the external display is active during PSR.
    
    I wasn't able to redproduce this on my HSW ULT machine, or BDW.
    So either there's something specific about this particular laptop
    (eg. some unknown firmware thing) or the issue is limited to just
    non-ULT HSW systems. All known registers that could affect this
    look perfectly reasonable on the affected machine.
    
    As a workaround let's unmask the LPSP event to prevent PSR entry
    except while in LPSP mode (only pipe A + eDP active). This
    will prevent PSR entry entirely when multiple pipes are active.
    The one slight downside is that we now also prevent PSR entry
    when driving eDP with pipe B or C, but I think that's a reasonable
    tradeoff to avoid having to implement a more complex workaround.
    
    Cc: stable@vger.kernel.org
    Fixes: 783d8b80871f ("drm/i915/psr: Re-enable PSR1 on hsw/bdw")
    Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10092
    Signed-off-by: Ville Syrjц╓lц╓ <ville.syrjala@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240118212131.31868-1-ville.syrjala@linux.intel.com
    Reviewed-by: Jouni Hц╤gander <jouni.hogander@intel.com>
    (cherry picked from commit 94501c3ca6400e463ff6cc0c9cf4a2feb6a9205d)
    Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/panel-edp: Add AUO B116XTN02, BOE NT116WHM-N21,836X2, NV116WHM-N49 V8.0 [+ + +]

Author: Sheng-Liang Pan <sheng-liang.pan@quanta.corp-partner.google.com>
Date:   Fri Oct 27 11:04:56 2023 +0800

    drm/panel-edp: Add AUO B116XTN02, BOE NT116WHM-N21,836X2, NV116WHM-N49 V8.0
    
    [ Upstream commit 3db2420422a5912d97966e0176050bb0fc9aa63e ]
    
    Add panel identification entry for
    - AUO B116XTN02 family (product ID:0x235c)
    - BOE NT116WHM-N21,836X2 (product ID:0x09c3)
    - BOE NV116WHM-N49 V8.0 (product ID:0x0979)
    
    Signed-off-by: Sheng-Liang Pan <sheng-liang.pan@quanta.corp-partner.google.com>
    Signed-off-by: Douglas Anderson <dianders@chromium.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20231027110435.1.Ia01fe9ec1c0953e0050a232eaa782fef2c037516@changeid
    Stable-dep-of: fc6e76792965 ("drm/panel-edp: drm/panel-edp: Fix AUO B116XAK01 name and timing")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/panel-edp: drm/panel-edp: Fix AUO B116XAK01 name and timing [+ + +]

Author: Hsin-Yi Wang <hsinyi@chromium.org>
Date:   Tue Nov 7 12:41:51 2023 -0800

    drm/panel-edp: drm/panel-edp: Fix AUO B116XAK01 name and timing
    
    [ Upstream commit fc6e7679296530106ee0954e8ddef1aa58b2e0b5 ]
    
    Rename AUO 0x405c B116XAK01 to B116XAK01.0 and adjust the timing of
    auo_b116xak01: T3=200, T12=500, T7_max = 50 according to decoding edid
    and datasheet.
    
    Fixes: da458286a5e2 ("drm/panel: Add support for AUO B116XAK01 panel")
    Cc: stable@vger.kernel.org
    Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
    Reviewed-by: Douglas Anderson <dianders@chromium.org>
    Acked-by: Maxime Ripard <mripard@kernel.org>
    Signed-off-by: Douglas Anderson <dianders@chromium.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20231107204611.3082200-2-hsinyi@chromium.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/panel-edp: drm/panel-edp: Fix AUO B116XTN02 name [+ + +]

Author: Hsin-Yi Wang <hsinyi@chromium.org>
Date:   Tue Nov 7 12:41:52 2023 -0800

    drm/panel-edp: drm/panel-edp: Fix AUO B116XTN02 name
    
    [ Upstream commit 962845c090c4f85fa4f6872a5b6c89ee61f53cc0 ]
    
    Rename AUO 0x235c B116XTN02 to B116XTN02.3 according to decoding edid.
    
    Fixes: 3db2420422a5 ("drm/panel-edp: Add AUO B116XTN02, BOE NT116WHM-N21,836X2, NV116WHM-N49 V8.0")
    Cc: stable@vger.kernel.org
    Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
    Reviewed-by: Douglas Anderson <dianders@chromium.org>
    Acked-by: Maxime Ripard <mripard@kernel.org>
    Signed-off-by: Douglas Anderson <dianders@chromium.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20231107204611.3082200-3-hsinyi@chromium.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/panel: samsung-s6d7aa0: drop DRM_BUS_FLAG_DE_HIGH for lsl080al02 [+ + +]

Author: Artur Weber <aweber.kernel@gmail.com>
Date:   Fri Jan 5 07:53:02 2024 +0100

    drm/panel: samsung-s6d7aa0: drop DRM_BUS_FLAG_DE_HIGH for lsl080al02
    
    [ Upstream commit 62b143b5ec4a14e1ae0dede5aabaf1832e3b0073 ]
    
    It turns out that I had misconfigured the device I was using the panel
    with; the bus data polarity is not high for this panel, I had to change
    the config on the display controller's side.
    
    Fix the panel config to properly reflect its accurate settings.
    
    Fixes: 6810bb390282 ("drm/panel: Add Samsung S6D7AA0 panel controller driver")
    Reviewed-by: Jessica Zhang <quic_jesszhan@quicinc.com>
    Signed-off-by: Artur Weber <aweber.kernel@gmail.com>
    Link: https://lore.kernel.org/r/20240105-tab3-display-fixes-v2-2-904d1207bf6f@gmail.com
    Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240105-tab3-display-fixes-v2-2-904d1207bf6f@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/tidss: Fix atomic_flush check [+ + +]

Author: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Date:   Thu Nov 9 09:38:03 2023 +0200

    drm/tidss: Fix atomic_flush check
    
    commit 95d4b471953411854f9c80b568da7fcf753f3801 upstream.
    
    tidss_crtc_atomic_flush() checks if the crtc is enabled, and if not,
    returns immediately as there's no reason to do any register changes.
    
    However, the code checks for 'crtc->state->enable', which does not
    reflect the actual HW state. We should instead look at the
    'crtc->state->active' flag.
    
    This causes the tidss_crtc_atomic_flush() to proceed with the flush even
    if the active state is false, which then causes us to hit the
    WARN_ON(!crtc->state->event) check.
    
    Fix this by checking the active flag, and while at it, fix the related
    debug print which had "active" and "needs modeset" wrong way.
    
    Cc:  <stable@vger.kernel.org>
    Fixes: 32a1795f57ee ("drm/tidss: New driver for TI Keystone platform Display SubSystem")
    Reviewed-by: Aradhya Bhatia <a-bhatia1@ti.com>
    Link: https://lore.kernel.org/r/20231109-tidss-probe-v2-10-ac91b5ea35c0@ideasonboard.com
    Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/virtio: Disable damage clipping if FB changed since last page-flip [+ + +]

Author: Javier Martinez Canillas <javierm@redhat.com>
Date:   Thu Nov 23 23:13:01 2023 +0100

    drm/virtio: Disable damage clipping if FB changed since last page-flip
    
    commit 0240db231dfe5ee5b7a3a03cba96f0844b7a673d upstream.
    
    The driver does per-buffer uploads and needs to force a full plane update
    if the plane's attached framebuffer has change since the last page-flip.
    
    Fixes: 01f05940a9a7 ("drm/virtio: Enable fb damage clips property for the primary plane")
    Cc: <stable@vger.kernel.org> # v6.4+
    Reported-by: nerdopolis <bluescreen_avenger@verizon.net>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218115
    Suggested-by: Sima Vetter <daniel.vetter@ffwll.ch>
    Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
    Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
    Reviewed-by: Zack Rusin <zackr@vmware.com>
    Acked-by: Sima Vetter <daniel.vetter@ffwll.ch>
    Link: https://patchwork.freedesktop.org/patch/msgid/20231123221315.3579454-3-javierm@redhat.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm: Allow drivers to indicate the damage helpers to ignore damage clips [+ + +]

Author: Javier Martinez Canillas <javierm@redhat.com>
Date:   Thu Nov 23 23:13:00 2023 +0100

    drm: Allow drivers to indicate the damage helpers to ignore damage clips
    
    commit 35ed38d58257336c1df26b14fd5110b026e2adde upstream.
    
    It allows drivers to set a struct drm_plane_state .ignore_damage_clips in
    their plane's .atomic_check callback, as an indication to damage helpers
    such as drm_atomic_helper_damage_iter_init() that the damage clips should
    be ignored.
    
    To be used by drivers that do per-buffer (e.g: virtio-gpu) uploads (rather
    than per-plane uploads), since these type of drivers need to handle buffer
    damages instead of frame damages.
    
    That way, these drivers could force a full plane update if the framebuffer
    attached to a plane's state has changed since the last update (page-flip).
    
    Fixes: 01f05940a9a7 ("drm/virtio: Enable fb damage clips property for the primary plane")
    Cc: <stable@vger.kernel.org> # v6.4+
    Reported-by: nerdopolis <bluescreen_avenger@verizon.net>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218115
    Suggested-by: Thomas Zimmermann <tzimmermann@suse.de>
    Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
    Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
    Reviewed-by: Zack Rusin <zackr@vmware.com>
    Acked-by: Sima Vetter <daniel.vetter@ffwll.ch>
    Link: https://patchwork.freedesktop.org/patch/msgid/20231123221315.3579454-2-javierm@redhat.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm: bridge: samsung-dsim: Don't use FORCE_STOP_STATE [+ + +]

Author: Michael Walle <mwalle@kernel.org>
Date:   Mon Nov 13 17:43:44 2023 +0100

    drm: bridge: samsung-dsim: Don't use FORCE_STOP_STATE
    
    [ Upstream commit ff3d5d04db07e5374758baa7e877fde8d683ebab ]
    
    The FORCE_STOP_STATE bit is unsuitable to force the DSI link into LP-11
    mode. It seems the bridge internally queues DSI packets and when the
    FORCE_STOP_STATE bit is cleared, they are sent in close succession
    without any useful timing (this also means that the DSI lanes won't go
    into LP-11 mode). The length of this gibberish varies between 1ms and
    5ms. This sometimes breaks an attached bridge (TI SN65DSI84 in this
    case). In our case, the bridge will fail in about 1 per 500 reboots.
    
    The FORCE_STOP_STATE handling was introduced to have the DSI lanes in
    LP-11 state during the .pre_enable phase. But as it turns out, none of
    this is needed at all. Between samsung_dsim_init() and
    samsung_dsim_set_display_enable() the lanes are already in LP-11 mode.
    The code as it was before commit 20c827683de0 ("drm: bridge:
    samsung-dsim: Fix init during host transfer") and 0c14d3130654 ("drm:
    bridge: samsung-dsim: Fix i.MX8M enable flow to meet spec") was correct
    in this regard.
    
    This patch basically reverts both commits. It was tested on an i.MX8M
    SoC with an SN65DSI84 bridge. The signals were probed and the DSI
    packets were decoded during initialization and link start-up. After this
    patch the first DSI packet on the link is a VSYNC packet and the timing
    is correct.
    
    Command mode between .pre_enable and .enable was also briefly tested by
    a quick hack. There was no DSI link partner which would have responded,
    but it was made sure the DSI packet was send on the link. As a side
    note, the command mode seems to just work in HS mode. I couldn't find
    that the bridge will handle commands in LP mode.
    
    Fixes: 20c827683de0 ("drm: bridge: samsung-dsim: Fix init during host transfer")
    Fixes: 0c14d3130654 ("drm: bridge: samsung-dsim: Fix i.MX8M enable flow to meet spec")
    Signed-off-by: Michael Walle <mwalle@kernel.org>
    Signed-off-by: Inki Dae <inki.dae@samsung.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20231113164344.1612602-1-mwalle@kernel.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm: Disable the cursor plane on atomic contexts with virtualized drivers [+ + +]

Author: Zack Rusin <zack.rusin@broadcom.com>
Date:   Mon Oct 23 09:46:05 2023 +0200

    drm: Disable the cursor plane on atomic contexts with virtualized drivers
    
    commit 4e3b70da64a53784683cfcbac2deda5d6e540407 upstream.
    
    Cursor planes on virtualized drivers have special meaning and require
    that the clients handle them in specific ways, e.g. the cursor plane
    should react to the mouse movement the way a mouse cursor would be
    expected to and the client is required to set hotspot properties on it
    in order for the mouse events to be routed correctly.
    
    This breaks the contract as specified by the "universal planes". Fix it
    by disabling the cursor planes on virtualized drivers while adding
    a foundation on top of which it's possible to special case mouse cursor
    planes for clients that want it.
    
    Disabling the cursor planes makes some kms compositors which were broken,
    e.g. Weston, fallback to software cursor which works fine or at least
    better than currently while having no effect on others, e.g. gnome-shell
    or kwin, which put virtualized drivers on a deny-list when running in
    atomic context to make them fallback to legacy kms and avoid this issue.
    
    Signed-off-by: Zack Rusin <zackr@vmware.com>
    Fixes: 681e7ec73044 ("drm: Allow userspace to ask for universal plane list (v2)")
    Cc: <stable@vger.kernel.org> # v5.4+
    Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
    Cc: Maxime Ripard <mripard@kernel.org>
    Cc: Thomas Zimmermann <tzimmermann@suse.de>
    Cc: David Airlie <airlied@linux.ie>
    Cc: Daniel Vetter <daniel@ffwll.ch>
    Cc: Dave Airlie <airlied@redhat.com>
    Cc: Gerd Hoffmann <kraxel@redhat.com>
    Cc: Hans de Goede <hdegoede@redhat.com>
    Cc: Gurchetan Singh <gurchetansingh@chromium.org>
    Cc: Chia-I Wu <olvaffe@gmail.com>
    Cc: dri-devel@lists.freedesktop.org
    Cc: virtualization@lists.linux-foundation.org
    Cc: spice-devel@lists.freedesktop.org
    Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>
    Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
    Acked-by: Simon Ser <contact@emersion.fr>
    Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20231023074613.41327-2-aesteve@redhat.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm: Don't unref the same fb many times by mistake due to deadlock handling [+ + +]

Author: Ville Syrjц╓lц╓ <ville.syrjala@linux.intel.com>
Date:   Mon Dec 11 10:16:24 2023 +0200

    drm: Don't unref the same fb many times by mistake due to deadlock handling
    
    commit cb4daf271302d71a6b9a7c01bd0b6d76febd8f0c upstream.
    
    If we get a deadlock after the fb lookup in drm_mode_page_flip_ioctl()
    we proceed to unref the fb and then retry the whole thing from the top.
    But we forget to reset the fb pointer back to NULL, and so if we then
    get another error during the retry, before the fb lookup, we proceed
    the unref the same fb again without having gotten another reference.
    The end result is that the fb will (eventually) end up being freed
    while it's still in use.
    
    Reset fb to NULL once we've unreffed it to avoid doing it again
    until we've done another fb lookup.
    
    This turned out to be pretty easy to hit on a DG2 when doing async
    flips (and CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y). The first symptom I
    saw that drm_closefb() simply got stuck in a busy loop while walking
    the framebuffer list. Fortunately I was able to convince it to oops
    instead, and from there it was easier to track down the culprit.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Ville Syrjц╓lц╓ <ville.syrjala@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20231211081625.25704-1-ville.syrjala@linux.intel.com
    Acked-by: Javier Martinez Canillas <javierm@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm: Fix TODO list mentioning non-KMS drivers [+ + +]

Author: Thomas Zimmermann <tzimmermann@suse.de>
Date:   Wed Nov 22 13:09:31 2023 +0100

    drm: Fix TODO list mentioning non-KMS drivers
    
    commit 9cf5ca1f485cae406968947a92bf304603999fa1 upstream.
    
    Non-KMS drivers have been removed from DRM. Update the TODO list
    accordingly.
    
    Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
    Fixes: a276afc19eec ("drm: Remove some obsolete drm pciids(tdfx, mga, i810, savage, r128, sis, via)")
    Cc: Cai Huoqing <cai.huoqing@linux.dev>
    Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
    Cc: Dave Airlie <airlied@redhat.com>
    Cc: Thomas Zimmermann <tzimmermann@suse.de>
    Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
    Cc: Maxime Ripard <mripard@kernel.org>
    Cc: David Airlie <airlied@gmail.com>
    Cc: Daniel Vetter <daniel@ffwll.ch>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: dri-devel@lists.freedesktop.org
    Cc: <stable@vger.kernel.org> # v6.3+
    Cc: linux-doc@vger.kernel.org
    Reviewed-by: David Airlie <airlied@gmail.com>
    Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
    Acked-by: Alex Deucher <alexander.deucher@amd.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20231122122449.11588-3-tzimmermann@suse.de
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm: panel-simple: add missing bus flags for Tianma tm070jvhg[30/33] [+ + +]

Author: Markus Niebel <Markus.Niebel@ew.tq-group.com>
Date:   Thu Oct 12 10:42:08 2023 +0200

    drm: panel-simple: add missing bus flags for Tianma tm070jvhg[30/33]
    
    [ Upstream commit 45dd7df26cee741b31c25ffdd44fb8794eb45ccd ]
    
    The DE signal is active high on this display, fill in the missing
    bus_flags. This aligns panel_desc with its display_timing.
    
    Fixes: 9a2654c0f62a ("drm/panel: Add and fill drm_panel type field")
    Fixes: b3bfcdf8a3b6 ("drm/panel: simple: add Tianma TM070JVHG33")
    
    Signed-off-by: Markus Niebel <Markus.Niebel@ew.tq-group.com>
    Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
    Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
    Link: https://lore.kernel.org/r/20231012084208.2731650-1-alexander.stein@ew.tq-group.com
    Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20231012084208.2731650-1-alexander.stein@ew.tq-group.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

dt-bindings: net: snps,dwmac: Tx coe unsupported [+ + +]

Author: Rohan G Thomas <rohan.g.thomas@intel.com>
Date:   Sat Sep 16 14:33:11 2023 +0800

    dt-bindings: net: snps,dwmac: Tx coe unsupported
    
    commit 6fb8c20a04be234cf1cfd4bdd8cfb8860c9d2d3b upstream.
    
    Add dt-bindings for coe-unsupported property per tx queue. Some DWMAC
    IPs support tx checksum offloading(coe) only for a few tx queues.
    
    DW xGMAC IP can be synthesized such that it can support tx coe only
    for a few initial tx queues. Also as Serge pointed out, for the DW
    QoS IP tx coe can be individually configured for each tx queue. This
    property is added to have sw fallback for checksum calculation if a
    tx queue doesn't support tx coe.
    
    Signed-off-by: Rohan G Thomas <rohan.g.thomas@intel.com>
    Acked-by: Conor Dooley <conor.dooley@microchip.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

efi: disable mirror feature during crashkernel [+ + +]

Author: Ma Wupeng <mawupeng1@huawei.com>
Date:   Tue Jan 9 12:15:36 2024 +0800

    efi: disable mirror feature during crashkernel
    
    commit 7ea6ec4c25294e8bc8788148ef854df92ee8dc5e upstream.
    
    If the system has no mirrored memory or uses crashkernel.high while
    kernelcore=mirror is enabled on the command line then during crashkernel,
    there will be limited mirrored memory and this usually leads to OOM.
    
    To solve this problem, disable the mirror feature during crashkernel.
    
    Link: https://lkml.kernel.org/r/20240109041536.3903042-1-mawupeng1@huawei.com
    Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
    Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

erofs: fix lz4 inplace decompression [+ + +]

Author: Gao Xiang <xiang@kernel.org>
Date:   Wed Dec 6 12:55:34 2023 +0800

    erofs: fix lz4 inplace decompression
    
    commit 3c12466b6b7bf1e56f9b32c366a3d83d87afb4de upstream.
    
    Currently EROFS can map another compressed buffer for inplace
    decompression, that was used to handle the cases that some pages of
    compressed data are actually not in-place I/O.
    
    However, like most simple LZ77 algorithms, LZ4 expects the compressed
    data is arranged at the end of the decompressed buffer and it
    explicitly uses memmove() to handle overlapping:
      __________________________________________________________
     |_ direction of decompression --> ____ |_ compressed data _|
    
    Although EROFS arranges compressed data like this, it typically maps two
    individual virtual buffers so the relative order is uncertain.
    Previously, it was hardly observed since LZ4 only uses memmove() for
    short overlapped literals and x86/arm64 memmove implementations seem to
    completely cover it up and they don't have this issue.  Juhyung reported
    that EROFS data corruption can be found on a new Intel x86 processor.
    After some analysis, it seems that recent x86 processors with the new
    FSRM feature expose this issue with "rep movsb".
    
    Let's strictly use the decompressed buffer for lz4 inplace
    decompression for now.  Later, as an useful improvement, we could try
    to tie up these two buffers together in the correct order.
    
    Reported-and-tested-by: Juhyung Park <qkrwngud825@gmail.com>
    Closes: https://lore.kernel.org/r/CAD14+f2AVKf8Fa2OO1aAUdDNTDsVzzR6ctU_oJSmTyd6zSYR2Q@mail.gmail.com
    Fixes: 0ffd71bcc3a0 ("staging: erofs: introduce LZ4 decompression inplace")
    Fixes: 598162d05080 ("erofs: support decompress big pcluster for lz4 backend")
    Cc: stable <stable@vger.kernel.org> # 5.4+
    Tested-by: Yifan Zhao <zhaoyifan@sjtu.edu.cn>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Link: https://lore.kernel.org/r/20231206045534.3920847-1-hsiangkao@linux.alibaba.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

exec: Fix error handling in begin_new_exec() [+ + +]

Author: Bernd Edlinger <bernd.edlinger@hotmail.de>
Date:   Mon Jan 22 19:34:21 2024 +0100

    exec: Fix error handling in begin_new_exec()
    
    commit 84c39ec57d409e803a9bb6e4e85daf1243e0e80b upstream.
    
    If get_unused_fd_flags() fails, the error handling is incomplete because
    bprm->cred is already set to NULL, and therefore free_bprm will not
    unlock the cred_guard_mutex. Note there are two error conditions which
    end up here, one before and one after bprm->cred is cleared.
    
    Fixes: b8a61c9e7b4a ("exec: Generic execfd support")
    Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
    Acked-by: Eric W. Biederman <ebiederm@xmission.com>
    Link: https://lore.kernel.org/r/AS8P193MB128517ADB5EFF29E04389EDAE4752@AS8P193MB1285.EURP193.PROD.OUTLOOK.COM
    Cc: stable@vger.kernel.org
    Signed-off-by: Kees Cook <keescook@chromium.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: allow for the last group to be marked as trimmed [+ + +]

Author: Suraj Jitindar Singh <surajjs@amazon.com>
Date:   Wed Dec 13 16:16:35 2023 +1100

    ext4: allow for the last group to be marked as trimmed
    
    commit 7c784d624819acbeefb0018bac89e632467cca5a upstream.
    
    The ext4 filesystem tracks the trim status of blocks at the group
    level.  When an entire group has been trimmed then it is marked as
    such and subsequent trim invocations with the same minimum trim size
    will not be attempted on that group unless it is marked as able to be
    trimmed again such as when a block is freed.
    
    Currently the last group can't be marked as trimmed due to incorrect
    logic in ext4_last_grp_cluster(). ext4_last_grp_cluster() is supposed
    to return the zero based index of the last cluster in a group. This is
    then used by ext4_try_to_trim_range() to determine if the trim
    operation spans the entire group and as such if the trim status of the
    group should be recorded.
    
    ext4_last_grp_cluster() takes a 0 based group index, thus the valid
    values for grp are 0..(ext4_get_groups_count - 1). Any group index
    less than (ext4_get_groups_count - 1) is not the last group and must
    have EXT4_CLUSTERS_PER_GROUP(sb) clusters. For the last group we need
    to calculate the number of clusters based on the number of blocks in
    the group. Finally subtract 1 from the number of clusters as zero
    based indexing is expected.  Rearrange the function slightly to make
    it clear what we are calculating and returning.
    
    Reproducer:
    // Create file system where the last group has fewer blocks than
    // blocks per group
    $ mkfs.ext4 -b 4096 -g 8192 /dev/nvme0n1 8191
    $ mount /dev/nvme0n1 /mnt
    
    Before Patch:
    $ fstrim -v /mnt
    /mnt: 25.9 MiB (27156480 bytes) trimmed
    // Group not marked as trimmed so second invocation still discards blocks
    $ fstrim -v /mnt
    /mnt: 25.9 MiB (27156480 bytes) trimmed
    
    After Patch:
    fstrim -v /mnt
    /mnt: 25.9 MiB (27156480 bytes) trimmed
    // Group marked as trimmed so second invocation DOESN'T discard any blocks
    fstrim -v /mnt
    /mnt: 0 B (0 bytes) trimmed
    
    Fixes: 45e4ab320c9b ("ext4: move setting of trimmed bit into ext4_try_to_trim_range()")
    Cc:  <stable@vger.kernel.org> # 4.19+
    Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20231213051635.37731-1-surajjs@amazon.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

firmware: arm_scmi: Check mailbox/SMT channel for consistency [+ + +]

Author: Cristian Marussi <cristian.marussi@arm.com>
Date:   Wed Dec 20 17:21:12 2023 +0000

    firmware: arm_scmi: Check mailbox/SMT channel for consistency
    
    commit 437a310b22244d4e0b78665c3042e5d1c0f45306 upstream.
    
    On reception of a completion interrupt the shared memory area is accessed
    to retrieve the message header at first and then, if the message sequence
    number identifies a transaction which is still pending, the related
    payload is fetched too.
    
    When an SCMI command times out the channel ownership remains with the
    platform until eventually a late reply is received and, as a consequence,
    any further transmission attempt remains pending, waiting for the channel
    to be relinquished by the platform.
    
    Once that late reply is received the channel ownership is given back
    to the agent and any pending request is then allowed to proceed and
    overwrite the SMT area of the just delivered late reply; then the wait
    for the reply to the new request starts.
    
    It has been observed that the spurious IRQ related to the late reply can
    be wrongly associated with the freshly enqueued request: when that happens
    the SCMI stack in-flight lookup procedure is fooled by the fact that the
    message header now present in the SMT area is related to the new pending
    transaction, even though the real reply has still to arrive.
    
    This race-condition on the A2P channel can be detected by looking at the
    channel status bits: a genuine reply from the platform will have set the
    channel free bit before triggering the completion IRQ.
    
    Add a consistency check to validate such condition in the A2P ISR.
    
    Reported-by: Xinglong Yang <xinglong.yang@cixtech.com>
    Closes: https://lore.kernel.org/all/PUZPR06MB54981E6FA00D82BFDBB864FBF08DA@PUZPR06MB5498.apcprd06.prod.outlook.com/
    Fixes: 5c8a47a5a91d ("firmware: arm_scmi: Make scmi core independent of the transport type")
    Cc: stable@vger.kernel.org # 5.15+
    Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
    Tested-by: Xinglong Yang <xinglong.yang@cixtech.com>
    Link: https://lore.kernel.org/r/20231220172112.763539-1-cristian.marussi@arm.com
    Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

firmware: arm_scmi: Use xa_insert() to store opps [+ + +]

Author: Cristian Marussi <cristian.marussi@arm.com>
Date:   Mon Jan 8 18:50:49 2024 +0000

    firmware: arm_scmi: Use xa_insert() to store opps
    
    [ Upstream commit e8ef4bbe39b9576a73f104f6af743fb9c7b624ba ]
    
    When storing opps by level or index use xa_insert() instead of xa_store()
    and add error-checking to spot bad duplicates indexes possibly wrongly
    provided by the platform firmware.
    
    Fixes: 31c7c1397a33 ("firmware: arm_scmi: Add v3.2 perf level indexing mode support")
    Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
    Link: https://lore.kernel.org/r/20240108185050.1628687-1-cristian.marussi@arm.com
    Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

firmware: arm_scmi: Use xa_insert() when saving raw queues [+ + +]

Author: Cristian Marussi <cristian.marussi@arm.com>
Date:   Mon Jan 8 18:50:50 2024 +0000

    firmware: arm_scmi: Use xa_insert() when saving raw queues
    
    [ Upstream commit b5dc0ffd36560dbadaed9a3d9fd7838055d62d74 ]
    
    Use xa_insert() when saving per-channel raw queues to better check for
    duplicates.
    
    Fixes: 7860701d1e6e ("firmware: arm_scmi: Add per-channel raw injection support")
    Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
    Link: https://lore.kernel.org/r/20240108185050.1628687-2-cristian.marussi@arm.com
    Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fjes: fix memleaks in fjes_hw_setup [+ + +]

Author: Zhipeng Lu <alexious@zju.edu.cn>
Date:   Tue Jan 23 01:24:42 2024 +0800

    fjes: fix memleaks in fjes_hw_setup
    
    [ Upstream commit f6cc4b6a3ae53df425771000e9c9540cce9b7bb1 ]
    
    In fjes_hw_setup, it allocates several memory and delay the deallocation
    to the fjes_hw_exit in fjes_probe through the following call chain:
    
    fjes_probe
      |-> fjes_hw_init
            |-> fjes_hw_setup
      |-> fjes_hw_exit
    
    However, when fjes_hw_setup fails, fjes_hw_exit won't be called and thus
    all the resources allocated in fjes_hw_setup will be leaked. In this
    patch, we free those resources in fjes_hw_setup and prevents such leaks.
    
    Fixes: 2fcbca687702 ("fjes: platform_driver's .probe and .remove routine")
    Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20240122172445.3841883-1-alexious@zju.edu.cn
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fs/pipe: move check to pipe_has_watch_queue() [+ + +]

Author: Max Kellermann <max.kellermann@ionos.com>
Date:   Thu Sep 21 09:57:53 2023 +0200

    fs/pipe: move check to pipe_has_watch_queue()
    
    [ Upstream commit b4bd6b4bac8edd61eb8f7b836969d12c0c6af165 ]
    
    This declutters the code by reducing the number of #ifdefs and makes
    the watch_queue checks simpler.  This has no runtime effect; the
    machine code is identical.
    
    Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
    Message-Id: <20230921075755.1378787-2-max.kellermann@ionos.com>
    Reviewed-by: David Howells <dhowells@redhat.com>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Stable-dep-of: e95aada4cb93 ("pipe: wakeup wr_wait after setting max_usage")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

genirq: Initialize resend_node hlist for all interrupt descriptors [+ + +]

Author: Dawei Li <dawei.li@shingroup.cn>
Date:   Mon Jan 22 16:57:15 2024 +0800

    genirq: Initialize resend_node hlist for all interrupt descriptors
    
    commit b184c8c2889ceef0a137c7d0567ef9fe3d92276e upstream.
    
    For a CONFIG_SPARSE_IRQ=n kernel, early_irq_init() is supposed to
    initialize all interrupt descriptors.
    
    It does except for irq_desc::resend_node, which ia only initialized for the
    first descriptor.
    
    Use the indexed decriptor and not the base pointer to address that.
    
    Fixes: bc06a9e08742 ("genirq: Use hlist for managing resend handlers")
    Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Acked-by: Marc Zyngier <maz@kernel.org>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240122085716.2999875-5-dawei.li@shingroup.cn
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

gpio: eic-sprd: Clear interrupt after set the interrupt type [+ + +]

Author: Wenhua Lin <Wenhua.Lin@unisoc.com>
Date:   Tue Jan 9 15:38:48 2024 +0800

    gpio: eic-sprd: Clear interrupt after set the interrupt type
    
    [ Upstream commit 84aef4ed59705585d629e81d633a83b7d416f5fb ]
    
    The raw interrupt status of eic maybe set before the interrupt is enabled,
    since the eic interrupt has a latch function, which would trigger the
    interrupt event once enabled it from user side. To solve this problem,
    interrupts generated before setting the interrupt trigger type are ignored.
    
    Fixes: 25518e024e3a ("gpio: Add Spreadtrum EIC driver support")
    Acked-by: Chunyan Zhang <zhang.lyra@gmail.com>
    Signed-off-by: Wenhua Lin <Wenhua.Lin@unisoc.com>
    Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

gpiolib: acpi: Ignore touchpad wakeup on GPD G1619-04 [+ + +]

Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Wed Jan 17 08:29:42 2024 -0600

    gpiolib: acpi: Ignore touchpad wakeup on GPD G1619-04
    
    commit 805c74eac8cb306dc69b87b6b066ab4da77ceaf1 upstream.
    
    Spurious wakeups are reported on the GPD G1619-04 which
    can be absolved by programming the GPIO to ignore wakeups.
    
    Cc: stable@vger.kernel.org
    Reported-and-tested-by: George Melikov <mail@gmelikov.ru>
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3073
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

hv_netvsc: Calculate correct ring size when PAGE_SIZE is not 4 Kbytes [+ + +]

Author: Michael Kelley <mhklinux@outlook.com>
Date:   Mon Jan 22 08:20:28 2024 -0800

    hv_netvsc: Calculate correct ring size when PAGE_SIZE is not 4 Kbytes
    
    commit 6941f67ad37d5465b75b9ffc498fcf6897a3c00e upstream.
    
    Current code in netvsc_drv_init() incorrectly assumes that PAGE_SIZE
    is 4 Kbytes, which is wrong on ARM64 with 16K or 64K page size. As a
    result, the default VMBus ring buffer size on ARM64 with 64K page size
    is 8 Mbytes instead of the expected 512 Kbytes. While this doesn't break
    anything, a typical VM with 8 vCPUs and 8 netvsc channels wastes 120
    Mbytes (8 channels * 2 ring buffers/channel * 7.5 Mbytes/ring buffer).
    
    Unfortunately, the module parameter specifying the ring buffer size
    is in units of 4 Kbyte pages. Ideally, it should be in units that
    are independent of PAGE_SIZE, but backwards compatibility prevents
    changing that now.
    
    Fix this by having netvsc_drv_init() hardcode 4096 instead of using
    PAGE_SIZE when calculating the ring buffer size in bytes. Also
    use the VMBUS_RING_SIZE macro to ensure proper alignment when running
    with page size larger than 4K.
    
    Cc: <stable@vger.kernel.org> # 5.15.x
    Fixes: 7aff79e297ee ("Drivers: hv: Enable Hyper-V code to be built on ARM64")
    Signed-off-by: Michael Kelley <mhklinux@outlook.com>
    Link: https://lore.kernel.org/r/20240122162028.348885-1-mhklinux@outlook.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

hwrng: core - Fix page fault dead lock on mmap-ed hwrng [+ + +]

Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Sat Dec 2 09:01:54 2023 +0800

    hwrng: core - Fix page fault dead lock on mmap-ed hwrng
    
    commit 78aafb3884f6bc6636efcc1760c891c8500b9922 upstream.
    
    There is a dead-lock in the hwrng device read path.  This triggers
    when the user reads from /dev/hwrng into memory also mmap-ed from
    /dev/hwrng.  The resulting page fault triggers a recursive read
    which then dead-locks.
    
    Fix this by using a stack buffer when calling copy_to_user.
    
    Reported-by: Edward Adam Davis <eadavis@qq.com>
    Reported-by: syzbot+c52ab18308964d248092@syzkaller.appspotmail.com
    Fixes: 9996508b3353 ("hwrng: core - Replace u32 in driver API with byte array")
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i40e: handle multi-buffer packets that are shrunk by xdp prog [+ + +]

Author: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
Date:   Wed Jan 24 20:15:56 2024 +0100

    i40e: handle multi-buffer packets that are shrunk by xdp prog
    
    [ Upstream commit 83014323c642b8faa2d64a5f303b41c019322478 ]
    
    XDP programs can shrink packets by calling the bpf_xdp_adjust_tail()
    helper function. For multi-buffer packets this may lead to reduction of
    frag count stored in skb_shared_info area of the xdp_buff struct. This
    results in issues with the current handling of XDP_PASS and XDP_DROP
    cases.
    
    For XDP_PASS, currently skb is being built using frag count of
    xdp_buffer before it was processed by XDP prog and thus will result in
    an inconsistent skb when frag count gets reduced by XDP prog. To fix
    this, get correct frag count while building the skb instead of using
    pre-obtained frag count.
    
    For XDP_DROP, current page recycling logic will not reuse the page but
    instead will adjust the pagecnt_bias so that the page can be freed. This
    again results in inconsistent behavior as the page refcnt has already
    been changed by the helper while freeing the frag(s) as part of
    shrinking the packet. To fix this, only adjust pagecnt_bias for buffers
    that are stillpart of the packet post-xdp prog run.
    
    Fixes: e213ced19bef ("i40e: add support for XDP multi-buffer Rx")
    Reported-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
    Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
    Link: https://lore.kernel.org/r/20240124191602.566724-6-maciej.fijalkowski@intel.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

i40e: set xdp_rxq_info::frag_size [+ + +]

Author: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date:   Wed Jan 24 20:16:01 2024 +0100

    i40e: set xdp_rxq_info::frag_size
    
    [ Upstream commit a045d2f2d03d23e7db6772dd83e0ba2705dfad93 ]
    
    i40e support XDP multi-buffer so it is supposed to use
    __xdp_rxq_info_reg() instead of xdp_rxq_info_reg() and set the
    frag_size. It can not be simply converted at existing callsite because
    rx_buf_len could be un-initialized, so let us register xdp_rxq_info
    within i40e_configure_rx_ring(), which happen to be called with already
    initialized rx_buf_len value.
    
    Commit 5180ff1364bc ("i40e: use int for i40e_status") converted 'err' to
    int, so two variables to deal with return codes are not needed within
    i40e_configure_rx_ring(). Remove 'ret' and use 'err' to handle status
    from xdp_rxq_info registration.
    
    Fixes: e213ced19bef ("i40e: add support for XDP multi-buffer Rx")
    Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
    Link: https://lore.kernel.org/r/20240124191602.566724-11-maciej.fijalkowski@intel.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

i40e: update xdp_rxq_info::frag_size for ZC enabled Rx queue [+ + +]

Author: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date:   Wed Jan 24 20:16:02 2024 +0100

    i40e: update xdp_rxq_info::frag_size for ZC enabled Rx queue
    
    [ Upstream commit 0cbb08707c932b3f004bc1a8ec6200ef572c1f5f ]
    
    Now that i40e driver correctly sets up frag_size in xdp_rxq_info, let us
    make it work for ZC multi-buffer as well. i40e_ring::rx_buf_len for ZC
    is being set via xsk_pool_get_rx_frame_size() and this needs to be
    propagated up to xdp_rxq_info.
    
    Fixes: 1c9ba9c14658 ("i40e: xsk: add RX multi-buffer support")
    Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
    Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
    Link: https://lore.kernel.org/r/20240124191602.566724-12-maciej.fijalkowski@intel.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ice: remove redundant xdp_rxq_info registration [+ + +]

Author: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date:   Wed Jan 24 20:15:57 2024 +0100

    ice: remove redundant xdp_rxq_info registration
    
    [ Upstream commit 2ee788c06493d02ee85855414cca39825e768aaf ]
    
    xdp_rxq_info struct can be registered by drivers via two functions -
    xdp_rxq_info_reg() and __xdp_rxq_info_reg(). The latter one allows
    drivers that support XDP multi-buffer to set up xdp_rxq_info::frag_size
    which in turn will make it possible to grow the packet via
    bpf_xdp_adjust_tail() BPF helper.
    
    Currently, ice registers xdp_rxq_info in two spots:
    1) ice_setup_rx_ring() // via xdp_rxq_info_reg(), BUG
    2) ice_vsi_cfg_rxq()   // via __xdp_rxq_info_reg(), OK
    
    Cited commit under fixes tag took care of setting up frag_size and
    updated registration scheme in 2) but it did not help as
    1) is called before 2) and as shown above it uses old registration
    function. This means that 2) sees that xdp_rxq_info is already
    registered and never calls __xdp_rxq_info_reg() which leaves us with
    xdp_rxq_info::frag_size being set to 0.
    
    To fix this misbehavior, simply remove xdp_rxq_info_reg() call from
    ice_setup_rx_ring().
    
    Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side")
    Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
    Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
    Link: https://lore.kernel.org/r/20240124191602.566724-7-maciej.fijalkowski@intel.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ice: update xdp_rxq_info::frag_size for ZC enabled Rx queue [+ + +]

Author: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date:   Wed Jan 24 20:15:59 2024 +0100

    ice: update xdp_rxq_info::frag_size for ZC enabled Rx queue
    
    [ Upstream commit 3de38c87174225487fc93befeea7d380db80aef6 ]
    
    Now that ice driver correctly sets up frag_size in xdp_rxq_info, let us
    make it work for ZC multi-buffer as well. ice_rx_ring::rx_buf_len for ZC
    is being set via xsk_pool_get_rx_frame_size() and this needs to be
    propagated up to xdp_rxq_info.
    
    Use a bigger hammer and instead of unregistering only xdp_rxq_info's
    memory model, unregister it altogether and register it again and have
    xdp_rxq_info with correct frag_size value.
    
    Fixes: 1bbc04de607b ("ice: xsk: add RX multi-buffer support")
    Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
    Link: https://lore.kernel.org/r/20240124191602.566724-9-maciej.fijalkowski@intel.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ice: work on pre-XDP prog frag count [+ + +]

Author: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date:   Wed Jan 24 20:15:55 2024 +0100

    ice: work on pre-XDP prog frag count
    
    [ Upstream commit ad2047cf5d9313200e308612aed516548873d124 ]
    
    Fix an OOM panic in XDP_DRV mode when a XDP program shrinks a
    multi-buffer packet by 4k bytes and then redirects it to an AF_XDP
    socket.
    
    Since support for handling multi-buffer frames was added to XDP, usage
    of bpf_xdp_adjust_tail() helper within XDP program can free the page
    that given fragment occupies and in turn decrease the fragment count
    within skb_shared_info that is embedded in xdp_buff struct. In current
    ice driver codebase, it can become problematic when page recycling logic
    decides not to reuse the page. In such case, __page_frag_cache_drain()
    is used with ice_rx_buf::pagecnt_bias that was not adjusted after
    refcount of page was changed by XDP prog which in turn does not drain
    the refcount to 0 and page is never freed.
    
    To address this, let us store the count of frags before the XDP program
    was executed on Rx ring struct. This will be used to compare with
    current frag count from skb_shared_info embedded in xdp_buff. A smaller
    value in the latter indicates that XDP prog freed frag(s). Then, for
    given delta decrement pagecnt_bias for XDP_DROP verdict.
    
    While at it, let us also handle the EOP frag within
    ice_set_rx_bufs_act() to make our life easier, so all of the adjustments
    needed to be applied against freed frags are performed in the single
    place.
    
    Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side")
    Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
    Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
    Link: https://lore.kernel.org/r/20240124191602.566724-5-maciej.fijalkowski@intel.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

iio: adc: ad7091r: Allow users to configure device events [+ + +]

Author: Marcelo Schmitt <marcelo.schmitt@analog.com>
Date:   Tue Dec 19 17:26:01 2023 -0300

    iio: adc: ad7091r: Allow users to configure device events
    
    [ Upstream commit 020e71c7ffc25dfe29ed9be6c2d39af7bd7f661f ]
    
    AD7091R-5 devices are supported by the ad7091r-5 driver together with
    the ad7091r-base driver. Those drivers declared iio events for notifying
    user space when ADC readings fall bellow the thresholds of low limit
    registers or above the values set in high limit registers.
    However, to configure iio events and their thresholds, a set of callback
    functions must be implemented and those were not present until now.
    The consequence of trying to configure ad7091r-5 events without the
    proper callback functions was a null pointer dereference in the kernel
    because the pointers to the callback functions were not set.
    
    Implement event configuration callbacks allowing users to read/write
    event thresholds and enable/disable event generation.
    
    Since the event spec structs are generic to AD7091R devices, also move
    those from the ad7091r-5 driver the base driver so they can be reused
    when support for ad7091r-2/-4/-8 be added.
    
    Fixes: ca69300173b6 ("iio: adc: Add support for AD7091R5 ADC")
    Suggested-by: David Lechner <dlechner@baylibre.com>
    Signed-off-by: Marcelo Schmitt <marcelo.schmitt@analog.com>
    Link: https://lore.kernel.org/r/59552d3548dabd56adc3107b7b4869afee2b0c3c.1703013352.git.marcelo.schmitt1@gmail.com
    Cc: <Stable@vger.kernel.org>
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

iio: adc: ad7091r: Enable internal vref if external vref is not supplied [+ + +]

Author: Marcelo Schmitt <marcelo.schmitt@analog.com>
Date:   Tue Dec 19 17:26:27 2023 -0300

    iio: adc: ad7091r: Enable internal vref if external vref is not supplied
    
    [ Upstream commit e71c5c89bcb165a02df35325aa13d1ee40112401 ]
    
    The ADC needs a voltage reference to work correctly.
    Users can provide an external voltage reference or use the chip internal
    reference to operate the ADC.
    The availability of an in chip reference for the ADC saves the user from
    having to supply an external voltage reference, which makes the external
    reference an optional property as described in the device tree
    documentation.
    Though, to use the internal reference, it must be enabled by writing to
    the configuration register.
    Enable AD7091R internal voltage reference if no external vref is supplied.
    
    Fixes: 260442cc5be4 ("iio: adc: ad7091r5: Add scale and external VREF support")
    Signed-off-by: Marcelo Schmitt <marcelo.schmitt@analog.com>
    Link: https://lore.kernel.org/r/b865033fa6a4fc4bf2b4a98ec51a6144e0f64f77.1703013352.git.marcelo.schmitt1@gmail.com
    Cc: <Stable@vger.kernel.org>
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

iio: adc: ad7091r: Set alert bit in config register [+ + +]

Author: Marcelo Schmitt <marcelo.schmitt@analog.com>
Date:   Sat Dec 16 14:46:37 2023 -0300

    iio: adc: ad7091r: Set alert bit in config register
    
    [ Upstream commit 149694f5e79b0c7a36ceb76e7c0d590db8f151c1 ]
    
    The ad7091r-base driver sets up an interrupt handler for firing events
    when inputs are either above or below a certain threshold.
    However, for the interrupt signal to come from the device it must be
    configured to enable the ALERT/BUSY/GPO pin to be used as ALERT, which
    was not being done until now.
    Enable interrupt signals on the ALERT/BUSY/GPO pin by setting the proper
    bit in the configuration register.
    
    Signed-off-by: Marcelo Schmitt <marcelo.schmitt@analog.com>
    Link: https://lore.kernel.org/r/e8da2ee98d6df88318b14baf3dc9630e20218418.1702746240.git.marcelo.schmitt1@gmail.com
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Stable-dep-of: 020e71c7ffc2 ("iio: adc: ad7091r: Allow users to configure device events")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

intel: xsk: initialize skb_frag_t::bv_offset in ZC drivers [+ + +]

Author: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date:   Wed Jan 24 20:15:58 2024 +0100

    intel: xsk: initialize skb_frag_t::bv_offset in ZC drivers
    
    [ Upstream commit 290779905d09d5fdf6caa4f58ddefc3f4db0c0a9 ]
    
    Ice and i40e ZC drivers currently set offset of a frag within
    skb_shared_info to 0, which is incorrect. xdp_buffs that come from
    xsk_buff_pool always have 256 bytes of a headroom, so they need to be
    taken into account to retrieve xdp_buff::data via skb_frag_address().
    Otherwise, bpf_xdp_frags_increase_tail() would be starting its job from
    xdp_buff::data_hard_start which would result in overwriting existing
    payload.
    
    Fixes: 1c9ba9c14658 ("i40e: xsk: add RX multi-buffer support")
    Fixes: 1bbc04de607b ("ice: xsk: add RX multi-buffer support")
    Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
    Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
    Link: https://lore.kernel.org/r/20240124191602.566724-8-maciej.fijalkowski@intel.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ipv6: init the accept_queue's spinlocks in inet6_create [+ + +]

Author: Zhengchao Shao <shaozhengchao@huawei.com>
Date:   Mon Jan 22 18:20:01 2024 +0800

    ipv6: init the accept_queue's spinlocks in inet6_create
    
    [ Upstream commit 435e202d645c197dcfd39d7372eb2a56529b6640 ]
    
    In commit 198bc90e0e73("tcp: make sure init the accept_queue's spinlocks
    once"), the spinlocks of accept_queue are initialized only when socket is
    created in the inet4 scenario. The locks are not initialized when socket
    is created in the inet6 scenario. The kernel reports the following error:
    INFO: trying to register non-static key.
    The code is fine but needs lockdep annotation, or maybe
    you didn't initialize this object before use?
    turning off the locking correctness validator.
    Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
    Call Trace:
    <TASK>
            dump_stack_lvl (lib/dump_stack.c:107)
            register_lock_class (kernel/locking/lockdep.c:1289)
            __lock_acquire (kernel/locking/lockdep.c:5015)
            lock_acquire.part.0 (kernel/locking/lockdep.c:5756)
            _raw_spin_lock_bh (kernel/locking/spinlock.c:178)
            inet_csk_listen_stop (net/ipv4/inet_connection_sock.c:1386)
            tcp_disconnect (net/ipv4/tcp.c:2981)
            inet_shutdown (net/ipv4/af_inet.c:935)
            __sys_shutdown (./include/linux/file.h:32 net/socket.c:2438)
            __x64_sys_shutdown (net/socket.c:2445)
            do_syscall_64 (arch/x86/entry/common.c:52)
            entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
    RIP: 0033:0x7f52ecd05a3d
    Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7
    48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
    ff 73 01 c3 48 8b 0d ab a3 0e 00 f7 d8 64 89 01 48
    RSP: 002b:00007f52ecf5dde8 EFLAGS: 00000293 ORIG_RAX: 0000000000000030
    RAX: ffffffffffffffda RBX: 00007f52ecf5e640 RCX: 00007f52ecd05a3d
    RDX: 00007f52ecc8b188 RSI: 0000000000000000 RDI: 0000000000000004
    RBP: 00007f52ecf5de20 R08: 00007ffdae45c69f R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000293 R12: 00007f52ecf5e640
    R13: 0000000000000000 R14: 00007f52ecc8b060 R15: 00007ffdae45c6e0
    
    Fixes: 198bc90e0e73 ("tcp: make sure init the accept_queue's spinlocks once")
    Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/20240122102001.2851701-1-shaozhengchao@huawei.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kexec: do syscore_shutdown() in kernel_kexec [+ + +]

Author: James Gowans <jgowans@amazon.com>
Date:   Wed Dec 13 08:40:04 2023 +0200

    kexec: do syscore_shutdown() in kernel_kexec
    
    commit 7bb943806ff61e83ae4cceef8906b7fe52453e8a upstream.
    
    syscore_shutdown() runs driver and module callbacks to get the system into
    a state where it can be correctly shut down.  In commit 6f389a8f1dd2 ("PM
    / reboot: call syscore_shutdown() after disable_nonboot_cpus()")
    syscore_shutdown() was removed from kernel_restart_prepare() and hence got
    (incorrectly?) removed from the kexec flow.  This was innocuous until
    commit 6735150b6997 ("KVM: Use syscore_ops instead of reboot_notifier to
    hook restart/shutdown") changed the way that KVM registered its shutdown
    callbacks, switching from reboot notifiers to syscore_ops.shutdown.  As
    syscore_shutdown() is missing from kexec, KVM's shutdown hook is not run
    and virtualisation is left enabled on the boot CPU which results in triple
    faults when switching to the new kernel on Intel x86 VT-x with VMXE
    enabled.
    
    Fix this by adding syscore_shutdown() to the kexec sequence.  In terms of
    where to add it, it is being added after migrating the kexec task to the
    boot CPU, but before APs are shut down.  It is not totally clear if this
    is the best place: in commit 6f389a8f1dd2 ("PM / reboot: call
    syscore_shutdown() after disable_nonboot_cpus()") it is stated that
    "syscore_ops operations should be carried with one CPU on-line and
    interrupts disabled." APs are only offlined later in machine_shutdown(),
    so this syscore_shutdown() is being run while APs are still online.  This
    seems to be the correct place as it matches where syscore_shutdown() is
    run in the reboot and halt flows - they also run it before APs are shut
    down.  The assumption is that the commit message in commit 6f389a8f1dd2
    ("PM / reboot: call syscore_shutdown() after disable_nonboot_cpus()") is
    no longer valid.
    
    KVM has been discussed here as it is what broke loudly by not having
    syscore_shutdown() in kexec, but this change impacts more than just KVM;
    all drivers/modules which register a syscore_ops.shutdown callback will
    now be invoked in the kexec flow.  Looking at some of them like x86 MCE it
    is probably more correct to also shut these down during kexec.
    Maintainers of all drivers which use syscore_ops.shutdown are added on CC
    for visibility.  They are:
    
    arch/powerpc/platforms/cell/spu_base.c  .shutdown = spu_shutdown,
    arch/x86/kernel/cpu/mce/core.c          .shutdown = mce_syscore_shutdown,
    arch/x86/kernel/i8259.c                 .shutdown = i8259A_shutdown,
    drivers/irqchip/irq-i8259.c             .shutdown = i8259A_shutdown,
    drivers/irqchip/irq-sun6i-r.c           .shutdown = sun6i_r_intc_shutdown,
    drivers/leds/trigger/ledtrig-cpu.c      .shutdown = ledtrig_cpu_syscore_shutdown,
    drivers/power/reset/sc27xx-poweroff.c   .shutdown = sc27xx_poweroff_shutdown,
    kernel/irq/generic-chip.c               .shutdown = irq_gc_shutdown,
    virt/kvm/kvm_main.c                     .shutdown = kvm_shutdown,
    
    This has been tested by doing a kexec on x86_64 and aarch64.
    
    Link: https://lkml.kernel.org/r/20231213064004.2419447-1-jgowans@amazon.com
    Fixes: 6735150b6997 ("KVM: Use syscore_ops instead of reboot_notifier to hook restart/shutdown")
    Signed-off-by: James Gowans <jgowans@amazon.com>
    Cc: Baoquan He <bhe@redhat.com>
    Cc: Eric Biederman <ebiederm@xmission.com>
    Cc: Paolo Bonzini <pbonzini@redhat.com>
    Cc: Sean Christopherson <seanjc@google.com>
    Cc: Marc Zyngier <maz@kernel.org>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Tony Luck <tony.luck@intel.com>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Chen-Yu Tsai <wens@csie.org>
    Cc: Jernej Skrabec <jernej.skrabec@gmail.com>
    Cc: Samuel Holland <samuel@sholland.org>
    Cc: Pavel Machek <pavel@ucw.cz>
    Cc: Sebastian Reichel <sre@kernel.org>
    Cc: Orson Zhai <orsonzhai@gmail.com>
    Cc: Alexander Graf <graf@amazon.de>
    Cc: Jan H. Schoenherr <jschoenh@amazon.de>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ksmbd: Add missing set_freezable() for freezable kthread [+ + +]

Author: Namjae Jeon <linkinjeon@kernel.org>
Date:   Tue Jan 23 20:40:31 2024 +0900

    ksmbd: Add missing set_freezable() for freezable kthread
    
    From: Kevin Hao <haokexin@gmail.com>
    
    [ Upstream commit 8fb7b723924cc9306bc161f45496497aec733904 ]
    
    The kernel thread function ksmbd_conn_handler_loop() invokes
    the try_to_freeze() in its loop. But all the kernel threads are
    non-freezable by default. So if we want to make a kernel thread to be
    freezable, we have to invoke set_freezable() explicitly.
    
    Signed-off-by: Kevin Hao <haokexin@gmail.com>
    Acked-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ksmbd: don't increment epoch if current state and request state are same [+ + +]

Author: Namjae Jeon <linkinjeon@kernel.org>
Date:   Tue Jan 23 20:40:29 2024 +0900

    ksmbd: don't increment epoch if current state and request state are same
    
    [ Upstream commit b6e9a44e99603fe10e1d78901fdd97681a539612 ]
    
    If existing lease state and request state are same, don't increment
    epoch in create context.
    
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ksmbd: fix global oob in ksmbd_nl_policy [+ + +]

Author: Lin Ma <linma@zju.edu.cn>
Date:   Sun Jan 21 15:35:06 2024 +0800

    ksmbd: fix global oob in ksmbd_nl_policy
    
    commit ebeae8adf89d9a82359f6659b1663d09beec2faa upstream.
    
    Similar to a reported issue (check the commit b33fb5b801c6 ("net:
    qualcomm: rmnet: fix global oob in rmnet_policy"), my local fuzzer finds
    another global out-of-bounds read for policy ksmbd_nl_policy. See bug
    trace below:
    
    ==================================================================
    BUG: KASAN: global-out-of-bounds in validate_nla lib/nlattr.c:386 [inline]
    BUG: KASAN: global-out-of-bounds in __nla_validate_parse+0x24af/0x2750 lib/nlattr.c:600
    Read of size 1 at addr ffffffff8f24b100 by task syz-executor.1/62810
    
    CPU: 0 PID: 62810 Comm: syz-executor.1 Tainted: G                 N 6.1.0 #3
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:88 [inline]
     dump_stack_lvl+0x8b/0xb3 lib/dump_stack.c:106
     print_address_description mm/kasan/report.c:284 [inline]
     print_report+0x172/0x475 mm/kasan/report.c:395
     kasan_report+0xbb/0x1c0 mm/kasan/report.c:495
     validate_nla lib/nlattr.c:386 [inline]
     __nla_validate_parse+0x24af/0x2750 lib/nlattr.c:600
     __nla_parse+0x3e/0x50 lib/nlattr.c:697
     __nlmsg_parse include/net/netlink.h:748 [inline]
     genl_family_rcv_msg_attrs_parse.constprop.0+0x1b0/0x290 net/netlink/genetlink.c:565
     genl_family_rcv_msg_doit+0xda/0x330 net/netlink/genetlink.c:734
     genl_family_rcv_msg net/netlink/genetlink.c:833 [inline]
     genl_rcv_msg+0x441/0x780 net/netlink/genetlink.c:850
     netlink_rcv_skb+0x14f/0x410 net/netlink/af_netlink.c:2540
     genl_rcv+0x24/0x40 net/netlink/genetlink.c:861
     netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
     netlink_unicast+0x54e/0x800 net/netlink/af_netlink.c:1345
     netlink_sendmsg+0x930/0xe50 net/netlink/af_netlink.c:1921
     sock_sendmsg_nosec net/socket.c:714 [inline]
     sock_sendmsg+0x154/0x190 net/socket.c:734
     ____sys_sendmsg+0x6df/0x840 net/socket.c:2482
     ___sys_sendmsg+0x110/0x1b0 net/socket.c:2536
     __sys_sendmsg+0xf3/0x1c0 net/socket.c:2565
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x63/0xcd
    RIP: 0033:0x7fdd66a8f359
    Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
    RSP: 002b:00007fdd65e00168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 00007fdd66bbcf80 RCX: 00007fdd66a8f359
    RDX: 0000000000000000 RSI: 0000000020000500 RDI: 0000000000000003
    RBP: 00007fdd66ada493 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
    R13: 00007ffc84b81aff R14: 00007fdd65e00300 R15: 0000000000022000
     </TASK>
    
    The buggy address belongs to the variable:
     ksmbd_nl_policy+0x100/0xa80
    
    The buggy address belongs to the physical page:
    page:0000000034f47940 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1ccc4b
    flags: 0x200000000001000(reserved|node=0|zone=2)
    raw: 0200000000001000 ffffea00073312c8 ffffea00073312c8 0000000000000000
    raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
    page dumped because: kasan: bad access detected
    
    Memory state around the buggy address:
     ffffffff8f24b000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     ffffffff8f24b080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    >ffffffff8f24b100: f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9 00 00 07 f9
                       ^
     ffffffff8f24b180: f9 f9 f9 f9 00 05 f9 f9 f9 f9 f9 f9 00 00 00 05
     ffffffff8f24b200: f9 f9 f9 f9 00 00 03 f9 f9 f9 f9 f9 00 00 04 f9
    ==================================================================
    
    To fix it, add a placeholder named __KSMBD_EVENT_MAX and let
    KSMBD_EVENT_MAX to be its original value - 1 according to what other
    netlink families do. Also change two sites that refer the
    KSMBD_EVENT_MAX to correct value.
    
    Cc: stable@vger.kernel.org
    Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers")
    Signed-off-by: Lin Ma <linma@zju.edu.cn>
    Acked-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ksmbd: fix potential circular locking issue in smb2_set_ea() [+ + +]

Author: Namjae Jeon <linkinjeon@kernel.org>
Date:   Tue Jan 23 20:40:28 2024 +0900

    ksmbd: fix potential circular locking issue in smb2_set_ea()
    
    [ Upstream commit 6fc0a265e1b932e5e97a038f99e29400a93baad0 ]
    
    smb2_set_ea() can be called in parent inode lock range.
    So add get_write argument to smb2_set_ea() not to call nested
    mnt_want_write().
    
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ksmbd: send lease break notification on FILE_RENAME_INFORMATION [+ + +]

Author: Namjae Jeon <linkinjeon@kernel.org>
Date:   Tue Jan 23 20:40:30 2024 +0900

    ksmbd: send lease break notification on FILE_RENAME_INFORMATION
    
    [ Upstream commit 3fc74c65b367476874da5fe6f633398674b78e5a ]
    
    Send lease break notification on FILE_RENAME_INFORMATION request.
    This patch fix smb2.lease.v2_epoch2 test failure.
    
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ksmbd: set v2 lease version on lease upgrade [+ + +]

Author: Namjae Jeon <linkinjeon@kernel.org>
Date:   Tue Jan 23 20:40:27 2024 +0900

    ksmbd: set v2 lease version on lease upgrade
    
    [ Upstream commit bb05367a66a9990d2c561282f5620bb1dbe40c28 ]
    
    If file opened with v2 lease is upgraded with v1 lease, smb server
    should response v2 lease create context to client.
    This patch fix smb2.lease.v2_epoch2 test failure.
    
    This test case assumes the following scenario:
     1. smb2 create with v2 lease(R, LEASE1 key)
     2. smb server return smb2 create response with v2 lease context(R,
    LEASE1 key, epoch + 1)
     3. smb2 create with v1 lease(RH, LEASE1 key)
     4. smb server return smb2 create response with v2 lease context(RH,
    LEASE1 key, epoch + 2)
    
    i.e. If same client(same lease key) try to open a file that is being
    opened with v2 lease with v1 lease, smb server should return v2 lease.
    
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Acked-by: Tom Talpey <tom@talpey.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux: Linux 6.6.15 [+ + +]

Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Wed Jan 31 16:19:14 2024 -0800

    Linux 6.6.15
    
    Link: https://lore.kernel.org/r/20240129170014.969142961@linuxfoundation.org
    Tested-by: SeongJae Park <sj@kernel.org>
    Tested-by: Salvatore Bonaccorso <carnil@debian.org>
    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
    Tested-by: Allen Pais <apais@linux.microsoft.com>
    Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
    Tested-by: kernelci.org bot <bot@kernelci.org>
    Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com>
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
    Tested-by: Ron Economos <re@w6rz.net>
    Tested-by: Kelsey Steele <kelseysteele@linux.microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

llc: Drop support for ETH_P_TR_802_2. [+ + +]

Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Thu Jan 18 17:55:15 2024 -0800

    llc: Drop support for ETH_P_TR_802_2.
    
    [ Upstream commit e3f9bed9bee261e3347131764e42aeedf1ffea61 ]
    
    syzbot reported an uninit-value bug below. [0]
    
    llc supports ETH_P_802_2 (0x0004) and used to support ETH_P_TR_802_2
    (0x0011), and syzbot abused the latter to trigger the bug.
    
      write$tun(r0, &(0x7f0000000040)={@val={0x0, 0x11}, @val, @mpls={[], @llc={@snap={0xaa, 0x1, ')', "90e5dd"}}}}, 0x16)
    
    llc_conn_handler() initialises local variables {saddr,daddr}.mac
    based on skb in llc_pdu_decode_sa()/llc_pdu_decode_da() and passes
    them to __llc_lookup().
    
    However, the initialisation is done only when skb->protocol is
    htons(ETH_P_802_2), otherwise, __llc_lookup_established() and
    __llc_lookup_listener() will read garbage.
    
    The missing initialisation existed prior to commit 211ed865108e
    ("net: delete all instances of special processing for token ring").
    
    It removed the part to kick out the token ring stuff but forgot to
    close the door allowing ETH_P_TR_802_2 packets to sneak into llc_rcv().
    
    Let's remove llc_tr_packet_type and complete the deprecation.
    
    [0]:
    BUG: KMSAN: uninit-value in __llc_lookup_established+0xe9d/0xf90
     __llc_lookup_established+0xe9d/0xf90
     __llc_lookup net/llc/llc_conn.c:611 [inline]
     llc_conn_handler+0x4bd/0x1360 net/llc/llc_conn.c:791
     llc_rcv+0xfbb/0x14a0 net/llc/llc_input.c:206
     __netif_receive_skb_one_core net/core/dev.c:5527 [inline]
     __netif_receive_skb+0x1a6/0x5a0 net/core/dev.c:5641
     netif_receive_skb_internal net/core/dev.c:5727 [inline]
     netif_receive_skb+0x58/0x660 net/core/dev.c:5786
     tun_rx_batched+0x3ee/0x980 drivers/net/tun.c:1555
     tun_get_user+0x53af/0x66d0 drivers/net/tun.c:2002
     tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048
     call_write_iter include/linux/fs.h:2020 [inline]
     new_sync_write fs/read_write.c:491 [inline]
     vfs_write+0x8ef/0x1490 fs/read_write.c:584
     ksys_write+0x20f/0x4c0 fs/read_write.c:637
     __do_sys_write fs/read_write.c:649 [inline]
     __se_sys_write fs/read_write.c:646 [inline]
     __x64_sys_write+0x93/0xd0 fs/read_write.c:646
     do_syscall_x64 arch/x86/entry/common.c:51 [inline]
     do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
     entry_SYSCALL_64_after_hwframe+0x63/0x6b
    
    Local variable daddr created at:
     llc_conn_handler+0x53/0x1360 net/llc/llc_conn.c:783
     llc_rcv+0xfbb/0x14a0 net/llc/llc_input.c:206
    
    CPU: 1 PID: 5004 Comm: syz-executor994 Not tainted 6.6.0-syzkaller-14500-g1c41041124bd #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
    
    Fixes: 211ed865108e ("net: delete all instances of special processing for token ring")
    Reported-by: syzbot+b5ad66046b913bc04c6f@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=b5ad66046b913bc04c6f
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/20240119015515.61898-1-kuniyu@amazon.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

llc: make llc_ui_sendmsg() more robust against bonding changes [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Jan 18 18:36:25 2024 +0000

    llc: make llc_ui_sendmsg() more robust against bonding changes
    
    [ Upstream commit dad555c816a50c6a6a8a86be1f9177673918c647 ]
    
    syzbot was able to trick llc_ui_sendmsg(), allocating an skb with no
    headroom, but subsequently trying to push 14 bytes of Ethernet header [1]
    
    Like some others, llc_ui_sendmsg() releases the socket lock before
    calling sock_alloc_send_skb().
    Then it acquires it again, but does not redo all the sanity checks
    that were performed.
    
    This fix:
    
    - Uses LL_RESERVED_SPACE() to reserve space.
    - Check all conditions again after socket lock is held again.
    - Do not account Ethernet header for mtu limitation.
    
    [1]
    
    skbuff: skb_under_panic: text:ffff800088baa334 len:1514 put:14 head:ffff0000c9c37000 data:ffff0000c9c36ff2 tail:0x5dc end:0x6c0 dev:bond0
    
     kernel BUG at net/core/skbuff.c:193 !
    Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
    Modules linked in:
    CPU: 0 PID: 6875 Comm: syz-executor.0 Not tainted 6.7.0-rc8-syzkaller-00101-g0802e17d9aca-dirty #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
    pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
     pc : skb_panic net/core/skbuff.c:189 [inline]
     pc : skb_under_panic+0x13c/0x140 net/core/skbuff.c:203
     lr : skb_panic net/core/skbuff.c:189 [inline]
     lr : skb_under_panic+0x13c/0x140 net/core/skbuff.c:203
    sp : ffff800096f97000
    x29: ffff800096f97010 x28: ffff80008cc8d668 x27: dfff800000000000
    x26: ffff0000cb970c90 x25: 00000000000005dc x24: ffff0000c9c36ff2
    x23: ffff0000c9c37000 x22: 00000000000005ea x21: 00000000000006c0
    x20: 000000000000000e x19: ffff800088baa334 x18: 1fffe000368261ce
    x17: ffff80008e4ed000 x16: ffff80008a8310f8 x15: 0000000000000001
    x14: 1ffff00012df2d58 x13: 0000000000000000 x12: 0000000000000000
    x11: 0000000000000001 x10: 0000000000ff0100 x9 : e28a51f1087e8400
    x8 : e28a51f1087e8400 x7 : ffff80008028f8d0 x6 : 0000000000000000
    x5 : 0000000000000001 x4 : 0000000000000001 x3 : ffff800082b78714
    x2 : 0000000000000001 x1 : 0000000100000000 x0 : 0000000000000089
    Call trace:
      skb_panic net/core/skbuff.c:189 [inline]
      skb_under_panic+0x13c/0x140 net/core/skbuff.c:203
      skb_push+0xf0/0x108 net/core/skbuff.c:2451
      eth_header+0x44/0x1f8 net/ethernet/eth.c:83
      dev_hard_header include/linux/netdevice.h:3188 [inline]
      llc_mac_hdr_init+0x110/0x17c net/llc/llc_output.c:33
      llc_sap_action_send_xid_c+0x170/0x344 net/llc/llc_s_ac.c:85
      llc_exec_sap_trans_actions net/llc/llc_sap.c:153 [inline]
      llc_sap_next_state net/llc/llc_sap.c:182 [inline]
      llc_sap_state_process+0x1ec/0x774 net/llc/llc_sap.c:209
      llc_build_and_send_xid_pkt+0x12c/0x1c0 net/llc/llc_sap.c:270
      llc_ui_sendmsg+0x7bc/0xb1c net/llc/af_llc.c:997
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg net/socket.c:745 [inline]
      sock_sendmsg+0x194/0x274 net/socket.c:767
      splice_to_socket+0x7cc/0xd58 fs/splice.c:881
      do_splice_from fs/splice.c:933 [inline]
      direct_splice_actor+0xe4/0x1c0 fs/splice.c:1142
      splice_direct_to_actor+0x2a0/0x7e4 fs/splice.c:1088
      do_splice_direct+0x20c/0x348 fs/splice.c:1194
      do_sendfile+0x4bc/0xc70 fs/read_write.c:1254
      __do_sys_sendfile64 fs/read_write.c:1322 [inline]
      __se_sys_sendfile64 fs/read_write.c:1308 [inline]
      __arm64_sys_sendfile64+0x160/0x3b4 fs/read_write.c:1308
      __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline]
      invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:51
      el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:136
      do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:155
      el0_svc+0x54/0x158 arch/arm64/kernel/entry-common.c:678
      el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:696
      el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:595
    Code: aa1803e6 aa1903e7 a90023f5 94792f6a (d4210000)
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Reported-and-tested-by: syzbot+2a7024e9502df538e8ef@syzkaller.appspotmail.com
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Link: https://lore.kernel.org/r/20240118183625.4007013-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

LoongArch/smp: Call rcutree_report_cpu_starting() earlier [+ + +]

Author: Huacai Chen <chenhuacai@kernel.org>
Date:   Wed Nov 8 14:12:15 2023 +0800

    LoongArch/smp: Call rcutree_report_cpu_starting() earlier
    
    commit a2ccf46333d7b2cf9658f0d82ac74097c1542fae upstream.
    
    rcutree_report_cpu_starting() must be called before cpu_probe() to avoid
    the following lockdep splat that triggered by calling __alloc_pages() when
    CONFIG_PROVE_RCU_LIST=y:
    
     =============================
     WARNING: suspicious RCU usage
     6.6.0+ #980 Not tainted
     -----------------------------
     kernel/locking/lockdep.c:3761 RCU-list traversed in non-reader section!!
     other info that might help us debug this:
     RCU used illegally from offline CPU!
     rcu_scheduler_active = 1, debug_locks = 1
     1 lock held by swapper/1/0:
      #0: 900000000c82ef98 (&pcp->lock){+.+.}-{2:2}, at: get_page_from_freelist+0x894/0x1790
     CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.6.0+ #980
     Stack : 0000000000000001 9000000004f79508 9000000004893670 9000000100310000
             90000001003137d0 0000000000000000 90000001003137d8 9000000004f79508
             0000000000000000 0000000000000001 0000000000000000 90000000048a3384
             203a656d616e2065 ca43677b3687e616 90000001002c3480 0000000000000008
             000000000000009d 0000000000000000 0000000000000001 80000000ffffe0b8
             000000000000000d 0000000000000033 0000000007ec0000 13bbf50562dad831
             9000000005140748 0000000000000000 9000000004f79508 0000000000000004
             0000000000000000 9000000005140748 90000001002bad40 0000000000000000
             90000001002ba400 0000000000000000 9000000003573ec8 0000000000000000
             00000000000000b0 0000000000000004 0000000000000000 0000000000070000
             ...
     Call Trace:
     [<9000000003573ec8>] show_stack+0x38/0x150
     [<9000000004893670>] dump_stack_lvl+0x74/0xa8
     [<900000000360d2bc>] lockdep_rcu_suspicious+0x14c/0x190
     [<900000000361235c>] __lock_acquire+0xd0c/0x2740
     [<90000000036146f4>] lock_acquire+0x104/0x2c0
     [<90000000048a955c>] _raw_spin_lock_irqsave+0x5c/0x90
     [<900000000381cd5c>] rmqueue_bulk+0x6c/0x950
     [<900000000381fc0c>] get_page_from_freelist+0xd4c/0x1790
     [<9000000003821c6c>] __alloc_pages+0x1bc/0x3e0
     [<9000000003583b40>] tlb_init+0x150/0x2a0
     [<90000000035742a0>] per_cpu_trap_init+0xf0/0x110
     [<90000000035712fc>] cpu_probe+0x3dc/0x7a0
     [<900000000357ed20>] start_secondary+0x40/0xb0
     [<9000000004897138>] smpboot_entry+0x54/0x58
    
    raw_smp_processor_id() is required in order to avoid calling into lockdep
    before RCU has declared the CPU to be watched for readers.
    
    See also commit 29368e093921 ("x86/smpboot: Move rcu_cpu_starting() earlier"),
    commit de5d9dae150c ("s390/smp: move rcu_cpu_starting() earlier") and commit
    99f070b62322 ("powerpc/smp: Call rcu_cpu_starting() earlier").
    
    Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

lsm: new security_file_ioctl_compat() hook [+ + +]

Author: Alfred Piccioni <alpic@google.com>
Date:   Tue Dec 19 10:09:09 2023 +0100

    lsm: new security_file_ioctl_compat() hook
    
    commit f1bb47a31dff6d4b34fb14e99850860ee74bb003 upstream.
    
    Some ioctl commands do not require ioctl permission, but are routed to
    other permissions such as FILE_GETATTR or FILE_SETATTR. This routing is
    done by comparing the ioctl cmd to a set of 64-bit flags (FS_IOC_*).
    
    However, if a 32-bit process is running on a 64-bit kernel, it emits
    32-bit flags (FS_IOC32_*) for certain ioctl operations. These flags are
    being checked erroneously, which leads to these ioctl operations being
    routed to the ioctl permission, rather than the correct file
    permissions.
    
    This was also noted in a RED-PEN finding from a while back -
    "/* RED-PEN how should LSM module know it's handling 32bit? */".
    
    This patch introduces a new hook, security_file_ioctl_compat(), that is
    called from the compat ioctl syscall. All current LSMs have been changed
    to support this hook.
    
    Reviewing the three places where we are currently using
    security_file_ioctl(), it appears that only SELinux needs a dedicated
    compat change; TOMOYO and SMACK appear to be functional without any
    change.
    
    Cc: stable@vger.kernel.org
    Fixes: 0b24dcb7f2f7 ("Revert "selinux: simplify ioctl checking"")
    Signed-off-by: Alfred Piccioni <alpic@google.com>
    Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com>
    [PM: subject tweak, line length fixes, and alignment corrections]
    Signed-off-by: Paul Moore <paul@paul-moore.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: i2c: imx290: Properly encode registers as little-endian [+ + +]

Author: Alexander Stein <alexander.stein@ew.tq-group.com>
Date:   Thu Nov 2 10:50:48 2023 +0100

    media: i2c: imx290: Properly encode registers as little-endian
    
    [ Upstream commit 60fc87a69523c294eb23a1316af922f6665a6f8c ]
    
    The conversion to CCI also converted the multi-byte register access to
    big-endian. Correct the register definition by using the correct
    little-endian ones.
    
    Fixes: af73323b9770 ("media: imx290: Convert to new CCI register access helpers")
    Cc: stable@vger.kernel.org
    Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
    Reviewed-by: Hans de Goede <hdegoede@redhat.com>
    Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
    [Sakari Ailus: Fixed the Fixes: tag.]
    Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

media: i2c: st-mipid02: correct format propagation [+ + +]

Author: Alain Volmat <alain.volmat@foss.st.com>
Date:   Mon Nov 13 15:57:30 2023 +0100

    media: i2c: st-mipid02: correct format propagation
    
    commit b33cb0cbe2893b96ecbfa16254407153f4b55d16 upstream.
    
    Use a copy of the struct v4l2_subdev_format when propagating
    format from the sink to source pad in order to avoid impacting the
    sink format returned to the application.
    
    Thanks to Jacopo Mondi for pointing the issue.
    
    Fixes: 6c01e6f3f27b ("media: st-mipid02: Propagate format from sink to source pad")
    Signed-off-by: Alain Volmat <alain.volmat@foss.st.com>
    Cc: stable@vger.kernel.org
    Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
    Reviewed-by: Daniel Scally <dan.scally@ideasonboard.com>
    Reviewed-by: Benjamin Mugnier <benjamin.mugnier@foss.st.com>
    Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: imx355: Enable runtime PM before registering async sub-device [+ + +]

Author: Bingbu Cao <bingbu.cao@intel.com>
Date:   Wed Nov 22 17:46:06 2023 +0800

    media: imx355: Enable runtime PM before registering async sub-device
    
    commit efa5fe19c0a9199f49e36e1f5242ed5c88da617d upstream.
    
    As the sensor device maybe accessible right after its async sub-device is
    registered, such as ipu-bridge will try to power up sensor by sensor's
    client device's runtime PM from the async notifier callback, if runtime PM
    is not enabled, it will fail.
    
    So runtime PM should be ready before its async sub-device is registered
    and accessible by others.
    
    Fixes: df0b5c4a7ddd ("media: add imx355 camera sensor driver")
    Cc: stable@vger.kernel.org
    Signed-off-by: Bingbu Cao <bingbu.cao@intel.com>
    Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: mtk-jpeg: Fix timeout schedule error in mtk_jpegdec_worker. [+ + +]

Author: Zheng Wang <zyytlz.wz@163.com>
Date:   Mon Nov 6 15:48:11 2023 +0100

    media: mtk-jpeg: Fix timeout schedule error in mtk_jpegdec_worker.
    
    commit 38e1857933def4b3fafc28cc34ff3bbc84cad2c3 upstream.
    
    In mtk_jpegdec_worker, if error occurs in mtk_jpeg_set_dec_dst, it
    will start the timeout worker and invoke v4l2_m2m_job_finish at
    the same time. This will break the logic of design for there should
    be only one function to call v4l2_m2m_job_finish. But now the timeout
    handler and mtk_jpegdec_worker will both invoke it.
    
    Fix it by start the worker only if mtk_jpeg_set_dec_dst successfully
    finished.
    
    Fixes: da4ede4b7fd6 ("media: mtk-jpeg: move data/code inside CONFIG_OF blocks")
    Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
    Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: mtk-jpeg: Fix use after free bug due to error path handling in mtk_jpeg_dec_device_run [+ + +]

Author: Zheng Wang <zyytlz.wz@163.com>
Date:   Mon Nov 6 15:48:10 2023 +0100

    media: mtk-jpeg: Fix use after free bug due to error path handling in mtk_jpeg_dec_device_run
    
    commit 206c857dd17d4d026de85866f1b5f0969f2a109e upstream.
    
    In mtk_jpeg_probe, &jpeg->job_timeout_work is bound with
    mtk_jpeg_job_timeout_work.
    
    In mtk_jpeg_dec_device_run, if error happens in
    mtk_jpeg_set_dec_dst, it will finally start the worker while
    mark the job as finished by invoking v4l2_m2m_job_finish.
    
    There are two methods to trigger the bug. If we remove the
    module, it which will call mtk_jpeg_remove to make cleanup.
    The possible sequence is as follows, which will cause a
    use-after-free bug.
    
    CPU0                  CPU1
    mtk_jpeg_dec_...    |
      start worker      |
                        |mtk_jpeg_job_timeout_work
    mtk_jpeg_remove     |
      v4l2_m2m_release  |
        kfree(m2m_dev); |
                        |
                        | v4l2_m2m_get_curr_priv
                        |   m2m_dev->curr_ctx //use
    
    If we close the file descriptor, which will call mtk_jpeg_release,
    it will have a similar sequence.
    
    Fix this bug by starting timeout worker only if started jpegdec worker
    successfully. Then v4l2_m2m_job_finish will only be called in
    either mtk_jpeg_job_timeout_work or mtk_jpeg_dec_device_run.
    
    Fixes: b2f0d2724ba4 ("[media] vcodec: mediatek: Add Mediatek JPEG Decoder Driver")
    Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
    Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: ov01a10: Enable runtime PM before registering async sub-device [+ + +]

Author: Bingbu Cao <bingbu.cao@intel.com>
Date:   Wed Nov 22 17:46:07 2023 +0800

    media: ov01a10: Enable runtime PM before registering async sub-device
    
    commit 47a78052db51b16e8045524fbf33373b58f1323b upstream.
    
    As the sensor device maybe accessible right after its async sub-device is
    registered, such as ipu-bridge will try to power up sensor by sensor's
    client device's runtime PM from the async notifier callback, if runtime PM
    is not enabled, it will fail.
    
    So runtime PM should be ready before its async sub-device is registered
    and accessible by others.
    
    It also sets the runtime PM status to active as the sensor was turned
    on by i2c-core.
    
    Fixes: 0827b58dabff ("media: i2c: add ov01a10 image sensor driver")
    Cc: stable@vger.kernel.org
    Signed-off-by: Bingbu Cao <bingbu.cao@intel.com>
    Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: ov13b10: Enable runtime PM before registering async sub-device [+ + +]

Author: Bingbu Cao <bingbu.cao@intel.com>
Date:   Wed Nov 22 17:46:08 2023 +0800

    media: ov13b10: Enable runtime PM before registering async sub-device
    
    commit 7b0454cfd8edb3509619407c3b9f78a6d0dee1a5 upstream.
    
    As the sensor device maybe accessible right after its async sub-device is
    registered, such as ipu-bridge will try to power up sensor by sensor's
    client device's runtime PM from the async notifier callback, if runtime PM
    is not enabled, it will fail.
    
    So runtime PM should be ready before its async sub-device is registered
    and accessible by others.
    
    Fixes: 7ee850546822 ("media: Add sensor driver support for the ov13b10 camera.")
    Cc: stable@vger.kernel.org
    Signed-off-by: Bingbu Cao <bingbu.cao@intel.com>
    Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: ov9734: Enable runtime PM before registering async sub-device [+ + +]

Author: Bingbu Cao <bingbu.cao@intel.com>
Date:   Wed Nov 22 17:46:09 2023 +0800

    media: ov9734: Enable runtime PM before registering async sub-device
    
    commit e242e9c144050ed120cf666642ba96b7c4462a4c upstream.
    
    As the sensor device maybe accessible right after its async sub-device is
    registered, such as ipu-bridge will try to power up sensor by sensor's
    client device's runtime PM from the async notifier callback, if runtime PM
    is not enabled, it will fail.
    
    So runtime PM should be ready before its async sub-device is registered
    and accessible by others.
    
    Fixes: d3f863a63fe4 ("media: i2c: Add ov9734 image sensor driver")
    Cc: stable@vger.kernel.org
    Signed-off-by: Bingbu Cao <bingbu.cao@intel.com>
    Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: v4l2-cci: Add support for little-endian encoded registers [+ + +]

Author: Alexander Stein <alexander.stein@ew.tq-group.com>
Date:   Thu Nov 2 10:50:47 2023 +0100

    media: v4l2-cci: Add support for little-endian encoded registers
    
    [ Upstream commit d92e7a013ff33f4e0b31bbf768d0c85a8acefebf ]
    
    Some sensors, e.g. Sony IMX290, are using little-endian registers. Add
    support for those by encoding the endianness into Bit 20 of the register
    address.
    
    Fixes: af73323b9770 ("media: imx290: Convert to new CCI register access helpers")
    Cc: stable@vger.kernel.org
    Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
    Reviewed-by: Hans de Goede <hdegoede@redhat.com>
    Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
    [Sakari Ailus: Fixed commit message.]
    Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

media: v4l: cci: Add macros to obtain register width and address [+ + +]

Author: Sakari Ailus <sakari.ailus@linux.intel.com>
Date:   Tue Nov 7 17:42:40 2023 +0200

    media: v4l: cci: Add macros to obtain register width and address
    
    [ Upstream commit cd93cc245dfe334c38da98c14b34f9597e1b4ea6 ]
    
    Add CCI_REG_WIDTH() macro to obtain register width in bits and similarly,
    CCI_REG_WIDTH_BYTES() to obtain it in bytes.
    
    Also add CCI_REG_ADDR() macro to obtain the address of a register.
    
    Use both macros in v4l2-cci.c, too.
    
    Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
    Reviewed-by: Hans de Goede <hdegoede@redhat.com>
    Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Stable-dep-of: d92e7a013ff3 ("media: v4l2-cci: Add support for little-endian encoded registers")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

media: v4l: cci: Include linux/bits.h [+ + +]

Author: Sakari Ailus <sakari.ailus@linux.intel.com>
Date:   Tue Nov 7 10:45:30 2023 +0200

    media: v4l: cci: Include linux/bits.h
    
    [ Upstream commit eba5058633b4d11e2a4d65eae9f1fce0b96365d9 ]
    
    linux/bits.h is needed for GENMASK(). Include it.
    
    Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
    Reviewed-by: Hans de Goede <hdegoede@redhat.com>
    Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Stable-dep-of: d92e7a013ff3 ("media: v4l2-cci: Add support for little-endian encoded registers")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

media: videobuf2-dma-sg: fix vmap callback [+ + +]

Author: Michael Grzeschik <m.grzeschik@pengutronix.de>
Date:   Thu Nov 23 23:32:05 2023 +0100

    media: videobuf2-dma-sg: fix vmap callback
    
    commit 608ca5a60ee47b48fec210aeb7a795a64eb5dcee upstream.
    
    For dmabuf import users to be able to use the vaddr from another
    videobuf2-dma-sg source, the exporter needs to set a proper vaddr on
    vb2_dma_sg_dmabuf_ops_vmap callback. This patch adds vmap on map if
    buf->vaddr was not set.
    
    Cc: stable@kernel.org
    Fixes: 7938f4218168 ("dma-buf-map: Rename to iosys-map")
    Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
    Acked-by: Tomasz Figa <tfiga@chromium.org>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

memblock: fix crash when reserved memory is not added to memory [+ + +]

Author: Yajun Deng <yajun.deng@linux.dev>
Date:   Thu Jan 18 14:18:53 2024 +0800

    memblock: fix crash when reserved memory is not added to memory
    
    [ Upstream commit 6a9531c3a88096a26cf3ac582f7ec44f94a7dcb2 ]
    
    After commit 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
    nid of a reserved region is used by init_reserved_page() (with
    CONFIG_DEFERRED_STRUCT_PAGE_INIT=y) to access node strucure.
    In many cases the nid of the reserved memory is not set and this causes
    a crash.
    
    When the nid of a reserved region is not set, fall back to
    early_pfn_to_nid(), so that nid of the first_online_node will be passed
    to init_reserved_page().
    
    Fixes: 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
    Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
    Link: https://lore.kernel.org/r/20240118061853.2652295-1-yajun.deng@linux.dev
    [rppt: massaged the commit message]
    Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mips: Call lose_fpu(0) before initializing fcr31 in mips_set_personality_nan [+ + +]

Author: Xi Ruoyao <xry111@xry111.site>
Date:   Sat Jan 27 05:05:57 2024 +0800

    mips: Call lose_fpu(0) before initializing fcr31 in mips_set_personality_nan
    
    commit 59be5c35850171e307ca5d3d703ee9ff4096b948 upstream.
    
    If we still own the FPU after initializing fcr31, when we are preempted
    the dirty value in the FPU will be read out and stored into fcr31,
    clobbering our setting.  This can cause an improper floating-point
    environment after execve().  For example:
    
        zsh% cat measure.c
        #include <fenv.h>
        int main() { return fetestexcept(FE_INEXACT); }
        zsh% cc measure.c -o measure -lm
        zsh% echo $((1.0/3)) # raising FE_INEXACT
        0.33333333333333331
        zsh% while ./measure; do ; done
        (stopped in seconds)
    
    Call lose_fpu(0) before setting fcr31 to prevent this.
    
    Closes: https://lore.kernel.org/linux-mips/7a6aa1bbdbbe2e63ae96ff163fab0349f58f1b9e.camel@xry111.site/
    Fixes: 9b26616c8d9d ("MIPS: Respect the ISA level in FCSR handling")
    Cc: stable@vger.kernel.org
    Signed-off-by: Xi Ruoyao <xry111@xry111.site>
    Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mips: Fix max_mapnr being uninitialized on early stages [+ + +]

Author: Serge Semin <fancer.lancer@gmail.com>
Date:   Sat Dec 2 14:14:20 2023 +0300

    mips: Fix max_mapnr being uninitialized on early stages
    
    commit e1a9ae45736989c972a8d1c151bc390678ae6205 upstream.
    
    max_mapnr variable is utilized in the pfn_valid() method in order to
    determine the upper PFN space boundary. Having it uninitialized
    effectively makes any PFN passed to that method invalid. That in its turn
    causes the kernel mm-subsystem occasion malfunctions even after the
    max_mapnr variable is actually properly updated. For instance,
    pfn_valid() is called in the init_unavailable_range() method in the
    framework of the calls-chain on MIPS:
    setup_arch()
    +-> paging_init()
        +-> free_area_init()
            +-> memmap_init()
                +-> memmap_init_zone_range()
                    +-> init_unavailable_range()
    
    Since pfn_valid() always returns "false" value before max_mapnr is
    initialized in the mem_init() method, any flatmem page-holes will be left
    in the poisoned/uninitialized state including the IO-memory pages. Thus
    any further attempts to map/remap the IO-memory by using MMU may fail.
    In particular it happened in my case on attempt to map the SRAM region.
    The kernel bootup procedure just crashed on the unhandled unaligned access
    bug raised in the __update_cache() method:
    
    > Unhandled kernel unaligned access[#1]:
    > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.7.0-rc1-XXX-dirty #2056
    > ...
    > Call Trace:
    > [<8011ef9c>] __update_cache+0x88/0x1bc
    > [<80385944>] ioremap_page_range+0x110/0x2a4
    > [<80126948>] ioremap_prot+0x17c/0x1f4
    > [<80711b80>] __devm_ioremap+0x8c/0x120
    > [<80711e0c>] __devm_ioremap_resource+0xf4/0x218
    > [<808bf244>] sram_probe+0x4f4/0x930
    > [<80889d20>] platform_probe+0x68/0xec
    > ...
    
    Let's fix the problem by initializing the max_mapnr variable as soon as
    the required data is available. In particular it can be done right in the
    paging_init() method before free_area_init() is called since all the PFN
    zone boundaries have already been calculated by that time.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
    Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

MIPS: lantiq: register smp_ops on non-smp platforms [+ + +]

Author: Aleksander Jan Bajkowski <olek2@wp.pl>
Date:   Mon Jan 22 19:47:09 2024 +0100

    MIPS: lantiq: register smp_ops on non-smp platforms
    
    [ Upstream commit 4bf2a626dc4bb46f0754d8ac02ec8584ff114ad5 ]
    
    Lantiq uses a common kernel config for devices with 24Kc and 34Kc cores.
    The changes made previously to add support for interrupts on all cores
    work on 24Kc platforms with SMP disabled and 34Kc platforms with SMP
    enabled. This patch fixes boot issues on Danube (single core 24Kc) with
    SMP enabled.
    
    Fixes: 730320fd770d ("MIPS: lantiq: enable all hardware interrupts on second VPE")
    Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
    Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm/rmap: fix misplaced parenthesis of a likely() [+ + +]

Author: Steven Rostedt (Google) <rostedt@goodmis.org>
Date:   Fri Dec 1 14:59:36 2023 -0500

    mm/rmap: fix misplaced parenthesis of a likely()
    
    commit f67f8d4a8c1e1ebc85a6cbdb9a7266f14863461c upstream.
    
    Running my yearly branch profiler to see where likely/unlikely annotation
    may be added or removed, I discovered this:
    
    correct incorrect  %        Function                  File              Line
     ------- ---------  -        --------                  ----              ----
           0   457918 100 page_try_dup_anon_rmap         rmap.h               264
    [..]
      458021        0   0 page_try_dup_anon_rmap         rmap.h               265
    
    I thought it was interesting that line 264 of rmap.h had a 100% incorrect
    annotation, but the line directly below it was 100% correct. Looking at the
    code:
    
            if (likely(!is_device_private_page(page) &&
                unlikely(page_needs_cow_for_dma(vma, page))))
    
    It didn't make sense. The "likely()" was around the entire if statement
    (not just the "!is_device_private_page(page)"), which also included the
    "unlikely()" portion of that if condition.
    
    If the unlikely portion is unlikely to be true, that would make the entire
    if condition unlikely to be true, so it made no sense at all to say the
    entire if condition is true.
    
    What is more likely to be likely is just the first part of the if statement
    before the && operation. It's likely to be a misplaced parenthesis. And
    after making the if condition broken into a likely() && unlikely(), both
    now appear to be correct!
    
    Link: https://lkml.kernel.org/r/20231201145936.5ddfdb50@gandalf.local.home
    Fixes:fb3d824d1a46c ("mm/rmap: split page_dup_rmap() into page_dup_file_rmap() and page_try_dup_anon_rmap()")
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/sparsemem: fix race in accessing memory_section->usage [+ + +]

Author: Charan Teja Kalla <quic_charante@quicinc.com>
Date:   Fri Oct 13 18:34:27 2023 +0530

    mm/sparsemem: fix race in accessing memory_section->usage
    
    commit 5ec8e8ea8b7783fab150cf86404fc38cb4db8800 upstream.
    
    The below race is observed on a PFN which falls into the device memory
    region with the system memory configuration where PFN's are such that
    [ZONE_NORMAL ZONE_DEVICE ZONE_NORMAL].  Since normal zone start and end
    pfn contains the device memory PFN's as well, the compaction triggered
    will try on the device memory PFN's too though they end up in NOP(because
    pfn_to_online_page() returns NULL for ZONE_DEVICE memory sections).  When
    from other core, the section mappings are being removed for the
    ZONE_DEVICE region, that the PFN in question belongs to, on which
    compaction is currently being operated is resulting into the kernel crash
    with CONFIG_SPASEMEM_VMEMAP enabled.  The crash logs can be seen at [1].
    
    compact_zone()                  memunmap_pages
    -------------                   ---------------
    __pageblock_pfn_to_page
       ......
     (a)pfn_valid():
         valid_section()//return true
                                  (b)__remove_pages()->
                                      sparse_remove_section()->
                                        section_deactivate():
                                        [Free the array ms->usage and set
                                         ms->usage = NULL]
         pfn_section_valid()
         [Access ms->usage which
         is NULL]
    
    NOTE: From the above it can be said that the race is reduced to between
    the pfn_valid()/pfn_section_valid() and the section deactivate with
    SPASEMEM_VMEMAP enabled.
    
    The commit b943f045a9af("mm/sparse: fix kernel crash with
    pfn_section_valid check") tried to address the same problem by clearing
    the SECTION_HAS_MEM_MAP with the expectation of valid_section() returns
    false thus ms->usage is not accessed.
    
    Fix this issue by the below steps:
    
    a) Clear SECTION_HAS_MEM_MAP before freeing the ->usage.
    
    b) RCU protected read side critical section will either return NULL
       when SECTION_HAS_MEM_MAP is cleared or can successfully access ->usage.
    
    c) Free the ->usage with kfree_rcu() and set ms->usage = NULL.  No
       attempt will be made to access ->usage after this as the
       SECTION_HAS_MEM_MAP is cleared thus valid_section() return false.
    
    Thanks to David/Pavan for their inputs on this patch.
    
    [1] https://lore.kernel.org/linux-mm/994410bb-89aa-d987-1f50-f514903c55aa@quicinc.com/
    
    On Snapdragon SoC, with the mentioned memory configuration of PFN's as
    [ZONE_NORMAL ZONE_DEVICE ZONE_NORMAL], we are able to see bunch of
    issues daily while testing on a device farm.
    
    For this particular issue below is the log.  Though the below log is
    not directly pointing to the pfn_section_valid(){ ms->usage;}, when we
    loaded this dump on T32 lauterbach tool, it is pointing.
    
    [  540.578056] Unable to handle kernel NULL pointer dereference at
    virtual address 0000000000000000
    [  540.578068] Mem abort info:
    [  540.578070]   ESR = 0x0000000096000005
    [  540.578073]   EC = 0x25: DABT (current EL), IL = 32 bits
    [  540.578077]   SET = 0, FnV = 0
    [  540.578080]   EA = 0, S1PTW = 0
    [  540.578082]   FSC = 0x05: level 1 translation fault
    [  540.578085] Data abort info:
    [  540.578086]   ISV = 0, ISS = 0x00000005
    [  540.578088]   CM = 0, WnR = 0
    [  540.579431] pstate: 82400005 (Nzcv daif +PAN -UAO +TCO -DIT -SSBSBTYPE=--)
    [  540.579436] pc : __pageblock_pfn_to_page+0x6c/0x14c
    [  540.579454] lr : compact_zone+0x994/0x1058
    [  540.579460] sp : ffffffc03579b510
    [  540.579463] x29: ffffffc03579b510 x28: 0000000000235800 x27:000000000000000c
    [  540.579470] x26: 0000000000235c00 x25: 0000000000000068 x24:ffffffc03579b640
    [  540.579477] x23: 0000000000000001 x22: ffffffc03579b660 x21:0000000000000000
    [  540.579483] x20: 0000000000235bff x19: ffffffdebf7e3940 x18:ffffffdebf66d140
    [  540.579489] x17: 00000000739ba063 x16: 00000000739ba063 x15:00000000009f4bff
    [  540.579495] x14: 0000008000000000 x13: 0000000000000000 x12:0000000000000001
    [  540.579501] x11: 0000000000000000 x10: 0000000000000000 x9 :ffffff897d2cd440
    [  540.579507] x8 : 0000000000000000 x7 : 0000000000000000 x6 :ffffffc03579b5b4
    [  540.579512] x5 : 0000000000027f25 x4 : ffffffc03579b5b8 x3 :0000000000000001
    [  540.579518] x2 : ffffffdebf7e3940 x1 : 0000000000235c00 x0 :0000000000235800
    [  540.579524] Call trace:
    [  540.579527]  __pageblock_pfn_to_page+0x6c/0x14c
    [  540.579533]  compact_zone+0x994/0x1058
    [  540.579536]  try_to_compact_pages+0x128/0x378
    [  540.579540]  __alloc_pages_direct_compact+0x80/0x2b0
    [  540.579544]  __alloc_pages_slowpath+0x5c0/0xe10
    [  540.579547]  __alloc_pages+0x250/0x2d0
    [  540.579550]  __iommu_dma_alloc_noncontiguous+0x13c/0x3fc
    [  540.579561]  iommu_dma_alloc+0xa0/0x320
    [  540.579565]  dma_alloc_attrs+0xd4/0x108
    
    [quic_charante@quicinc.com: use kfree_rcu() in place of synchronize_rcu(), per David]
      Link: https://lkml.kernel.org/r/1698403778-20938-1-git-send-email-quic_charante@quicinc.com
    Link: https://lkml.kernel.org/r/1697202267-23600-1-git-send-email-quic_charante@quicinc.com
    Fixes: f46edbd1b151 ("mm/sparsemem: add helpers track active portions of a section at boot")
    Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>
    Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: migrate: fix getting incorrect page mapping during page migration [+ + +]

Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date:   Fri Dec 15 20:07:52 2023 +0800

    mm: migrate: fix getting incorrect page mapping during page migration
    
    [ Upstream commit d1adb25df7111de83b64655a80b5a135adbded61 ]
    
    When running stress-ng testing, we found below kernel crash after a few hours:
    
    Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
    pc : dentry_name+0xd8/0x224
    lr : pointer+0x22c/0x370
    sp : ffff800025f134c0
    ......
    Call trace:
      dentry_name+0xd8/0x224
      pointer+0x22c/0x370
      vsnprintf+0x1ec/0x730
      vscnprintf+0x2c/0x60
      vprintk_store+0x70/0x234
      vprintk_emit+0xe0/0x24c
      vprintk_default+0x3c/0x44
      vprintk_func+0x84/0x2d0
      printk+0x64/0x88
      __dump_page+0x52c/0x530
      dump_page+0x14/0x20
      set_migratetype_isolate+0x110/0x224
      start_isolate_page_range+0xc4/0x20c
      offline_pages+0x124/0x474
      memory_block_offline+0x44/0xf4
      memory_subsys_offline+0x3c/0x70
      device_offline+0xf0/0x120
      ......
    
    After analyzing the vmcore, I found this issue is caused by page migration.
    The scenario is that, one thread is doing page migration, and we will use the
    target page's ->mapping field to save 'anon_vma' pointer between page unmap and
    page move, and now the target page is locked and refcount is 1.
    
    Currently, there is another stress-ng thread performing memory hotplug,
    attempting to offline the target page that is being migrated. It discovers that
    the refcount of this target page is 1, preventing the offline operation, thus
    proceeding to dump the page. However, page_mapping() of the target page may
    return an incorrect file mapping to crash the system in dump_mapping(), since
    the target page->mapping only saves 'anon_vma' pointer without setting
    PAGE_MAPPING_ANON flag.
    
    There are seveval ways to fix this issue:
    (1) Setting the PAGE_MAPPING_ANON flag for target page's ->mapping when saving
    'anon_vma', but this can confuse PageAnon() for PFN walkers, since the target
    page has not built mappings yet.
    (2) Getting the page lock to call page_mapping() in __dump_page() to avoid crashing
    the system, however, there are still some PFN walkers that call page_mapping()
    without holding the page lock, such as compaction.
    (3) Using target page->private field to save the 'anon_vma' pointer and 2 bits
    page state, just as page->mapping records an anonymous page, which can remove
    the page_mapping() impact for PFN walkers and also seems a simple way.
    
    So I choose option 3 to fix this issue, and this can also fix other potential
    issues for PFN walkers, such as compaction.
    
    Link: https://lkml.kernel.org/r/e60b17a88afc38cb32f84c3e30837ec70b343d2b.1702641709.git.baolin.wang@linux.alibaba.com
    Fixes: 64c8902ed441 ("migrate_pages: split unmap_and_move() to _unmap() and _move()")
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Xu Yu <xuyu@linux.alibaba.com>
    Cc: Zi Yan <ziy@nvidia.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm: migrate: record the mlocked page status to remove unnecessary lru drain [+ + +]

Author: Baolin Wang <baolin.wang@linux.alibaba.com>
Date:   Sat Oct 21 12:33:22 2023 +0800

    mm: migrate: record the mlocked page status to remove unnecessary lru drain
    
    [ Upstream commit eebb3dabbb5cc590afe32880b5d3726d0fbf88db ]
    
    When doing compaction, I found the lru_add_drain() is an obvious hotspot
    when migrating pages. The distribution of this hotspot is as follows:
       - 18.75% compact_zone
          - 17.39% migrate_pages
             - 13.79% migrate_pages_batch
                - 11.66% migrate_folio_move
                   - 7.02% lru_add_drain
                      + 7.02% lru_add_drain_cpu
                   + 3.00% move_to_new_folio
                     1.23% rmap_walk
                + 1.92% migrate_folio_unmap
             + 3.20% migrate_pages_sync
          + 0.90% isolate_migratepages
    
    The lru_add_drain() was added by commit c3096e6782b7 ("mm/migrate:
    __unmap_and_move() push good newpage to LRU") to drain the newpage to LRU
    immediately, to help to build up the correct newpage->mlock_count in
    remove_migration_ptes() for mlocked pages.  However, if there are no
    mlocked pages are migrating, then we can avoid this lru drain operation,
    especailly for the heavy concurrent scenarios.
    
    So we can record the source pages' mlocked status in
    migrate_folio_unmap(), and only drain the lru list when the mlocked status
    is set in migrate_folio_move().
    
    In addition, the page was already isolated from lru when migrating, so
    checking the mlocked status is stable by folio_test_mlocked() in
    migrate_folio_unmap().
    
    After this patch, I can see the hotpot of the lru_add_drain() is gone:
       - 9.41% migrate_pages_batch
          - 6.15% migrate_folio_move
             - 3.64% move_to_new_folio
                + 1.80% migrate_folio_extra
                + 1.70% buffer_migrate_folio
             + 1.41% rmap_walk
             + 0.62% folio_add_lru
          + 3.07% migrate_folio_unmap
    
    Meanwhile, the compaction latency shows some improvements when running
    thpscale:
                                base                   patched
    Amean     fault-both-1      1131.22 (   0.00%)     1112.55 *   1.65%*
    Amean     fault-both-3      2489.75 (   0.00%)     2324.15 *   6.65%*
    Amean     fault-both-5      3257.37 (   0.00%)     3183.18 *   2.28%*
    Amean     fault-both-7      4257.99 (   0.00%)     4079.04 *   4.20%*
    Amean     fault-both-12     6614.02 (   0.00%)     6075.60 *   8.14%*
    Amean     fault-both-18    10607.78 (   0.00%)     8978.86 *  15.36%*
    Amean     fault-both-24    14911.65 (   0.00%)    11619.55 *  22.08%*
    Amean     fault-both-30    14954.67 (   0.00%)    14925.66 *   0.19%*
    Amean     fault-both-32    16654.87 (   0.00%)    15580.31 *   6.45%*
    
    Link: https://lkml.kernel.org/r/06e9153a7a4850352ec36602df3a3a844de45698.1697859741.git.baolin.wang@linux.alibaba.com
    Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
    Reviewed-by: Zi Yan <ziy@nvidia.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Yin Fengwei <fengwei.yin@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: d1adb25df711 ("mm: migrate: fix getting incorrect page mapping during page migration")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm: page_alloc: unreserve highatomic page blocks before oom [+ + +]

Author: Charan Teja Kalla <quic_charante@quicinc.com>
Date:   Fri Nov 24 16:27:25 2023 +0530

    mm: page_alloc: unreserve highatomic page blocks before oom
    
    commit ac3f3b0a55518056bc80ed32a41931c99e1f7d81 upstream.
    
    __alloc_pages_direct_reclaim() is called from slowpath allocation where
    high atomic reserves can be unreserved after there is a progress in
    reclaim and yet no suitable page is found.  Later should_reclaim_retry()
    gets called from slow path allocation to decide if the reclaim needs to be
    retried before OOM kill path is taken.
    
    should_reclaim_retry() checks the available(reclaimable + free pages)
    memory against the min wmark levels of a zone and returns:
    
    a) true, if it is above the min wmark so that slow path allocation will
       do the reclaim retries.
    
    b) false, thus slowpath allocation takes oom kill path.
    
    should_reclaim_retry() can also unreserves the high atomic reserves **but
    only after all the reclaim retries are exhausted.**
    
    In a case where there are almost none reclaimable memory and free pages
    contains mostly the high atomic reserves but allocation context can't use
    these high atomic reserves, makes the available memory below min wmark
    levels hence false is returned from should_reclaim_retry() leading the
    allocation request to take OOM kill path.  This can turn into a early oom
    kill if high atomic reserves are holding lot of free memory and
    unreserving of them is not attempted.
    
    (early)OOM is encountered on a VM with the below state:
    [  295.998653] Normal free:7728kB boost:0kB min:804kB low:1004kB
    high:1204kB reserved_highatomic:8192KB active_anon:4kB inactive_anon:0kB
    active_file:24kB inactive_file:24kB unevictable:1220kB writepending:0kB
    present:70732kB managed:49224kB mlocked:0kB bounce:0kB free_pcp:688kB
    local_pcp:492kB free_cma:0kB
    [  295.998656] lowmem_reserve[]: 0 32
    [  295.998659] Normal: 508*4kB (UMEH) 241*8kB (UMEH) 143*16kB (UMEH)
    33*32kB (UH) 7*64kB (UH) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB
    0*4096kB = 7752kB
    
    Per above log, the free memory of ~7MB exist in the high atomic reserves
    is not freed up before falling back to oom kill path.
    
    Fix it by trying to unreserve the high atomic reserves in
    should_reclaim_retry() before __alloc_pages_direct_reclaim() can fallback
    to oom kill path.
    
    Link: https://lkml.kernel.org/r/1700823445-27531-1-git-send-email-quic_charante@quicinc.com
    Fixes: 0aaa29a56e4f ("mm, page_alloc: reserve pageblocks for high-order atomic allocations on demand")
    Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>
    Reported-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com>
    Suggested-by: Michal Hocko <mhocko@suse.com>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Acked-by: David Rientjes <rientjes@google.com>
    Cc: Chris Goldsworthy <quic_cgoldswo@quicinc.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Pavankumar Kondeti <quic_pkondeti@quicinc.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: core: Use mrq.sbc in close-ended ffu [+ + +]

Author: Avri Altman <avri.altman@wdc.com>
Date:   Wed Nov 29 11:25:35 2023 +0200

    mmc: core: Use mrq.sbc in close-ended ffu
    
    commit 4d0c8d0aef6355660b6775d57ccd5d4ea2e15802 upstream.
    
    Field Firmware Update (ffu) may use close-ended or open ended sequence.
    Each such sequence is comprised of a write commands enclosed between 2
    switch commands - to and from ffu mode. So for the close-ended case, it
    will be: cmd6->cmd23-cmd25-cmd6.
    
    Some host controllers however, get confused when multi-block rw is sent
    without sbc, and may generate auto-cmd12 which breaks the ffu sequence.
    I encountered  this issue while testing fwupd (github.com/fwupd/fwupd)
    on HP Chromebook x2, a qualcomm based QC-7c, code name - strongbad.
    
    Instead of a quirk, or hooking the request function of the msm ops,
    it would be better to fix the ioctl handling and make it use mrq.sbc
    instead of issuing SET_BLOCK_COUNT separately.
    
    Signed-off-by: Avri Altman <avri.altman@wdc.com>
    Acked-by: Adrian Hunter <adrian.hunter@intel.com>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20231129092535.3278-1-avri.altman@wdc.com
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: mmc_spi: remove custom DMA mapped buffers [+ + +]

Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date:   Fri Dec 8 00:19:01 2023 +0200

    mmc: mmc_spi: remove custom DMA mapped buffers
    
    commit 84a6be7db9050dd2601c9870f65eab9a665d2d5d upstream.
    
    There is no need to duplicate what SPI core or individual controller
    drivers already do, i.e. mapping the buffers for DMA capable transfers.
    
    Note, that the code, besides its redundancy, was buggy: strictly speaking
    there is no guarantee, while it's true for those which can use this code
    (see below), that the SPI host controller _is_ the device which does DMA.
    
    Also see the Link tags below.
    
    Additional notes. Currently only two SPI host controller drivers may use
    premapped (by the user) DMA buffers:
    
      - drivers/spi/spi-au1550.c
    
      - drivers/spi/spi-fsl-spi.c
    
    Both of them have DMA mapping support code. I don't expect that SPI host
    controller code is worse than what has been done in mmc_spi. Hence I do
    not expect any regressions here. Otherwise, I'm pretty much sure these
    regressions have to be fixed in the respective drivers, and not here.
    
    That said, remove all related pieces of DMA mapping code from mmc_spi.
    
    Link: https://lore.kernel.org/linux-mmc/c73b9ba9-1699-2aff-e2fd-b4b4f292a3ca@raspberrypi.org/
    Link: https://stackoverflow.com/questions/67620728/mmc-spi-issue-not-able-to-setup-mmc-sd-card-in-linux
    Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20231207221901.3259962-1-andriy.shevchenko@linux.intel.com
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: maps: vmu-flash: Fix the (mtd core) switch to ref counters [+ + +]

Author: Miquel Raynal <miquel.raynal@bootlin.com>
Date:   Tue Dec 5 08:59:36 2023 +0100

    mtd: maps: vmu-flash: Fix the (mtd core) switch to ref counters
    
    commit a7d84a2e7663bbe12394cc771107e04668ea313a upstream.
    
    While switching to ref counters for track mtd devices use, the vmu-flash
    driver was forgotten. The reason for reading the ref counter seems
    debatable, but let's just fix the build for now.
    
    Fixes: 19bfa9ebebb5 ("mtd: use refcount to prevent corruption")
    Reported-by: kernel test robot <lkp@intel.com>
    Closes: https://lore.kernel.org/oe-kbuild-all/202312022315.79twVRZw-lkp@intel.com/
    Cc: stable@vger.kernel.org
    Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Link: https://lore.kernel.org/linux-mtd/20231205075936.13831-1-miquel.raynal@bootlin.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: rawnand: Clarify conditions to enable continuous reads [+ + +]

Author: Miquel Raynal <miquel.raynal@bootlin.com>
Date:   Fri Dec 15 13:32:08 2023 +0100

    mtd: rawnand: Clarify conditions to enable continuous reads
    
    commit 828f6df1bcba7f64729166efc7086ea657070445 upstream.
    
    The current logic is probably fine but is a bit convoluted. Plus, we
    don't want partial pages to be part of the sequential operation just in
    case the core would optimize the page read with a subpage read (which
    would break the sequence). This may happen on the first and last page
    only, so if the start offset or the end offset is not aligned with a
    page boundary, better avoid them to prevent any risk.
    
    Cc: stable@vger.kernel.org
    Fixes: 003fe4b9545b ("mtd: rawnand: Support for sequential cache reads")
    Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Tested-by: Martin Hundebц╦ll <martin@geanix.com>
    Link: https://lore.kernel.org/linux-mtd/20231215123208.516590-5-miquel.raynal@bootlin.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: rawnand: Fix core interference with sequential reads [+ + +]

Author: Miquel Raynal <miquel.raynal@bootlin.com>
Date:   Fri Dec 15 13:32:06 2023 +0100

    mtd: rawnand: Fix core interference with sequential reads
    
    commit 7c9414c870c027737d0f2ed7b0ed10f26edb1c61 upstream.
    
    A couple of reports pointed at some strange failures happening a bit
    randomly since the introduction of sequential page reads support. After
    investigation it turned out the most likely reason for these issues was
    the fact that sometimes a (longer) read might happen, starting at the
    same page that was read previously. This is optimized by the raw NAND
    core, by not sending the READ_PAGE command to the NAND device and just
    reading out the data in a local cache. When this page is also flagged as
    being the starting point for a sequential read, it means the page right
    next will be accessed without the right instructions. The NAND chip will
    be confused and will not output correct data. In order to avoid such
    situation from happening anymore, we can however handle this case with a
    bit of additional logic, to postpone the initialization of the read
    sequence by one page.
    
    Reported-by: Alexander Shiyan <eagle.alexander923@gmail.com>
    Closes: https://lore.kernel.org/linux-mtd/CAP1tNvS=NVAm-vfvYWbc3k9Cx9YxMc2uZZkmXk8h1NhGX877Zg@mail.gmail.com/
    Reported-by: Mц╔ns Rullgц╔rd <mans@mansr.com>
    Closes: https://lore.kernel.org/linux-mtd/yw1xfs6j4k6q.fsf@mansr.com/
    Reported-by: Martin Hundebц╦ll <martin@geanix.com>
    Closes: https://lore.kernel.org/linux-mtd/9d0c42fcde79bfedfe5b05d6a4e9fdef71d3dd52.camel@geanix.com/
    Fixes: 003fe4b9545b ("mtd: rawnand: Support for sequential cache reads")
    Cc: stable@vger.kernel.org
    Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Tested-by: Martin Hundebц╦ll <martin@geanix.com>
    Link: https://lore.kernel.org/linux-mtd/20231215123208.516590-3-miquel.raynal@bootlin.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: rawnand: Prevent crossing LUN boundaries during sequential reads [+ + +]

Author: Miquel Raynal <miquel.raynal@bootlin.com>
Date:   Fri Dec 15 13:32:05 2023 +0100

    mtd: rawnand: Prevent crossing LUN boundaries during sequential reads
    
    commit bbcd80f53a5e8c27c2511f539fec8c373f500cf4 upstream.
    
    The ONFI specification states that devices do not need to support
    sequential reads across LUN boundaries. In order to prevent such event
    from happening and possibly failing, let's introduce the concept of
    "pause" in the sequential read to handle these cases. The first/last
    pages remain the same but any time we cross a LUN boundary we will end
    and restart (if relevant) the sequential read operation.
    
    Cc: stable@vger.kernel.org
    Fixes: 003fe4b9545b ("mtd: rawnand: Support for sequential cache reads")
    Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Tested-by: Martin Hundebц╦ll <martin@geanix.com>
    Link: https://lore.kernel.org/linux-mtd/20231215123208.516590-2-miquel.raynal@bootlin.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: rawnand: Prevent sequential reads with on-die ECC engines [+ + +]

Author: Miquel Raynal <miquel.raynal@bootlin.com>
Date:   Fri Dec 15 13:32:07 2023 +0100

    mtd: rawnand: Prevent sequential reads with on-die ECC engines
    
    commit a62c4597953fe54c6af04166a5e2872efd0e1490 upstream.
    
    Some devices support sequential reads when using the on-die ECC engines,
    some others do not. It is a bit hard to know which ones will break other
    than experimentally, so in order to avoid such a difficult and painful
    task, let's just pretend all devices should avoid using this
    optimization when configured like this.
    
    Cc: stable@vger.kernel.org
    Fixes: 003fe4b9545b ("mtd: rawnand: Support for sequential cache reads")
    Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Tested-by: Martin Hundebц╦ll <martin@geanix.com>
    Link: https://lore.kernel.org/linux-mtd/20231215123208.516590-4-miquel.raynal@bootlin.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nbd: always initialize struct msghdr completely [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Jan 12 13:26:57 2024 +0000

    nbd: always initialize struct msghdr completely
    
    commit 78fbb92af27d0982634116c7a31065f24d092826 upstream.
    
    syzbot complains that msg->msg_get_inq value can be uninitialized [1]
    
    struct msghdr got many new fields recently, we should always make
    sure their values is zero by default.
    
    [1]
     BUG: KMSAN: uninit-value in tcp_recvmsg+0x686/0xac0 net/ipv4/tcp.c:2571
      tcp_recvmsg+0x686/0xac0 net/ipv4/tcp.c:2571
      inet_recvmsg+0x131/0x580 net/ipv4/af_inet.c:879
      sock_recvmsg_nosec net/socket.c:1044 [inline]
      sock_recvmsg+0x12b/0x1e0 net/socket.c:1066
      __sock_xmit+0x236/0x5c0 drivers/block/nbd.c:538
      nbd_read_reply drivers/block/nbd.c:732 [inline]
      recv_work+0x262/0x3100 drivers/block/nbd.c:863
      process_one_work kernel/workqueue.c:2627 [inline]
      process_scheduled_works+0x104e/0x1e70 kernel/workqueue.c:2700
      worker_thread+0xf45/0x1490 kernel/workqueue.c:2781
      kthread+0x3ed/0x540 kernel/kthread.c:388
      ret_from_fork+0x66/0x80 arch/x86/kernel/process.c:147
      ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
    
    Local variable msg created at:
      __sock_xmit+0x4c/0x5c0 drivers/block/nbd.c:513
      nbd_read_reply drivers/block/nbd.c:732 [inline]
      recv_work+0x262/0x3100 drivers/block/nbd.c:863
    
    CPU: 1 PID: 7465 Comm: kworker/u5:1 Not tainted 6.7.0-rc7-syzkaller-00041-gf016f7547aee #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
    Workqueue: nbd5-recv recv_work
    
    Fixes: f94fd25cb0aa ("tcp: pass back data left in socket after receive")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: stable@vger.kernel.org
    Cc: Josef Bacik <josef@toxicpanda.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: linux-block@vger.kernel.org
    Cc: nbd@other.debian.org
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20240112132657.647112-1-edumazet@google.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/bpf: Avoid unused "sin_addr_len" warning when CONFIG_CGROUP_BPF is not set [+ + +]

Author: Martin KaFai Lau <martin.lau@kernel.org>
Date:   Fri Oct 13 11:57:02 2023 -0700

    net/bpf: Avoid unused "sin_addr_len" warning when CONFIG_CGROUP_BPF is not set
    
    commit 9c1292eca243821249fa99f40175b0660d9329e3 upstream.
    
    It was reported that there is a compiler warning on the unused variable
    "sin_addr_len" in af_inet.c when CONFIG_CGROUP_BPF is not set.
    This patch is to address it similar to the ipv6 counterpart
    in inet6_getname(). It is to "return sin_addr_len;"
    instead of "return sizeof(*sin);".
    
    Fixes: fefba7d1ae19 ("bpf: Propagate modified uaddrlen from cgroup sockaddr programs")
    Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Link: https://lore.kernel.org/bpf/20231013185702.3993710-1-martin.lau@linux.dev
    Closes: https://lore.kernel.org/bpf/20231013114007.2fb09691@canb.auug.org.au/
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/mlx5: Bridge, Enable mcast in smfs steering mode [+ + +]

Author: Erez Shitrit <erezsh@nvidia.com>
Date:   Mon Aug 28 14:20:00 2023 +0300

    net/mlx5: Bridge, Enable mcast in smfs steering mode
    
    [ Upstream commit 653b7eb9d74426397c95061fd57da3063625af65 ]
    
    In order to have mcast offloads the driver needs the following:
    It should know if that mcast comes from wire port, in addition the flow
    should not be marked as any specific source, that way it will give the
    flexibility for the driver not to be depended on the way iterator
    implemented in the FW.
    
    Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
    Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
    Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Stable-dep-of: ec7cc38ef9f8 ("net/mlx5: Bridge, fix multicast packets sent to uplink")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5: Bridge, fix multicast packets sent to uplink [+ + +]

Author: Moshe Shemesh <moshe@nvidia.com>
Date:   Sat Dec 30 22:40:37 2023 +0200

    net/mlx5: Bridge, fix multicast packets sent to uplink
    
    [ Upstream commit ec7cc38ef9f83553102e84c82536971a81630739 ]
    
    To enable multicast packets which are offloaded in bridge multicast
    offload mode to be sent also to uplink, FTE bit uplink_hairpin_en should
    be set. Add this bit to FTE for the bridge multicast offload rules.
    
    Fixes: 18c2916cee12 ("net/mlx5: Bridge, snoop igmp/mld packets")
    Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
    Reviewed-by: Gal Pressman <gal@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5: DR, Can't go to uplink vport on RX rule [+ + +]

Author: Yevgeny Kliteynik <kliteyn@nvidia.com>
Date:   Sun Dec 17 13:20:36 2023 +0200

    net/mlx5: DR, Can't go to uplink vport on RX rule
    
    [ Upstream commit 5b2a2523eeea5f03d39a9d1ff1bad2e9f8eb98d2 ]
    
    Go-To-Vport action on RX is not allowed when the vport is uplink.
    In such case, the packet should be dropped.
    
    Fixes: 9db810ed2d37 ("net/mlx5: DR, Expose steering action functionality")
    Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
    Reviewed-by: Erez Shitrit <erezsh@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5: DR, Use the right GVMI number for drop action [+ + +]

Author: Yevgeny Kliteynik <kliteyn@nvidia.com>
Date:   Sun Dec 17 11:24:08 2023 +0200

    net/mlx5: DR, Use the right GVMI number for drop action
    
    [ Upstream commit 5665954293f13642f9c052ead83c1e9d8cff186f ]
    
    When FW provides ICM addresses for drop RX/TX, the provided capability
    is 64 bits that contain its GVMI as well as the ICM address itself.
    In case of TX DROP this GVMI is different from the GVMI that the
    domain is operating on.
    
    This patch fixes the action to use these GVMI IDs, as provided by FW.
    
    Fixes: 9db810ed2d37 ("net/mlx5: DR, Expose steering action functionality")
    Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5: Fix a WARN upon a callback command failure [+ + +]

Author: Yishai Hadas <yishaih@nvidia.com>
Date:   Sun Dec 31 15:19:50 2023 +0200

    net/mlx5: Fix a WARN upon a callback command failure
    
    [ Upstream commit cc8091587779cfaddb6b29c9e9edb9079a282cad ]
    
    The below WARN [1] is reported once a callback command failed.
    
    As a callback runs under an interrupt context, needs to use the IRQ
    save/restore variant.
    
    [1]
    DEBUG_LOCKS_WARN_ON(lockdep_hardirq_context())
    WARNING: CPU: 15 PID: 0 at kernel/locking/lockdep.c:4353
                  lockdep_hardirqs_on_prepare+0x11b/0x180
    Modules linked in: vhost_net vhost tap mlx5_vfio_pci
    vfio_pci vfio_pci_core vfio_iommu_type1 vfio mlx5_vdpa vringh
    vhost_iotlb vdpa nfnetlink_cttimeout openvswitch nsh ip6table_mangle
    ip6table_nat ip6table_filter ip6_tables iptable_mangle
    xt_conntrackxt_MASQUERADE nf_conntrack_netlink nfnetlink
    xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5
    auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi
    scsi_transport_iscsi rdma_cm iw_cm ib_umad ib_ipoib ib_cm
    mlx5_ib ib_uverbs ib_core fuse mlx5_core
    CPU: 15 PID: 0 Comm: swapper/15 Tainted: G        W 6.7.0-rc4+ #1587
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
    rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
    RIP: 0010:lockdep_hardirqs_on_prepare+0x11b/0x180
    Code: 00 5b c3 c3 e8 e6 0d 58 00 85 c0 74 d6 8b 15 f0 c3
          76 01 85 d2 75 cc 48 c7 c6 04 a5 3b 82 48 c7 c7 f1
          e9 39 82 e8 95 12 f9 ff <0f> 0b 5b c3 e8 bc 0d 58 00
          85 c0 74 ac 8b 3d c6 c3 76 01 85 ff 75
    RSP: 0018:ffffc900003ecd18 EFLAGS: 00010086
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000027
    RDX: 0000000000000000 RSI: ffff88885fbdb880 RDI: ffff88885fbdb888
    RBP: 00000000ffffff87 R08: 0000000000000000 R09: 0000000000000001
    R10: 0000000000000000 R11: 284e4f5f4e524157 R12: 00000000002c9aa1
    R13: ffff88810aace980 R14: ffff88810aace9b8 R15: 0000000000000003
    FS:  0000000000000000(0000) GS:ffff88885fbc0000(0000)
    knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f731436f4c8 CR3: 000000010aae6001 CR4: 0000000000372eb0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <IRQ>
    ? __warn+0x81/0x170
    ? lockdep_hardirqs_on_prepare+0x11b/0x180
    ? report_bug+0xf8/0x1c0
    ? handle_bug+0x3f/0x70
    ? exc_invalid_op+0x13/0x60
    ? asm_exc_invalid_op+0x16/0x20
    ? lockdep_hardirqs_on_prepare+0x11b/0x180
    ? lockdep_hardirqs_on_prepare+0x11b/0x180
    trace_hardirqs_on+0x4a/0xa0
    raw_spin_unlock_irq+0x24/0x30
    cmd_status_err+0xc0/0x1a0 [mlx5_core]
    cmd_status_err+0x1a0/0x1a0 [mlx5_core]
    mlx5_cmd_exec_cb_handler+0x24/0x40 [mlx5_core]
    mlx5_cmd_comp_handler+0x129/0x4b0 [mlx5_core]
    cmd_comp_notifier+0x1a/0x20 [mlx5_core]
    notifier_call_chain+0x3e/0xe0
    atomic_notifier_call_chain+0x5f/0x130
    mlx5_eq_async_int+0xe7/0x200 [mlx5_core]
    notifier_call_chain+0x3e/0xe0
    atomic_notifier_call_chain+0x5f/0x130
    irq_int_handler+0x11/0x20 [mlx5_core]
    __handle_irq_event_percpu+0x99/0x220
    ? tick_irq_enter+0x5d/0x80
    handle_irq_event_percpu+0xf/0x40
    handle_irq_event+0x3a/0x60
    handle_edge_irq+0xa2/0x1c0
    __common_interrupt+0x55/0x140
    common_interrupt+0x7d/0xa0
    </IRQ>
    <TASK>
    asm_common_interrupt+0x22/0x40
    RIP: 0010:default_idle+0x13/0x20
    Code: c0 08 00 00 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 72 ff
    ff ff cc cc cc cc 8b 05 ea 08 25 01 85 c0 7e 07 0f 00 2d 7f b0 26 00 fb
    f4 <fa> c3 90 66 2e 0f 1f 84 00 00 00 00 00 65 48 8b 04 25 80 d0 02 00
    RSP: 0018:ffffc9000010fec8 EFLAGS: 00000242
    RAX: 0000000000000001 RBX: 000000000000000f RCX: 4000000000000000
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff811c410c
    RBP: ffffffff829478c0 R08: 0000000000000001 R09: 0000000000000001
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
    R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
    ? do_idle+0x1ec/0x210
    default_idle_call+0x6c/0x90
    do_idle+0x1ec/0x210
    cpu_startup_entry+0x26/0x30
    start_secondary+0x11b/0x150
    secondary_startup_64_no_verify+0x165/0x16b
    </TASK>
    irq event stamp: 833284
    hardirqs last  enabled at (833283): [<ffffffff811c410c>]
    do_idle+0x1ec/0x210
    hardirqs last disabled at (833284): [<ffffffff81daf9ef>]
    common_interrupt+0xf/0xa0
    softirqs last  enabled at (833224): [<ffffffff81dc199f>]
    __do_softirq+0x2bf/0x40e
    softirqs last disabled at (833177): [<ffffffff81178ddf>]
    irq_exit_rcu+0x7f/0xa0
    
    Fixes: 34f46ae0d4b3 ("net/mlx5: Add command failures data to debugfs")
    Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
    Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5: Use mlx5 device constant for selecting CQ period mode for ASO [+ + +]

Author: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Date:   Tue Nov 28 14:01:54 2023 -0800

    net/mlx5: Use mlx5 device constant for selecting CQ period mode for ASO
    
    [ Upstream commit 20cbf8cbb827094197f3b17db60d71449415db1e ]
    
    mlx5 devices have specific constants for choosing the CQ period mode. These
    constants do not have to match the constants used by the kernel software
    API for DIM period mode selection.
    
    Fixes: cdd04f4d4d71 ("net/mlx5: Add support to create SQ and CQ for ASO")
    Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
    Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5e: Allow software parsing when IPsec crypto is enabled [+ + +]

Author: Leon Romanovsky <leon@kernel.org>
Date:   Tue Dec 12 13:52:55 2023 +0200

    net/mlx5e: Allow software parsing when IPsec crypto is enabled
    
    [ Upstream commit 20f5468a7988dedd94a57ba8acd65ebda6a59723 ]
    
    All ConnectX devices have software parsing capability enabled, but it is
    more correct to set allow_swp only if capability exists, which for IPsec
    means that crypto offload is supported.
    
    Fixes: 2451da081a34 ("net/mlx5: Unify device IPsec capabilities check")
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5e: fix a double-free in arfs_create_groups [+ + +]

Author: Zhipeng Lu <alexious@zju.edu.cn>
Date:   Wed Jan 17 15:17:36 2024 +0800

    net/mlx5e: fix a double-free in arfs_create_groups
    
    [ Upstream commit 3c6d5189246f590e4e1f167991558bdb72a4738b ]
    
    When `in` allocated by kvzalloc fails, arfs_create_groups will free
    ft->g and return an error. However, arfs_create_table, the only caller of
    arfs_create_groups, will hold this error and call to
    mlx5e_destroy_flow_table, in which the ft->g will be freed again.
    
    Fixes: 1cabe6b0965e ("net/mlx5e: Create aRFS flow tables")
    Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5e: fix a potential double-free in fs_any_create_groups [+ + +]

Author: Dinghao Liu <dinghao.liu@zju.edu.cn>
Date:   Tue Nov 28 17:29:01 2023 +0800

    net/mlx5e: fix a potential double-free in fs_any_create_groups
    
    [ Upstream commit aef855df7e1bbd5aa4484851561211500b22707e ]
    
    When kcalloc() for ft->g succeeds but kvzalloc() for in fails,
    fs_any_create_groups() will free ft->g. However, its caller
    fs_any_create_table() will free ft->g again through calling
    mlx5e_destroy_flow_table(), which will lead to a double-free.
    Fix this by setting ft->g to NULL in fs_any_create_groups().
    
    Fixes: 0f575c20bf06 ("net/mlx5e: Introduce Flow Steering ANY API")
    Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
    Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5e: Fix operation precedence bug in port timestamping napi_poll context [+ + +]

Author: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Date:   Wed Nov 22 18:32:11 2023 -0800

    net/mlx5e: Fix operation precedence bug in port timestamping napi_poll context
    
    [ Upstream commit 3876638b2c7ebb2c9d181de1191db0de8cac143a ]
    
    Indirection (*) is of lower precedence than postfix increment (++). Logic
    in napi_poll context would cause an out-of-bound read by first increment
    the pointer address by byte address space and then dereference the value.
    Rather, the intended logic was to dereference first and then increment the
    underlying value.
    
    Fixes: 92214be5979c ("net/mlx5e: Update doorbell for port timestamping CQ before the software counter")
    Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
    Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5e: Fix peer flow lists handling [+ + +]

Author: Vlad Buslov <vladbu@nvidia.com>
Date:   Fri Nov 10 11:10:22 2023 +0100

    net/mlx5e: Fix peer flow lists handling
    
    [ Upstream commit d76fdd31f953ac5046555171620f2562715e9b71 ]
    
    The cited change refactored mlx5e_tc_del_fdb_peer_flow() to only clear DUP
    flag when list of peer flows has become empty. However, if any concurrent
    user holds a reference to a peer flow (for example, the neighbor update
    workqueue task is updating peer flow's parent encap entry concurrently),
    then the flow will not be removed from the peer list and, consecutively,
    DUP flag will remain set. Since mlx5e_tc_del_fdb_peers_flow() calls
    mlx5e_tc_del_fdb_peer_flow() for every possible peer index the algorithm
    will try to remove the flow from eswitch instances that it has never peered
    with causing either NULL pointer dereference when trying to remove the flow
    peer list head of peer_index that was never initialized or a warning if the
    list debug config is enabled[0].
    
    Fix the issue by always removing the peer flow from the list even when not
    releasing the last reference to it.
    
    [0]:
    
    [ 3102.985806] ------------[ cut here ]------------
    [ 3102.986223] list_del corruption, ffff888139110698->next is NULL
    [ 3102.986757] WARNING: CPU: 2 PID: 22109 at lib/list_debug.c:53 __list_del_entry_valid_or_report+0x4f/0xc0
    [ 3102.987561] Modules linked in: act_ct nf_flow_table bonding act_tunnel_key act_mirred act_skbedit vxlan cls_matchall nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vhost_iotlb vdpa openvswitch nsh xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcg
    ss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core mlx5_core [last unloaded: bonding]
    [ 3102.991113] CPU: 2 PID: 22109 Comm: revalidator28 Not tainted 6.6.0-rc6+ #3
    [ 3102.991695] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
    [ 3102.992605] RIP: 0010:__list_del_entry_valid_or_report+0x4f/0xc0
    [ 3102.993122] Code: 39 c2 74 56 48 8b 32 48 39 fe 75 62 48 8b 51 08 48 39 f2 75 73 b8 01 00 00 00 c3 48 89 fe 48 c7 c7 48 fd 0a 82 e8 41 0b ad ff <0f> 0b 31 c0 c3 48 89 fe 48 c7 c7 70 fd 0a 82 e8 2d 0b ad ff 0f 0b
    [ 3102.994615] RSP: 0018:ffff8881383e7710 EFLAGS: 00010286
    [ 3102.995078] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000
    [ 3102.995670] RDX: 0000000000000001 RSI: ffff88885f89b640 RDI: ffff88885f89b640
    [ 3102.997188] DEL flow 00000000be367878 on port 0
    [ 3102.998594] RBP: dead000000000122 R08: 0000000000000000 R09: c0000000ffffdfff
    [ 3102.999604] R10: 0000000000000008 R11: ffff8881383e7598 R12: dead000000000100
    [ 3103.000198] R13: 0000000000000002 R14: ffff888139110000 R15: ffff888101901240
    [ 3103.000790] FS:  00007f424cde4700(0000) GS:ffff88885f880000(0000) knlGS:0000000000000000
    [ 3103.001486] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 3103.001986] CR2: 00007fd42e8dcb70 CR3: 000000011e68a003 CR4: 0000000000370ea0
    [ 3103.002596] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 3103.003190] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 3103.003787] Call Trace:
    [ 3103.004055]  <TASK>
    [ 3103.004297]  ? __warn+0x7d/0x130
    [ 3103.004623]  ? __list_del_entry_valid_or_report+0x4f/0xc0
    [ 3103.005094]  ? report_bug+0xf1/0x1c0
    [ 3103.005439]  ? console_unlock+0x4a/0xd0
    [ 3103.005806]  ? handle_bug+0x3f/0x70
    [ 3103.006149]  ? exc_invalid_op+0x13/0x60
    [ 3103.006531]  ? asm_exc_invalid_op+0x16/0x20
    [ 3103.007430]  ? __list_del_entry_valid_or_report+0x4f/0xc0
    [ 3103.007910]  mlx5e_tc_del_fdb_peers_flow+0xcf/0x240 [mlx5_core]
    [ 3103.008463]  mlx5e_tc_del_flow+0x46/0x270 [mlx5_core]
    [ 3103.008944]  mlx5e_flow_put+0x26/0x50 [mlx5_core]
    [ 3103.009401]  mlx5e_delete_flower+0x25f/0x380 [mlx5_core]
    [ 3103.009901]  tc_setup_cb_destroy+0xab/0x180
    [ 3103.010292]  fl_hw_destroy_filter+0x99/0xc0 [cls_flower]
    [ 3103.010779]  __fl_delete+0x2d4/0x2f0 [cls_flower]
    [ 3103.011207]  fl_delete+0x36/0x80 [cls_flower]
    [ 3103.011614]  tc_del_tfilter+0x56f/0x750
    [ 3103.011982]  rtnetlink_rcv_msg+0xff/0x3a0
    [ 3103.012362]  ? netlink_ack+0x1c7/0x4e0
    [ 3103.012719]  ? rtnl_calcit.isra.44+0x130/0x130
    [ 3103.013134]  netlink_rcv_skb+0x54/0x100
    [ 3103.013533]  netlink_unicast+0x1ca/0x2b0
    [ 3103.013902]  netlink_sendmsg+0x361/0x4d0
    [ 3103.014269]  __sock_sendmsg+0x38/0x60
    [ 3103.014643]  ____sys_sendmsg+0x1f2/0x200
    [ 3103.015018]  ? copy_msghdr_from_user+0x72/0xa0
    [ 3103.015265]  ___sys_sendmsg+0x87/0xd0
    [ 3103.016608]  ? copy_msghdr_from_user+0x72/0xa0
    [ 3103.017014]  ? ___sys_recvmsg+0x9b/0xd0
    [ 3103.017381]  ? ttwu_do_activate.isra.137+0x58/0x180
    [ 3103.017821]  ? wake_up_q+0x49/0x90
    [ 3103.018157]  ? futex_wake+0x137/0x160
    [ 3103.018521]  ? __sys_sendmsg+0x51/0x90
    [ 3103.018882]  __sys_sendmsg+0x51/0x90
    [ 3103.019230]  ? exit_to_user_mode_prepare+0x56/0x130
    [ 3103.019670]  do_syscall_64+0x3c/0x80
    [ 3103.020017]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
    [ 3103.020469] RIP: 0033:0x7f4254811ef4
    [ 3103.020816] Code: 89 f3 48 83 ec 10 48 89 7c 24 08 48 89 14 24 e8 42 eb ff ff 48 8b 14 24 41 89 c0 48 89 de 48 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 30 44 89 c7 48 89 04 24 e8 78 eb ff ff 48 8b
    [ 3103.022290] RSP: 002b:00007f424cdd9480 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
    [ 3103.022970] RAX: ffffffffffffffda RBX: 00007f424cdd9510 RCX: 00007f4254811ef4
    [ 3103.023564] RDX: 0000000000000000 RSI: 00007f424cdd9510 RDI: 0000000000000012
    [ 3103.024158] RBP: 00007f424cdda238 R08: 0000000000000000 R09: 00007f41d801a4b0
    [ 3103.024748] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000001
    [ 3103.025341] R13: 00007f424cdd9510 R14: 00007f424cdda240 R15: 00007f424cdd99a0
    [ 3103.025931]  </TASK>
    [ 3103.026182] ---[ end trace 0000000000000000 ]---
    [ 3103.027033] ------------[ cut here ]------------
    
    Fixes: 9be6c21fdcf8 ("net/mlx5e: Handle offloads flows per peer")
    Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
    Reviewed-by: Mark Bloch <mbloch@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5e: Ignore IPsec replay window values on sender side [+ + +]

Author: Leon Romanovsky <leon@kernel.org>
Date:   Sun Nov 26 11:08:10 2023 +0200

    net/mlx5e: Ignore IPsec replay window values on sender side
    
    [ Upstream commit 315a597f9bcfe7fe9980985031413457bee95510 ]
    
    XFRM stack doesn't prevent from users to configure replay window
    in TX side and strongswan sets replay_window to be 1. It causes
    to failures in validation logic when trying to offload the SA.
    
    Replay window is not relevant in TX side and should be ignored.
    
    Fixes: cded6d80129b ("net/mlx5e: Store replay window in XFRM attributes")
    Signed-off-by: Aya Levin <ayal@nvidia.com>
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/rds: Fix UBSAN: array-index-out-of-bounds in rds_cmsg_recv [+ + +]

Author: Sharath Srinivasan <sharath.srinivasan@oracle.com>
Date:   Fri Jan 19 17:48:39 2024 -0800

    net/rds: Fix UBSAN: array-index-out-of-bounds in rds_cmsg_recv
    
    [ Upstream commit 13e788deb7348cc88df34bed736c3b3b9927ea52 ]
    
    Syzcaller UBSAN crash occurs in rds_cmsg_recv(),
    which reads inc->i_rx_lat_trace[j + 1] with index 4 (3 + 1),
    but with array size of 4 (RDS_RX_MAX_TRACES).
    Here 'j' is assigned from rs->rs_rx_trace[i] and in-turn from
    trace.rx_trace_pos[i] in rds_recv_track_latency(),
    with both arrays sized 3 (RDS_MSG_RX_DGRAM_TRACE_MAX). So fix the
    off-by-one bounds check in rds_recv_track_latency() to prevent
    a potential crash in rds_cmsg_recv().
    
    Found by syzcaller:
    =================================================================
    UBSAN: array-index-out-of-bounds in net/rds/recv.c:585:39
    index 4 is out of range for type 'u64 [4]'
    CPU: 1 PID: 8058 Comm: syz-executor228 Not tainted 6.6.0-gd2f51b3516da #1
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
    BIOS 1.15.0-1 04/01/2014
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:88 [inline]
     dump_stack_lvl+0x136/0x150 lib/dump_stack.c:106
     ubsan_epilogue lib/ubsan.c:217 [inline]
     __ubsan_handle_out_of_bounds+0xd5/0x130 lib/ubsan.c:348
     rds_cmsg_recv+0x60d/0x700 net/rds/recv.c:585
     rds_recvmsg+0x3fb/0x1610 net/rds/recv.c:716
     sock_recvmsg_nosec net/socket.c:1044 [inline]
     sock_recvmsg+0xe2/0x160 net/socket.c:1066
     __sys_recvfrom+0x1b6/0x2f0 net/socket.c:2246
     __do_sys_recvfrom net/socket.c:2264 [inline]
     __se_sys_recvfrom net/socket.c:2260 [inline]
     __x64_sys_recvfrom+0xe0/0x1b0 net/socket.c:2260
     do_syscall_x64 arch/x86/entry/common.c:51 [inline]
     do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
     entry_SYSCALL_64_after_hwframe+0x63/0x6b
    ==================================================================
    
    Fixes: 3289025aedc0 ("RDS: add receive message trace used by application")
    Reported-by: Chenyuan Yang <chenyuan0y@gmail.com>
    Closes: https://lore.kernel.org/linux-rdma/CALGdzuoVdq-wtQ4Az9iottBqC5cv9ZhcE5q8N7LfYFvkRsOVcw@mail.gmail.com/
    Signed-off-by: Sharath Srinivasan <sharath.srinivasan@oracle.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/sched: flower: Fix chain template offload [+ + +]

Author: Ido Schimmel <idosch@nvidia.com>
Date:   Mon Jan 22 15:28:43 2024 +0200

    net/sched: flower: Fix chain template offload
    
    [ Upstream commit 32f2a0afa95fae0d1ceec2ff06e0e816939964b8 ]
    
    When a qdisc is deleted from a net device the stack instructs the
    underlying driver to remove its flow offload callback from the
    associated filter block using the 'FLOW_BLOCK_UNBIND' command. The stack
    then continues to replay the removal of the filters in the block for
    this driver by iterating over the chains in the block and invoking the
    'reoffload' operation of the classifier being used. In turn, the
    classifier in its 'reoffload' operation prepares and emits a
    'FLOW_CLS_DESTROY' command for each filter.
    
    However, the stack does not do the same for chain templates and the
    underlying driver never receives a 'FLOW_CLS_TMPLT_DESTROY' command when
    a qdisc is deleted. This results in a memory leak [1] which can be
    reproduced using [2].
    
    Fix by introducing a 'tmplt_reoffload' operation and have the stack
    invoke it with the appropriate arguments as part of the replay.
    Implement the operation in the sole classifier that supports chain
    templates (flower) by emitting the 'FLOW_CLS_TMPLT_{CREATE,DESTROY}'
    command based on whether a flow offload callback is being bound to a
    filter block or being unbound from one.
    
    As far as I can tell, the issue happens since cited commit which
    reordered tcf_block_offload_unbind() before tcf_block_flush_all_chains()
    in __tcf_block_put(). The order cannot be reversed as the filter block
    is expected to be freed after flushing all the chains.
    
    [1]
    unreferenced object 0xffff888107e28800 (size 2048):
      comm "tc", pid 1079, jiffies 4294958525 (age 3074.287s)
      hex dump (first 32 bytes):
        b1 a6 7c 11 81 88 ff ff e0 5b b3 10 81 88 ff ff  ..|......[......
        01 00 00 00 00 00 00 00 e0 aa b0 84 ff ff ff ff  ................
      backtrace:
        [<ffffffff81c06a68>] __kmem_cache_alloc_node+0x1e8/0x320
        [<ffffffff81ab374e>] __kmalloc+0x4e/0x90
        [<ffffffff832aec6d>] mlxsw_sp_acl_ruleset_get+0x34d/0x7a0
        [<ffffffff832bc195>] mlxsw_sp_flower_tmplt_create+0x145/0x180
        [<ffffffff832b2e1a>] mlxsw_sp_flow_block_cb+0x1ea/0x280
        [<ffffffff83a10613>] tc_setup_cb_call+0x183/0x340
        [<ffffffff83a9f85a>] fl_tmplt_create+0x3da/0x4c0
        [<ffffffff83a22435>] tc_ctl_chain+0xa15/0x1170
        [<ffffffff838a863c>] rtnetlink_rcv_msg+0x3cc/0xed0
        [<ffffffff83ac87f0>] netlink_rcv_skb+0x170/0x440
        [<ffffffff83ac6270>] netlink_unicast+0x540/0x820
        [<ffffffff83ac6e28>] netlink_sendmsg+0x8d8/0xda0
        [<ffffffff83793def>] ____sys_sendmsg+0x30f/0xa80
        [<ffffffff8379d29a>] ___sys_sendmsg+0x13a/0x1e0
        [<ffffffff8379d50c>] __sys_sendmsg+0x11c/0x1f0
        [<ffffffff843b9ce0>] do_syscall_64+0x40/0xe0
    unreferenced object 0xffff88816d2c0400 (size 1024):
      comm "tc", pid 1079, jiffies 4294958525 (age 3074.287s)
      hex dump (first 32 bytes):
        40 00 00 00 00 00 00 00 57 f6 38 be 00 00 00 00  @.......W.8.....
        10 04 2c 6d 81 88 ff ff 10 04 2c 6d 81 88 ff ff  ..,m......,m....
      backtrace:
        [<ffffffff81c06a68>] __kmem_cache_alloc_node+0x1e8/0x320
        [<ffffffff81ab36c1>] __kmalloc_node+0x51/0x90
        [<ffffffff81a8ed96>] kvmalloc_node+0xa6/0x1f0
        [<ffffffff82827d03>] bucket_table_alloc.isra.0+0x83/0x460
        [<ffffffff82828d2b>] rhashtable_init+0x43b/0x7c0
        [<ffffffff832aed48>] mlxsw_sp_acl_ruleset_get+0x428/0x7a0
        [<ffffffff832bc195>] mlxsw_sp_flower_tmplt_create+0x145/0x180
        [<ffffffff832b2e1a>] mlxsw_sp_flow_block_cb+0x1ea/0x280
        [<ffffffff83a10613>] tc_setup_cb_call+0x183/0x340
        [<ffffffff83a9f85a>] fl_tmplt_create+0x3da/0x4c0
        [<ffffffff83a22435>] tc_ctl_chain+0xa15/0x1170
        [<ffffffff838a863c>] rtnetlink_rcv_msg+0x3cc/0xed0
        [<ffffffff83ac87f0>] netlink_rcv_skb+0x170/0x440
        [<ffffffff83ac6270>] netlink_unicast+0x540/0x820
        [<ffffffff83ac6e28>] netlink_sendmsg+0x8d8/0xda0
        [<ffffffff83793def>] ____sys_sendmsg+0x30f/0xa80
    
    [2]
     # tc qdisc add dev swp1 clsact
     # tc chain add dev swp1 ingress proto ip chain 1 flower dst_ip 0.0.0.0/32
     # tc qdisc del dev swp1 clsact
     # devlink dev reload pci/0000:06:00.0
    
    Fixes: bbf73830cd48 ("net: sched: traverse chains in block with tcf_get_next_chain()")
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/smc: fix illegal rmb_desc access in SMC-D connection dump [+ + +]

Author: Wen Gu <guwen@linux.alibaba.com>
Date:   Thu Jan 18 12:32:10 2024 +0800

    net/smc: fix illegal rmb_desc access in SMC-D connection dump
    
    [ Upstream commit dbc153fd3c142909e564bb256da087e13fbf239c ]
    
    A crash was found when dumping SMC-D connections. It can be reproduced
    by following steps:
    
    - run nginx/wrk test:
      smc_run nginx
      smc_run wrk -t 16 -c 1000 -d <duration> -H 'Connection: Close' <URL>
    
    - continuously dump SMC-D connections in parallel:
      watch -n 1 'smcss -D'
    
     BUG: kernel NULL pointer dereference, address: 0000000000000030
     CPU: 2 PID: 7204 Comm: smcss Kdump: loaded Tainted: G  E      6.7.0+ #55
     RIP: 0010:__smc_diag_dump.constprop.0+0x5e5/0x620 [smc_diag]
     Call Trace:
      <TASK>
      ? __die+0x24/0x70
      ? page_fault_oops+0x66/0x150
      ? exc_page_fault+0x69/0x140
      ? asm_exc_page_fault+0x26/0x30
      ? __smc_diag_dump.constprop.0+0x5e5/0x620 [smc_diag]
      ? __kmalloc_node_track_caller+0x35d/0x430
      ? __alloc_skb+0x77/0x170
      smc_diag_dump_proto+0xd0/0xf0 [smc_diag]
      smc_diag_dump+0x26/0x60 [smc_diag]
      netlink_dump+0x19f/0x320
      __netlink_dump_start+0x1dc/0x300
      smc_diag_handler_dump+0x6a/0x80 [smc_diag]
      ? __pfx_smc_diag_dump+0x10/0x10 [smc_diag]
      sock_diag_rcv_msg+0x121/0x140
      ? __pfx_sock_diag_rcv_msg+0x10/0x10
      netlink_rcv_skb+0x5a/0x110
      sock_diag_rcv+0x28/0x40
      netlink_unicast+0x22a/0x330
      netlink_sendmsg+0x1f8/0x420
      __sock_sendmsg+0xb0/0xc0
      ____sys_sendmsg+0x24e/0x300
      ? copy_msghdr_from_user+0x62/0x80
      ___sys_sendmsg+0x7c/0xd0
      ? __do_fault+0x34/0x160
      ? do_read_fault+0x5f/0x100
      ? do_fault+0xb0/0x110
      ? __handle_mm_fault+0x2b0/0x6c0
      __sys_sendmsg+0x4d/0x80
      do_syscall_64+0x69/0x180
      entry_SYSCALL_64_after_hwframe+0x6e/0x76
    
    It is possible that the connection is in process of being established
    when we dump it. Assumed that the connection has been registered in a
    link group by smc_conn_create() but the rmb_desc has not yet been
    initialized by smc_buf_create(), thus causing the illegal access to
    conn->rmb_desc. So fix it by checking before dump.
    
    Fixes: 4b1b7d3b30a6 ("net/smc: add SMC-D diag support")
    Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
    Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
    Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: fec: fix the unhandled context fault from smmu [+ + +]

Author: Shenwei Wang <shenwei.wang@nxp.com>
Date:   Tue Jan 23 10:51:41 2024 -0600

    net: fec: fix the unhandled context fault from smmu
    
    [ Upstream commit 5e344807735023cd3a67c37a1852b849caa42620 ]
    
    When repeatedly changing the interface link speed using the command below:
    
    ethtool -s eth0 speed 100 duplex full
    ethtool -s eth0 speed 1000 duplex full
    
    The following errors may sometimes be reported by the ARM SMMU driver:
    
    [ 5395.035364] fec 5b040000.ethernet eth0: Link is Down
    [ 5395.039255] arm-smmu 51400000.iommu: Unhandled context fault:
    fsr=0x402, iova=0x00000000, fsynr=0x100001, cbfrsynra=0x852, cb=2
    [ 5398.108460] fec 5b040000.ethernet eth0: Link is Up - 100Mbps/Full -
    flow control off
    
    It is identified that the FEC driver does not properly stop the TX queue
    during the link speed transitions, and this results in the invalid virtual
    I/O address translations from the SMMU and causes the context faults.
    
    Fixes: dbc64a8ea231 ("net: fec: move calls to quiesce/resume packet processing out of fec_restart()")
    Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com>
    Link: https://lore.kernel.org/r/20240123165141.2008104-1-shenwei.wang@nxp.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: fix removing a namespace with conflicting altnames [+ + +]

Author: Jakub Kicinski <kuba@kernel.org>
Date:   Thu Jan 18 16:58:59 2024 -0800

    net: fix removing a namespace with conflicting altnames
    
    [ Upstream commit d09486a04f5da0a812c26217213b89a3b1acf836 ]
    
    Mark reports a BUG() when a net namespace is removed.
    
        kernel BUG at net/core/dev.c:11520!
    
    Physical interfaces moved outside of init_net get "refunded"
    to init_net when that namespace disappears. The main interface
    name may get overwritten in the process if it would have
    conflicted. We need to also discard all conflicting altnames.
    Recent fixes addressed ensuring that altnames get moved
    with the main interface, which surfaced this problem.
    
    Reported-by: п°п╟я─п╨ п п╬я─п╣п╫п╠п╣я─пЁ <socketpair@gmail.com>
    Link: https://lore.kernel.org/all/CAEmTpZFZ4Sv3KwqFOY2WKDHeZYdi0O7N5H1nTvcGp=SAEavtDg@mail.gmail.com/
    Fixes: 7663d522099e ("net: check for altname conflicts when changing netdev's netns")
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Reviewed-by: Xin Long <lucien.xin@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: micrel: Fix PTP frame parsing for lan8814 [+ + +]

Author: Horatiu Vultur <horatiu.vultur@microchip.com>
Date:   Fri Jan 19 11:47:50 2024 +0100

    net: micrel: Fix PTP frame parsing for lan8814
    
    [ Upstream commit aaf632f7ab6dec57bc9329a438f94504fe8034b9 ]
    
    The HW has the capability to check each frame if it is a PTP frame,
    which domain it is, which ptp frame type it is, different ip address in
    the frame. And if one of these checks fail then the frame is not
    timestamp. Most of these checks were disabled except checking the field
    minorVersionPTP inside the PTP header. Meaning that once a partner sends
    a frame compliant to 8021AS which has minorVersionPTP set to 1, then the
    frame was not timestamp because the HW expected by default a value of 0
    in minorVersionPTP. This is exactly the same issue as on lan8841.
    Fix this issue by removing this check so the userspace can decide on this.
    
    Fixes: ece19502834d ("net: phy: micrel: 1588 support for LAN8814 phy")
    Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
    Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
    Reviewed-by: Divya Koppera <divya.koppera@microchip.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: mvpp2: clear BM pool before initialization [+ + +]

Author: Jenishkumar Maheshbhai Patel <jpatel2@marvell.com>
Date:   Thu Jan 18 19:59:14 2024 -0800

    net: mvpp2: clear BM pool before initialization
    
    [ Upstream commit 9f538b415db862e74b8c5d3abbccfc1b2b6caa38 ]
    
    Register value persist after booting the kernel using
    kexec which results in kernel panic. Thus clear the
    BM pool registers before initialisation to fix the issue.
    
    Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
    Signed-off-by: Jenishkumar Maheshbhai Patel <jpatel2@marvell.com>
    Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
    Link: https://lore.kernel.org/r/20240119035914.2595665-1-jpatel2@marvell.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: stmmac: Prevent DSA tags from breaking COE [+ + +]

Author: Romain Gantois <romain.gantois@bootlin.com>
Date:   Tue Jan 16 13:19:17 2024 +0100

    net: stmmac: Prevent DSA tags from breaking COE
    
    [ Upstream commit c2945c435c999c63e47f337bc7c13c98c21d0bcc ]
    
    Some DSA tagging protocols change the EtherType field in the MAC header
    e.g.  DSA_TAG_PROTO_(DSA/EDSA/BRCM/MTK/RTL4C_A/SJA1105). On TX these tagged
    frames are ignored by the checksum offload engine and IP header checker of
    some stmmac cores.
    
    On RX, the stmmac driver wrongly assumes that checksums have been computed
    for these tagged packets, and sets CHECKSUM_UNNECESSARY.
    
    Add an additional check in the stmmac TX and RX hotpaths so that COE is
    deactivated for packets with ethertypes that will not trigger the COE and
    IP header checks.
    
    Fixes: 6b2c6e4a938f ("net: stmmac: propagate feature flags to vlan")
    Cc:  <stable@vger.kernel.org>
    Reported-by: Richard Tresidder <rtresidd@electromag.com.au>
    Link: https://lore.kernel.org/netdev/e5c6c75f-2dfa-4e50-a1fb-6bf4cdb617c2@electromag.com.au/
    Reported-by: Romain Gantois <romain.gantois@bootlin.com>
    Link: https://lore.kernel.org/netdev/c57283ed-6b9b-b0e6-ee12-5655c1c54495@bootlin.com/
    Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
    Signed-off-by: Romain Gantois <romain.gantois@bootlin.com>
    Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: stmmac: Tx coe sw fallback [+ + +]

Author: Rohan G Thomas <rohan.g.thomas@intel.com>
Date:   Sat Sep 16 14:33:12 2023 +0800

    net: stmmac: Tx coe sw fallback
    
    [ Upstream commit 8452a05b2c633b708dbe3e742f71b24bf21fe42d ]
    
    Add sw fallback of tx checksum calculation for those tx queues that
    don't support tx checksum offloading. DW xGMAC IP can be synthesized
    such that it can support tx checksum offloading only for a few
    initial tx queues. Also as Serge pointed out, for the DW QoS IP, tx
    coe can be individually configured for each tx queue.
    
    So when tx coe is enabled, for any tx queue that doesn't support
    tx coe with 'coe-unsupported' flag set will have a sw fallback
    happen in the driver for tx checksum calculation when any packets to
    be transmitted on these tx queues.
    
    Signed-off-by: Rohan G Thomas <rohan.g.thomas@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: c2945c435c99 ("net: stmmac: Prevent DSA tags from breaking COE")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: stmmac: Wait a bit for the reset to take effect [+ + +]

Author: Bernd Edlinger <bernd.edlinger@hotmail.de>
Date:   Mon Jan 22 19:19:09 2024 +0100

    net: stmmac: Wait a bit for the reset to take effect
    
    [ Upstream commit a5f5eee282a0aae80227697e1d9c811b1726d31d ]
    
    otherwise the synopsys_id value may be read out wrong,
    because the GMAC_VERSION register might still be in reset
    state, for at least 1 us after the reset is de-asserted.
    
    Add a wait for 10 us before continuing to be on the safe side.
    
    > From what have you got that delay value?
    
    Just try and error, with very old linux versions and old gcc versions
    the synopsys_id was read out correctly most of the time (but not always),
    with recent linux versions and recnet gcc versions it was read out
    wrongly most of the time, but again not always.
    I don't have access to the VHDL code in question, so I cannot
    tell why it takes so long to get the correct values, I also do not
    have more than a few hardware samples, so I cannot tell how long
    this timeout must be in worst case.
    Experimentally I can tell that the register is read several times
    as zero immediately after the reset is de-asserted, also adding several
    no-ops is not enough, adding a printk is enough, also udelay(1) seems to
    be enough but I tried that not very often, and I have not access to many
    hardware samples to be 100% sure about the necessary delay.
    And since the udelay here is only executed once per device instance,
    it seems acceptable to delay the boot for 10 us.
    
    BTW: my hardware's synopsys id is 0x37.
    
    Fixes: c5e4ddbdfa11 ("net: stmmac: Add support for optional reset control")
    Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
    Link: https://lore.kernel.org/r/AS8P193MB1285A810BD78C111E7F6AA34E4752@AS8P193MB1285.EURP193.PROD.OUTLOOK.COM
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nf_tables: reject QUEUE/DROP verdict parameters [+ + +]

Author: Florian Westphal <fw@strlen.de>
Date:   Sat Jan 20 22:50:04 2024 +0100

    netfilter: nf_tables: reject QUEUE/DROP verdict parameters
    
    commit f342de4e2f33e0e39165d8639387aa6c19dff660 upstream.
    
    This reverts commit e0abdadcc6e1.
    
    core.c:nf_hook_slow assumes that the upper 16 bits of NF_DROP
    verdicts contain a valid errno, i.e. -EPERM, -EHOSTUNREACH or similar,
    or 0.
    
    Due to the reverted commit, its possible to provide a positive
    value, e.g. NF_ACCEPT (1), which results in use-after-free.
    
    Its not clear to me why this commit was made.
    
    NF_QUEUE is not used by nftables; "queue" rules in nftables
    will result in use of "nft_queue" expression.
    
    If we later need to allow specifiying errno values from userspace
    (do not know why), this has to call NF_DROP_GETERR and check that
    "err <= 0" holds true.
    
    Fixes: e0abdadcc6e1 ("netfilter: nf_tables: accept QUEUE/DROP verdict parameters")
    Cc: stable@vger.kernel.org
    Reported-by: Notselwyn <notselwyn@pwning.tech>
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: nf_tables: restrict anonymous set and map names to 16 bytes [+ + +]

Author: Florian Westphal <fw@strlen.de>
Date:   Fri Jan 19 13:34:32 2024 +0100

    netfilter: nf_tables: restrict anonymous set and map names to 16 bytes
    
    [ Upstream commit b462579b2b86a8f5230543cadd3a4836be27baf7 ]
    
    nftables has two types of sets/maps, one where userspace defines the
    name, and anonymous sets/maps, where userspace defines a template name.
    
    For the latter, kernel requires presence of exactly one "%d".
    nftables uses "__set%d" and "__map%d" for this.  The kernel will
    expand the format specifier and replaces it with the smallest unused
    number.
    
    As-is, userspace could define a template name that allows to move
    the set name past the 256 bytes upperlimit (post-expansion).
    
    I don't see how this could be a problem, but I would prefer if userspace
    cannot do this, so add a limit of 16 bytes for the '%d' template name.
    
    16 bytes is the old total upper limit for set names that existed when
    nf_tables was merged initially.
    
    Fixes: 387454901bd6 ("netfilter: nf_tables: Allow set names of up to 255 chars")
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nf_tables: validate NFPROTO_* family [+ + +]

Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Tue Jan 23 16:38:25 2024 +0100

    netfilter: nf_tables: validate NFPROTO_* family
    
    [ Upstream commit d0009effa8862c20a13af4cb7475d9771b905693 ]
    
    Several expressions explicitly refer to NF_INET_* hook definitions
    from expr->ops->validate, however, family is not validated.
    
    Bail out with EOPNOTSUPP in case they are used from unsupported
    families.
    
    Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
    Fixes: a3c90f7a2323 ("netfilter: nf_tables: flow offload expression")
    Fixes: 2fa841938c64 ("netfilter: nf_tables: introduce routing expression")
    Fixes: 554ced0a6e29 ("netfilter: nf_tables: add support for native socket matching")
    Fixes: ad49d86e07a4 ("netfilter: nf_tables: Add synproxy support")
    Fixes: 4ed8eb6570a4 ("netfilter: nf_tables: Add native tproxy support")
    Fixes: 6c47260250fc ("netfilter: nf_tables: add xfrm expression")
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nft_chain_filter: handle NETDEV_UNREGISTER for inet/ingress basechain [+ + +]

Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Thu Jan 18 10:56:26 2024 +0100

    netfilter: nft_chain_filter: handle NETDEV_UNREGISTER for inet/ingress basechain
    
    commit 01acb2e8666a6529697141a6017edbf206921913 upstream.
    
    Remove netdevice from inet/ingress basechain in case NETDEV_UNREGISTER
    event is reported, otherwise a stale reference to netdevice remains in
    the hook list.
    
    Fixes: 60a3815da702 ("netfilter: add inet ingress support")
    Cc: stable@vger.kernel.org
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: nft_limit: reject configurations that cause integer overflow [+ + +]

Author: Florian Westphal <fw@strlen.de>
Date:   Fri Jan 19 13:11:32 2024 +0100

    netfilter: nft_limit: reject configurations that cause integer overflow
    
    [ Upstream commit c9d9eb9c53d37cdebbad56b91e40baf42d5a97aa ]
    
    Reject bogus configs where internal token counter wraps around.
    This only occurs with very very large requests, such as 17gbyte/s.
    
    Its better to reject this rather than having incorrect ratelimit.
    
    Fixes: d2168e849ebf ("netfilter: nft_limit: add per-byte limiting")
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfs, fscache: Prevent Oops in fscache_put_cache() [+ + +]

Author: Dan Carpenter <dan.carpenter@linaro.org>
Date:   Fri Jan 12 09:59:41 2024 +0300

    netfs, fscache: Prevent Oops in fscache_put_cache()
    
    [ Upstream commit 3be0b3ed1d76c6703b9ee482b55f7e01c369cc68 ]
    
    This function dereferences "cache" and then checks if it's
    IS_ERR_OR_NULL().  Check first, then dereference.
    
    Fixes: 9549332df4ed ("fscache: Implement cache registration")
    Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
    Signed-off-by: David Howells <dhowells@redhat.com>
    Link: https://lore.kernel.org/r/e84bc740-3502-4f16-982a-a40d5676615c@moroto.mountain/ # v2
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netlink: fix potential sleeping issue in mqueue_flush_file [+ + +]

Author: Zhengchao Shao <shaozhengchao@huawei.com>
Date:   Mon Jan 22 09:18:07 2024 +0800

    netlink: fix potential sleeping issue in mqueue_flush_file
    
    [ Upstream commit 234ec0b6034b16869d45128b8cd2dc6ffe596f04 ]
    
    I analyze the potential sleeping issue of the following processes:
    Thread A                                Thread B
    ...                                     netlink_create  //ref = 1
    do_mq_notify                            ...
      sock = netlink_getsockbyfilp          ...     //ref = 2
      info->notify_sock = sock;             ...
    ...                                     netlink_sendmsg
    ...                                       skb = netlink_alloc_large_skb  //skb->head is vmalloced
    ...                                       netlink_unicast
    ...                                         sk = netlink_getsockbyportid //ref = 3
    ...                                         netlink_sendskb
    ...                                           __netlink_sendskb
    ...                                             skb_queue_tail //put skb to sk_receive_queue
    ...                                         sock_put //ref = 2
    ...                                     ...
    ...                                     netlink_release
    ...                                       deferred_put_nlk_sk //ref = 1
    mqueue_flush_file
      spin_lock
      remove_notification
        netlink_sendskb
          sock_put  //ref = 0
            sk_free
              ...
              __sk_destruct
                netlink_sock_destruct
                  skb_queue_purge  //get skb from sk_receive_queue
                    ...
                    __skb_queue_purge_reason
                      kfree_skb_reason
                        __kfree_skb
                        ...
                        skb_release_all
                          skb_release_head_state
                            netlink_skb_destructor
                              vfree(skb->head)  //sleeping while holding spinlock
    
    In netlink_sendmsg, if the memory pointed to by skb->head is allocated by
    vmalloc, and is put to sk_receive_queue queue, also the skb is not freed.
    When the mqueue executes flush, the sleeping bug will occur. Use
    vfree_atomic instead of vfree in netlink_skb_destructor to solve the issue.
    
    Fixes: c05cdb1b864f ("netlink: allow large data transfers from user-space")
    Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
    Link: https://lore.kernel.org/r/20240122011807.2110357-1-shaozhengchao@huawei.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: fix RELEASE_LOCKOWNER [+ + +]

Author: NeilBrown <neilb@suse.de>
Date:   Mon Jan 22 14:58:16 2024 +1100

    nfsd: fix RELEASE_LOCKOWNER
    
    commit edcf9725150e42beeca42d085149f4c88fa97afd upstream.
    
    The test on so_count in nfsd4_release_lockowner() is nonsense and
    harmful.  Revert to using check_for_locks(), changing that to not sleep.
    
    First: harmful.
    As is documented in the kdoc comment for nfsd4_release_lockowner(), the
    test on so_count can transiently return a false positive resulting in a
    return of NFS4ERR_LOCKS_HELD when in fact no locks are held.  This is
    clearly a protocol violation and with the Linux NFS client it can cause
    incorrect behaviour.
    
    If RELEASE_LOCKOWNER is sent while some other thread is still
    processing a LOCK request which failed because, at the time that request
    was received, the given owner held a conflicting lock, then the nfsd
    thread processing that LOCK request can hold a reference (conflock) to
    the lock owner that causes nfsd4_release_lockowner() to return an
    incorrect error.
    
    The Linux NFS client ignores that NFS4ERR_LOCKS_HELD error because it
    never sends NFS4_RELEASE_LOCKOWNER without first releasing any locks, so
    it knows that the error is impossible.  It assumes the lock owner was in
    fact released so it feels free to use the same lock owner identifier in
    some later locking request.
    
    When it does reuse a lock owner identifier for which a previous RELEASE
    failed, it will naturally use a lock_seqid of zero.  However the server,
    which didn't release the lock owner, will expect a larger lock_seqid and
    so will respond with NFS4ERR_BAD_SEQID.
    
    So clearly it is harmful to allow a false positive, which testing
    so_count allows.
    
    The test is nonsense because ... well... it doesn't mean anything.
    
    so_count is the sum of three different counts.
    1/ the set of states listed on so_stateids
    2/ the set of active vfs locks owned by any of those states
    3/ various transient counts such as for conflicting locks.
    
    When it is tested against '2' it is clear that one of these is the
    transient reference obtained by find_lockowner_str_locked().  It is not
    clear what the other one is expected to be.
    
    In practice, the count is often 2 because there is precisely one state
    on so_stateids.  If there were more, this would fail.
    
    In my testing I see two circumstances when RELEASE_LOCKOWNER is called.
    In one case, CLOSE is called before RELEASE_LOCKOWNER.  That results in
    all the lock states being removed, and so the lockowner being discarded
    (it is removed when there are no more references which usually happens
    when the lock state is discarded).  When nfsd4_release_lockowner() finds
    that the lock owner doesn't exist, it returns success.
    
    The other case shows an so_count of '2' and precisely one state listed
    in so_stateid.  It appears that the Linux client uses a separate lock
    owner for each file resulting in one lock state per lock owner, so this
    test on '2' is safe.  For another client it might not be safe.
    
    So this patch changes check_for_locks() to use the (newish)
    find_any_file_locked() so that it doesn't take a reference on the
    nfs4_file and so never calls nfsd_file_put(), and so never sleeps.  With
    this check is it safe to restore the use of check_for_locks() rather
    than testing so_count against the mysterious '2'.
    
    Fixes: ce3c4ad7f4ce ("NFSD: Fix possible sleep during nfsd4_release_lockowner()")
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Cc: stable@vger.kernel.org # v6.2+
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nouveau/vmm: don't set addr on the fail path to avoid warning [+ + +]

Author: Dave Airlie <airlied@redhat.com>
Date:   Thu Jan 18 06:19:57 2024 +1000

    nouveau/vmm: don't set addr on the fail path to avoid warning
    
    commit cacea81390fd8c8c85404e5eb2adeb83d87a912e upstream.
    
    nvif_vmm_put gets called if addr is set, but if the allocation
    fails we don't need to call put, otherwise we get a warning like
    
    [523232.435671] ------------[ cut here ]------------
    [523232.435674] WARNING: CPU: 8 PID: 1505697 at drivers/gpu/drm/nouveau/nvif/vmm.c:68 nvif_vmm_put+0x72/0x80 [nouveau]
    [523232.435795] Modules linked in: uinput rfcomm snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr bnep sunrpc binfmt_misc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common isst_if_common iwlmvm nfit libnvdimm vfat fat x86_pkg_temp_thermal intel_powerclamp mac80211 snd_soc_avs snd_soc_hda_codec coretemp snd_hda_ext_core snd_soc_core snd_hda_codec_realtek kvm_intel snd_hda_codec_hdmi snd_compress snd_hda_codec_generic ac97_bus snd_pcm_dmaengine snd_hda_intel libarc4 snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm iwlwifi snd_hda_core btusb snd_hwdep btrtl snd_seq btintel irqbypass btbcm rapl snd_seq_device eeepc_wmi btmtk intel_cstate iTCO_wdt cfg80211 snd_pcm asus_wmi bluetooth intel_pmc_bxt iTCO_vendor_support snd_timer ledtrig_audio pktcdvd snd mei_me
    [523232.435828]  sparse_keymap intel_uncore i2c_i801 platform_profile wmi_bmof mei pcspkr ioatdma soundcore i2c_smbus rfkill idma64 dca joydev acpi_tad loop zram nouveau drm_ttm_helper ttm video drm_exec drm_gpuvm gpu_sched crct10dif_pclmul i2c_algo_bit nvme crc32_pclmul crc32c_intel drm_display_helper polyval_clmulni nvme_core polyval_generic e1000e mxm_wmi cec ghash_clmulni_intel r8169 sha512_ssse3 nvme_common wmi pinctrl_sunrisepoint uas usb_storage ip6_tables ip_tables fuse
    [523232.435849] CPU: 8 PID: 1505697 Comm: gnome-shell Tainted: G        W          6.6.0-rc7-nvk-uapi+ #12
    [523232.435851] Hardware name: System manufacturer System Product Name/ROG STRIX X299-E GAMING II, BIOS 1301 09/24/2021
    [523232.435852] RIP: 0010:nvif_vmm_put+0x72/0x80 [nouveau]
    [523232.435934] Code: 00 00 48 89 e2 be 02 00 00 00 48 c7 04 24 00 00 00 00 48 89 44 24 08 e8 fc bf ff ff 85
    c0 75 0a 48 c7 43 08 00 00 00 00 eb b3 <0f> 0b eb f2 e8 f5 c9 b2 e6 0f 1f 44 00 00 90 90 90 90 90 90 90 90
    [523232.435936] RSP: 0018:ffffc900077ffbd8 EFLAGS: 00010282
    [523232.435937] RAX: 00000000fffffffe RBX: ffffc900077ffc00 RCX: 0000000000000010
    [523232.435938] RDX: 0000000000000010 RSI: ffffc900077ffb38 RDI: ffffc900077ffbd8
    [523232.435940] RBP: ffff888e1c4f2140 R08: 0000000000000000 R09: 0000000000000000
    [523232.435940] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888503811800
    [523232.435941] R13: ffffc900077ffca0 R14: ffff888e1c4f2140 R15: ffff88810317e1e0
    [523232.435942] FS:  00007f933a769640(0000) GS:ffff88905fa00000(0000) knlGS:0000000000000000
    [523232.435943] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [523232.435944] CR2: 00007f930bef7000 CR3: 00000005d0322001 CR4: 00000000003706e0
    [523232.435945] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [523232.435946] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [523232.435964] Call Trace:
    [523232.435965]  <TASK>
    [523232.435966]  ? nvif_vmm_put+0x72/0x80 [nouveau]
    [523232.436051]  ? __warn+0x81/0x130
    [523232.436055]  ? nvif_vmm_put+0x72/0x80 [nouveau]
    [523232.436138]  ? report_bug+0x171/0x1a0
    [523232.436142]  ? handle_bug+0x3c/0x80
    [523232.436144]  ? exc_invalid_op+0x17/0x70
    [523232.436145]  ? asm_exc_invalid_op+0x1a/0x20
    [523232.436149]  ? nvif_vmm_put+0x72/0x80 [nouveau]
    [523232.436230]  ? nvif_vmm_put+0x64/0x80 [nouveau]
    [523232.436342]  nouveau_vma_del+0x80/0xd0 [nouveau]
    [523232.436506]  nouveau_vma_new+0x1a0/0x210 [nouveau]
    [523232.436671]  nouveau_gem_object_open+0x1d0/0x1f0 [nouveau]
    [523232.436835]  drm_gem_handle_create_tail+0xd1/0x180
    [523232.436840]  drm_prime_fd_to_handle_ioctl+0x12e/0x200
    [523232.436844]  ? __pfx_drm_prime_fd_to_handle_ioctl+0x10/0x10
    [523232.436847]  drm_ioctl_kernel+0xd3/0x180
    [523232.436849]  drm_ioctl+0x26d/0x4b0
    [523232.436851]  ? __pfx_drm_prime_fd_to_handle_ioctl+0x10/0x10
    [523232.436855]  nouveau_drm_ioctl+0x5a/0xb0 [nouveau]
    [523232.437032]  __x64_sys_ioctl+0x94/0xd0
    [523232.437036]  do_syscall_64+0x5d/0x90
    [523232.437040]  ? syscall_exit_to_user_mode+0x2b/0x40
    [523232.437044]  ? do_syscall_64+0x6c/0x90
    [523232.437046]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
    
    Reported-by: Faith Ekstrand <faith.ekstrand@collabora.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240117213852.295565-1-airlied@gmail.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

OPP: Pass rounded rate to _set_opp() [+ + +]

Author: Viresh Kumar <viresh.kumar@linaro.org>
Date:   Fri Jan 5 13:55:37 2024 +0530

    OPP: Pass rounded rate to _set_opp()
    
    commit 7269c250db1b89cda72ca419b7bd5e37997309d6 upstream.
    
    The OPP core finds the eventual frequency to set with the help of
    clk_round_rate() and the same was earlier getting passed to _set_opp()
    and that's what would get configured.
    
    The commit 1efae8d2e777 ("OPP: Make dev_pm_opp_set_opp() independent of
    frequency") mistakenly changed that. Fix it.
    
    Fixes: 1efae8d2e777 ("OPP: Make dev_pm_opp_set_opp() independent of frequency")
    Cc: v5.18+ <stable@vger.kernel.org> # v6.0+
    Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

parisc/firmware: Fix F-extend for PDC addresses [+ + +]

Author: Helge Deller <deller@gmx.de>
Date:   Wed Jan 3 21:02:16 2024 +0100

    parisc/firmware: Fix F-extend for PDC addresses
    
    commit 735ae74f73e55c191d48689bd11ff4a06ea0508f upstream.
    
    When running with narrow firmware (64-bit kernel using a 32-bit
    firmware), extend PDC addresses into the 0xfffffff0.00000000
    region instead of the 0xf0f0f0f0.00000000 region.
    
    This fixes the power button on the C3700 machine in qemu (64-bit CPU
    with 32-bit firmware), and my assumption is that the previous code was
    really never used (because most 64-bit machines have a 64-bit firmware),
    or it just worked on very old machines because they may only decode
    40-bit of virtual addresses.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Helge Deller <deller@gmx.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

parisc/power: Fix power soft-off button emulation on qemu [+ + +]

Author: Helge Deller <deller@gmx.de>
Date:   Wed Jan 3 21:17:23 2024 +0100

    parisc/power: Fix power soft-off button emulation on qemu
    
    commit 6472036581f947109b20664121db1d143e916f0b upstream.
    
    Make sure to start the kthread to check the power button on qemu as
    well if the power button address was provided.
    This fixes the qemu built-in system_powerdown runtime command.
    
    Fixes: d0c219472980 ("parisc/power: Add power soft-off when running on qemu")
    Signed-off-by: Helge Deller <deller@gmx.de>
    Cc: stable@vger.kernel.org # v6.0+
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pipe: wakeup wr_wait after setting max_usage [+ + +]

Author: Lukas Schauer <lukas@schauer.dev>
Date:   Fri Dec 1 11:11:28 2023 +0100

    pipe: wakeup wr_wait after setting max_usage
    
    [ Upstream commit e95aada4cb93d42e25c30a0ef9eb2923d9711d4a ]
    
    Commit c73be61cede5 ("pipe: Add general notification queue support") a
    regression was introduced that would lock up resized pipes under certain
    conditions. See the reproducer in [1].
    
    The commit resizing the pipe ring size was moved to a different
    function, doing that moved the wakeup for pipe->wr_wait before actually
    raising pipe->max_usage. If a pipe was full before the resize occured it
    would result in the wakeup never actually triggering pipe_write.
    
    Set @max_usage and @nr_accounted before waking writers if this isn't a
    watch queue.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=212295 [1]
    Link: https://lore.kernel.org/r/20231201-orchideen-modewelt-e009de4562c6@brauner
    Fixes: c73be61cede5 ("pipe: Add general notification queue support")
    Reviewed-by: David Howells <dhowells@redhat.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Lukas Schauer <lukas@schauer.dev>
    [Christian Brauner <brauner@kernel.org>: rewrite to account for watch queues]
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

platform/x86: intel-uncore-freq: Fix types in sysfs callbacks [+ + +]

Author: Nathan Chancellor <nathan@kernel.org>
Date:   Thu Jan 4 15:59:03 2024 -0700

    platform/x86: intel-uncore-freq: Fix types in sysfs callbacks
    
    commit 416de0246f35f43d871a57939671fe814f4455ee upstream.
    
    When booting a kernel with CONFIG_CFI_CLANG, there is a CFI failure when
    accessing any of the values under
    /sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00:
    
      $ cat /sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00/max_freq_khz
      fish: Job 1, 'cat /sys/devices/system/cpu/intБ─╕' terminated by signal SIGSEGV (Address boundary error)
    
      $ sudo dmesg &| grep 'CFI failure'
      [  170.953925] CFI failure at kobj_attr_show+0x19/0x30 (target: show_max_freq_khz+0x0/0xc0 [intel_uncore_frequency_common]; expected type: 0xd34078c5
    
    The sysfs callback functions such as show_domain_id() are written as if
    they are going to be called by dev_attr_show() but as the above message
    shows, they are instead called by kobj_attr_show(). kCFI checks that the
    destination of an indirect jump has the exact same type as the prototype
    of the function pointer it is called through and fails when they do not.
    
    These callbacks are called through kobj_attr_show() because
    uncore_root_kobj was initialized with kobject_create_and_add(), which
    means uncore_root_kobj has a ->sysfs_ops of kobj_sysfs_ops from
    kobject_create(), which uses kobj_attr_show() as its ->show() value.
    
    The only reason there has not been a more noticeable problem until this
    point is that 'struct kobj_attribute' and 'struct device_attribute' have
    the same layout, so getting the callback from container_of() works the
    same with either value.
    
    Change all the callbacks and their uses to be compatible with
    kobj_attr_show() and kobj_attr_store(), which resolves the kCFI failure
    and allows the sysfs files to work properly.
    
    Closes: https://github.com/ClangBuiltLinux/linux/issues/1974
    Fixes: ae7b2ce57851 ("platform/x86/intel/uncore-freq: Use sysfs API to create attributes")
    Cc: stable@vger.kernel.org
    Signed-off-by: Nathan Chancellor <nathan@kernel.org>
    Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
    Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
    Link: https://lore.kernel.org/r/20240104-intel-uncore-freq-kcfi-fix-v1-1-bf1e8939af40@kernel.org
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe [+ + +]

Author: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Date:   Mon Jan 8 15:20:58 2024 +0900

    platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe
    
    commit 5913320eb0b3ec88158cfcb0fa5e996bf4ef681b upstream.
    
    p2sb_bar() unhides P2SB device to get resources from the device. It
    guards the operation by locking pci_rescan_remove_lock so that parallel
    rescans do not find the P2SB device. However, this lock causes deadlock
    when PCI bus rescan is triggered by /sys/bus/pci/rescan. The rescan
    locks pci_rescan_remove_lock and probes PCI devices. When PCI devices
    call p2sb_bar() during probe, it locks pci_rescan_remove_lock again.
    Hence the deadlock.
    
    To avoid the deadlock, do not lock pci_rescan_remove_lock in p2sb_bar().
    Instead, do the lock at fs_initcall. Introduce p2sb_cache_resources()
    for fs_initcall which gets and caches the P2SB resources. At p2sb_bar(),
    refer the cache and return to the caller.
    
    Before operating the device at P2SB DEVFN for resource cache, check
    that its device class is PCI_CLASS_MEMORY_OTHER 0x0580 that PCH
    specifications define. This avoids unexpected operation to other devices
    at the same DEVFN.
    
    Link: https://lore.kernel.org/linux-pci/6xb24fjmptxxn5js2fjrrddjae6twex5bjaftwqsuawuqqqydx@7cl3uik5ef6j/
    Fixes: 9745fb07474f ("platform/x86/intel: Add Primary to Sideband (P2SB) bridge support")
    Cc: stable@vger.kernel.org
    Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
    Link: https://lore.kernel.org/r/20240108062059.3583028-2-shinichiro.kawasaki@wdc.com
    Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Reviewed-by: Ilpo Jц╓rvinen <ilpo.jarvinen@linux.intel.com>
    Tested-by Klara Modin <klarasmodin@gmail.com>
    Reviewed-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

PM / devfreq: Fix buffer overflow in trans_stat_show [+ + +]

Author: Christian Marangi <ansuelsmth@gmail.com>
Date:   Tue Oct 24 20:30:15 2023 +0200

    PM / devfreq: Fix buffer overflow in trans_stat_show
    
    commit 08e23d05fa6dc4fc13da0ccf09defdd4bbc92ff4 upstream.
    
    Fix buffer overflow in trans_stat_show().
    
    Convert simple snprintf to the more secure scnprintf with size of
    PAGE_SIZE.
    
    Add condition checking if we are exceeding PAGE_SIZE and exit early from
    loop. Also add at the end a warning that we exceeded PAGE_SIZE and that
    stats is disabled.
    
    Return -EFBIG in the case where we don't have enough space to write the
    full transition table.
    
    Also document in the ABI that this function can return -EFBIG error.
    
    Link: https://lore.kernel.org/all/20231024183016.14648-2-ansuelsmth@gmail.com/
    Cc: stable@vger.kernel.org
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218041
    Fixes: e552bbaf5b98 ("PM / devfreq: Add sysfs node for representing frequency transition information.")
    Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
    Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

PM: hibernate: Enforce ordering during image compression/decompression [+ + +]

Author: Hongchen Zhang <zhanghongchen@loongson.cn>
Date:   Thu Nov 16 08:56:09 2023 +0800

    PM: hibernate: Enforce ordering during image compression/decompression
    
    commit 71cd7e80cfde548959952eac7063aeaea1f2e1c6 upstream.
    
    An S4 (suspend to disk) test on the LoongArch 3A6000 platform sometimes
    fails with the following error messaged in the dmesg log:
    
            Invalid LZO compressed length
    
    That happens because when compressing/decompressing the image, the
    synchronization between the control thread and the compress/decompress/crc
    thread is based on a relaxed ordering interface, which is unreliable, and the
    following situation may occur:
    
    CPU 0                                   CPU 1
    save_image_lzo                          lzo_compress_threadfn
                                              atomic_set(&d->stop, 1);
      atomic_read(&data[thr].stop)
      data[thr].cmp = data[thr].cmp_len;
                                              WRITE data[thr].cmp_len
    
    Then CPU0 gets a stale cmp_len and writes it to disk. During resume from S4,
    wrong cmp_len is loaded.
    
    To maintain data consistency between the two threads, use the acquire/release
    variants of atomic set and read operations.
    
    Fixes: 081a9d043c98 ("PM / Hibernate: Improve performance of LZO/plain hibernation, checksum image")
    Cc: All applicable <stable@vger.kernel.org>
    Signed-off-by: Hongchen Zhang <zhanghongchen@loongson.cn>
    Co-developed-by: Weihao Li <liweihao@loongson.cn>
    Signed-off-by: Weihao Li <liweihao@loongson.cn>
    [ rjw: Subject rewrite and changelog edits ]
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

PM: sleep: Fix possible deadlocks in core system-wide PM code [+ + +]

Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date:   Wed Dec 27 21:41:06 2023 +0100

    PM: sleep: Fix possible deadlocks in core system-wide PM code
    
    commit 7839d0078e0d5e6cc2fa0b0dfbee71de74f1e557 upstream.
    
    It is reported that in low-memory situations the system-wide resume core
    code deadlocks, because async_schedule_dev() executes its argument
    function synchronously if it cannot allocate memory (and not only in
    that case) and that function attempts to acquire a mutex that is already
    held.  Executing the argument function synchronously from within
    dpm_async_fn() may also be problematic for ordering reasons (it may
    cause a consumer device's resume callback to be invoked before a
    requisite supplier device's one, for example).
    
    Address this by changing the code in question to use
    async_schedule_dev_nocall() for scheduling the asynchronous
    execution of device suspend and resume functions and to directly
    run them synchronously if async_schedule_dev_nocall() returns false.
    
    Link: https://lore.kernel.org/linux-pm/ZYvjiqX6EsL15moe@perf/
    Reported-by: Youngmin Nam <youngmin.nam@samsung.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
    Tested-by: Youngmin Nam <youngmin.nam@samsung.com>
    Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
    Cc: 5.7+ <stable@vger.kernel.org> # 5.7+: 6aa09a5bccd8 async: Split async_schedule_node_domain()
    Cc: 5.7+ <stable@vger.kernel.org> # 5.7+: 7d4b5d7a37bd async: Introduce async_schedule_dev_nocall()
    Cc: 5.7+ <stable@vger.kernel.org> # 5.7+
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

powerpc/ps3_defconfig: Disable PPC64_BIG_ENDIAN_ELF_ABI_V2 [+ + +]

Author: Geoff Levand <geoff@infradead.org>
Date:   Sun Dec 24 09:52:46 2023 +0900

    powerpc/ps3_defconfig: Disable PPC64_BIG_ENDIAN_ELF_ABI_V2
    
    commit 482b718a84f08b6fc84879c3e90cc57dba11c115 upstream.
    
    Commit 8c5fa3b5c4df ("powerpc/64: Make ELFv2 the default for big-endian
    builds"), merged in Linux-6.5-rc1 changes the calling ABI in a way
    that is incompatible with the current code for the PS3's LV1 hypervisor
    calls.
    
    This change just adds the line '# CONFIG_PPC64_BIG_ENDIAN_ELF_ABI_V2 is not set'
    to the ps3_defconfig file so that the PPC64_ELF_ABI_V1 is used.
    
    Fixes run time errors like these:
    
      BUG: Kernel NULL pointer dereference at 0x00000000
      Faulting instruction address: 0xc000000000047cf0
      Oops: Kernel access of bad area, sig: 11 [#1]
      Call Trace:
      [c0000000023039e0] [c00000000100ebfc] ps3_create_spu+0xc4/0x2b0 (unreliable)
      [c000000002303ab0] [c00000000100d4c4] create_spu+0xcc/0x3c4
      [c000000002303b40] [c00000000100eae4] ps3_enumerate_spus+0xa4/0xf8
    
    Fixes: 8c5fa3b5c4df ("powerpc/64: Make ELFv2 the default for big-endian builds")
    Cc: stable@vger.kernel.org # v6.5+
    Signed-off-by: Geoff Levand <geoff@infradead.org>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://msgid.link/df906ac1-5f17-44b9-b0bb-7cd292a0df65@infradead.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rbd: don't move requests to the running list on errors [+ + +]

Author: Ilya Dryomov <idryomov@gmail.com>
Date:   Wed Jan 17 18:59:44 2024 +0100

    rbd: don't move requests to the running list on errors
    
    commit ded080c86b3f99683774af0441a58fc2e3d60cae upstream.
    
    The running list is supposed to contain requests that are pinning the
    exclusive lock, i.e. those that must be flushed before exclusive lock
    is released.  When wake_lock_waiters() is called to handle an error,
    requests on the acquiring list are failed with that error and no
    flushing takes place.  Briefly moving them to the running list is not
    only pointless but also harmful: if exclusive lock gets acquired
    before all of their state machines are scheduled and go through
    rbd_lock_del_request(), we trigger
    
        rbd_assert(list_empty(&rbd_dev->running_list));
    
    in rbd_try_acquire_lock().
    
    Cc: stable@vger.kernel.org
    Fixes: 637cd060537d ("rbd: new exclusive lock wait/wake code")
    Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
    Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rcu: Defer RCU kthreads wakeup when CPU is dying [+ + +]

Author: Frederic Weisbecker <frederic@kernel.org>
Date:   Tue Dec 19 00:19:15 2023 +0100

    rcu: Defer RCU kthreads wakeup when CPU is dying
    
    [ Upstream commit e787644caf7628ad3269c1fbd321c3255cf51710 ]
    
    When the CPU goes idle for the last time during the CPU down hotplug
    process, RCU reports a final quiescent state for the current CPU. If
    this quiescent state propagates up to the top, some tasks may then be
    woken up to complete the grace period: the main grace period kthread
    and/or the expedited main workqueue (or kworker).
    
    If those kthreads have a SCHED_FIFO policy, the wake up can indirectly
    arm the RT bandwith timer to the local offline CPU. Since this happens
    after hrtimers have been migrated at CPUHP_AP_HRTIMERS_DYING stage, the
    timer gets ignored. Therefore if the RCU kthreads are waiting for RT
    bandwidth to be available, they may never be actually scheduled.
    
    This triggers TREE03 rcutorture hangs:
    
             rcu: INFO: rcu_preempt self-detected stall on CPU
             rcu:     4-...!: (1 GPs behind) idle=9874/1/0x4000000000000000 softirq=0/0 fqs=20 rcuc=21071 jiffies(starved)
             rcu:     (t=21035 jiffies g=938281 q=40787 ncpus=6)
             rcu: rcu_preempt kthread starved for 20964 jiffies! g938281 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
             rcu:     Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
             rcu: RCU grace-period kthread stack dump:
             task:rcu_preempt     state:R  running task     stack:14896 pid:14    tgid:14    ppid:2      flags:0x00004000
             Call Trace:
              <TASK>
              __schedule+0x2eb/0xa80
              schedule+0x1f/0x90
              schedule_timeout+0x163/0x270
              ? __pfx_process_timeout+0x10/0x10
              rcu_gp_fqs_loop+0x37c/0x5b0
              ? __pfx_rcu_gp_kthread+0x10/0x10
              rcu_gp_kthread+0x17c/0x200
              kthread+0xde/0x110
              ? __pfx_kthread+0x10/0x10
              ret_from_fork+0x2b/0x40
              ? __pfx_kthread+0x10/0x10
              ret_from_fork_asm+0x1b/0x30
              </TASK>
    
    The situation can't be solved with just unpinning the timer. The hrtimer
    infrastructure and the nohz heuristics involved in finding the best
    remote target for an unpinned timer would then also need to handle
    enqueues from an offline CPU in the most horrendous way.
    
    So fix this on the RCU side instead and defer the wake up to an online
    CPU if it's too late for the local one.
    
    Reported-by: Paul E. McKenney <paulmck@kernel.org>
    Fixes: 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier")
    Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
    Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

rename(): fix the locking of subdirectories [+ + +]

Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Sun Nov 19 20:25:58 2023 -0500

    rename(): fix the locking of subdirectories
    
    commit 22e111ed6c83dcde3037fc81176012721bc34c0b upstream.
    
            We should never lock two subdirectories without having taken
    ->s_vfs_rename_mutex; inode pointer order or not, the "order" proposed
    in 28eceeda130f "fs: Lock moved directories" is not transitive, with
    the usual consequences.
    
            The rationale for locking renamed subdirectory in all cases was
    the possibility of race between rename modifying .. in a subdirectory to
    reflect the new parent and another thread modifying the same subdirectory.
    For a lot of filesystems that's not a problem, but for some it can lead
    to trouble (e.g. the case when short directory contents is kept in the
    inode, but creating a file in it might push it across the size limit
    and copy its contents into separate data block(s)).
    
            However, we need that only in case when the parent does change -
    otherwise ->rename() doesn't need to do anything with .. entry in the
    first place.  Some instances are lazy and do a tautological update anyway,
    but it's really not hard to avoid.
    
    Amended locking rules for rename():
            find the parent(s) of source and target
            if source and target have the same parent
                    lock the common parent
            else
                    lock ->s_vfs_rename_mutex
                    lock both parents, in ancestor-first order; if neither
                    is an ancestor of another, lock the parent of source
                    first.
            find the source and target.
            if source and target have the same parent
                    if operation is an overwriting rename of a subdirectory
                            lock the target subdirectory
            else
                    if source is a subdirectory
                            lock the source
                    if target is a subdirectory
                            lock the target
            lock non-directories involved, in inode pointer order if both
            source and target are such.
    
    That way we are guaranteed that parents are locked (for obvious reasons),
    that any renamed non-directory is locked (nfsd relies upon that),
    that any victim is locked (emptiness check needs that, among other things)
    and subdirectory that changes parent is locked (needed to protect the update
    of .. entries).  We are also guaranteed that any operation locking more
    than one directory either takes ->s_vfs_rename_mutex or locks a parent
    followed by its child.
    
    Cc: stable@vger.kernel.org
    Fixes: 28eceeda130f "fs: Lock moved directories"
    Reviewed-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "cifs: reconnect work should have reference on server struct" [+ + +]

Author: Shyam Prasad N <sprasad@microsoft.com>
Date:   Wed Dec 6 16:37:37 2023 +0000

    Revert "cifs: reconnect work should have reference on server struct"
    
    [ Upstream commit 823342524868168bf681f135d01b4ae10f5863ec ]
    
    This reverts commit 19a4b9d6c372cab6a3b2c9a061a236136fe95274.
    
    This earlier commit was making an assumption that each mod_delayed_work
    called for the reconnect work would result in smb2_reconnect_server
    being called twice. This assumption turns out to be untrue. So reverting
    this change for now.
    
    I will submit a follow-up patch to fix the actual problem in a different
    way.
    
    Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Stable-dep-of: 78e727e58e54 ("cifs: update iface_last_update on each query-and-update")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Revert "drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync" [+ + +]

Author: Thomas Zimmermann <tzimmermann@suse.de>
Date:   Tue Jan 23 13:09:26 2024 +0100

    Revert "drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync"
    
    commit d1b163aa0749706379055e40a52cf7a851abf9dc upstream.
    
    This reverts commit 60aebc9559492cea6a9625f514a8041717e3a2e4.
    
    Commit 60aebc9559492cea ("drivers/firmware: Move sysfb_init() from
    device_initcall to subsys_initcall_sync") messes up initialization order
    of the graphics drivers and leads to blank displays on some systems. So
    revert the commit.
    
    To make the display drivers fully independent from initialization
    order requires to track framebuffer memory by device and independently
    from the loaded drivers. The kernel currently lacks the infrastructure
    to do so.
    
    Reported-by: Jaak Ristioja <jaak@ristioja.ee>
    Closes: https://lore.kernel.org/dri-devel/ZUnNi3q3yB3zZfTl@P70.localdomain/T/#t
    Reported-by: Huacai Chen <chenhuacai@loongson.cn>
    Closes: https://lore.kernel.org/dri-devel/20231108024613.2898921-1-chenhuacai@loongson.cn/
    Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10133
    Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
    Cc: Javier Martinez Canillas <javierm@redhat.com>
    Cc: Thorsten Leemhuis <regressions@leemhuis.info>
    Cc: Jani Nikula <jani.nikula@linux.intel.com>
    Cc: stable@vger.kernel.org # v6.5+
    Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
    Acked-by: Jani Nikula <jani.nikula@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240123120937.27736-1-tzimmermann@suse.de
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "drm/amd/display: fix bandwidth validation failure on DCN 2.1" [+ + +]

Author: Ivan Lipski <ivlipski@amd.com>
Date:   Fri Jan 5 19:40:50 2024 -0500

    Revert "drm/amd/display: fix bandwidth validation failure on DCN 2.1"
    
    commit c2ab9ce0ee7225fc05f58a6671c43b8a3684f530 upstream.
    
    This commit causes dmesg-warn on several IGT tests on DCN 3.1.6: *ERROR*
    link_enc_cfg_validate: Invalid link encoder assignments - 0x1c
    
    Affected IGT tests include:
    - amdgpu/[amd_assr|amd_plane|amd_hotplug]
    - kms_atomic
    - kms_color
    - kms_flip
    - kms_properties
    - kms_universal_plane
    
    and some other tests
    
    This reverts commit 3a0fa3bc245ef92838a8296e0055569b8dff94c4.
    
    Cc: Melissa Wen <mwen@igalia.com>
    Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Signed-off-by: Ivan Lipski <ivlipski@amd.com>
    Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "drm/amd: Enable PCIe PME from D3" [+ + +]

Author: Jonathan Gray <jsg@jsg.id.au>
Date:   Sat Jan 27 12:01:50 2024 +1100

    Revert "drm/amd: Enable PCIe PME from D3"
    
    This reverts commit 847e6947afd3c46623172d2eabcfc2481ee8668e.
    
    duplicated a change made in 6.6.5
    49227bea27ebcd260f0c94a3055b14bbd8605c5e
    
    Cc: stable@vger.kernel.org # 6.6
    Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "drm/i915/dsi: Do display on sequence later on icl+" [+ + +]

Author: Ville Syrjц╓lц╓ <ville.syrjala@linux.intel.com>
Date:   Tue Jan 16 23:08:21 2024 +0200

    Revert "drm/i915/dsi: Do display on sequence later on icl+"
    
    commit 6992eb815d087858f8d7e4020529c2fe800456b3 upstream.
    
    This reverts commit 88b065943cb583e890324d618e8d4b23460d51a3.
    
    Lenovo 82TQ is unhappy if we do the display on sequence this
    late. The display output shows severe corruption.
    
    It's unclear if this is a failure on our part (perhaps
    something to do with sending commands in LP mode after HS
    /video mode transmission has been started? Though the backlight
    on command at least seems to work) or simply that there are
    some commands in the sequence that are needed to be done
    earlier (eg. could be some DSC init stuff?). If the latter
    then I don't think the current Windows code would work
    either, but maybe this was originally tested with an older
    driver, who knows.
    
    Root causing this fully would likely require a lot of
    experimentation which isn't really feasible without direct
    access to the machine, so let's just accept failure and
    go back to the original sequence.
    
    Cc: stable@vger.kernel.org
    Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10071
    Signed-off-by: Ville Syrjц╓lц╓ <ville.syrjala@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240116210821.30194-1-ville.syrjala@linux.intel.com
    Acked-by: Jani Nikula <jani.nikula@intel.com>
    (cherry picked from commit dc524d05974f615b145404191fcf91b478950499)
    Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

riscv: Fix an off-by-one in get_early_cmdline() [+ + +]

Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Sun Oct 29 08:20:40 2023 +0100

    riscv: Fix an off-by-one in get_early_cmdline()
    
    [ Upstream commit adb1f95d388a43c4c564ef3e436f18900dde978e ]
    
    The ending NULL is not taken into account by strncat(), so switch to
    strlcat() to correctly compute the size of the available memory when
    appending CONFIG_CMDLINE to 'early_cmdline'.
    
    Fixes: 26e7aacb83df ("riscv: Allow to downgrade paging mode from the command line")
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Link: https://lore.kernel.org/r/9f66d2b58c8052d4055e90b8477ee55d9a0914f9.1698564026.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

riscv: mm: Fixup compat arch_get_mmap_end [+ + +]

Author: Guo Ren <guoren@kernel.org>
Date:   Fri Dec 22 06:57:01 2023 -0500

    riscv: mm: Fixup compat arch_get_mmap_end
    
    commit 97b7ac69be2e5a683e898f5267f659fde52efdd5 upstream.
    
    When the task is in COMPAT mode, the arch_get_mmap_end should be 2GB,
    not TASK_SIZE_64. The TASK_SIZE has contained is_compat_mode()
    detection, so change the definition of STACK_TOP_MAX to TASK_SIZE
    directly.
    
    Cc: stable@vger.kernel.org
    Fixes: add2cc6b6515 ("RISC-V: mm: Restrict address space for sv39,sv48,sv57")
    Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
    Signed-off-by: Guo Ren <guoren@kernel.org>
    Reviewed-by: Leonardo Bras <leobras@redhat.com>
    Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
    Link: https://lore.kernel.org/r/20231222115703.2404036-3-guoren@kernel.org
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

riscv: mm: Fixup compat mode boot failure [+ + +]

Author: Guo Ren <guoren@kernel.org>
Date:   Fri Dec 22 06:57:00 2023 -0500

    riscv: mm: Fixup compat mode boot failure
    
    commit 5f449e245e5b0d9d63eef6c8968fbdc3a8594407 upstream.
    
    In COMPAT mode, the STACK_TOP is DEFAULT_MAP_WINDOW (0x80000000), but
    the TASK_SIZE is 0x7fff000. When the user stack is upon 0x7fff000, it
    will cause a user segment fault. Sometimes, it would cause boot
    failure when the whole rootfs is rv32.
    
    Freeing unused kernel image (initmem) memory: 2236K
    Run /sbin/init as init process
    Starting init: /sbin/init exists but couldn't execute it (error -14)
    Run /etc/init as init process
    ...
    
    Increase the TASK_SIZE to cover STACK_TOP.
    
    Cc: stable@vger.kernel.org
    Fixes: add2cc6b6515 ("RISC-V: mm: Restrict address space for sv39,sv48,sv57")
    Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
    Signed-off-by: Guo Ren <guoren@kernel.org>
    Reviewed-by: Leonardo Bras <leobras@redhat.com>
    Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
    Link: https://lore.kernel.org/r/20231222115703.2404036-2-guoren@kernel.org
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rpmsg: virtio: Free driver_override when rpmsg_remove() [+ + +]

Author: Xiaolei Wang <xiaolei.wang@windriver.com>
Date:   Fri Dec 15 10:00:49 2023 +0800

    rpmsg: virtio: Free driver_override when rpmsg_remove()
    
    commit d5362c37e1f8a40096452fc201c30e705750e687 upstream.
    
    Free driver_override when rpmsg_remove(), otherwise
    the following memory leak will occur:
    
    unreferenced object 0xffff0000d55d7080 (size 128):
      comm "kworker/u8:2", pid 56, jiffies 4294893188 (age 214.272s)
      hex dump (first 32 bytes):
        72 70 6d 73 67 5f 6e 73 00 00 00 00 00 00 00 00  rpmsg_ns........
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      backtrace:
        [<000000009c94c9c1>] __kmem_cache_alloc_node+0x1f8/0x320
        [<000000002300d89b>] __kmalloc_node_track_caller+0x44/0x70
        [<00000000228a60c3>] kstrndup+0x4c/0x90
        [<0000000077158695>] driver_set_override+0xd0/0x164
        [<000000003e9c4ea5>] rpmsg_register_device_override+0x98/0x170
        [<000000001c0c89a8>] rpmsg_ns_register_device+0x24/0x30
        [<000000008bbf8fa2>] rpmsg_probe+0x2e0/0x3ec
        [<00000000e65a68df>] virtio_dev_probe+0x1c0/0x280
        [<00000000443331cc>] really_probe+0xbc/0x2dc
        [<00000000391064b1>] __driver_probe_device+0x78/0xe0
        [<00000000a41c9a5b>] driver_probe_device+0xd8/0x160
        [<000000009c3bd5df>] __device_attach_driver+0xb8/0x140
        [<0000000043cd7614>] bus_for_each_drv+0x7c/0xd4
        [<000000003b929a36>] __device_attach+0x9c/0x19c
        [<00000000a94e0ba8>] device_initial_probe+0x14/0x20
        [<000000003c999637>] bus_probe_device+0xa0/0xac
    
    Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
    Fixes: b0b03b811963 ("rpmsg: Release rpmsg devices in backends")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20231215020049.78750-1-xiaolei.wang@windriver.com
    Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rtc: Add support for configuring the UIP timeout for RTC reads [+ + +]

Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Mon Nov 27 23:36:52 2023 -0600

    rtc: Add support for configuring the UIP timeout for RTC reads
    
    commit 120931db07b49252aba2073096b595482d71857c upstream.
    
    The UIP timeout is hardcoded to 10ms for all RTC reads, but in some
    contexts this might not be enough time. Add a timeout parameter to
    mc146818_get_time() and mc146818_get_time_callback().
    
    If UIP timeout is configured by caller to be >=100 ms and a call
    takes this long, log a warning.
    
    Make all callers use 10ms to ensure no functional changes.
    
    Cc:  <stable@vger.kernel.org> # 6.1.y
    Fixes: ec5895c0f2d8 ("rtc: mc146818-lib: extract mc146818_avoid_UIP")
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Tested-by: Mateusz Joе└czyk <mat.jonczyk@o2.pl>
    Reviewed-by: Mateusz Joе└czyk <mat.jonczyk@o2.pl>
    Acked-by: Mateusz Joе└czyk <mat.jonczyk@o2.pl>
    Link: https://lore.kernel.org/r/20231128053653.101798-4-mario.limonciello@amd.com
    Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rtc: Adjust failure return code for cmos_set_alarm() [+ + +]

Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Mon Nov 27 23:36:51 2023 -0600

    rtc: Adjust failure return code for cmos_set_alarm()
    
    commit 1311a8f0d4b23f58bbababa13623aa40b8ad4e0c upstream.
    
    When mc146818_avoid_UIP() fails to return a valid value, this is because
    UIP didn't clear in the timeout period. Adjust the return code in this
    case to -ETIMEDOUT.
    
    Tested-by: Mateusz Joе└czyk <mat.jonczyk@o2.pl>
    Reviewed-by: Mateusz Joе└czyk <mat.jonczyk@o2.pl>
    Acked-by: Mateusz Joе└czyk <mat.jonczyk@o2.pl>
    Cc:  <stable@vger.kernel.org>
    Fixes: cdedc45c579f ("rtc: cmos: avoid UIP when reading alarm time")
    Fixes: cd17420ebea5 ("rtc: cmos: avoid UIP when writing alarm time")
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Link: https://lore.kernel.org/r/20231128053653.101798-3-mario.limonciello@amd.com
    Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rtc: cmos: Use ACPI alarm for non-Intel x86 systems too [+ + +]

Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Mon Nov 6 10:23:10 2023 -0600

    rtc: cmos: Use ACPI alarm for non-Intel x86 systems too
    
    commit 3d762e21d56370a43478b55e604b4a83dd85aafc upstream.
    
    Intel systems > 2015 have been configured to use ACPI alarm instead
    of HPET to avoid s2idle issues.
    
    Having HPET programmed for wakeup causes problems on AMD systems with
    s2idle as well.
    
    One particular case is that the systemd "SuspendThenHibernate" feature
    doesn't work properly on the Framework 13" AMD model. Switching to
    using ACPI alarm fixes the issue.
    
    Adjust the quirk to apply to AMD/Hygon systems from 2021 onwards.
    This matches what has been tested and is specifically to avoid potential
    risk to older systems.
    
    Cc:  <stable@vger.kernel.org> # 6.1+
    Reported-by:  <alvin.zhuge@gmail.com>
    Reported-by:  <renzhamin@gmail.com>
    Closes: https://github.com/systemd/systemd/issues/24279
    Reported-by: Kelvie Wong <kelvie@kelvie.ca>
    Closes: https://community.frame.work/t/systemd-suspend-then-hibernate-wakes-up-after-5-minutes/39392
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Link: https://lore.kernel.org/r/20231106162310.85711-1-mario.limonciello@amd.com
    Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rtc: Extend timeout for waiting for UIP to clear to 1s [+ + +]

Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Mon Nov 27 23:36:53 2023 -0600

    rtc: Extend timeout for waiting for UIP to clear to 1s
    
    commit cef9ecc8e938dd48a560f7dd9be1246359248d20 upstream.
    
    Specs don't say anything about UIP being cleared within 10ms. They
    only say that UIP won't occur for another 244uS. If a long NMI occurs
    while UIP is still updating it might not be possible to get valid
    data in 10ms.
    
    This has been observed in the wild that around s2idle some calls can
    take up to 480ms before UIP is clear.
    
    Adjust callers from outside an interrupt context to wait for up to a
    1s instead of 10ms.
    
    Cc:  <stable@vger.kernel.org> # 6.1.y
    Fixes: ec5895c0f2d8 ("rtc: mc146818-lib: extract mc146818_avoid_UIP")
    Reported-by: Carsten Hatger <xmb8dsv4@gmail.com>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217626
    Tested-by: Mateusz Joе└czyk <mat.jonczyk@o2.pl>
    Reviewed-by: Mateusz Joе└czyk <mat.jonczyk@o2.pl>
    Acked-by: Mateusz Joе└czyk <mat.jonczyk@o2.pl>
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Link: https://lore.kernel.org/r/20231128053653.101798-5-mario.limonciello@amd.com
    Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rtc: mc146818-lib: Adjust failure return code for mc146818_get_time() [+ + +]

Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Mon Nov 27 23:36:50 2023 -0600

    rtc: mc146818-lib: Adjust failure return code for mc146818_get_time()
    
    commit af838635a3eb9b1bc0d98599c101ebca98f31311 upstream.
    
    mc146818_get_time() calls mc146818_avoid_UIP() to avoid fetching the
    time while RTC update is in progress (UIP). When this fails, the return
    code is -EIO, but actually there was no IO failure.
    
    The reason for the return from mc146818_avoid_UIP() is that the UIP
    wasn't cleared in the time period. Adjust the return code to -ETIMEDOUT
    to match the behavior.
    
    Tested-by: Mateusz Joе└czyk <mat.jonczyk@o2.pl>
    Reviewed-by: Mateusz Joе└czyk <mat.jonczyk@o2.pl>
    Acked-by: Mateusz Joе└czyk <mat.jonczyk@o2.pl>
    Cc:  <stable@vger.kernel.org>
    Fixes: 2a61b0ac5493 ("rtc: mc146818-lib: refactor mc146818_get_time")
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Link: https://lore.kernel.org/r/20231128053653.101798-2-mario.limonciello@amd.com
    Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

s390/vfio-ap: always filter entire AP matrix [+ + +]

Author: Tony Krowiak <akrowiak@linux.ibm.com>
Date:   Mon Jan 15 13:54:31 2024 -0500

    s390/vfio-ap: always filter entire AP matrix
    
    commit 850fb7fa8c684a4c6bf0e4b6978f4ddcc5d43d11 upstream.
    
    The vfio_ap_mdev_filter_matrix function is called whenever a new adapter or
    domain is assigned to the mdev. The purpose of the function is to update
    the guest's AP configuration by filtering the matrix of adapters and
    domains assigned to the mdev. When an adapter or domain is assigned, only
    the APQNs associated with the APID of the new adapter or APQI of the new
    domain are inspected. If an APQN does not reference a queue device bound to
    the vfio_ap device driver, then it's APID will be filtered from the mdev's
    matrix when updating the guest's AP configuration.
    
    Inspecting only the APID of the new adapter or APQI of the new domain will
    result in passing AP queues through to a guest that are not bound to the
    vfio_ap device driver under certain circumstances. Consider the following:
    
    guest's AP configuration (all also assigned to the mdev's matrix):
    14.0004
    14.0005
    14.0006
    16.0004
    16.0005
    16.0006
    
    unassign domain 4
    unbind queue 16.0005
    assign domain 4
    
    When domain 4 is re-assigned, since only domain 4 will be inspected, the
    APQNs that will be examined will be:
    14.0004
    16.0004
    
    Since both of those APQNs reference queue devices that are bound to the
    vfio_ap device driver, nothing will get filtered from the mdev's matrix
    when updating the guest's AP configuration. Consequently, queue 16.0005
    will get passed through despite not being bound to the driver. This
    violates the linux device model requirement that a guest shall only be
    given access to devices bound to the device driver facilitating their
    pass-through.
    
    To resolve this problem, every adapter and domain assigned to the mdev will
    be inspected when filtering the mdev's matrix.
    
    Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
    Acked-by: Halil Pasic <pasic@linux.ibm.com>
    Fixes: 48cae940c31d ("s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240115185441.31526-2-akrowiak@linux.ibm.com
    Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

s390/vfio-ap: do not reset queue removed from host config [+ + +]

Author: Tony Krowiak <akrowiak@linux.ibm.com>
Date:   Mon Jan 15 13:54:36 2024 -0500

    s390/vfio-ap: do not reset queue removed from host config
    
    commit b9bd10c43456d16abd97b717446f51afb3b88411 upstream.
    
    When a queue is unbound from the vfio_ap device driver, it is reset to
    ensure its crypto data is not leaked when it is bound to another device
    driver. If the queue is unbound due to the fact that the adapter or domain
    was removed from the host's AP configuration, then attempting to reset it
    will fail with response code 01 (APID not valid) getting returned from the
    reset command. Let's ensure that the queue is assigned to the host's
    configuration before resetting it.
    
    Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
    Reviewed-by: "Jason J. Herne" <jjherne@linux.ibm.com>
    Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
    Fixes: eeb386aeb5b7 ("s390/vfio-ap: handle config changed and scan complete notification")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240115185441.31526-7-akrowiak@linux.ibm.com
    Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

s390/vfio-ap: let on_scan_complete() callback filter matrix and update guest's APCB [+ + +]

Author: Tony Krowiak <akrowiak@linux.ibm.com>
Date:   Mon Jan 15 13:54:33 2024 -0500

    s390/vfio-ap: let on_scan_complete() callback filter matrix and update guest's APCB
    
    commit 774d10196e648e2c0b78da817f631edfb3dfa557 upstream.
    
    When adapters and/or domains are added to the host's AP configuration, this
    may result in multiple queue devices getting created and probed by the
    vfio_ap device driver. For each queue device probed, the matrix of adapters
    and domains assigned to a matrix mdev will be filtered to update the
    guest's APCB. If any adapters or domains get added to or removed from the
    APCB, the guest's AP configuration will be dynamically updated (i.e., hot
    plug/unplug). To dynamically update the guest's configuration, its VCPUs
    must be taken out of SIE for the period of time it takes to make the
    update. This is disruptive to the guest's operation and if there are many
    queues probed due to a change in the host's AP configuration, this could be
    troublesome. The problem is exacerbated by the fact that the
    'on_scan_complete' callback also filters the mdev's matrix and updates
    the guest's AP configuration.
    
    In order to reduce the potential amount of disruption to the guest that may
    result from a change to the host's AP configuration, let's bypass the
    filtering of the matrix and updating of the guest's AP configuration in the
    probe callback - if due to a host config change - and defer it until the
    'on_scan_complete' callback is invoked after the AP bus finishes its device
    scan operation. This way the filtering and updating will be performed only
    once regardless of the number of queues added.
    
    Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
    Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
    Fixes: 48cae940c31d ("s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240115185441.31526-4-akrowiak@linux.ibm.com
    Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

s390/vfio-ap: loop over the shadow APCB when filtering guest's AP configuration [+ + +]

Author: Tony Krowiak <akrowiak@linux.ibm.com>
Date:   Mon Jan 15 13:54:32 2024 -0500

    s390/vfio-ap: loop over the shadow APCB when filtering guest's AP configuration
    
    commit 16fb78cbf56e42b8efb2682a4444ab59e32e7959 upstream.
    
    While filtering the mdev matrix, it doesn't make sense - and will have
    unexpected results - to filter an APID from the matrix if the APID or one
    of the associated APQIs is not in the host's AP configuration. There are
    two reasons for this:
    
    1. An adapter or domain that is not in the host's AP configuration can be
       assigned to the matrix; this is known as over-provisioning. Queue
       devices, however, are only created for adapters and domains in the
       host's AP configuration, so there will be no queues associated with an
       over-provisioned adapter or domain to filter.
    
    2. The adapter or domain may have been externally removed from the host's
       configuration via an SE or HMC attached to a DPM enabled LPAR. In this
       case, the vfio_ap device driver would have been notified by the AP bus
       via the on_config_changed callback and the adapter or domain would
       have already been filtered.
    
    Since the matrix_mdev->shadow_apcb.apm and matrix_mdev->shadow_apcb.aqm are
    copied from the mdev matrix sans the APIDs and APQIs not in the host's AP
    configuration, let's loop over those bitmaps instead of those assigned to
    the matrix.
    
    Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
    Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
    Fixes: 48cae940c31d ("s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240115185441.31526-3-akrowiak@linux.ibm.com
    Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

s390/vfio-ap: reset queues associated with adapter for queue unbound from driver [+ + +]

Author: Tony Krowiak <akrowiak@linux.ibm.com>
Date:   Mon Jan 15 13:54:35 2024 -0500

    s390/vfio-ap: reset queues associated with adapter for queue unbound from driver
    
    commit f009cfa466558b7dfe97f167ba1875d6f9ea4c07 upstream.
    
    When a queue is unbound from the vfio_ap device driver, if that queue is
    assigned to a guest's AP configuration, its associated adapter is removed
    because queues are defined to a guest via a matrix of adapters and
    domains; so, it is not possible to remove a single queue.
    
    If an adapter is removed from the guest's AP configuration, all associated
    queues must be reset to prevent leaking crypto data should any of them be
    assigned to a different guest or device driver. The one caveat is that if
    the queue is being removed because the adapter or domain has been removed
    from the host's AP configuration, then an attempt to reset the queue will
    fail with response code 01, AP-queue number not valid; so resetting these
    queues should be skipped.
    
    Acked-by: Halil Pasic <pasic@linux.ibm.com>
    Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
    Fixes: 09d31ff78793 ("s390/vfio-ap: hot plug/unplug of AP devices when probed/removed")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240115185441.31526-6-akrowiak@linux.ibm.com
    Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

s390/vfio-ap: reset queues filtered from the guest's AP config [+ + +]

Author: Tony Krowiak <akrowiak@linux.ibm.com>
Date:   Mon Jan 15 13:54:34 2024 -0500

    s390/vfio-ap: reset queues filtered from the guest's AP config
    
    commit f848cba767e59f8d5c54984b1d45451aae040d50 upstream.
    
    When filtering the adapters from the configuration profile for a guest to
    create or update a guest's AP configuration, if the APID of an adapter and
    the APQI of a domain identify a queue device that is not bound to the
    vfio_ap device driver, the APID of the adapter will be filtered because an
    individual APQN can not be filtered due to the fact the APQNs are assigned
    to an AP configuration as a matrix of APIDs and APQIs. Consequently, a
    guest will not have access to all of the queues associated with the
    filtered adapter. If the queues are subsequently made available again to
    the guest, they should re-appear in a reset state; so, let's make sure all
    queues associated with an adapter unplugged from the guest are reset.
    
    In order to identify the set of queues that need to be reset, let's allow a
    vfio_ap_queue object to be simultaneously stored in both a hashtable and a
    list: A hashtable used to store all of the queues assigned
    to a matrix mdev; and/or, a list used to store a subset of the queues that
    need to be reset. For example, when an adapter is hot unplugged from a
    guest, all guest queues associated with that adapter must be reset. Since
    that may be a subset of those assigned to the matrix mdev, they can be
    stored in a list that can be passed to the vfio_ap_mdev_reset_queues
    function.
    
    Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
    Acked-by: Halil Pasic <pasic@linux.ibm.com>
    Fixes: 48cae940c31d ("s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240115185441.31526-5-akrowiak@linux.ibm.com
    Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

s390/vfio-ap: unpin pages on gisc registration failure [+ + +]

Author: Anthony Krowiak <akrowiak@linux.ibm.com>
Date:   Thu Nov 9 11:44:20 2023 -0500

    s390/vfio-ap: unpin pages on gisc registration failure
    
    commit 7b2d039da622daa9ba259ac6f38701d542b237c3 upstream.
    
    In the vfio_ap_irq_enable function, after the page containing the
    notification indicator byte (NIB) is pinned, the function attempts
    to register the guest ISC. If registration fails, the function sets the
    status response code and returns without unpinning the page containing
    the NIB. In order to avoid a memory leak, the NIB should be unpinned before
    returning from the vfio_ap_irq_enable function.
    
    Co-developed-by: Janosch Frank <frankja@linux.ibm.com>
    Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
    Signed-off-by: Anthony Krowiak <akrowiak@linux.ibm.com>
    Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
    Fixes: 783f0a3ccd79 ("s390/vfio-ap: add s390dbf logging to the vfio_ap_irq_enable function")
    Cc: <stable@vger.kernel.org>
    Link: https://lore.kernel.org/r/20231109164427.460493-2-akrowiak@linux.ibm.com
    Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scripts/get_abi: fix source path leak [+ + +]

Author: Vegard Nossum <vegard.nossum@oracle.com>
Date:   Mon Jan 1 00:59:58 2024 +0100

    scripts/get_abi: fix source path leak
    
    commit 5889d6ede53bc17252f79c142387e007224aa554 upstream.
    
    The code currently leaks the absolute path of the ABI files into the
    rendered documentation.
    
    There exists code to prevent this, but it is not effective when an
    absolute path is passed, which it is when $srctree is used.
    
    I consider this to be a minimal, stop-gap fix; a better fix would strip
    off the actual prefix instead of hacking it off with a regex.
    
    Link: https://mastodon.social/@vegard/111677490643495163
    Cc: Jani Nikula <jani.nikula@intel.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
    Signed-off-by: Jonathan Corbet <corbet@lwn.net>
    Link: https://lore.kernel.org/r/20231231235959.3342928-1-vegard.nossum@oracle.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: core: Kick the requeue list after inserting when flushing [+ + +]

Author: Niklas Cassel <cassel@kernel.org>
Date:   Thu Jan 11 13:05:32 2024 +0100

    scsi: core: Kick the requeue list after inserting when flushing
    
    [ Upstream commit 6df0e077d76bd144c533b61d6182676aae6b0a85 ]
    
    When libata calls ata_link_abort() to abort all ata queued commands, it
    calls blk_abort_request() on the SCSI command representing each QC.
    
    This causes scsi_timeout() to be called, which calls scsi_eh_scmd_add() for
    each SCSI command.
    
    scsi_eh_scmd_add() sets the SCSI host to state recovery, and then adds the
    command to shost->eh_cmd_q.
    
    This will wake up the SCSI EH, and eventually the libata EH strategy
    handler will be called, which calls scsi_eh_flush_done_q() to either flush
    retry or flush finish each failed command.
    
    The commands that are flush retried by scsi_eh_flush_done_q() are done so
    using scsi_queue_insert().
    
    Before commit 8b566edbdbfb ("scsi: core: Only kick the requeue list if
    necessary"), __scsi_queue_insert() called blk_mq_requeue_request() with the
    second argument set to true, indicating that it should always kick/run the
    requeue list after inserting.
    
    After commit 8b566edbdbfb ("scsi: core: Only kick the requeue list if
    necessary"), __scsi_queue_insert() does not kick/run the requeue list after
    inserting, if the current SCSI host state is recovery (which is the case in
    the libata example above).
    
    This optimization is probably fine in most cases, as I can only assume that
    most often someone will eventually kick/run the queues.
    
    However, that is not the case for scsi_eh_flush_done_q(), where we can see
    that the request gets inserted to the requeue list, but the queue is never
    started after the request has been inserted, leading to the block layer
    waiting for the completion of command that never gets to run.
    
    Since scsi_eh_flush_done_q() is called by SCSI EH context, the SCSI host
    state is most likely always in recovery when this function is called.
    
    Thus, let scsi_eh_flush_done_q() explicitly kick the requeue list after
    inserting a flush retry command, so that scsi_eh_flush_done_q() keeps the
    same behavior as before commit 8b566edbdbfb ("scsi: core: Only kick the
    requeue list if necessary").
    
    Simple reproducer for the libata example above:
    $ hdparm -Y /dev/sda
    $ echo 1 > /sys/class/scsi_device/0\:0\:0\:0/device/delete
    
    Fixes: 8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary")
    Reported-by: Kevin Locke <kevin@kevinlocke.name>
    Closes: https://lore.kernel.org/linux-scsi/ZZw3Th70wUUvCiCY@kevinlocke.name/
    Signed-off-by: Niklas Cassel <cassel@kernel.org>
    Link: https://lore.kernel.org/r/20240111120533.3612509-1-cassel@kernel.org
    Reviewed-by: Bart Van Assche <bvanassche@acm.org>
    Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: ufs: core: Remove the ufshcd_hba_exit() call from ufshcd_async_scan() [+ + +]

Author: Bart Van Assche <bvanassche@acm.org>
Date:   Mon Dec 18 14:52:15 2023 -0800

    scsi: ufs: core: Remove the ufshcd_hba_exit() call from ufshcd_async_scan()
    
    [ Upstream commit ee36710912b2075c417100a8acc642c9c6496501 ]
    
    Calling ufshcd_hba_exit() from a function that is called asynchronously
    from ufshcd_init() is wrong because this triggers multiple race
    conditions. Instead of calling ufshcd_hba_exit(), log an error message.
    
    Reported-by: Daniel Mentz <danielmentz@google.com>
    Fixes: 1d337ec2f35e ("ufs: improve init sequence")
    Signed-off-by: Bart Van Assche <bvanassche@acm.org>
    Link: https://lore.kernel.org/r/20231218225229.2542156-3-bvanassche@acm.org
    Reviewed-by: Can Guo <quic_cang@quicinc.com>
    Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftest: Don't reuse port for SO_INCOMING_CPU test. [+ + +]

Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Fri Jan 19 19:16:42 2024 -0800

    selftest: Don't reuse port for SO_INCOMING_CPU test.
    
    [ Upstream commit 97de5a15edf2d22184f5ff588656030bbb7fa358 ]
    
    Jakub reported that ASSERT_EQ(cpu, i) in so_incoming_cpu.c seems to
    fire somewhat randomly.
    
      # #  RUN           so_incoming_cpu.before_reuseport.test3 ...
      # # so_incoming_cpu.c:191:test3:Expected cpu (32) == i (0)
      # # test3: Test terminated by assertion
      # #          FAIL  so_incoming_cpu.before_reuseport.test3
      # not ok 3 so_incoming_cpu.before_reuseport.test3
    
    When the test failed, not-yet-accepted CLOSE_WAIT sockets received
    SYN with a "challenging" SEQ number, which was sent from an unexpected
    CPU that did not create the receiver.
    
    The test basically does:
    
      1. for each cpu:
        1-1. create a server
        1-2. set SO_INCOMING_CPU
    
      2. for each cpu:
        2-1. set cpu affinity
        2-2. create some clients
        2-3. let clients connect() to the server on the same cpu
        2-4. close() clients
    
      3. for each server:
        3-1. accept() all child sockets
        3-2. check if all children have the same SO_INCOMING_CPU with the server
    
    The root cause was the close() in 2-4. and net.ipv4.tcp_tw_reuse.
    
    In a loop of 2., close() changed the client state to FIN_WAIT_2, and
    the peer transitioned to CLOSE_WAIT.
    
    In another loop of 2., connect() happened to select the same port of
    the FIN_WAIT_2 socket, and it was reused as the default value of
    net.ipv4.tcp_tw_reuse is 2.
    
    As a result, the new client sent SYN to the CLOSE_WAIT socket from
    a different CPU, and the receiver's sk_incoming_cpu was overwritten
    with unexpected CPU ID.
    
    Also, the SYN had a different SEQ number, so the CLOSE_WAIT socket
    responded with Challenge ACK.  The new client properly returned RST
    and effectively killed the CLOSE_WAIT socket.
    
    This way, all clients were created successfully, but the error was
    detected later by 3-2., ASSERT_EQ(cpu, i).
    
    To avoid the failure, let's make sure that (i) the number of clients
    is less than the number of available ports and (ii) such reuse never
    happens.
    
    Fixes: 6df96146b202 ("selftest: Add test for SO_INCOMING_CPU.")
    Reported-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Tested-by: Jakub Kicinski <kuba@kernel.org>
    Link: https://lore.kernel.org/r/20240120031642.67014-1-kuniyu@amazon.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests/bpf: check if max number of bpf_loop iterations is tracked [+ + +]

Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Nov 21 04:07:01 2023 +0200

    selftests/bpf: check if max number of bpf_loop iterations is tracked
    
    commit 57e2a52deeb12ab84c15c6d0fb93638b5b94001b upstream.
    
    Check that even if bpf_loop() callback simulation does not converge to
    a specific state, verification could proceed via "brute force"
    simulation of maximal number of callback calls.
    
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231121020701.26440-12-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests/bpf: test if state loops are detected in a tricky case [+ + +]

Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Oct 24 03:09:16 2023 +0300

    selftests/bpf: test if state loops are detected in a tricky case
    
    commit 64870feebecb7130291a55caf0ce839a87405a70 upstream.
    
    A convoluted test case for iterators convergence logic that
    demonstrates that states with branch count equal to 0 might still be
    a part of not completely explored loop.
    
    E.g. consider the following state diagram:
    
                   initial     Here state 'succ' was processed first,
                     |         it was eventually tracked to produce a
                     V         state identical to 'hdr'.
        .---------> hdr        All branches from 'succ' had been explored
        |            |         and thus 'succ' has its .branches == 0.
        |            V
        |    .------...        Suppose states 'cur' and 'succ' correspond
        |    |       |         to the same instruction + callsites.
        |    V       V         In such case it is necessary to check
        |   ...     ...        whether 'succ' and 'cur' are identical.
        |    |       |         If 'succ' and 'cur' are a part of the same loop
        |    V       V         they have to be compared exactly.
        |   succ <- cur
        |    |
        |    V
        |   ...
        |    |
        '----'
    
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231024000917.12153-7-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests/bpf: test widening for iterating callbacks [+ + +]

Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Nov 21 04:06:59 2023 +0200

    selftests/bpf: test widening for iterating callbacks
    
    commit 9f3330aa644d6d979eb064c46e85c62d4b4eac75 upstream.
    
    A test case to verify that imprecise scalars widening is applied to
    callback entering state, when callback call is simulated repeatedly.
    
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231121020701.26440-10-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests/bpf: tests for iterating callbacks [+ + +]

Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Nov 21 04:06:57 2023 +0200

    selftests/bpf: tests for iterating callbacks
    
    commit 958465e217dbf5fc6677d42d8827fb3073d86afd upstream.
    
    A set of test cases to check behavior of callback handling logic,
    check if verifier catches the following situations:
    - program not safe on second callback iteration;
    - program not safe on zero callback iterations;
    - infinite loop inside a callback.
    
    Verify that callback logic works for bpf_loop, bpf_for_each_map_elem,
    bpf_user_ringbuf_drain, bpf_find_vma.
    
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231121020701.26440-8-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests/bpf: tests with delayed read/precision makrs in loop body [+ + +]

Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Oct 24 03:09:14 2023 +0300

    selftests/bpf: tests with delayed read/precision makrs in loop body
    
    commit 389ede06c2974b2f878a7ebff6b0f4f707f9db74 upstream.
    
    These test cases try to hide read and precision marks from loop
    convergence logic: marks would only be assigned on subsequent loop
    iterations or after exploring states pushed to env->head stack first.
    Without verifier fix to use exact states comparison logic for
    iterators convergence these tests (except 'triple_continue') would be
    errorneously marked as safe.
    
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231024000917.12153-5-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests/bpf: track string payload offset as scalar in strobemeta [+ + +]

Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Nov 21 04:06:52 2023 +0200

    selftests/bpf: track string payload offset as scalar in strobemeta
    
    commit 87eb0152bcc102ecbda866978f4e54db5a3be1ef upstream.
    
    This change prepares strobemeta for update in callbacks verification
    logic. To allow bpf_loop() verification converge when multiple
    callback iterations are considered:
    - track offset inside strobemeta_payload->payload directly as scalar
      value;
    - at each iteration make sure that remaining
      strobemeta_payload->payload capacity is sufficient for execution of
      read_{map,str}_var functions;
    - make sure that offset is tracked as unbound scalar between
      iterations, otherwise verifier won't be able infer that bpf_loop
      callback reaches identical states.
    
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231121020701.26440-3-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests/bpf: track tcp payload offset as scalar in xdp_synproxy [+ + +]

Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Tue Nov 21 04:06:51 2023 +0200

    selftests/bpf: track tcp payload offset as scalar in xdp_synproxy
    
    commit 977bc146d4eb7070118d8a974919b33bb52732b4 upstream.
    
    This change prepares syncookie_{tc,xdp} for update in callbakcs
    verification logic. To allow bpf_loop() verification converge when
    multiple callback itreations are considered:
    - track offset inside TCP payload explicitly, not as a part of the
      pointer;
    - make sure that offset does not exceed MAX_PACKET_OFF enforced by
      verifier;
    - make sure that offset is tracked as unbound scalar between
      iterations, otherwise verifier won't be able infer that bpf_loop
      callback reaches identical states.
    
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231121020701.26440-2-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests: bonding: do not test arp/ns target with mode balance-alb/tlb [+ + +]

Author: Hangbin Liu <liuhangbin@gmail.com>
Date:   Tue Jan 23 15:59:17 2024 +0800

    selftests: bonding: do not test arp/ns target with mode balance-alb/tlb
    
    [ Upstream commit a2933a8759a62269754e54733d993b19de870e84 ]
    
    The prio_arp/ns tests hard code the mode to active-backup. At the same
    time, The balance-alb/tlb modes do not support arp/ns target. So remove
    the prio_arp/ns tests from the loop and only test active-backup mode.
    
    Fixes: 481b56e0391e ("selftests: bonding: re-format bond option tests")
    Reported-by: Jay Vosburgh <jay.vosburgh@canonical.com>
    Closes: https://lore.kernel.org/netdev/17415.1705965957@famine/
    Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
    Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
    Link: https://lore.kernel.org/r/20240123075917.1576360-1-liuhangbin@gmail.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: bonding: Increase timeout to 1200s [+ + +]

Author: Benjamin Poirier <bpoirier@nvidia.com>
Date:   Wed Jan 17 19:12:32 2024 -0500

    selftests: bonding: Increase timeout to 1200s
    
    [ Upstream commit b01f15a7571b7aa222458bc9bf26ab59bd84e384 ]
    
    When tests are run by runner.sh, bond_options.sh gets killed before
    it can complete:
    
    make -C tools/testing/selftests run_tests TARGETS="drivers/net/bonding"
            [...]
            # timeout set to 120
            # selftests: drivers/net/bonding: bond_options.sh
            # TEST: prio (active-backup miimon primary_reselect 0)                [ OK ]
            # TEST: prio (active-backup miimon primary_reselect 1)                [ OK ]
            # TEST: prio (active-backup miimon primary_reselect 2)                [ OK ]
            # TEST: prio (active-backup arp_ip_target primary_reselect 0)         [ OK ]
            # TEST: prio (active-backup arp_ip_target primary_reselect 1)         [ OK ]
            # TEST: prio (active-backup arp_ip_target primary_reselect 2)         [ OK ]
            #
            not ok 7 selftests: drivers/net/bonding: bond_options.sh # TIMEOUT 120 seconds
    
    This test includes many sleep statements, at least some of which are
    related to timers in the operation of the bonding driver itself. Increase
    the test timeout to allow the test to complete.
    
    I ran the test in slightly different VMs (including one without HW
    virtualization support) and got runtimes of 13m39.760s, 13m31.238s, and
    13m2.956s. Use a ~1.5x "safety factor" and set the timeout to 1200s.
    
    Fixes: 42a8d4aaea84 ("selftests: bonding: add bonding prio option test")
    Reported-by: Jakub Kicinski <kuba@kernel.org>
    Closes: https://lore.kernel.org/netdev/20240116104402.1203850a@kernel.org/#t
    Suggested-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
    Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
    Link: https://lore.kernel.org/r/20240118001233.304759-1-bpoirier@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: fill in some missing configs for net [+ + +]

Author: Jakub Kicinski <kuba@kernel.org>
Date:   Mon Jan 22 12:35:28 2024 -0800

    selftests: fill in some missing configs for net
    
    [ Upstream commit 04fe7c5029cbdbcdb28917f09a958d939a8f19f7 ]
    
    We are missing a lot of config options from net selftests,
    it seems:
    
    tun/tap:     CONFIG_TUN, CONFIG_MACVLAN, CONFIG_MACVTAP
    fib_tests:   CONFIG_NET_SCH_FQ_CODEL
    l2tp:        CONFIG_L2TP, CONFIG_L2TP_V3, CONFIG_L2TP_IP, CONFIG_L2TP_ETH
    sctp-vrf:    CONFIG_INET_DIAG
    txtimestamp: CONFIG_NET_CLS_U32
    vxlan_mdb:   CONFIG_BRIDGE_VLAN_FILTERING
    gre_gso:     CONFIG_NET_IPGRE_DEMUX, CONFIG_IP_GRE, CONFIG_IPV6_GRE
    srv6_end_dt*_l3vpn:   CONFIG_IPV6_SEG6_LWTUNNEL
    ip_local_port_range:  CONFIG_MPTCP
    fib_test:    CONFIG_NET_CLS_BASIC
    rtnetlink:   CONFIG_MACSEC, CONFIG_NET_SCH_HTB, CONFIG_XFRM_INTERFACE
                 CONFIG_NET_IPGRE, CONFIG_BONDING
    fib_nexthops: CONFIG_MPLS, CONFIG_MPLS_ROUTING
    vxlan_mdb:   CONFIG_NET_ACT_GACT
    tls:         CONFIG_TLS, CONFIG_CRYPTO_CHACHA20POLY1305
    psample:     CONFIG_PSAMPLE
    fcnal:       CONFIG_TCP_MD5SIG
    
    Try to add them in a semi-alphabetical order.
    
    Fixes: 62199e3f1658 ("selftests: net: Add VXLAN MDB test")
    Fixes: c12e0d5f267d ("self-tests: introduce self-tests for RPS default mask")
    Fixes: 122db5e3634b ("selftests/net: add MPTCP coverage for IP_LOCAL_PORT_RANGE")
    Link: https://lore.kernel.org/r/20240122203528.672004-1-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: mm: hugepage-vmemmap fails on 64K page size systems [+ + +]

Author: Donet Tom <donettom@linux.vnet.ibm.com>
Date:   Wed Jan 10 14:03:35 2024 +0530

    selftests: mm: hugepage-vmemmap fails on 64K page size systems
    
    commit 00bcfcd47a52f50f07a2e88d730d7931384cb073 upstream.
    
    The kernel sefltest mm/hugepage-vmemmap fails on architectures which has
    different page size other than 4K.  In hugepage-vmemmap page size used is
    4k so the pfn calculation will go wrong on systems which has different
    page size .The length of MAP_HUGETLB memory must be hugepage aligned but
    in hugepage-vmemmap map length is 2M so this will not get aligned if the
    system has differnet hugepage size.
    
    Added  psize() to get the page size and default_huge_page_size() to
    get the default hugepage size at run time, hugepage-vmemmap test pass
    on powerpc with 64K page size and x86 with 4K page size.
    
    Result on powerpc without patch (page size 64K)
    *# ./hugepage-vmemmap
    Returned address is 0x7effff000000 whose pfn is 0
    Head page flags (100000000) is invalid
    check_page_flags: Invalid argument
    *#
    
    Result on powerpc with patch (page size 64K)
    *# ./hugepage-vmemmap
    Returned address is 0x7effff000000 whose pfn is 600
    *#
    
    Result on x86 with patch (page size 4K)
    *# ./hugepage-vmemmap
    Returned address is 0x7fc7c2c00000 whose pfn is 1dac00
    *#
    
    Link: https://lkml.kernel.org/r/3b3a3ae37ba21218481c482a872bbf7526031600.1704865754.git.donettom@linux.vnet.ibm.com
    Fixes: b147c89cd429 ("selftests: vm: add a hugetlb test case")
    Signed-off-by: Donet Tom <donettom@linux.vnet.ibm.com>
    Reported-by: Geetika Moolchandani <geetika@linux.ibm.com>
    Tested-by: Geetika Moolchandani <geetika@linux.ibm.com>
    Acked-by: Muchun Song <muchun.song@linux.dev>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests: net: fix rps_default_mask with >32 CPUs [+ + +]

Author: Jakub Kicinski <kuba@kernel.org>
Date:   Mon Jan 22 11:58:15 2024 -0800

    selftests: net: fix rps_default_mask with >32 CPUs
    
    [ Upstream commit 0719b5338a0cbe80d1637a5fb03d8141b5bfc7a1 ]
    
    If there is more than 32 cpus the bitmask will start to contain
    commas, leading to:
    
    ./rps_default_mask.sh: line 36: [: 00000000,00000000: integer expression expected
    
    Remove the commas, bash doesn't interpret leading zeroes as oct
    so that should be good enough. Switch to bash, Simon reports that
    not all shells support this type of substitution.
    
    Fixes: c12e0d5f267d ("self-tests: introduce self-tests for RPS default mask")
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20240122195815.638997-1-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: netdevsim: fix the udp_tunnel_nic test [+ + +]

Author: Jakub Kicinski <kuba@kernel.org>
Date:   Mon Jan 22 22:05:29 2024 -0800

    selftests: netdevsim: fix the udp_tunnel_nic test
    
    [ Upstream commit 0879020a7817e7ce636372c016b4528f541c9f4d ]
    
    This test is missing a whole bunch of checks for interface
    renaming and one ifup. Presumably it was only used on a system
    with renaming disabled and NetworkManager running.
    
    Fixes: 91f430b2c49d ("selftests: net: add a test for UDP tunnel info infra")
    Acked-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20240123060529.1033912-1-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

serial: core: fix kernel-doc for uart_port_unlock_irqrestore() [+ + +]

Author: Randy Dunlap <rdunlap@infradead.org>
Date:   Tue Sep 26 21:41:28 2023 -0700

    serial: core: fix kernel-doc for uart_port_unlock_irqrestore()
    
    commit 29bff582b74ed0bdb7e6986482ad9e6799ea4d2f upstream.
    
    Fix the function name to avoid a kernel-doc warning:
    
    include/linux/serial_core.h:666: warning: expecting prototype for uart_port_lock_irqrestore(). Prototype was for uart_port_unlock_irqrestore() instead
    
    Fixes: b0af4bcb4946 ("serial: core: Provide port lock wrappers")
    Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: John Ogness <john.ogness@linutronix.de>
    Cc: linux-serial@vger.kernel.org
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Jiri Slaby <jirislaby@kernel.org>
    Reviewed-by: John Ogness <john.ogness@linutronix.de>
    Link: https://lore.kernel.org/r/20230927044128.4748-1-rdunlap@infradead.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

serial: core: Provide port lock wrappers [+ + +]

Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Thu Sep 14 20:43:18 2023 +0206

    serial: core: Provide port lock wrappers
    
    [ Upstream commit b0af4bcb49464c221ad5f95d40f2b1b252ceedcc ]
    
    When a serial port is used for kernel console output, then all
    modifications to the UART registers which are done from other contexts,
    e.g. getty, termios, are interference points for the kernel console.
    
    So far this has been ignored and the printk output is based on the
    principle of hope. The rework of the console infrastructure which aims to
    support threaded and atomic consoles, requires to mark sections which
    modify the UART registers as unsafe. This allows the atomic write function
    to make informed decisions and eventually to restore operational state. It
    also allows to prevent the regular UART code from modifying UART registers
    while printk output is in progress.
    
    All modifications of UART registers are guarded by the UART port lock,
    which provides an obvious synchronization point with the console
    infrastructure.
    
    Provide wrapper functions for spin_[un]lock*(port->lock) invocations so
    that the console mechanics can be applied later on at a single place and
    does not require to copy the same logic all over the drivers.
    
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Ilpo Jц╓rvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: John Ogness <john.ogness@linutronix.de>
    Link: https://lore.kernel.org/r/20230914183831.587273-2-john.ogness@linutronix.de
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Stable-dep-of: 9915753037eb ("serial: sc16is7xx: fix unconditional activation of THRI interrupt")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

serial: core: set missing supported flag for RX during TX GPIO [+ + +]

Author: Lino Sanfilippo <l.sanfilippo@kunbus.com>
Date:   Wed Jan 3 07:18:13 2024 +0100

    serial: core: set missing supported flag for RX during TX GPIO
    
    [ Upstream commit 1a33e33ca0e80d485458410f149265cdc0178cfa ]
    
    If the RS485 feature RX-during-TX is supported by means of a GPIO set the
    according supported flag. Otherwise setting this feature from userspace may
    not be possible, since in uart_sanitize_serial_rs485() the passed RS485
    configuration is matched against the supported features and unsupported
    settings are thereby removed and thus take no effect.
    
    Cc:  <stable@vger.kernel.org>
    Fixes: 163f080eb717 ("serial: core: Add option to output RS485 RX_DURING_TX state via GPIO")
    Reviewed-by: Ilpo Jц╓rvinen <ilpo.jarvinen@linux.intel.com>
    Signed-off-by: Lino Sanfilippo <l.sanfilippo@kunbus.com>
    Link: https://lore.kernel.org/r/20240103061818.564-3-l.sanfilippo@kunbus.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

serial: core: Simplify uart_get_rs485_mode() [+ + +]

Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date:   Tue Oct 3 17:23:46 2023 +0300

    serial: core: Simplify uart_get_rs485_mode()
    
    [ Upstream commit 7cda0b9eb6eb9e761f452e2ef4e81eca20b19938 ]
    
    Simplify uart_get_rs485_mode() by using temporary variable for
    the GPIO descriptor. With that, use proper type for the flags
    of the GPIO descriptor.
    
    Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Link: https://lore.kernel.org/r/20231003142346.3072929-1-andriy.shevchenko@linux.intel.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Stable-dep-of: 1a33e33ca0e8 ("serial: core: set missing supported flag for RX during TX GPIO")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

serial: Do not hold the port lock when setting rx-during-tx GPIO [+ + +]

Author: Lino Sanfilippo <l.sanfilippo@kunbus.com>
Date:   Wed Jan 3 07:18:12 2024 +0100

    serial: Do not hold the port lock when setting rx-during-tx GPIO
    
    commit 07c30ea5861fb26a77dade8cdc787252f6122fb1 upstream.
    
    Both the imx and stm32 driver set the rx-during-tx GPIO in rs485_config().
    Since this function is called with the port lock held, this can be a
    problem in case that setting the GPIO line can sleep (e.g. if a GPIO
    expander is used which is connected via SPI or I2C).
    
    Avoid this issue by moving the GPIO setting outside of the port lock into
    the serial core and thus making it a generic feature.
    
    Also with commit c54d48543689 ("serial: stm32: Add support for rs485
    RX_DURING_TX output GPIO") the SER_RS485_RX_DURING_TX flag is only set if a
    rx-during-tx GPIO is _not_ available, which is wrong. Fix this, too.
    
    Furthermore reset old GPIO settings in case that changing the RS485
    configuration failed.
    
    Fixes: c54d48543689 ("serial: stm32: Add support for rs485 RX_DURING_TX output GPIO")
    Fixes: ca530cfa968c ("serial: imx: Add support for RS485 RX_DURING_TX output GPIO")
    Cc: Shawn Guo <shawnguo@kernel.org>
    Cc: Sascha Hauer <s.hauer@pengutronix.de>
    Cc:  <stable@vger.kernel.org>
    Signed-off-by: Lino Sanfilippo <l.sanfilippo@kunbus.com>
    Link: https://lore.kernel.org/r/20240103061818.564-2-l.sanfilippo@kunbus.com
    Signed-off-by: Lino Sanfilippo <l.sanfilippo@kunbus.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

serial: sc16is7xx: change EFR lock to operate on each channels [+ + +]

Author: Hugo Villeneuve <hvilleneuve@dimonoff.com>
Date:   Mon Dec 11 12:13:51 2023 -0500

    serial: sc16is7xx: change EFR lock to operate on each channels
    
    commit 4409df5866b7ff7686ba27e449ca97a92ee063c9 upstream.
    
    Now that the driver has been converted to use one regmap per port, change
    efr locking to operate on a channel basis instead of on the whole IC.
    
    Fixes: 3837a0379533 ("serial: sc16is7xx: improve regmap debugfs by using one regmap per port")
    Cc:  <stable@vger.kernel.org> # 6.1.x: 3837a03 serial: sc16is7xx: improve regmap debugfs by using one regmap per port
    Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com>
    Link: https://lore.kernel.org/r/20231211171353.2901416-5-hugo@hugovil.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

serial: sc16is7xx: convert from _raw_ to _noinc_ regmap functions for FIFO [+ + +]

Author: Hugo Villeneuve <hvilleneuve@dimonoff.com>
Date:   Mon Dec 11 12:13:52 2023 -0500

    serial: sc16is7xx: convert from _raw_ to _noinc_ regmap functions for FIFO
    
    commit dbf4ab821804df071c8b566d9813083125e6d97b upstream.
    
    The SC16IS7XX IC supports a burst mode to access the FIFOs where the
    initial register address is sent ($00), followed by all the FIFO data
    without having to resend the register address each time. In this mode, the
    IC doesn't increment the register address for each R/W byte.
    
    The regmap_raw_read() and regmap_raw_write() are functions which can
    perform IO over multiple registers. They are currently used to read/write
    from/to the FIFO, and although they operate correctly in this burst mode on
    the SPI bus, they would corrupt the regmap cache if it was not disabled
    manually. The reason is that when the R/W size is more than 1 byte, these
    functions assume that the register address is incremented and handle the
    cache accordingly.
    
    Convert FIFO R/W functions to use the regmap _noinc_ versions in order to
    remove the manual cache control which was a workaround when using the
    _raw_ versions. FIFO registers are properly declared as volatile so
    cache will not be used/updated for FIFO accesses.
    
    Fixes: dfeae619d781 ("serial: sc16is7xx")
    Cc:  <stable@vger.kernel.org>
    Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com>
    Link: https://lore.kernel.org/r/20231211171353.2901416-6-hugo@hugovil.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

serial: sc16is7xx: fix invalid sc16is7xx_lines bitfield in case of probe error [+ + +]

Author: Hugo Villeneuve <hvilleneuve@dimonoff.com>
Date:   Thu Dec 21 18:18:08 2023 -0500

    serial: sc16is7xx: fix invalid sc16is7xx_lines bitfield in case of probe error
    
    commit 8a1060ce974919f2a79807527ad82ac39336eda2 upstream.
    
    If an error occurs during probing, the sc16is7xx_lines bitfield may be left
    in a state that doesn't represent the correct state of lines allocation.
    
    For example, in a system with two SC16 devices, if an error occurs only
    during probing of channel (port) B of the second device, sc16is7xx_lines
    final state will be 00001011b instead of the expected 00000011b.
    
    This is caused in part because of the "i--" in the for/loop located in
    the out_ports: error path.
    
    Fix this by checking the return value of uart_add_one_port() and set line
    allocation bit only if this was successful. This allows the refactor of
    the obfuscated for(i--...) loop in the error path, and properly call
    uart_remove_one_port() only when needed, and properly unset line allocation
    bits.
    
    Also use same mechanism in remove() when calling uart_remove_one_port().
    
    Fixes: c64349722d14 ("sc16is7xx: support multiple devices")
    Cc:  <stable@vger.kernel.org>
    Cc: Yury Norov <yury.norov@gmail.com>
    Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com>
    Link: https://lore.kernel.org/r/20231221231823.2327894-2-hugo@hugovil.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

serial: sc16is7xx: fix unconditional activation of THRI interrupt [+ + +]

Author: Hugo Villeneuve <hvilleneuve@dimonoff.com>
Date:   Mon Dec 11 12:13:53 2023 -0500

    serial: sc16is7xx: fix unconditional activation of THRI interrupt
    
    [ Upstream commit 9915753037eba7135b209fef4f2afeca841af816 ]
    
    Commit cc4c1d05eb10 ("sc16is7xx: Properly resume TX after stop") changed
    behavior to unconditionnaly set the THRI interrupt in sc16is7xx_tx_proc().
    
    For example when sending a 65 bytes message, and assuming the Tx FIFO is
    initially empty, sc16is7xx_handle_tx() will write the first 64 bytes of the
    message to the FIFO and sc16is7xx_tx_proc() will then activate THRI. When
    the THRI IRQ is fired, the driver will write the remaining byte of the
    message to the FIFO, and disable THRI by calling sc16is7xx_stop_tx().
    
    When sending a 2 bytes message, sc16is7xx_handle_tx() will write the 2
    bytes of the message to the FIFO and call sc16is7xx_stop_tx(), disabling
    THRI. After sc16is7xx_handle_tx() exits, control returns to
    sc16is7xx_tx_proc() which will unconditionally set THRI. When the THRI IRQ
    is fired, the driver simply acknowledges the interrupt and does nothing
    more, since all the data has already been written to the FIFO. This results
    in 2 register writes and 4 register reads all for nothing and taking
    precious cycles from the I2C/SPI bus.
    
    Fix this by enabling the THRI interrupt only when we fill the Tx FIFO to
    its maximum capacity and there are remaining bytes to send in the message.
    
    Fixes: cc4c1d05eb10 ("sc16is7xx: Properly resume TX after stop")
    Cc:  <stable@vger.kernel.org>
    Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com>
    Link: https://lore.kernel.org/r/20231211171353.2901416-7-hugo@hugovil.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

serial: sc16is7xx: improve do/while loop in sc16is7xx_irq() [+ + +]

Author: Hugo Villeneuve <hvilleneuve@dimonoff.com>
Date:   Thu Dec 21 18:18:12 2023 -0500

    serial: sc16is7xx: improve do/while loop in sc16is7xx_irq()
    
    commit d5078509c8b06c5c472a60232815e41af81c6446 upstream.
    
    Simplify and improve readability by replacing while(1) loop with
    do {} while, and by using the keep_polling variable as the exit
    condition, making it more explicit.
    
    Fixes: 834449872105 ("sc16is7xx: Fix for multi-channel stall")
    Cc:  <stable@vger.kernel.org>
    Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
    Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com>
    Link: https://lore.kernel.org/r/20231221231823.2327894-6-hugo@hugovil.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

serial: sc16is7xx: improve regmap debugfs by using one regmap per port [+ + +]

Author: Hugo Villeneuve <hvilleneuve@dimonoff.com>
Date:   Mon Oct 30 17:14:47 2023 -0400

    serial: sc16is7xx: improve regmap debugfs by using one regmap per port
    
    commit 3837a0379533aabb9e4483677077479f7c6aa910 upstream.
    
    With this current driver regmap implementation, it is hard to make sense
    of the register addresses displayed using the regmap debugfs interface,
    because they do not correspond to the actual register addresses documented
    in the datasheet. For example, register 1 is displayed as registers 04 thru
    07:
    
    $ cat /sys/kernel/debug/regmap/spi0.0/registers
      04: 10 -> Port 0, register offset 1
      05: 10 -> Port 1, register offset 1
      06: 00 -> Port 2, register offset 1 -> invalid
      07: 00 -> port 3, register offset 1 -> invalid
      ...
    
    The reason is that bits 0 and 1 of the register address correspond to the
    channel (port) bits, so the register address itself starts at bit 2, and we
    must 'mentally' shift each register address by 2 bits to get its real
    address/offset.
    
    Also, only channels 0 and 1 are supported by the chip, so channel mask
    combinations of 10b and 11b are invalid, and the display of these
    registers is useless.
    
    This patch adds a separate regmap configuration for each port, similar to
    what is done in the max310x driver, so that register addresses displayed
    match the register addresses in the chip datasheet. Also, each port now has
    its own debugfs entry.
    
    Example with new regmap implementation:
    
    $ cat /sys/kernel/debug/regmap/spi0.0-port0/registers
    1: 10
    2: 01
    3: 00
    ...
    
    $ cat /sys/kernel/debug/regmap/spi0.0-port1/registers
    1: 10
    2: 01
    3: 00
    
    As an added bonus, this also simplifies some operations (read/write/modify)
    because it is no longer necessary to manually shift register addresses.
    
    Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com>
    Link: https://lore.kernel.org/r/20231030211447.974779-1-hugo@hugovil.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

serial: sc16is7xx: remove global regmap from struct sc16is7xx_port [+ + +]

Author: Hugo Villeneuve <hvilleneuve@dimonoff.com>
Date:   Mon Dec 11 12:13:49 2023 -0500

    serial: sc16is7xx: remove global regmap from struct sc16is7xx_port
    
    commit f6959c5217bd799bcb770b95d3c09b3244e175c6 upstream.
    
    Remove global struct regmap so that it is more obvious that this
    regmap is to be used only in the probe function.
    
    Also add a comment to that effect in probe function.
    
    Fixes: 3837a0379533 ("serial: sc16is7xx: improve regmap debugfs by using one regmap per port")
    Cc:  <stable@vger.kernel.org>
    Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
    Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com>
    Link: https://lore.kernel.org/r/20231211171353.2901416-3-hugo@hugovil.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

serial: sc16is7xx: remove obsolete loop in sc16is7xx_port_irq() [+ + +]

Author: Hugo Villeneuve <hvilleneuve@dimonoff.com>
Date:   Thu Dec 21 18:18:11 2023 -0500

    serial: sc16is7xx: remove obsolete loop in sc16is7xx_port_irq()
    
    commit ed647256e8f226241ecff7baaecdb8632ffc7ec1 upstream.
    
    Commit 834449872105 ("sc16is7xx: Fix for multi-channel stall") changed
    sc16is7xx_port_irq() from looping multiple times when there was still
    interrupts to serve. It simply changed the do {} while(1) loop to a
    do {} while(0) loop, which makes the loop itself now obsolete.
    
    Clean the code by removing this obsolete do {} while(0) loop.
    
    Fixes: 834449872105 ("sc16is7xx: Fix for multi-channel stall")
    Cc:  <stable@vger.kernel.org>
    Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
    Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com>
    Link: https://lore.kernel.org/r/20231221231823.2327894-5-hugo@hugovil.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

serial: sc16is7xx: remove unused line structure member [+ + +]

Author: Hugo Villeneuve <hvilleneuve@dimonoff.com>
Date:   Mon Dec 11 12:13:50 2023 -0500

    serial: sc16is7xx: remove unused line structure member
    
    commit 41a308cbedb2a68a6831f0f2e992e296c4b8aff0 upstream.
    
    Now that the driver has been converted to use one regmap per port, the line
    structure member is no longer used, so remove it.
    
    Fixes: 3837a0379533 ("serial: sc16is7xx: improve regmap debugfs by using one regmap per port")
    Cc:  <stable@vger.kernel.org>
    Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com>
    Link: https://lore.kernel.org/r/20231211171353.2901416-4-hugo@hugovil.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

serial: sc16is7xx: remove wasteful static buffer in sc16is7xx_regmap_name() [+ + +]

Author: Hugo Villeneuve <hvilleneuve@dimonoff.com>
Date:   Mon Dec 11 12:13:48 2023 -0500

    serial: sc16is7xx: remove wasteful static buffer in sc16is7xx_regmap_name()
    
    commit 6bcab3c8acc88e265c570dea969fd04f137c8a4c upstream.
    
    Using a static buffer inside sc16is7xx_regmap_name() was a convenient and
    simple way to set the regmap name without having to allocate and free a
    buffer each time it is called. The drawback is that the static buffer
    wastes memory for nothing once regmap is fully initialized.
    
    Remove static buffer and use constant strings instead.
    
    This also avoids a truncation warning when using "%d" or "%u" in snprintf
    which was flagged by kernel test robot.
    
    Fixes: 3837a0379533 ("serial: sc16is7xx: improve regmap debugfs by using one regmap per port")
    Cc:  <stable@vger.kernel.org> # 6.1.x: 3837a03 serial: sc16is7xx: improve regmap debugfs by using one regmap per port
    Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
    Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com>
    Link: https://lore.kernel.org/r/20231211171353.2901416-2-hugo@hugovil.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

serial: sc16is7xx: Use port lock wrappers [+ + +]

Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Thu Sep 14 20:44:12 2023 +0206

    serial: sc16is7xx: Use port lock wrappers
    
    [ Upstream commit b465848be8a652e2c5fefe102661fb660cff8497 ]
    
    When a serial port is used for kernel console output, then all
    modifications to the UART registers which are done from other contexts,
    e.g. getty, termios, are interference points for the kernel console.
    
    So far this has been ignored and the printk output is based on the
    principle of hope. The rework of the console infrastructure which aims to
    support threaded and atomic consoles, requires to mark sections which
    modify the UART registers as unsafe. This allows the atomic write function
    to make informed decisions and eventually to restore operational state. It
    also allows to prevent the regular UART code from modifying UART registers
    while printk output is in progress.
    
    All modifications of UART registers are guarded by the UART port lock,
    which provides an obvious synchronization point with the console
    infrastructure.
    
    To avoid adding this functionality to all UART drivers, wrap the
    spin_[un]lock*() invocations for uart_port::lock into helper functions
    which just contain the spin_[un]lock*() invocations for now. In a
    subsequent step these helpers will gain the console synchronization
    mechanisms.
    
    Converted with coccinelle. No functional change.
    
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: John Ogness <john.ogness@linutronix.de>
    Link: https://lore.kernel.org/r/20230914183831.587273-56-john.ogness@linutronix.de
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Stable-dep-of: 9915753037eb ("serial: sc16is7xx: fix unconditional activation of THRI interrupt")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

sh: ecovec24: Rename missed backlight field from fbdev to dev [+ + +]

Author: Geert Uytterhoeven <geert+renesas@glider.be>
Date:   Mon Sep 25 13:10:22 2023 +0200

    sh: ecovec24: Rename missed backlight field from fbdev to dev
    
    [ Upstream commit d87123aa9a7920e88633ffc5c5a0a22ab08bdc06 ]
    
    One instance of gpio_backlight_platform_data.fbdev was renamed, but the
    second instance was forgotten, causing a build failure:
    
        arch/sh/boards/mach-ecovec24/setup.c: In function Б─≤arch_setupБ─≥:
        arch/sh/boards/mach-ecovec24/setup.c:1223:37: error: Б─≤struct gpio_backlight_platform_dataБ─≥ has no member named Б─≤fbdevБ─≥; did you mean Б─≤devБ─≥?
         1223 |                 gpio_backlight_data.fbdev = NULL;
              |                                     ^~~~~
              |                                     dev
    
    Fix this by updating the second instance.
    
    Fixes: ed369def91c1579a ("backlight/gpio_backlight: Rename field 'fbdev' to 'dev'")
    Reported-by: kernel test robot <lkp@intel.com>
    Closes: https://lore.kernel.org/oe-kbuild-all/202309231601.Uu6qcRnU-lkp@intel.com/
    Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
    Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
    Link: https://lore.kernel.org/r/20230925111022.3626362-1-geert+renesas@glider.be
    Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

smb: client: fix parsing of SMB3.1.1 POSIX create context [+ + +]

Author: Paulo Alcantara <pc@manguebit.com>
Date:   Fri Jan 19 01:08:26 2024 -0300

    smb: client: fix parsing of SMB3.1.1 POSIX create context
    
    [ Upstream commit 76025cc2285d9ede3d717fe4305d66f8be2d9346 ]
    
    The data offset for the SMB3.1.1 POSIX create context will always be
    8-byte aligned so having the check 'noff + nlen >= doff' in
    smb2_parse_contexts() is wrong as it will lead to -EINVAL because noff
    + nlen == doff.
    
    Fix the sanity check to correctly handle aligned create context data.
    
    Fixes: af1689a9b770 ("smb: client: fix potential OOBs in smb2_parse_contexts()")
    Signed-off-by: Paulo Alcantara <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

soc: fsl: cpm1: qmc: Fix __iomem addresses declaration [+ + +]

Author: Herve Codina <herve.codina@bootlin.com>
Date:   Tue Dec 5 16:20:59 2023 +0100

    soc: fsl: cpm1: qmc: Fix __iomem addresses declaration
    
    commit a5ec3a21220da06bdda2e686012ca64fdb6c513d upstream.
    
    Running sparse (make C=1) on qmc.c raises a lot of warning such as:
      ...
      warning: incorrect type in assignment (different address spaces)
         expected struct cpm_buf_desc [usertype] *[noderef] __iomem bd
         got struct cpm_buf_desc [noderef] [usertype] __iomem *txbd_free
      ...
    
    Indeed, some variable were declared 'type *__iomem var' instead of
    'type __iomem *var'.
    
    Use the correct declaration to remove these warnings.
    
    Fixes: 3178d58e0b97 ("soc: fsl: cpm1: Add support for QMC")
    Cc: stable@vger.kernel.org
    Signed-off-by: Herve Codina <herve.codina@bootlin.com>
    Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Link: https://lore.kernel.org/r/20231205152116.122512-3-herve.codina@bootlin.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

soc: fsl: cpm1: qmc: Fix rx channel reset [+ + +]

Author: Herve Codina <herve.codina@bootlin.com>
Date:   Tue Dec 5 16:21:00 2023 +0100

    soc: fsl: cpm1: qmc: Fix rx channel reset
    
    commit dfe66d012af2ddfa566cf9c860b8472b412fb7e4 upstream.
    
    The qmc_chan_reset_rx() set the is_rx_stopped flag. This leads to an
    inconsistent state in the following sequence.
        qmc_chan_stop()
        qmc_chan_reset()
    Indeed, after the qmc_chan_reset() call, the channel must still be
    stopped. Only a qmc_chan_start() call can move the channel from stopped
    state to started state.
    
    Fix the issue removing the is_rx_stopped flag setting from
    qmc_chan_reset()
    
    Fixes: 3178d58e0b97 ("soc: fsl: cpm1: Add support for QMC")
    Cc: stable@vger.kernel.org
    Signed-off-by: Herve Codina <herve.codina@bootlin.com>
    Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Link: https://lore.kernel.org/r/20231205152116.122512-4-herve.codina@bootlin.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

soc: fsl: cpm1: tsa: Fix __iomem addresses declaration [+ + +]

Author: Herve Codina <herve.codina@bootlin.com>
Date:   Tue Dec 5 16:20:58 2023 +0100

    soc: fsl: cpm1: tsa: Fix __iomem addresses declaration
    
    commit fc0c64154e5ddeb6f63c954735bd646ce5b8d9a4 upstream.
    
    Running sparse (make C=1) on tsa.c raises a lot of warning such as:
      --- 8< ---
      warning: incorrect type in assignment (different address spaces)
         expected void *[noderef] si_regs
         got void [noderef] __iomem *
      --- 8< ---
    
    Indeed, some variable were declared 'type *__iomem var' instead of
    'type __iomem *var'.
    
    Use the correct declaration to remove these warnings.
    
    Fixes: 1d4ba0b81c1c ("soc: fsl: cpm1: Add support for TSA")
    Cc: stable@vger.kernel.org
    Reported-by: kernel test robot <lkp@intel.com>
    Closes: https://lore.kernel.org/oe-kbuild-all/202312051959.9YdRIYbg-lkp@intel.com/
    Signed-off-by: Herve Codina <herve.codina@bootlin.com>
    Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Link: https://lore.kernel.org/r/20231205152116.122512-2-herve.codina@bootlin.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

soc: qcom: pmic_glink_altmode: fix port sanity check [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Thu Nov 9 10:31:00 2023 +0100

    soc: qcom: pmic_glink_altmode: fix port sanity check
    
    commit c4fb7d2eac9ff9bfc35a2e4d40c7169a332416e0 upstream.
    
    The PMIC GLINK altmode driver currently supports at most two ports.
    
    Fix the incomplete port sanity check on notifications to avoid
    accessing and corrupting memory beyond the port array if we ever get a
    notification for an unsupported port.
    
    Fixes: 080b4e24852b ("soc: qcom: pmic_glink: Introduce altmode support")
    Cc: stable@vger.kernel.org      # 6.3
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
    Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Link: https://lore.kernel.org/r/20231109093100.19971-1-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

soundwire: bus: introduce controller_id [+ + +]

Author: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Date:   Tue Oct 17 11:09:32 2023 -0500

    soundwire: bus: introduce controller_id
    
    [ Upstream commit 6543ac13c623f906200dfd3f1c407d8d333b6995 ]
    
    The existing SoundWire support misses a clear Controller/Manager
    hiearchical definition to deal with all variants across SOC vendors.
    
    a) Intel platforms have one controller with 4 or more Managers.
    b) AMD platforms have two controllers with one Manager each, but due
    to BIOS issues use two different link_id values within the scope of a
    single controller.
    c) QCOM platforms have one or more controller with one Manager each.
    
    This patch adds a 'controller_id' which can be set by higher
    levels. If assigned to -1, the controller_id will be set to the
    system-unique IDA-assigned bus->id.
    
    The main change is that the bus->id is no longer used for any device
    name, which makes the definition completely predictable and not
    dependent on any enumeration order. The bus->id is only used to insert
    the Managers in the stream rt context.
    
    Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
    Reviewed-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
    Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
    Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Tested-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
    Link: https://lore.kernel.org/stable/20231017160933.12624-2-pierre-louis.bossart%40linux.intel.com
    Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
    Link: https://lore.kernel.org/r/20231017160933.12624-2-pierre-louis.bossart@linux.intel.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Stable-dep-of: 8a8a9ac8a497 ("soundwire: fix initializing sysfs for same devices on different buses")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

soundwire: fix initializing sysfs for same devices on different buses [+ + +]

Author: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Date:   Tue Oct 17 11:09:33 2023 -0500

    soundwire: fix initializing sysfs for same devices on different buses
    
    [ Upstream commit 8a8a9ac8a4972ee69d3dd3d1ae43963ae39cee18 ]
    
    If same devices with same device IDs are present on different soundwire
    buses, the probe fails due to conflicting device names and sysfs
    entries:
    
      sysfs: cannot create duplicate filename '/bus/soundwire/devices/sdw:0:0217:0204:00:0'
    
    The link ID is 0 for both devices, so they should be differentiated by
    the controller ID. Add the controller ID so, the device names and sysfs entries look
    like:
    
      sdw:1:0:0217:0204:00:0 -> ../../../devices/platform/soc@0/6ab0000.soundwire-controller/sdw-master-1-0/sdw:1:0:0217:0204:00:0
      sdw:3:0:0217:0204:00:0 -> ../../../devices/platform/soc@0/6b10000.soundwire-controller/sdw-master-3-0/sdw:3:0:0217:0204:00:0
    
    [PLB changes: use bus->controller_id instead of bus->id]
    
    Fixes: 7c3cd189b86d ("soundwire: Add Master registration")
    Cc: stable@vger.kernel.org
    Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
    Reviewed-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
    Co-developed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
    Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
    Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Tested-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Acked-by: Mark Brown <broonie@kernel.org>
    Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
    Link: https://lore.kernel.org/r/20231017160933.12624-3-pierre-louis.bossart@linux.intel.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: bcm-qspi: fix SFDP BFPT read by usig mspi read [+ + +]

Author: Kamal Dasu <kamal.dasu@broadcom.com>
Date:   Tue Jan 9 16:00:32 2024 -0500

    spi: bcm-qspi: fix SFDP BFPT read by usig mspi read
    
    [ Upstream commit 574bf7bbe83794a902679846770f75a9b7f28176 ]
    
    SFDP read shall use the mspi reads when using the bcm_qspi_exec_mem_op()
    call. This fixes SFDP parameter page read failures seen with parts that
    now use SFDP protocol to read the basic flash parameter table.
    
    Fixes: 5f195ee7d830 ("spi: bcm-qspi: Implement the spi_mem interface")
    Signed-off-by: Kamal Dasu <kamal.dasu@broadcom.com>
    Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Link: https://msgid.link/r/20240109210033.43249-1-kamal.dasu@broadcom.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: fix finalize message on error return [+ + +]

Author: David Lechner <dlechner@baylibre.com>
Date:   Thu Jan 25 14:53:09 2024 -0600

    spi: fix finalize message on error return
    
    [ Upstream commit 8c2ae772fe08e33f3d7a83849e85539320701abd ]
    
    In __spi_pump_transfer_message(), the message was not finalized in the
    first error return as it is in the other error return paths. Not
    finalizing the message could cause anything waiting on the message to
    complete to hang forever.
    
    This adds the missing call to spi_finalize_current_message().
    
    Fixes: ae7d2346dc89 ("spi: Don't use the message queue if possible in spi_sync")
    Signed-off-by: David Lechner <dlechner@baylibre.com>
    Link: https://msgid.link/r/20240125205312.3458541-2-dlechner@baylibre.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: intel-pci: Remove Meteor Lake-S SoC PCI ID from the list [+ + +]

Author: Mika Westerberg <mika.westerberg@linux.intel.com>
Date:   Mon Jan 22 14:00:33 2024 +0200

    spi: intel-pci: Remove Meteor Lake-S SoC PCI ID from the list
    
    [ Upstream commit 6c314425b9ef6b247cefd0903e287eb072580c3b ]
    
    Turns out this "SoC" side controller does not support certain commands,
    such as reading chip JEDEC ID, so the controller is pretty much unusable
    in Linux. We should be using the "PCH" side controller instead. For this
    reason remove this PCI ID from the list.
    
    Fixes: c2912d42e86e ("spi: intel-pci: Add support for Meteor Lake-S SPI serial flash")
    Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
    Link: https://msgid.link/r/20240122120034.2664812-2-mika.westerberg@linux.intel.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: spi-cadence: Reverse the order of interleaved write and read operations [+ + +]

Author: Amit Kumar Mahapatra <amit.kumar-mahapatra@amd.com>
Date:   Mon Dec 18 14:36:52 2023 +0530

    spi: spi-cadence: Reverse the order of interleaved write and read operations
    
    [ Upstream commit 633cd6fe6e1993ba80e0954c2db127a0b1a3e66f ]
    
    In the existing implementation, when executing interleaved write and read
    operations in the ISR for a transfer length greater than the FIFO size,
    the TXFIFO write precedes the RXFIFO read. Consequently, the initially
    received data in the RXFIFO is pushed out and lost, leading to a failure
    in data integrity. To address this issue, reverse the order of interleaved
    operations and conduct the RXFIFO read followed by the TXFIFO write.
    
    Fixes: 6afe2ae8dc48 ("spi: spi-cadence: Interleave write of TX and read of RX FIFO")
    Signed-off-by: Amit Kumar Mahapatra <amit.kumar-mahapatra@amd.com>
    Link: https://msgid.link/r/20231218090652.18403-1-amit.kumar-mahapatra@amd.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: use request size to initialize bio_vec in svc_udp_sendto() [+ + +]

Author: Lucas Stach <l.stach@pengutronix.de>
Date:   Wed Jan 17 22:06:28 2024 +0100

    SUNRPC: use request size to initialize bio_vec in svc_udp_sendto()
    
    [ Upstream commit 1d9cabe2817edd215779dc9c2fe5e7ab9aac0704 ]
    
    Use the proper size when setting up the bio_vec, as otherwise only
    zero-length UDP packets will be sent.
    
    Fixes: baabf59c2414 ("SUNRPC: Convert svc_udp_sendto() to use the per-socket bio_vec array")
    Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tcp: Add memory barrier to tcp_push() [+ + +]

Author: Salvatore Dipietro <dipiets@amazon.com>
Date:   Fri Jan 19 11:01:33 2024 -0800

    tcp: Add memory barrier to tcp_push()
    
    [ Upstream commit 7267e8dcad6b2f9fce05a6a06335d7040acbc2b6 ]
    
    On CPUs with weak memory models, reads and updates performed by tcp_push
    to the sk variables can get reordered leaving the socket throttled when
    it should not. The tasklet running tcp_wfree() may also not observe the
    memory updates in time and will skip flushing any packets throttled by
    tcp_push(), delaying the sending. This can pathologically cause 40ms
    extra latency due to bad interactions with delayed acks.
    
    Adding a memory barrier in tcp_push removes the bug, similarly to the
    previous commit bf06200e732d ("tcp: tsq: fix nonagle handling").
    smp_mb__after_atomic() is used to not incur in unnecessary overhead
    on x86 since not affected.
    
    Patch has been tested using an AWS c7g.2xlarge instance with Ubuntu
    22.04 and Apache Tomcat 9.0.83 running the basic servlet below:
    
    import java.io.IOException;
    import java.io.OutputStreamWriter;
    import java.io.PrintWriter;
    import javax.servlet.ServletException;
    import javax.servlet.http.HttpServlet;
    import javax.servlet.http.HttpServletRequest;
    import javax.servlet.http.HttpServletResponse;
    
    public class HelloWorldServlet extends HttpServlet {
        @Override
        protected void doGet(HttpServletRequest request, HttpServletResponse response)
          throws ServletException, IOException {
            response.setContentType("text/html;charset=utf-8");
            OutputStreamWriter osw = new OutputStreamWriter(response.getOutputStream(),"UTF-8");
            String s = "a".repeat(3096);
            osw.write(s,0,s.length());
            osw.flush();
        }
    }
    
    Load was applied using wrk2 (https://github.com/kinvolk/wrk2) from an AWS
    c6i.8xlarge instance. Before the patch an additional 40ms latency from P99.99+
    values is observed while, with the patch, the extra latency disappears.
    
    No patch and tcp_autocorking=1
    ./wrk -t32 -c128 -d40s --latency -R10000  http://172.31.60.173:8080/hello/hello
      ...
     50.000%    0.91ms
     75.000%    1.13ms
     90.000%    1.46ms
     99.000%    1.74ms
     99.900%    1.89ms
     99.990%   41.95ms  <<< 40+ ms extra latency
     99.999%   48.32ms
    100.000%   48.96ms
    
    With patch and tcp_autocorking=1
    ./wrk -t32 -c128 -d40s --latency -R10000  http://172.31.60.173:8080/hello/hello
      ...
     50.000%    0.90ms
     75.000%    1.13ms
     90.000%    1.45ms
     99.000%    1.72ms
     99.900%    1.83ms
     99.990%    2.11ms  <<< no 40+ ms extra latency
     99.999%    2.53ms
    100.000%    2.62ms
    
    Patch has been also tested on x86 (m7i.2xlarge instance) which it is not
    affected by this issue and the patch doesn't introduce any additional
    delay.
    
    Fixes: 7aa5470c2c09 ("tcp: tsq: move tsq_flags close to sk_wmem_alloc")
    Signed-off-by: Salvatore Dipietro <dipiets@amazon.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/20240119190133.43698-1-dipiets@amazon.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tcp: make sure init the accept_queue's spinlocks once [+ + +]

Author: Zhengchao Shao <shaozhengchao@huawei.com>
Date:   Thu Jan 18 09:20:19 2024 +0800

    tcp: make sure init the accept_queue's spinlocks once
    
    [ Upstream commit 198bc90e0e734e5f98c3d2833e8390cac3df61b2 ]
    
    When I run syz's reproduction C program locally, it causes the following
    issue:
    pvqspinlock: lock 0xffff9d181cd5c660 has corrupted value 0x0!
    WARNING: CPU: 19 PID: 21160 at __pv_queued_spin_unlock_slowpath (kernel/locking/qspinlock_paravirt.h:508)
    Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
    RIP: 0010:__pv_queued_spin_unlock_slowpath (kernel/locking/qspinlock_paravirt.h:508)
    Code: 73 56 3a ff 90 c3 cc cc cc cc 8b 05 bb 1f 48 01 85 c0 74 05 c3 cc cc cc cc 8b 17 48 89 fe 48 c7 c7
    30 20 ce 8f e8 ad 56 42 ff <0f> 0b c3 cc cc cc cc 0f 0b 0f 1f 40 00 90 90 90 90 90 90 90 90 90
    RSP: 0018:ffffa8d200604cb8 EFLAGS: 00010282
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9d1ef60e0908
    RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff9d1ef60e0900
    RBP: ffff9d181cd5c280 R08: 0000000000000000 R09: 00000000ffff7fff
    R10: ffffa8d200604b68 R11: ffffffff907dcdc8 R12: 0000000000000000
    R13: ffff9d181cd5c660 R14: ffff9d1813a3f330 R15: 0000000000001000
    FS:  00007fa110184640(0000) GS:ffff9d1ef60c0000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000020000000 CR3: 000000011f65e000 CR4: 00000000000006f0
    Call Trace:
    <IRQ>
      _raw_spin_unlock (kernel/locking/spinlock.c:186)
      inet_csk_reqsk_queue_add (net/ipv4/inet_connection_sock.c:1321)
      inet_csk_complete_hashdance (net/ipv4/inet_connection_sock.c:1358)
      tcp_check_req (net/ipv4/tcp_minisocks.c:868)
      tcp_v4_rcv (net/ipv4/tcp_ipv4.c:2260)
      ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205)
      ip_local_deliver_finish (net/ipv4/ip_input.c:234)
      __netif_receive_skb_one_core (net/core/dev.c:5529)
      process_backlog (./include/linux/rcupdate.h:779)
      __napi_poll (net/core/dev.c:6533)
      net_rx_action (net/core/dev.c:6604)
      __do_softirq (./arch/x86/include/asm/jump_label.h:27)
      do_softirq (kernel/softirq.c:454 kernel/softirq.c:441)
    </IRQ>
    <TASK>
      __local_bh_enable_ip (kernel/softirq.c:381)
      __dev_queue_xmit (net/core/dev.c:4374)
      ip_finish_output2 (./include/net/neighbour.h:540 net/ipv4/ip_output.c:235)
      __ip_queue_xmit (net/ipv4/ip_output.c:535)
      __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
      tcp_rcv_synsent_state_process (net/ipv4/tcp_input.c:6469)
      tcp_rcv_state_process (net/ipv4/tcp_input.c:6657)
      tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1929)
      __release_sock (./include/net/sock.h:1121 net/core/sock.c:2968)
      release_sock (net/core/sock.c:3536)
      inet_wait_for_connect (net/ipv4/af_inet.c:609)
      __inet_stream_connect (net/ipv4/af_inet.c:702)
      inet_stream_connect (net/ipv4/af_inet.c:748)
      __sys_connect (./include/linux/file.h:45 net/socket.c:2064)
      __x64_sys_connect (net/socket.c:2073 net/socket.c:2070 net/socket.c:2070)
      do_syscall_64 (arch/x86/entry/common.c:51 arch/x86/entry/common.c:82)
      entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
      RIP: 0033:0x7fa10ff05a3d
      Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89
      c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ab a3 0e 00 f7 d8 64 89 01 48
      RSP: 002b:00007fa110183de8 EFLAGS: 00000202 ORIG_RAX: 000000000000002a
      RAX: ffffffffffffffda RBX: 0000000020000054 RCX: 00007fa10ff05a3d
      RDX: 000000000000001c RSI: 0000000020000040 RDI: 0000000000000003
      RBP: 00007fa110183e20 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000202 R12: 00007fa110184640
      R13: 0000000000000000 R14: 00007fa10fe8b060 R15: 00007fff73e23b20
    </TASK>
    
    The issue triggering process is analyzed as follows:
    Thread A                                       Thread B
    tcp_v4_rcv      //receive ack TCP packet       inet_shutdown
      tcp_check_req                                  tcp_disconnect //disconnect sock
      ...                                              tcp_set_state(sk, TCP_CLOSE)
        inet_csk_complete_hashdance                ...
          inet_csk_reqsk_queue_add                 inet_listen  //start listen
            spin_lock(&queue->rskq_lock)             inet_csk_listen_start
            ...                                        reqsk_queue_alloc
            ...                                          spin_lock_init
            spin_unlock(&queue->rskq_lock)  //warning
    
    When the socket receives the ACK packet during the three-way handshake,
    it will hold spinlock. And then the user actively shutdowns the socket
    and listens to the socket immediately, the spinlock will be initialized.
    When the socket is going to release the spinlock, a warning is generated.
    Also the same issue to fastopenq.lock.
    
    Move init spinlock to inet_create and inet_accept to make sure init the
    accept_queue's spinlocks once.
    
    Fixes: fff1f3001cc5 ("tcp: add a spinlock to protect struct request_sock_queue")
    Fixes: 168a8f58059a ("tcp: TCP Fast Open Server - main code path")
    Reported-by: Ming Shu <sming56@aliyun.com>
    Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/20240118012019.1751966-1-shaozhengchao@huawei.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

thermal: core: Store trip pointer in struct thermal_instance [+ + +]

Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date:   Thu Sep 21 19:52:44 2023 +0200

    thermal: core: Store trip pointer in struct thermal_instance
    
    [ Upstream commit 2c7b4bfadef08cc0995c24a7b9eb120fe897165f ]
    
    Replace the integer trip number stored in struct thermal_instance with
    a pointer to the relevant trip and adjust the code using the structure
    in question accordingly.
    
    The main reason for making this change is to allow the trip point to
    cooling device binding code more straightforward, as illustrated by
    subsequent modifications of the ACPI thermal driver, but it also helps
    to clarify the overall design and allows the governor code overhead to
    be reduced (through subsequent modifications).
    
    The only case in which it adds complexity is trip_point_show() that
    needs to walk the trips[] table to find the index of the given trip
    point, but this is not a critical path and the interface that
    trip_point_show() belongs to is problematic anyway (for instance, it
    doesn't cover the case when the same cooling devices is associated
    with multiple trip points).
    
    This is a preliminary change and the affected code will be refined by
    a series of subsequent modifications of thermal governors, the core and
    the ACPI thermal driver.
    
    The general functionality is not expected to be affected by this change.
    
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
    Stable-dep-of: e95fa7404716 ("thermal: gov_power_allocator: avoid inability to reset a cdev")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

thermal: gov_power_allocator: avoid inability to reset a cdev [+ + +]

Author: Di Shen <di.shen@unisoc.com>
Date:   Wed Jan 10 19:55:26 2024 +0800

    thermal: gov_power_allocator: avoid inability to reset a cdev
    
    [ Upstream commit e95fa7404716f6e25021e66067271a4ad8eb1486 ]
    
    Commit 0952177f2a1f ("thermal/core/power_allocator: Update once
    cooling devices when temp is low") adds an update flag to avoid
    triggering a thermal event when there is no need, and the thermal
    cdev is updated once when the temperature is low.
    
    But when the trips are writable, and switch_on_temp is set to be a
    higher value, the cooling device state may not be reset to 0,
    because last_temperature is smaller than switch_on_temp.
    
    For example:
    First:
    switch_on_temp=70 control_temp=85;
    Then userspace change the trip_temp:
    switch_on_temp=45 control_temp=55 cur_temp=54
    
    Then userspace reset the trip_temp:
    switch_on_temp=70 control_temp=85 cur_temp=57 last_temp=54
    
    At this time, the cooling device state should be reset to 0.
    However, because cur_temp(57) < switch_on_temp(70)
    last_temp(54) < switch_on_temp(70)  ---->  update = false,
    update is false, the cooling device state can not be reset.
    
    Using the observation that tz->passive can also be regarded as the
    temperature status, set the update flag to the tz->passive value.
    
    When the temperature drops below switch_on for the first time, the
    states of cooling devices can be reset once, and tz->passive is updated
    to 0. In the next round, because tz->passive is 0, cdev->state will not
    be updated.
    
    By using the tz->passive value as the "update" flag, the issue above
    can be solved, and the cooling devices can be updated only once when the
    temperature is low.
    
    Fixes: 0952177f2a1f ("thermal/core/power_allocator: Update once cooling devices when temp is low")
    Cc: 5.13+ <stable@vger.kernel.org> # 5.13+
    Suggested-by: Wei Wang <wvw@google.com>
    Signed-off-by: Di Shen <di.shen@unisoc.com>
    Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
    [ rjw: Subject and changelog edits ]
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

thermal: intel: hfi: Add syscore callbacks for system-wide PM [+ + +]

Author: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Date:   Tue Jan 9 19:07:04 2024 -0800

    thermal: intel: hfi: Add syscore callbacks for system-wide PM
    
    [ Upstream commit 97566d09fd02d2ab329774bb89a2cdf2267e86d9 ]
    
    The kernel allocates a memory buffer and provides its location to the
    hardware, which uses it to update the HFI table. This allocation occurs
    during boot and remains constant throughout runtime.
    
    When resuming from hibernation, the restore kernel allocates a second
    memory buffer and reprograms the HFI hardware with the new location as
    part of a normal boot. The location of the second memory buffer may
    differ from the one allocated by the image kernel.
    
    When the restore kernel transfers control to the image kernel, its HFI
    buffer becomes invalid, potentially leading to memory corruption if the
    hardware writes to it (the hardware continues to use the buffer from the
    restore kernel).
    
    It is also possible that the hardware "forgets" the address of the memory
    buffer when resuming from "deep" suspend. Memory corruption may also occur
    in such a scenario.
    
    To prevent the described memory corruption, disable HFI when preparing to
    suspend or hibernate. Enable it when resuming.
    
    Add syscore callbacks to handle the package of the boot CPU (packages of
    non-boot CPUs are handled via CPU offline). Syscore ops always run on the
    boot CPU. Additionally, HFI only needs to be disabled during "deep" suspend
    and hibernation. Syscore ops only run in these cases.
    
    Cc: 6.1+ <stable@vger.kernel.org> # 6.1+
    Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
    [ rjw: Comment adjustment, subject and changelog edits ]
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

thermal: intel: hfi: Disable an HFI instance when all its CPUs go offline [+ + +]

Author: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Date:   Tue Jan 2 20:14:58 2024 -0800

    thermal: intel: hfi: Disable an HFI instance when all its CPUs go offline
    
    [ Upstream commit 1c53081d773c2cb4461636559b0d55b46559ceec ]
    
    In preparation to support hibernation, add functionality to disable an HFI
    instance during CPU offline. The last CPU of an instance that goes offline
    will disable such instance.
    
    The Intel Software Development Manual states that the operating system must
    wait for the hardware to set MSR_IA32_PACKAGE_THERM_STATUS[26] after
    disabling an HFI instance to ensure that it will no longer write on the HFI
    memory. Some processors, however, do not ever set such bit. Wait a minimum
    of 2ms to give time hardware to complete any pending memory writes.
    
    Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Stable-dep-of: 97566d09fd02 ("thermal: intel: hfi: Add syscore callbacks for system-wide PM")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

thermal: intel: hfi: Refactor enabling code into helper functions [+ + +]

Author: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Date:   Tue Jan 2 20:14:56 2024 -0800

    thermal: intel: hfi: Refactor enabling code into helper functions
    
    [ Upstream commit 8a8b6bb93c704776c4b05cb517c3fa8baffb72f5 ]
    
    In preparation for the addition of a suspend notifier, wrap the logic to
    enable HFI and program its memory buffer into helper functions. Both the
    CPU hotplug callback and the suspend notifier will use them.
    
    This refactoring does not introduce functional changes.
    
    Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Stable-dep-of: 97566d09fd02 ("thermal: intel: hfi: Add syscore callbacks for system-wide PM")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

thermal: trip: Drop lockdep assertion from thermal_zone_trip_id() [+ + +]

Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date:   Wed Oct 11 17:45:42 2023 +0200

    thermal: trip: Drop lockdep assertion from thermal_zone_trip_id()
    
    commit 108ffd12be24ba1d74b3314df8db32a0a6d55ba5 upstream.
    
    The lockdep assertion in thermal_zone_trip_id() triggers when the
    trip point sysfs attribute of a thermal instance is read, because
    there is no thermal zone locking in that code path.
    
    This is not verly useful, though, because there is no mechanism by which
    the location of the trips[] table in a thermal zone or its size can
    change after binding cooling devices to the trips in that thermal
    zone and before those cooling devices are unbound from them.  Thus
    it is not in fact necessary to hold the thermal zone lock when
    thermal_zone_trip_id() is called from trip_point_show() and so the
    lockdep asserion in the former is invalid.
    
    Accordingly, drop that lockdep assertion.
    
    Fixes: 2c7b4bfadef0 ("thermal: core: Store trip pointer in struct thermal_instance")
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

thermal: trip: Drop redundant trips check from for_each_thermal_trip() [+ + +]

Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date:   Tue Sep 19 20:59:53 2023 +0200

    thermal: trip: Drop redundant trips check from for_each_thermal_trip()
    
    [ Upstream commit a15ffa783ea4210877886c59566a0d20f6b2bc09 ]
    
    It is invalid to call for_each_thermal_trip() on an unregistered thermal
    zone anyway, and as per thermal_zone_device_register_with_trips(), the
    trips[] table must be present if num_trips is greater than zero for the
    given thermal zone.
    
    Hence, the trips check in for_each_thermal_trip() is redundant and so it
    can be dropped.
    
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
    Stable-dep-of: e95fa7404716 ("thermal: gov_power_allocator: avoid inability to reset a cdev")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tick/sched: Preserve number of idle sleeps across CPU hotplug events [+ + +]

Author: Tim Chen <tim.c.chen@linux.intel.com>
Date:   Mon Jan 22 15:35:34 2024 -0800

    tick/sched: Preserve number of idle sleeps across CPU hotplug events
    
    commit 9a574ea9069be30b835a3da772c039993c43369b upstream.
    
    Commit 71fee48f ("tick-sched: Fix idle and iowait sleeptime accounting vs
    CPU hotplug") preserved total idle sleep time and iowait sleeptime across
    CPU hotplug events.
    
    Similar reasoning applies to the number of idle calls and idle sleeps to
    get the proper average of sleep time per idle invocation.
    
    Preserve those fields too.
    
    Fixes: 71fee48f ("tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug")
    Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240122233534.3094238-1-tim.c.chen@linux.intel.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tracing: Ensure visibility when inserting an element into tracing_map [+ + +]

Author: Petr Pavlu <petr.pavlu@suse.com>
Date:   Mon Jan 22 16:09:28 2024 +0100

    tracing: Ensure visibility when inserting an element into tracing_map
    
    [ Upstream commit 2b44760609e9eaafc9d234a6883d042fc21132a7 ]
    
    Running the following two commands in parallel on a multi-processor
    AArch64 machine can sporadically produce an unexpected warning about
    duplicate histogram entries:
    
     $ while true; do
         echo hist:key=id.syscall:val=hitcount > \
           /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger
         cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist
         sleep 0.001
       done
     $ stress-ng --sysbadaddr $(nproc)
    
    The warning looks as follows:
    
    [ 2911.172474] ------------[ cut here ]------------
    [ 2911.173111] Duplicates detected: 1
    [ 2911.173574] WARNING: CPU: 2 PID: 12247 at kernel/trace/tracing_map.c:983 tracing_map_sort_entries+0x3e0/0x408
    [ 2911.174702] Modules linked in: iscsi_ibft(E) iscsi_boot_sysfs(E) rfkill(E) af_packet(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) ena(E) tiny_power_button(E) qemu_fw_cfg(E) button(E) fuse(E) efi_pstore(E) ip_tables(E) x_tables(E) xfs(E) libcrc32c(E) aes_ce_blk(E) aes_ce_cipher(E) crct10dif_ce(E) polyval_ce(E) polyval_generic(E) ghash_ce(E) gf128mul(E) sm4_ce_gcm(E) sm4_ce_ccm(E) sm4_ce(E) sm4_ce_cipher(E) sm4(E) sm3_ce(E) sm3(E) sha3_ce(E) sha512_ce(E) sha512_arm64(E) sha2_ce(E) sha256_arm64(E) nvme(E) sha1_ce(E) nvme_core(E) nvme_auth(E) t10_pi(E) sg(E) scsi_mod(E) scsi_common(E) efivarfs(E)
    [ 2911.174738] Unloaded tainted modules: cppc_cpufreq(E):1
    [ 2911.180985] CPU: 2 PID: 12247 Comm: cat Kdump: loaded Tainted: G            E      6.7.0-default #2 1b58bbb22c97e4399dc09f92d309344f69c44a01
    [ 2911.182398] Hardware name: Amazon EC2 c7g.8xlarge/, BIOS 1.0 11/1/2018
    [ 2911.183208] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
    [ 2911.184038] pc : tracing_map_sort_entries+0x3e0/0x408
    [ 2911.184667] lr : tracing_map_sort_entries+0x3e0/0x408
    [ 2911.185310] sp : ffff8000a1513900
    [ 2911.185750] x29: ffff8000a1513900 x28: ffff0003f272fe80 x27: 0000000000000001
    [ 2911.186600] x26: ffff0003f272fe80 x25: 0000000000000030 x24: 0000000000000008
    [ 2911.187458] x23: ffff0003c5788000 x22: ffff0003c16710c8 x21: ffff80008017f180
    [ 2911.188310] x20: ffff80008017f000 x19: ffff80008017f180 x18: ffffffffffffffff
    [ 2911.189160] x17: 0000000000000000 x16: 0000000000000000 x15: ffff8000a15134b8
    [ 2911.190015] x14: 0000000000000000 x13: 205d373432323154 x12: 5b5d313131333731
    [ 2911.190844] x11: 00000000fffeffff x10: 00000000fffeffff x9 : ffffd1b78274a13c
    [ 2911.191716] x8 : 000000000017ffe8 x7 : c0000000fffeffff x6 : 000000000057ffa8
    [ 2911.192554] x5 : ffff0012f6c24ec0 x4 : 0000000000000000 x3 : ffff2e5b72b5d000
    [ 2911.193404] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0003ff254480
    [ 2911.194259] Call trace:
    [ 2911.194626]  tracing_map_sort_entries+0x3e0/0x408
    [ 2911.195220]  hist_show+0x124/0x800
    [ 2911.195692]  seq_read_iter+0x1d4/0x4e8
    [ 2911.196193]  seq_read+0xe8/0x138
    [ 2911.196638]  vfs_read+0xc8/0x300
    [ 2911.197078]  ksys_read+0x70/0x108
    [ 2911.197534]  __arm64_sys_read+0x24/0x38
    [ 2911.198046]  invoke_syscall+0x78/0x108
    [ 2911.198553]  el0_svc_common.constprop.0+0xd0/0xf8
    [ 2911.199157]  do_el0_svc+0x28/0x40
    [ 2911.199613]  el0_svc+0x40/0x178
    [ 2911.200048]  el0t_64_sync_handler+0x13c/0x158
    [ 2911.200621]  el0t_64_sync+0x1a8/0x1b0
    [ 2911.201115] ---[ end trace 0000000000000000 ]---
    
    The problem appears to be caused by CPU reordering of writes issued from
    __tracing_map_insert().
    
    The check for the presence of an element with a given key in this
    function is:
    
     val = READ_ONCE(entry->val);
     if (val && keys_match(key, val->key, map->key_size)) ...
    
    The write of a new entry is:
    
     elt = get_free_elt(map);
     memcpy(elt->key, key, map->key_size);
     entry->val = elt;
    
    The "memcpy(elt->key, key, map->key_size);" and "entry->val = elt;"
    stores may become visible in the reversed order on another CPU. This
    second CPU might then incorrectly determine that a new key doesn't match
    an already present val->key and subsequently insert a new element,
    resulting in a duplicate.
    
    Fix the problem by adding a write barrier between
    "memcpy(elt->key, key, map->key_size);" and "entry->val = elt;", and for
    good measure, also use WRITE_ONCE(entry->val, elt) for publishing the
    element. The sequence pairs with the mentioned "READ_ONCE(entry->val);"
    and the "val->key" check which has an address dependency.
    
    The barrier is placed on a path executed when adding an element for
    a new key. Subsequent updates targeting the same key remain unaffected.
    
    From the user's perspective, the issue was introduced by commit
    c193707dde77 ("tracing: Remove code which merges duplicates"), which
    followed commit cbf4100efb8f ("tracing: Add support to detect and avoid
    duplicates"). The previous code operated differently; it inherently
    expected potential races which result in duplicates but merged them
    later when they occurred.
    
    Link: https://lore.kernel.org/linux-trace-kernel/20240122150928.27725-1-petr.pavlu@suse.com
    
    Fixes: c193707dde77 ("tracing: Remove code which merges duplicates")
    Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
    Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tsnep: Fix XDP_RING_NEED_WAKEUP for empty fill ring [+ + +]

Author: Gerhard Engleder <gerhard@engleder-embedded.com>
Date:   Tue Jan 23 21:09:18 2024 +0100

    tsnep: Fix XDP_RING_NEED_WAKEUP for empty fill ring
    
    [ Upstream commit 9a91c05f4bd6f6bdd6b8f90445e0da92e3ac956c ]
    
    The fill ring of the XDP socket may contain not enough buffers to
    completey fill the RX queue during socket creation. In this case the
    flag XDP_RING_NEED_WAKEUP is not set as this flag is only set if the RX
    queue is not completely filled during polling.
    
    Set XDP_RING_NEED_WAKEUP flag also if RX queue is not completely filled
    during XDP socket creation.
    
    Fixes: 3fc2333933fd ("tsnep: Add XDP socket zero-copy RX support")
    Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tsnep: Remove FCS for XDP data path [+ + +]

Author: Gerhard Engleder <gerhard@engleder-embedded.com>
Date:   Tue Jan 23 21:09:17 2024 +0100

    tsnep: Remove FCS for XDP data path
    
    [ Upstream commit 50bad6f797d4d501c5ef416a6f92e1912ab5aa8b ]
    
    The RX data buffer includes the FCS. The FCS is already stripped for the
    normal data path. But for the XDP data path the FCS is included and
    acts like additional/useless data.
    
    Remove the FCS from the RX data buffer also for XDP.
    
    Fixes: 65b28c810035 ("tsnep: Add XDP RX support")
    Fixes: 3fc2333933fd ("tsnep: Add XDP socket zero-copy RX support")
    Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tun: add missing rx stats accounting in tun_xdp_act [+ + +]

Author: Yunjian Wang <wangyunjian@huawei.com>
Date:   Fri Jan 19 18:22:56 2024 +0800

    tun: add missing rx stats accounting in tun_xdp_act
    
    [ Upstream commit f1084c427f55d573fcd5688d9ba7b31b78019716 ]
    
    The TUN can be used as vhost-net backend, and it is necessary to
    count the packets transmitted from TUN to vhost-net/virtio-net.
    However, there are some places in the receive path that were not
    taken into account when using XDP. It would be beneficial to also
    include new accounting for successfully received bytes using
    dev_sw_netstats_rx_add.
    
    Fixes: 761876c857cb ("tap: XDP support")
    Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tun: fix missing dropped counter in tun_xdp_act [+ + +]

Author: Yunjian Wang <wangyunjian@huawei.com>
Date:   Fri Jan 19 18:22:35 2024 +0800

    tun: fix missing dropped counter in tun_xdp_act
    
    [ Upstream commit 5744ba05e7c4bff8fec133dd0f9e51ddffba92f5 ]
    
    The commit 8ae1aff0b331 ("tuntap: split out XDP logic") includes
    dropped counter for XDP_DROP, XDP_ABORTED, and invalid XDP actions.
    Unfortunately, that commit missed the dropped counter when error
    occurs during XDP_TX and XDP_REDIRECT actions. This patch fixes
    this issue.
    
    Fixes: 8ae1aff0b331 ("tuntap: split out XDP logic")
    Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path [+ + +]

Author: Zhihao Cheng <chengzhihao1@huawei.com>
Date:   Fri Dec 22 16:54:46 2023 +0800

    ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path
    
    commit 1e022216dcd248326a5bb95609d12a6815bca4e2 upstream.
    
    For error handling path in ubifs_symlink(), inode will be marked as
    bad first, then iput() is invoked. If inode->i_link is initialized by
    fscrypt_encrypt_symlink() in encryption scenario, inode->i_link won't
    be freed by callchain ubifs_free_inode -> fscrypt_free_inode in error
    handling path, because make_bad_inode() has changed 'inode->i_mode' as
    'S_IFREG'.
    Following kmemleak is easy to be reproduced by injecting error in
    ubifs_jnl_update() when doing symlink in encryption scenario:
     unreferenced object 0xffff888103da3d98 (size 8):
      comm "ln", pid 1692, jiffies 4294914701 (age 12.045s)
      backtrace:
       kmemdup+0x32/0x70
       __fscrypt_encrypt_symlink+0xed/0x1c0
       ubifs_symlink+0x210/0x300 [ubifs]
       vfs_symlink+0x216/0x360
       do_symlinkat+0x11a/0x190
       do_syscall_64+0x3b/0xe0
    There are two ways fixing it:
     1. Remove make_bad_inode() in error handling path. We can do that
        because ubifs_evict_inode() will do same processes for good
        symlink inode and bad symlink inode, for inode->i_nlink checking
        is before is_bad_inode().
     2. Free inode->i_link before marking inode bad.
    Method 2 is picked, it has less influence, personally, I think.
    
    Cc: stable@vger.kernel.org
    Fixes: 2c58d548f570 ("fscrypt: cache decrypted symlink target in ->i_link")
    Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
    Suggested-by: Eric Biggers <ebiggers@kernel.org>
    Reviewed-by: Eric Biggers <ebiggers@google.com>
    Signed-off-by: Richard Weinberger <richard@nod.at>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

udp: fix busy polling [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Jan 18 20:17:49 2024 +0000

    udp: fix busy polling
    
    [ Upstream commit a54d51fb2dfb846aedf3751af501e9688db447f5 ]
    
    Generic sk_busy_loop_end() only looks at sk->sk_receive_queue
    for presence of packets.
    
    Problem is that for UDP sockets after blamed commit, some packets
    could be present in another queue: udp_sk(sk)->reader_queue
    
    In some cases, a busy poller could spin until timeout expiration,
    even if some packets are available in udp_sk(sk)->reader_queue.
    
    v3: - make sk_busy_loop_end() nicer (Willem)
    
    v2: - add a READ_ONCE(sk->sk_family) in sk_is_inet() to avoid KCSAN splats.
        - add a sk_is_inet() check in sk_is_udp() (Willem feedback)
        - add a sk_is_inet() check in sk_is_tcp().
    
    Fixes: 2276f58ac589 ("udp: use a separate rx queue for packet reception")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

vlan: skip nested type that is not IFLA_VLAN_QOS_MAPPING [+ + +]

Author: Lin Ma <linma@zju.edu.cn>
Date:   Thu Jan 18 21:03:06 2024 +0800

    vlan: skip nested type that is not IFLA_VLAN_QOS_MAPPING
    
    [ Upstream commit 6c21660fe221a15c789dee2bc2fd95516bc5aeaf ]
    
    In the vlan_changelink function, a loop is used to parse the nested
    attributes IFLA_VLAN_EGRESS_QOS and IFLA_VLAN_INGRESS_QOS in order to
    obtain the struct ifla_vlan_qos_mapping. These two nested attributes are
    checked in the vlan_validate_qos_map function, which calls
    nla_validate_nested_deprecated with the vlan_map_policy.
    
    However, this deprecated validator applies a LIBERAL strictness, allowing
    the presence of an attribute with the type IFLA_VLAN_QOS_UNSPEC.
    Consequently, the loop in vlan_changelink may parse an attribute of type
    IFLA_VLAN_QOS_UNSPEC and believe it carries a payload of
    struct ifla_vlan_qos_mapping, which is not necessarily true.
    
    To address this issue and ensure compatibility, this patch introduces two
    type checks that skip attributes whose type is not IFLA_VLAN_QOS_MAPPING.
    
    Fixes: 07b5b17e157b ("[VLAN]: Use rtnl_link API")
    Signed-off-by: Lin Ma <linma@zju.edu.cn>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20240118130306.1644001-1-linma@zju.edu.cn
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: iwlwifi: fix a memory corruption [+ + +]

Author: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Date:   Thu Jan 11 15:07:25 2024 +0200

    wifi: iwlwifi: fix a memory corruption
    
    commit cf4a0d840ecc72fcf16198d5e9c505ab7d5a5e4d upstream.
    
    iwl_fw_ini_trigger_tlv::data is a pointer to a __le32, which means that
    if we copy to iwl_fw_ini_trigger_tlv::data + offset while offset is in
    bytes, we'll write past the buffer.
    
    Cc: stable@vger.kernel.org
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218233
    Fixes: cf29c5b66b9f ("iwlwifi: dbg_ini: implement time point handling")
    Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
    Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
    Link: https://msgid.link/20240111150610.2d2b8b870194.I14ed76505a5cf87304e0c9cc05cc0ae85ed3bf91@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

wifi: mac80211: fix potential sta-link leak [+ + +]

Author: Johannes Berg <johannes.berg@intel.com>
Date:   Thu Jan 11 18:17:44 2024 +0200

    wifi: mac80211: fix potential sta-link leak
    
    [ Upstream commit b01a74b3ca6fd51b62c67733ba7c3280fa6c5d26 ]
    
    When a station is allocated, links are added but not
    set to valid yet (e.g. during connection to an AP MLD),
    we might remove the station without ever marking links
    valid, and leak them. Fix that.
    
    Fixes: cb71f1d136a6 ("wifi: mac80211: add sta link addition/removal")
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Reviewed-by: Ilan Peer <ilan.peer@intel.com>
    Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
    Link: https://msgid.link/20240111181514.6573998beaf8.I09ac2e1d41c80f82a5a616b8bd1d9d8dd709a6a6@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

x86/entry/ia32: Ensure s32 is sign extended to s64 [+ + +]

Author: Richard Palethorpe <rpalethorpe@suse.com>
Date:   Wed Jan 10 15:01:22 2024 +0200

    x86/entry/ia32: Ensure s32 is sign extended to s64
    
    commit 56062d60f117dccfb5281869e0ab61e090baf864 upstream.
    
    Presently ia32 registers stored in ptregs are unconditionally cast to
    unsigned int by the ia32 stub. They are then cast to long when passed to
    __se_sys*, but will not be sign extended.
    
    This takes the sign of the syscall argument into account in the ia32
    stub. It still casts to unsigned int to avoid implementation specific
    behavior. However then casts to int or unsigned int as necessary. So that
    the following cast to long sign extends the value.
    
    This fixes the io_pgetevents02 LTP test when compiled with -m32. Presently
    the systemcall io_pgetevents_time64() unexpectedly accepts -1 for the
    maximum number of events.
    
    It doesn't appear other systemcalls with signed arguments are effected
    because they all have compat variants defined and wired up.
    
    Fixes: ebeb8c82ffaf ("syscalls/x86: Use 'struct pt_regs' based syscall calling for IA32_EMULATION and x32")
    Suggested-by: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
    Signed-off-by: Nikolay Borisov <nik.borisov@suse.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Arnd Bergmann <arnd@arndb.de>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240110130122.3836513-1-nik.borisov@suse.com
    Link: https://lore.kernel.org/ltp/20210921130127.24131-1-rpalethorpe@suse.com/
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xdp: reflect tail increase for MEM_TYPE_XSK_BUFF_POOL [+ + +]

Author: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date:   Wed Jan 24 20:16:00 2024 +0100

    xdp: reflect tail increase for MEM_TYPE_XSK_BUFF_POOL
    
    [ Upstream commit fbadd83a612c3b7aad2987893faca6bd24aaebb3 ]
    
    XSK ZC Rx path calculates the size of data that will be posted to XSK Rx
    queue via subtracting xdp_buff::data_end from xdp_buff::data.
    
    In bpf_xdp_frags_increase_tail(), when underlying memory type of
    xdp_rxq_info is MEM_TYPE_XSK_BUFF_POOL, add offset to data_end in tail
    fragment, so that later on user space will be able to take into account
    the amount of bytes added by XDP program.
    
    Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX")
    Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
    Link: https://lore.kernel.org/r/20240124191602.566724-10-maciej.fijalkowski@intel.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

xfs: read only mounts with fsopen mount API are busted [+ + +]

Author: Dave Chinner <dchinner@redhat.com>
Date:   Tue Jan 16 15:33:07 2024 +1100

    xfs: read only mounts with fsopen mount API are busted
    
    commit d8d222e09dab84a17bb65dda4b94d01c565f5327 upstream.
    
    Recently xfs/513 started failing on my test machines testing "-o
    ro,norecovery" mount options. This was being emitted in dmesg:
    
    [ 9906.932724] XFS (pmem0): no-recovery mounts must be read-only.
    
    Turns out, readonly mounts with the fsopen()/fsconfig() mount API
    have been busted since day zero. It's only taken 5 years for debian
    unstable to start using this "new" mount API, and shortly after this
    I noticed xfs/513 had started to fail as per above.
    
    The syscall trace is:
    
    fsopen("xfs", FSOPEN_CLOEXEC)           = 3
    mount_setattr(-1, NULL, 0, NULL, 0)     = -1 EINVAL (Invalid argument)
    .....
    fsconfig(3, FSCONFIG_SET_STRING, "source", "/dev/pmem0", 0) = 0
    fsconfig(3, FSCONFIG_SET_FLAG, "ro", NULL, 0) = 0
    fsconfig(3, FSCONFIG_SET_FLAG, "norecovery", NULL, 0) = 0
    fsconfig(3, FSCONFIG_CMD_CREATE, NULL, NULL, 0) = -1 EINVAL (Invalid argument)
    close(3)                                = 0
    
    Showing that the actual mount instantiation (FSCONFIG_CMD_CREATE) is
    what threw out the error.
    
    During mount instantiation, we call xfs_fs_validate_params() which
    does:
    
            /* No recovery flag requires a read-only mount */
            if (xfs_has_norecovery(mp) && !xfs_is_readonly(mp)) {
                    xfs_warn(mp, "no-recovery mounts must be read-only.");
                    return -EINVAL;
            }
    
    and xfs_is_readonly() checks internal mount flags for read only
    state. This state is set in xfs_init_fs_context() from the
    context superblock flag state:
    
            /*
             * Copy binary VFS mount flags we are interested in.
             */
            if (fc->sb_flags & SB_RDONLY)
                    set_bit(XFS_OPSTATE_READONLY, &mp->m_opstate);
    
    With the old mount API, all of the VFS specific superblock flags
    had already been parsed and set before xfs_init_fs_context() is
    called, so this all works fine.
    
    However, in the brave new fsopen/fsconfig world,
    xfs_init_fs_context() is called from fsopen() context, before any
    VFS superblock have been set or parsed. Hence if we use fsopen(),
    the internal XFS readonly state is *never set*. Hence anything that
    depends on xfs_is_readonly() actually returning true for read only
    mounts is broken if fsopen() has been used to mount the filesystem.
    
    Fix this by moving this internal state initialisation to
    xfs_fs_fill_super() before we attempt to validate the parameters
    that have been set prior to the FSCONFIG_CMD_CREATE call being made.
    
    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Fixes: 73e5fff98b64 ("xfs: switch to use the new mount-api")
    cc: stable@vger.kernel.org
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xsk: fix usage of multi-buffer BPF helpers for ZC XDP [+ + +]

Author: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date:   Wed Jan 24 20:15:54 2024 +0100

    xsk: fix usage of multi-buffer BPF helpers for ZC XDP
    
    [ Upstream commit c5114710c8ce86b8317e9b448f4fd15c711c2a82 ]
    
    Currently when packet is shrunk via bpf_xdp_adjust_tail() and memory
    type is set to MEM_TYPE_XSK_BUFF_POOL, null ptr dereference happens:
    
    [1136314.192256] BUG: kernel NULL pointer dereference, address:
    0000000000000034
    [1136314.203943] #PF: supervisor read access in kernel mode
    [1136314.213768] #PF: error_code(0x0000) - not-present page
    [1136314.223550] PGD 0 P4D 0
    [1136314.230684] Oops: 0000 [#1] PREEMPT SMP NOPTI
    [1136314.239621] CPU: 8 PID: 54203 Comm: xdpsock Not tainted 6.6.0+ #257
    [1136314.250469] Hardware name: Intel Corporation S2600WFT/S2600WFT,
    BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
    [1136314.265615] RIP: 0010:__xdp_return+0x6c/0x210
    [1136314.274653] Code: ad 00 48 8b 47 08 49 89 f8 a8 01 0f 85 9b 01 00 00 0f 1f 44 00 00 f0 41 ff 48 34 75 32 4c 89 c7 e9 79 cd 80 ff 83 fe 03 75 17 <f6> 41 34 01 0f 85 02 01 00 00 48 89 cf e9 22 cc 1e 00 e9 3d d2 86
    [1136314.302907] RSP: 0018:ffffc900089f8db0 EFLAGS: 00010246
    [1136314.312967] RAX: ffffc9003168aed0 RBX: ffff8881c3300000 RCX:
    0000000000000000
    [1136314.324953] RDX: 0000000000000000 RSI: 0000000000000003 RDI:
    ffffc9003168c000
    [1136314.336929] RBP: 0000000000000ae0 R08: 0000000000000002 R09:
    0000000000010000
    [1136314.348844] R10: ffffc9000e495000 R11: 0000000000000040 R12:
    0000000000000001
    [1136314.360706] R13: 0000000000000524 R14: ffffc9003168aec0 R15:
    0000000000000001
    [1136314.373298] FS:  00007f8df8bbcb80(0000) GS:ffff8897e0e00000(0000)
    knlGS:0000000000000000
    [1136314.386105] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [1136314.396532] CR2: 0000000000000034 CR3: 00000001aa912002 CR4:
    00000000007706f0
    [1136314.408377] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
    0000000000000000
    [1136314.420173] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
    0000000000000400
    [1136314.431890] PKRU: 55555554
    [1136314.439143] Call Trace:
    [1136314.446058]  <IRQ>
    [1136314.452465]  ? __die+0x20/0x70
    [1136314.459881]  ? page_fault_oops+0x15b/0x440
    [1136314.468305]  ? exc_page_fault+0x6a/0x150
    [1136314.476491]  ? asm_exc_page_fault+0x22/0x30
    [1136314.484927]  ? __xdp_return+0x6c/0x210
    [1136314.492863]  bpf_xdp_adjust_tail+0x155/0x1d0
    [1136314.501269]  bpf_prog_ccc47ae29d3b6570_xdp_sock_prog+0x15/0x60
    [1136314.511263]  ice_clean_rx_irq_zc+0x206/0xc60 [ice]
    [1136314.520222]  ? ice_xmit_zc+0x6e/0x150 [ice]
    [1136314.528506]  ice_napi_poll+0x467/0x670 [ice]
    [1136314.536858]  ? ttwu_do_activate.constprop.0+0x8f/0x1a0
    [1136314.546010]  __napi_poll+0x29/0x1b0
    [1136314.553462]  net_rx_action+0x133/0x270
    [1136314.561619]  __do_softirq+0xbe/0x28e
    [1136314.569303]  do_softirq+0x3f/0x60
    
    This comes from __xdp_return() call with xdp_buff argument passed as
    NULL which is supposed to be consumed by xsk_buff_free() call.
    
    To address this properly, in ZC case, a node that represents the frag
    being removed has to be pulled out of xskb_list. Introduce
    appropriate xsk helpers to do such node operation and use them
    accordingly within bpf_xdp_adjust_tail().
    
    Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX")
    Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> # For the xsk header part
    Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
    Link: https://lore.kernel.org/r/20240124191602.566724-4-maciej.fijalkowski@intel.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

xsk: make xsk_buff_pool responsible for clearing xdp_buff::flags [+ + +]

Author: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date:   Wed Jan 24 20:15:53 2024 +0100

    xsk: make xsk_buff_pool responsible for clearing xdp_buff::flags
    
    [ Upstream commit f7f6aa8e24383fbb11ac55942e66da9660110f80 ]
    
    XDP multi-buffer support introduced XDP_FLAGS_HAS_FRAGS flag that is
    used by drivers to notify data path whether xdp_buff contains fragments
    or not. Data path looks up mentioned flag on first buffer that occupies
    the linear part of xdp_buff, so drivers only modify it there. This is
    sufficient for SKB and XDP_DRV modes as usually xdp_buff is allocated on
    stack or it resides within struct representing driver's queue and
    fragments are carried via skb_frag_t structs. IOW, we are dealing with
    only one xdp_buff.
    
    ZC mode though relies on list of xdp_buff structs that is carried via
    xsk_buff_pool::xskb_list, so ZC data path has to make sure that
    fragments do *not* have XDP_FLAGS_HAS_FRAGS set. Otherwise,
    xsk_buff_free() could misbehave if it would be executed against xdp_buff
    that carries a frag with XDP_FLAGS_HAS_FRAGS flag set. Such scenario can
    take place when within supplied XDP program bpf_xdp_adjust_tail() is
    used with negative offset that would in turn release the tail fragment
    from multi-buffer frame.
    
    Calling xsk_buff_free() on tail fragment with XDP_FLAGS_HAS_FRAGS would
    result in releasing all the nodes from xskb_list that were produced by
    driver before XDP program execution, which is not what is intended -
    only tail fragment should be deleted from xskb_list and then it should
    be put onto xsk_buff_pool::free_list. Such multi-buffer frame will never
    make it up to user space, so from AF_XDP application POV there would be
    no traffic running, however due to free_list getting constantly new
    nodes, driver will be able to feed HW Rx queue with recycled buffers.
    Bottom line is that instead of traffic being redirected to user space,
    it would be continuously dropped.
    
    To fix this, let us clear the mentioned flag on xsk_buff_pool side
    during xdp_buff initialization, which is what should have been done
    right from the start of XSK multi-buffer support.
    
    Fixes: 1bbc04de607b ("ice: xsk: add RX multi-buffer support")
    Fixes: 1c9ba9c14658 ("i40e: xsk: add RX multi-buffer support")
    Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX")
    Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
    Link: https://lore.kernel.org/r/20240124191602.566724-3-maciej.fijalkowski@intel.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

xsk: recycle buffer in case Rx queue was full [+ + +]

Author: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date:   Wed Jan 24 20:15:52 2024 +0100

    xsk: recycle buffer in case Rx queue was full
    
    [ Upstream commit 269009893146c495f41e9572dd9319e787c2eba9 ]
    
    Add missing xsk_buff_free() call when __xsk_rcv_zc() failed to produce
    descriptor to XSK Rx queue.
    
    Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX")
    Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
    Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
    Link: https://lore.kernel.org/r/20240124191602.566724-2-maciej.fijalkowski@intel.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Список изменений в Linux 6.6.15