[Deepin-Kernel-SIG] [linux 6.12-y] [Upstream] Update kernel base to 6.12.68#1480
Open
opsiff wants to merge 170 commits intodeepin-community:linux-6.12.yfrom
Open
[Deepin-Kernel-SIG] [linux 6.12-y] [Upstream] Update kernel base to 6.12.68#1480opsiff wants to merge 170 commits intodeepin-community:linux-6.12.yfrom
opsiff wants to merge 170 commits intodeepin-community:linux-6.12.yfrom
Conversation
[ Upstream commit e859d37 ] File descriptor based pc_clock_*() operations of dynamic posix clocks have access to the file pointer and implement permission checks in the generic code before invoking the relevant dynamic clock callback. Character device operations (open, read, poll, ioctl) do not implement a generic permission control and the dynamic clock callbacks have no access to the file pointer to implement them. Extend struct posix_clock_context with a struct file pointer and initialize it in posix_clock_open(), so that all dynamic clock callbacks can access it. Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Wojtek Wasko <wwasko@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 9fe52e15e1f1d9b0a65f077267478e5add8e0c39) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit b4e53b1 ] Many devices implement highly accurate clocks, which the kernel manages as PTP Hardware Clocks (PHCs). Userspace applications rely on these clocks to timestamp events, trace workload execution, correlate timescales across devices, and keep various clocks in sync. The kernel’s current implementation of PTP clocks does not enforce file permissions checks for most device operations except for POSIX clock operations, where file mode is verified in the POSIX layer before forwarding the call to the PTP subsystem. Consequently, it is common practice to not give unprivileged userspace applications any access to PTP clocks whatsoever by giving the PTP chardevs 600 permissions. An example of users running into this limitation is documented in [1]. Additionally, POSIX layer requires WRITE permission even for readonly adjtime() calls which are used in PTP layer to return current frequency offset applied to the PHC. Add permission checks for functions that modify the state of a PTP device. Continue enforcing permission checks for POSIX clock operations (settime, adjtime) in the POSIX layer. Only require WRITE access for dynamic clocks adjtime() if any flags are set in the modes field. [1] https://lists.nwtime.org/sympa/arc/linuxptp-users/2024-01/msg00036.html Changes in v4: - Require FMODE_WRITE in ajtime() only for calls modifying the clock in any way. Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Signed-off-by: Wojtek Wasko <wwasko@nvidia.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 97529630d85f910f4e6dfc615f1d111ada9448e3) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 3d07b69 ] With the inclusion of commit c259aca ("ptp/ioctl: support MONOTONIC{,_RAW} timestamps for PTP_SYS_OFFSET_EXTENDED") clock_gettime() now allows retrieval of pre/post timestamps for CLOCK_MONOTONIC and CLOCK_MONOTONIC_RAW timebases along with the previously supported CLOCK_REALTIME. This patch adds a command line option 'y' to the testptp program to choose one of the allowed timebases [realtime aka system, monotonic, and monotonic-raw). Signed-off-by: Mahesh Bandewar <maheshb@google.com> Cc: Shuah Khan <shuah@kernel.org> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://patch.msgid.link/20241003101506.769418-1-maheshb@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 7686864 ("testptp: Add option to open PHC in readonly mode") Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit fa361565a7275cc43c6ca1abec9ec4fcc9ec51f1) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 7686864 ] PTP Hardware Clocks no longer require WRITE permission to perform readonly operations, such as listing device capabilities or listening to EXTTS events once they have been enabled by a process with WRITE permissions. Add '-r' option to testptp to open the PHC in readonly mode instead of the default read-write mode. Skip enabling EXTTS if readonly mode is requested. Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Signed-off-by: Wojtek Wasko <wwasko@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 2e6908dadc36fe78ca42ce615d1e4c770d946c19) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 134e9d0 ] Document the RPMh Power Domains on the SM8750 Platform. Signed-off-by: Taniya Das <quic_tdas@quicinc.com> Signed-off-by: Jishnu Prakash <quic_jprakash@quicinc.com> Signed-off-by: Melody Olvera <quic_molvera@quicinc.com> Message-ID: <20241112002444.2802092-2-quic_molvera@quicinc.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Stable-dep-of: 5bc3e720e725 ("pmdomain: qcom: rpmhpd: Add MXC to SC8280XP") Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 2411e8d383e89dea15fceddb5d27e2ae1e46dfab) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 1c40229 ] Update the RPMH level definitions to include TURBO_L5 corner. Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/661840/ Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com> Stable-dep-of: 5bc3e720e725 ("pmdomain: qcom: rpmhpd: Add MXC to SC8280XP") Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 584b63dc1bb483f4bb8ee44dec3ef5ec848f78a1) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit dcb8d01 ] Historically both RPM and RPMh domain definitions were a part of the same, qcom-rpmpd.h header. Now as we have a separate header for RPMh definitions, qcom,rpmhpd.h, move all RPMh power domain definitions to that header. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20250718-rework-rpmhpd-rpmpd-v1-1-eedca108e540@oss.qualcomm.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Stable-dep-of: 5bc3e720e725 ("pmdomain: qcom: rpmhpd: Add MXC to SC8280XP") Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 5d79846b7596cbb032d9840efe817e4aafc87378) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 45e1be5ddec98db71e7481fa7a3005673200d85c ] Not sure how useful it's gonna be in practice, but the definition is missing (unlike the previously-unused SC8280XP_MXC-non-_AO), so add it to allow the driver to create the corresponding pmdomain. Fixes: dbfb5f9 ("dt-bindings: power: rpmpd: Add sc8280xp RPMh power-domains") Acked-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20251202-topic-8280_mxc-v2-1-46cdf47a829e@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Stable-dep-of: 5bc3e720e725 ("pmdomain: qcom: rpmhpd: Add MXC to SC8280XP") Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 1ce854343236ffa53301e0e4125bf20b47beed52) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 5bc3e720e725cd5fa34875fa1e5434d565858067 ] This was apparently accounted for in dt-bindings, but never made its way into the driver. Fix it for SC8280XP and its VDD_GFX-less cousin, SA8540P. Fixes: f68f1cb ("soc: qcom: rpmhpd: add sc8280xp & sa8540p rpmh power-domains") Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20251202-topic-8280_mxc-v2-2-46cdf47a829e@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 27f558bed5302a6ecedb14a91c87bc2181a34893) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 868b979c5328b867c95a6d5a93ba13ad0d3cd2f1 ] To make sure that power rail is voted for, wire it up to its consumers. Fixes: 152d1fa ("arm64: dts: qcom: add SC8280XP platform") Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20251202-topic-8280_mxc-v2-3-46cdf47a829e@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit f878f9d0af0ae48eab9cbd6e4be4dbd490f34729) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 49f49d47af67f8a7b221db1d758fc634242dc91a ] hv_kmsg_dump() currently skips the panic notification entirely if it doesn't get any message bytes to pass to Hyper-V due to an error from kmsg_dump_get_buffer(). Skipping the notification is undesirable because it leaves the Hyper-V host uncertain about the state of a panic'ed guest. Fix this by always doing the panic notification, even if bytes_written is zero. Also ensure that bytes_written is initialized, which fixes a kernel test robot warning. The warning is actually bogus because kmsg_dump_get_buffer() happens to set bytes_written even if it fails, and in the kernel test robot's CONFIG_PRINTK not set case, hv_kmsg_dump() is never called. But do the initialization for robustness and to quiet the static checker. Fixes: 9c318a1 ("Drivers: hv: move panic report code from vmbus to hv early init code") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/202512172103.OcUspn1Z-lkp@intel.com/ Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Roman Kisel <vdso@mailbox.org> Signed-off-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 65130319ce0d15e48be834db0d6643634328873b) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 1d8f69f453c2e8a2d99b158e58e02ed65031fa6d ] When the BLOCK_GROUP_TREE compat_ro flag is set, the extent root and csum root fields are getting missed. This is because EXTENT_TREE_V2 treated these differently, and when they were split off this special-casing was mistakenly assigned to BGT rather than the rump EXTENT_TREE_V2. There's no reason why the existence of the block group tree should mean that we don't record the details of the last commit's extent root and csum root. Fix the code in backup_super_roots() so that the correct check gets made. Fixes: 1c56ab9 ("btrfs: separate BLOCK_GROUP_TREE compat RO flag from EXTENT_TREE_V2") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 7be5516fbf45b1d94d0e5bebca2e1c565f4ee3f1) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit ea4d4ea6d10a561043922d285f1765c7e4bfd32a ] An AHCI HBA specifies the number of ports it supports using CAP.NP. The HBA is free to only make a subset of the number of ports available using the PI (Ports Implemented) register. libata currently creates dummy ports for HBA ports that are provided by the HBA, but which are marked as "unavailable" using the PI register. Each port will have a per port area of registers in the HBA, regardless if the port is marked as "unavailable" or not. ahci_mark_external_port() currently reads this per port area of registers using readl() to see if the port is marked as external/hotplug-capable. However, AHCI 1.3.1, section "3.1.4 Offset 0Ch: PI – Ports Implemented" states: "Software must not read or write to registers within unavailable ports." Thus, make sure that we only call ahci_mark_external_port() and ahci_update_initial_lpm_policy() for ports that are implemented. From a libata perspective, this should not change anything related to LPM, as dummy ports do not provide any ap->ops (they do not have a .set_lpm() callback), so even if EH were to call .set_lpm() on a dummy port, it was already a no-op. Fixes: f713193 ("ata: ahci: move marking of external port earlier") Signed-off-by: Niklas Cassel <cassel@kernel.org> Tested-by: Wolf <wolf@yoxt.cc> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit f7e722acdb51178f7ae3248e0ab8438873f35da5) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
…bute [ Upstream commit ce83767ea323baf8509a75eb0c783cd203e14789 ] The link_power_management_supported sysfs attribute is currently set as true even for ata ports that lack a .set_lpm() callback, e.g. dummy ports. This is a bit silly, because while writing to the link_power_management_policy sysfs attribute will make ata_scsi_lpm_store() update ap->target_lpm_policy (thus sysfs will reflect the new value) and call ata_port_schedule_eh() for the port, it is essentially a no-op. This is because for a port without a .set_lpm() callback, once EH gets to run, the ata_eh_link_set_lpm() will simply return, since the port does not provide a .set_lpm() callback. Thus, make sure that the link_power_management_supported sysfs attribute is set to false for ports that lack a .set_lpm() callback. This way the link_power_management_policy sysfs attribute will no longer be writable, so we will no longer be misleading users to think that their sysfs write actually does something. Fixes: 0060bee ("ata: libata-sata: Add link_power_management_supported sysfs attribute") Signed-off-by: Niklas Cassel <cassel@kernel.org> Tested-by: Wolf <wolf@yoxt.cc> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 0dfe9069d1ce872e6fd547e2032c18354afe07fc) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit a6bee5e5243ad02cae575becc4c83df66fc29573 ] ata_dev_print_features() is supposed to return early and not print anything if there are no features supported. However, commit fe22e1c ("libata: support concurrent positioning ranges log") added another feature to ata_dev_print_features() without updating the early return conditional. Add the missing feature to the early return conditional. Fixes: fe22e1c ("libata: support concurrent positioning ranges log") Signed-off-by: Niklas Cassel <cassel@kernel.org> Tested-by: Wolf <wolf@yoxt.cc> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 97593e2724c1a463eaf8162ad0f3c6f7d626a30c) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit d360121 ] If the port of a device does not support Device Initiated Power Management (DIPM), that is, the port is flagged with ATA_FLAG_NO_DIPM, the DIPM feature of a device should not be used. Though DIPM is disabled by default on a device, the "Software Settings Preservation feature" may keep DIPM enabled or DIPM may have been enabled by the system firmware. Introduce the function ata_dev_config_lpm() to always disable DIPM on a device that supports this feature if the port of the device is flagged with ATA_FLAG_NO_DIPM. ata_dev_config_lpm() is called from ata_dev_configure(), ensuring that a device DIPM feature is disabled when it cannot be used. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20250701125321.69496-2-dlemoal@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org> Stable-dep-of: c8c6fb886f57 ("ata: libata: Print features also for ATAPI devices") Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 7ed741e68fc37c671fe6a12707cfcac98e6facef) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 8f3fb33f8f3f825c708ece800c921977c157f9b6 ] Commit d360121 ("ata: libata-core: Introduce ata_dev_config_lpm()") introduced ata_dev_config_lpm(). However, it only called this function for ATA_DEV_ATA and ATA_DEV_ZAC devices, not for ATA_DEV_ATAPI devices. Additionally, commit d99a914 ("ata: libata-core: Move device LPM quirk settings to ata_dev_config_lpm()") moved the LPM quirk application from ata_dev_configure() to ata_dev_config_lpm(), causing LPM quirks for ATAPI devices to no longer be applied. Call ata_dev_config_lpm() also for ATAPI devices, such that LPM quirks are applied for ATAPI devices with an entry in __ata_dev_quirks once again. Fixes: d360121 ("ata: libata-core: Introduce ata_dev_config_lpm()") Fixes: d99a914 ("ata: libata-core: Move device LPM quirk settings to ata_dev_config_lpm()") Signed-off-by: Niklas Cassel <cassel@kernel.org> Tested-by: Wolf <wolf@yoxt.cc> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Stable-dep-of: c8c6fb886f57 ("ata: libata: Print features also for ATAPI devices") Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit ff50853f848c2a2c70c74290930450d77847b462) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit c8c6fb886f57d5bf71fb6de6334a143608d35707 ] Commit d633b8a ("libata: print feature list on device scan") added a print of the features supported by the device for ATA_DEV_ATA and ATA_DEV_ZAC devices, but not for ATA_DEV_ATAPI devices. Fix this by printing the features also for ATAPI devices. Before changes: ata1.00: ATAPI: Slimtype DVD A DU8AESH, 6C2M, max UDMA/133 After changes: ata1.00: ATAPI: Slimtype DVD A DU8AESH, 6C2M, max UDMA/133 ata1.00: Features: Dev-Attention HIPM DIPM Fixes: d633b8a ("libata: print feature list on device scan") Signed-off-by: Niklas Cassel <cassel@kernel.org> Tested-by: Wolf <wolf@yoxt.cc> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 37080e96acd967e693c1b928dff1251f27dd94b3) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 8439016c3b8b5ab687c2420317b1691585106611 ] The u64_stats_sync structure is empty on 64-bit systems. However, on 32-bit systems it contains a seqcount_t which needs to be initialized. While the memory is zero-initialized, a lack of u64_stats_init means that lockdep won't get initialized properly. Fix this by adding u64_stats_init() calls to the rings just after allocation. Fixes: 2b245cb ("ice: Implement transmit and NAPI support") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 60bd8ff17b398c6421379bd00417583848dfe637) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit a9d45c22ed120cdd15ff56d0a6e4700c46451901 ] When the user issues an administrative down to an interface that is the primary for an aggregate bond, the prune lists are being purged. This breaks communication to the secondary interface, which shares a prune list on the main switch block while bonded together. For the primary interface of an aggregate, avoid deleting these prune lists during stop, and since they are hardcoded to specific values for the default vlan and QinQ vlans, the attempt to re-add them during the up phase will quietly fail without any additional problem. Fixes: 1e0f988 ("ice: Flesh out implementation of support for SRIOV on bonded interface") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 77407dc2d7eac8693deabd241086cb932f39bdac) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 01139a2ce532d77379e1593230127caa261a8036 ] The commit 5f6df17 ("ice: implement and use rd32_poll_timeout for ice_sq_done timeout") converted ICE_CTL_Q_SQ_CMD_TIMEOUT from jiffies to microseconds. But the ice_release_res() function was missed, and its logic still treats ICE_CTL_Q_SQ_CMD_TIMEOUT as a jiffies value. So correct the issue by usecs_to_jiffies(). Found by inspection of the DDP downloading process. Compile and modprobe tested only. Fixes: 5f6df17 ("ice: implement and use rd32_poll_timeout for ice_sq_done timeout") Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit a5cb7456d59ef1628d3027d83ec8bca37625c6d6) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 41a9a6826f20a524242a6c984845c4855f629841 ]
The Multi-queue Priority (MQPRIO) and Earliest TxTime First (ETF) offloads
utilize the Time Sensitive Networking (TSN) Tx mode. This mode is always
coupled to IEEE 802.1Qbv time aware shaper (Qbv). Therefore, the driver
sets a default Qbv schedule of all gates opened and a cycle time of
1s. This schedule is set during probe.
However, the following sequence of events lead to Tx issues:
- Boot a dual core system
igc_probe():
igc_tsn_clear_schedule():
-> Default Schedule is set
Note: At this point the driver has allocated two Tx/Rx queues, because
there are only two CPUs.
- ethtool -L enp3s0 combined 4
igc_ethtool_set_channels():
igc_reinit_queues()
-> Default schedule is gone, per Tx ring start and end time are zero
- tc qdisc replace dev enp3s0 handle 100 parent root mqprio \
num_tc 4 map 3 3 2 2 0 1 1 1 3 3 3 3 3 3 3 3 \
queues 1@0 1@1 1@2 1@3 hw 1
igc_tsn_offload_apply():
igc_tsn_enable_offload():
-> Writes zeros to IGC_STQT(i) and IGC_ENDQT(i), causing Tx to stall/fail
Therefore, restore the default Qbv schedule after changing the number of
channels.
Furthermore, add a restriction to not allow queue reconfiguration when
TSN/Qbv is enabled, because it may lead to inconsistent states.
Fixes: c814a2d ("igc: Use default cycle 'start' and 'end' values for queues")
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Avigail Dahan <avigailx.dahan@intel.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 8fdfa207afdd9274efa8143861622867ea8fec22)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 6990dc392a9ab10e52af37e0bee8c7b753756dc4 ] The current HW bug workaround checks the TXTT_0 ready bit first, then reads TXSTMPL_0 twice (before and after reading TXSTMPH_0) to detect whether a new timestamp was captured by timestamp register 0 during the workaround. This sequence has a race: if a new timestamp is captured after checking the TXTT_0 bit but before the first TXSTMPL_0 read, the detection fails because both the "old" and "new" values come from the same timestamp. Fix by reading TXSTMPL_0 first to establish a baseline, then checking the TXTT_0 bit. This ensures any timestamp captured during the race window will be detected. Old sequence: 1. Check TXTT_0 ready bit 2. Read TXSTMPL_0 (baseline) 3. Read TXSTMPH_0 (interrupt workaround) 4. Read TXSTMPL_0 (detect changes vs baseline) New sequence: 1. Read TXSTMPL_0 (baseline) 2. Check TXTT_0 ready bit 3. Read TXSTMPH_0 (interrupt workaround) 4. Read TXSTMPL_0 (detect changes vs baseline) Fixes: c789ad7 ("igc: Work around HW bug causing missing timestamps") Suggested-by: Avi Shalev <avi.shalev@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Co-developed-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Chwee-Lin Choong <chwee.lin.choong@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 71b9f40b5be5c5247a5b859a862c54d5019ea5b7) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 0386bd321d0f95d041a7b3d7b07643411b044a96 ] vsock/virtio common tries to coalesce buffers in rx queue: if a linear skb (with a spare tail room) is followed by a small skb (length limited by GOOD_COPY_LEN = 128), an attempt is made to join them. Since the introduction of MSG_ZEROCOPY support, assumption that a small skb will always be linear is incorrect. In the zerocopy case, data is lost and the linear skb is appended with uninitialized kernel memory. Of all 3 supported virtio-based transports, only loopback-transport is affected. G2H virtio-transport rx queue operates on explicitly linear skbs; see virtio_vsock_alloc_linear_skb() in virtio_vsock_rx_fill(). H2G vhost-transport may allocate non-linear skbs, but only for sizes that are not considered for coalescence; see PAGE_ALLOC_COSTLY_ORDER in virtio_vsock_alloc_skb(). Ensure only linear skbs are coalesced. Note that skb_tailroom(last_skb) > 0 guarantees last_skb is linear. Fixes: 581512a ("vsock/virtio: MSG_ZEROCOPY flag support") Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://patch.msgid.link/20260113-vsock-recv-coalescence-v2-1-552b17837cf4@rbox.co Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 568e9cd8ed7ca9bf748c7687ba6501f29d30e59f) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 7d7dbafefbe74f5a25efc4807af093b857a7612e ] The SR9700 chip sends more than one packet in a USB transaction, like the DM962x chips can optionally do, but the dm9601 driver does not support this mode, and the hardware does not have the DM962x MODE_CTL register to disable it, so this driver drops packets on SR9700 devices. The sr9700 driver correctly handles receiving more than one packet per transaction. While the dm9601 driver could be improved to handle this, the easiest way to fix this issue in the short term is to remove the SR9700 device ID from the dm9601 driver so the sr9700 driver is always used. This device ID should not have been in more than one driver to begin with. The "Fixes" commit was chosen so that the patch is automatically included in all kernels that have the sr9700 driver, even though the issue affects dm9601. Fixes: c9b3745 ("USB2NET : SR9700 : One chip USB 1.1 USB2NET SR9700Device Driver Support") Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Acked-by: Peter Korsgaard <peter@korsgaard.com> Link: https://patch.msgid.link/20260113063924.74464-1-enelsonmoore@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 61173010fdfe3c2e44e6235918ed7747b404d261) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit c84fcb79e5dbde0b8d5aeeaf04282d2149aebcf6 ] BOND_MODE_8023AD makes sense for ARPHRD_ETHER only. syzbot reported: BUG: KASAN: global-out-of-bounds in __hw_addr_create net/core/dev_addr_lists.c:63 [inline] BUG: KASAN: global-out-of-bounds in __hw_addr_add_ex+0x25d/0x760 net/core/dev_addr_lists.c:118 Read of size 16 at addr ffffffff8bf94040 by task syz.1.3580/19497 CPU: 1 UID: 0 PID: 19497 Comm: syz.1.3580 Tainted: G L syzkaller #0 PREEMPT(full) Tainted: [L]=SOFTLOCKUP Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025 Call Trace: <TASK> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xca/0x240 mm/kasan/report.c:482 kasan_report+0x118/0x150 mm/kasan/report.c:595 check_region_inline mm/kasan/generic.c:-1 [inline] kasan_check_range+0x2b0/0x2c0 mm/kasan/generic.c:200 __asan_memcpy+0x29/0x70 mm/kasan/shadow.c:105 __hw_addr_create net/core/dev_addr_lists.c:63 [inline] __hw_addr_add_ex+0x25d/0x760 net/core/dev_addr_lists.c:118 __dev_mc_add net/core/dev_addr_lists.c:868 [inline] dev_mc_add+0xa1/0x120 net/core/dev_addr_lists.c:886 bond_enslave+0x2b8b/0x3ac0 drivers/net/bonding/bond_main.c:2180 do_set_master+0x533/0x6d0 net/core/rtnetlink.c:2963 do_setlink+0xcf0/0x41c0 net/core/rtnetlink.c:3165 rtnl_changelink net/core/rtnetlink.c:3776 [inline] __rtnl_newlink net/core/rtnetlink.c:3935 [inline] rtnl_newlink+0x161c/0x1c90 net/core/rtnetlink.c:4072 rtnetlink_rcv_msg+0x7cf/0xb70 net/core/rtnetlink.c:6958 netlink_rcv_skb+0x208/0x470 net/netlink/af_netlink.c:2550 netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline] netlink_unicast+0x82f/0x9e0 net/netlink/af_netlink.c:1344 netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1894 sock_sendmsg_nosec net/socket.c:727 [inline] __sock_sendmsg+0x21c/0x270 net/socket.c:742 ____sys_sendmsg+0x505/0x820 net/socket.c:2592 ___sys_sendmsg+0x21f/0x2a0 net/socket.c:2646 __sys_sendmsg+0x164/0x220 net/socket.c:2678 do_syscall_32_irqs_on arch/x86/entry/syscall_32.c:83 [inline] __do_fast_syscall_32+0x1dc/0x560 arch/x86/entry/syscall_32.c:307 do_fast_syscall_32+0x34/0x80 arch/x86/entry/syscall_32.c:332 entry_SYSENTER_compat_after_hwframe+0x84/0x8e </TASK> The buggy address belongs to the variable: lacpdu_mcast_addr+0x0/0x40 Fixes: 872254d ("net/bonding: Enable bonding to enslave non ARPHRD_ETHER") Reported-by: syzbot+9c081b17773615f24672@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6966946b.a70a0220.245e30.0002.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Acked-by: Jay Vosburgh <jv@jvosburgh.net> Link: https://patch.msgid.link/20260113191201.3970737-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit ef68afb1bee8d35a18896c27d7358079353d8d8a) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 4d10edfd1475b69dbd4c47f34b61a3772ece83ca ]
syzbot reported memleak of struct l2tp_session, l2tp_tunnel,
sock, etc. [0]
The cited commit moved down the validation of the protocol
version in l2tp_udp_encap_recv().
The new place requires an extra error handling to avoid the
memleak.
Let's call l2tp_session_put() there.
[0]:
BUG: memory leak
unreferenced object 0xffff88810a290200 (size 512):
comm "syz.0.17", pid 6086, jiffies 4294944299
hex dump (first 32 bytes):
7d eb 04 0c 00 00 00 00 01 00 00 00 00 00 00 00 }...............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc babb6a4f):
kmemleak_alloc_recursive include/linux/kmemleak.h:44 [inline]
slab_post_alloc_hook mm/slub.c:4958 [inline]
slab_alloc_node mm/slub.c:5263 [inline]
__do_kmalloc_node mm/slub.c:5656 [inline]
__kmalloc_noprof+0x3e0/0x660 mm/slub.c:5669
kmalloc_noprof include/linux/slab.h:961 [inline]
kzalloc_noprof include/linux/slab.h:1094 [inline]
l2tp_session_create+0x3a/0x3b0 net/l2tp/l2tp_core.c:1778
pppol2tp_connect+0x48b/0x920 net/l2tp/l2tp_ppp.c:755
__sys_connect_file+0x7a/0xb0 net/socket.c:2089
__sys_connect+0xde/0x110 net/socket.c:2108
__do_sys_connect net/socket.c:2114 [inline]
__se_sys_connect net/socket.c:2111 [inline]
__x64_sys_connect+0x1c/0x30 net/socket.c:2111
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xa4/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes: 3647980 ("l2tp: Support different protocol versions with same IP/port quadruple")
Reported-by: syzbot+2c42ea4485b29beb0643@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/696693f2.a70a0220.245e30.0001.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/20260113185446.2533333-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 5cd158a88eef34e7b100cd9b963873d3b4e41b35)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 4f5f148dd7c0459229d2ab9a769b2e820f9ee6a2 ] Currently, the test breaks if the SUT already has a default route configured for IPv6. Fix by avoiding the use of the default namespace. Fixes: 4ed591c ("net/ipv6: Allow onlink routes to have a device mismatch if it is the default route") Suggested-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Ricardo B. Marlière <rbm@suse.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de> Link: https://patch.msgid.link/20260113-selftests-net-fib-onlink-v2-1-89de2b931389@suse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit f7d28b8b6784f26e5b9b9b9ea63841ea38f898a3) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
…it_urb() error
[ Upstream commit 79a6d1bfe1148bc921b8d7f3371a7fbce44e30f7 ]
In commit 7352e1d5932a ("can: gs_usb: gs_usb_receive_bulk_callback(): fix
URB memory leak"), the URB was re-anchored before usb_submit_urb() in
gs_usb_receive_bulk_callback() to prevent a leak of this URB during
cleanup.
However, this patch did not take into account that usb_submit_urb() could
fail. The URB remains anchored and
usb_kill_anchored_urbs(&parent->rx_submitted) in gs_can_close() loops
infinitely since the anchor list never becomes empty.
To fix the bug, unanchor the URB when an usb_submit_urb() error occurs,
also print an info message.
Fixes: 7352e1d5932a ("can: gs_usb: gs_usb_receive_bulk_callback(): fix URB memory leak")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/all/20260110223836.3890248-1-kuba@kernel.org/
Link: https://patch.msgid.link/20260116-can_usb-fix-reanchor-v1-1-9d74e7289225@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit c610b550ccc0438d456dfe1df9f4f36254ccaae3)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit a80c9d945aef55b23b54838334345f20251dad83 ] A null-ptr-deref was reported in the SCTP transmit path when SCTP-AUTH key initialization fails: ================================================================== KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] CPU: 0 PID: 16 Comm: ksoftirqd/0 Tainted: G W 6.6.0 #2 RIP: 0010:sctp_packet_bundle_auth net/sctp/output.c:264 [inline] RIP: 0010:sctp_packet_append_chunk+0xb36/0x1260 net/sctp/output.c:401 Call Trace: sctp_packet_transmit_chunk+0x31/0x250 net/sctp/output.c:189 sctp_outq_flush_data+0xa29/0x26d0 net/sctp/outqueue.c:1111 sctp_outq_flush+0xc80/0x1240 net/sctp/outqueue.c:1217 sctp_cmd_interpreter.isra.0+0x19a5/0x62c0 net/sctp/sm_sideeffect.c:1787 sctp_side_effects net/sctp/sm_sideeffect.c:1198 [inline] sctp_do_sm+0x1a3/0x670 net/sctp/sm_sideeffect.c:1169 sctp_assoc_bh_rcv+0x33e/0x640 net/sctp/associola.c:1052 sctp_inq_push+0x1dd/0x280 net/sctp/inqueue.c:88 sctp_rcv+0x11ae/0x3100 net/sctp/input.c:243 sctp6_rcv+0x3d/0x60 net/sctp/ipv6.c:1127 The issue is triggered when sctp_auth_asoc_init_active_key() fails in sctp_sf_do_5_1C_ack() while processing an INIT_ACK. In this case, the command sequence is currently: - SCTP_CMD_PEER_INIT - SCTP_CMD_TIMER_STOP (T1_INIT) - SCTP_CMD_TIMER_START (T1_COOKIE) - SCTP_CMD_NEW_STATE (COOKIE_ECHOED) - SCTP_CMD_ASSOC_SHKEY - SCTP_CMD_GEN_COOKIE_ECHO If SCTP_CMD_ASSOC_SHKEY fails, asoc->shkey remains NULL, while asoc->peer.auth_capable and asoc->peer.peer_chunks have already been set by SCTP_CMD_PEER_INIT. This allows a DATA chunk with auth = 1 and shkey = NULL to be queued by sctp_datamsg_from_user(). Since command interpretation stops on failure, no COOKIE_ECHO should been sent via SCTP_CMD_GEN_COOKIE_ECHO. However, the T1_COOKIE timer has already been started, and it may enqueue a COOKIE_ECHO into the outqueue later. As a result, the DATA chunk can be transmitted together with the COOKIE_ECHO in sctp_outq_flush_data(), leading to the observed issue. Similar to the other places where it calls sctp_auth_asoc_init_active_key() right after sctp_process_init(), this patch moves the SCTP_CMD_ASSOC_SHKEY immediately after SCTP_CMD_PEER_INIT, before stopping T1_INIT and starting T1_COOKIE. This ensures that if shared key generation fails, authenticated DATA cannot be sent. It also allows the T1_INIT timer to retransmit INIT, giving the client another chance to process INIT_ACK and retry key setup. Fixes: 730fc3d ("[SCTP]: Implete SCTP-AUTH parameter processing") Reported-by: Zhen Chen <chenzhen126@huawei.com> Tested-by: Zhen Chen <chenzhen126@huawei.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/44881224b375aa8853f5e19b4055a1a56d895813.1768324226.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit bf2b543b3cc4ebb4ab5bca4f8dfa5612035d45b8) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
commit b7880cb166ab62c2409046b2347261abf701530e upstream.
Syzbot has found a deadlock (analyzed by Lance Yang):
1) Task (5749): Holds folio_lock, then tries to acquire i_mmap_rwsem(read lock).
2) Task (5754): Holds i_mmap_rwsem(write lock), then tries to acquire
folio_lock.
migrate_pages()
-> migrate_hugetlbs()
-> unmap_and_move_huge_page() <- Takes folio_lock!
-> remove_migration_ptes()
-> __rmap_walk_file()
-> i_mmap_lock_read() <- Waits for i_mmap_rwsem(read lock)!
hugetlbfs_fallocate()
-> hugetlbfs_punch_hole() <- Takes i_mmap_rwsem(write lock)!
-> hugetlbfs_zero_partial_page()
-> filemap_lock_hugetlb_folio()
-> filemap_lock_folio()
-> __filemap_get_folio <- Waits for folio_lock!
The migration path is the one taking locks in the wrong order according to
the documentation at the top of mm/rmap.c. So expand the scope of the
existing i_mmap_lock to cover the calls to remove_migration_ptes() too.
This is (mostly) how it used to be after commit c0d0381. That was
removed by 336bf30 for both file & anon hugetlb pages when it should
only have been removed for anon hugetlb pages.
Link: https://lkml.kernel.org/r/20260109041345.3863089-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Fixes: 336bf30 ("hugetlbfs: fix anon huge page migration race")
Reported-by: syzbot+2d9c96466c978346b55f@syzkaller.appspotmail.com
Link: https://lore.kernel.org/all/68e9715a.050a0220.1186a4.000d.GAE@google.com
Debugged-by: Lance Yang <lance.yang@linux.dev>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Acked-by: Zi Yan <ziy@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: Jann Horn <jannh@google.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Ying Huang <ying.huang@linux.alibaba.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b75070823b89009f5123fd0e05a8e0c3d39937c1)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
…TDMA commit 566beb3 upstream. The SoC DMA resources for UDMA, BCDMA and PKTDMA can be described via a combination of up to two resource ranges. The first resource range handles the default partitioning wherein all resources belonging to that range are allocated to a single entity and form a continuous range. For use-cases where the resources are shared across multiple entities and require to be described via discontinuous ranges, a second resource range is required. Currently, udma_setup_resources() supports handling resources that belong to the second range. Extend bcdma_setup_resources() and pktdma_setup_resources() to support the same. Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com> Link: https://lore.kernel.org/r/20250205121805.316792-1-s-vadapalli@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Tested-by: Sai Sree Kartheek Adivi <s-adivi@ti.com> Signed-off-by: Sai Sree Kartheek Adivi <s-adivi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 4d733efd584b54a07eac2efab05450ca1a611dd1) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
commit 5a4391bdc6c8357242f62f22069c865b792406b3 upstream.
Fix similar memory leak as in commit 7352e1d5932a ("can: gs_usb:
gs_usb_receive_bulk_callback(): fix URB memory leak").
In esd_usb_open(), the URBs for USB-in transfers are allocated, added to
the dev->rx_submitted anchor and submitted. In the complete callback
esd_usb_read_bulk_callback(), the URBs are processed and resubmitted. In
esd_usb_close() the URBs are freed by calling
usb_kill_anchored_urbs(&dev->rx_submitted).
However, this does not take into account that the USB framework unanchors
the URB before the complete function is called. This means that once an
in-URB has been completed, it is no longer anchored and is ultimately not
released in esd_usb_close().
Fix the memory leak by anchoring the URB in the
esd_usb_read_bulk_callback() to the dev->rx_submitted anchor.
Fixes: 96d8e90 ("can: Add driver for esd CAN-USB/2 device")
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260116-can_usb-fix-memory-leak-v2-2-4b8cb2915571@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 9d1807b442fc3286b204f8e59981b10e743533ce)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
commit e6c209d upstream. Recently perf_link test started unreliably failing on libbpf CI: * https://github.com/libbpf/libbpf/actions/runs/11260672407/job/31312405473 * https://github.com/libbpf/libbpf/actions/runs/11260992334/job/31315514626 * https://github.com/libbpf/libbpf/actions/runs/11263162459/job/31320458251 Part of the test is running a dummy loop for a while and then checking for a counter incremented by the test program. Instead of waiting for an arbitrary number of loop iterations once, check for the test counter in a loop and use get_time_ns() helper to enforce a 100ms timeout. v1: https://lore.kernel.org/bpf/zuRd072x9tumn2iN4wDNs5av0nu5nekMNV4PkR-YwCT10eFFTrUtZBRkLWFbrcCe7guvLStGQlhibo8qWojCO7i2-NGajes5GYIyynexD-w=@pm.me/ Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241011153104.249800-1-ihor.solodrai@pm.me Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit eca8bb92f9948a8cbf19a99750c8b1764b0f994f) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
commit 04a899573fb87273a656f178b5f920c505f68875 upstream. Yinhao et al. reported that their fuzzer tool was able to trigger a skb_warn_bad_offload() from netif_skb_features() -> gso_features_check(). When a BPF program - triggered via BPF test infra - pushes the packet to the loopback device via bpf_clone_redirect() then mentioned offload warning can be seen. GSO-related features are then rightfully disabled. We get into this situation due to convert___skb_to_skb() setting gso_segs and gso_size but not gso_type. Technically, it makes sense that this warning triggers since the GSO properties are malformed due to the gso_type. Potentially, the gso_type could be marked non-trustworthy through setting it at least to SKB_GSO_DODGY without any other specific assumptions, but that also feels wrong given we should not go further into the GSO engine in the first place. The checks were added in 121d57a ("gso: validate gso_type in GSO handlers") because there were malicious (syzbot) senders that combine a protocol with a non-matching gso_type. If we would want to drop such packets, gso_features_check() currently only returns feature flags via netif_skb_features(), so one location for potentially dropping such skbs could be validate_xmit_unreadable_skb(), but then otoh it would be an additional check in the fast-path for a very corner case. Given bpf_clone_redirect() is the only place where BPF test infra could emit such packets, lets reject them right there. Fixes: 850a88c ("bpf: Expose __sk_buff wire_len/gso_segs to BPF_PROG_TEST_RUN") Fixes: cf62089 ("bpf: Add gso_size to __sk_buff") Reported-by: Yinhao Hu <dddddd@hust.edu.cn> Reported-by: Kaiyan Mei <M202472210@hust.edu.cn> Reported-by: Dongliang Mu <dzm91@hust.edu.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251020075441.127980-1-daniel@iogearbox.net Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 0f3a60869ca22024dfb9c6fce412b0c70cb4ea36) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit ce652c98a7bfa0b7c675ef5cd85c44c186db96af ] This is already the default in rk3399-base.dtsi, remove redundant declaration from rk3399-nanopi-r4s.dtsi. Fixes: db792e9 ("rockchip: rk3399: Add support for FriendlyARM NanoPi R4S") Cc: stable@vger.kernel.org Reported-by: Dragan Simic <dsimic@manjaro.org> Reviewed-by: Dragan Simic <dsimic@manjaro.org> Signed-off-by: Geraldo Nascimento <geraldogabriel@gmail.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> Link: https://patch.msgid.link/6694456a735844177c897581f785cc00c064c7d1.1763415706.git.geraldogabriel@gmail.com Signed-off-by: Heiko Stuebner <heiko@sntech.de> [ adapted file path from rk3399-nanopi-r4s.dtsi to rk3399-nanopi-r4s.dts ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 5c9737bba560f0f0f14ec950848e0ee2ce74a144) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit f5d203467a31798191365efeb16cd619d2c8f23a ] Add missing mutex_destroy() call in iio_dev_release() to properly clean up the mutex initialized in iio_device_alloc(). Ensure proper resource cleanup and follows kernel practices. Found by code review. While at it, create a lockdep key before mutex initialisation. This will help with converting it to the better API in the future. Fixes: 847ec80 ("Staging: IIO: core support for device registration and management") Fixes: ac917a8 ("staging:iio:core set the iio_dev.info pointer to null on unregister under lock.") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Nuno Sá <nuno.sa@analog.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Stable-dep-of: 9910159f0659 ("iio: core: add separate lockdep class for info_exist_lock") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 5210af7ca689658a200a68396166c7d98d887baf) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit c76ba4b2644424b8dbacee80bb40991eac29d39e ]
Replace lockdep_set_class() + mutex_init() by combined call
mutex_init_with_key().
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Stable-dep-of: 9910159f0659 ("iio: core: add separate lockdep class for info_exist_lock")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 1e8e13cc7d8bffffc7924b40436264e2a5692314)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 9910159f06590c17df4fbddedaabb4c0201cc4cb ]
When one iio device is a consumer of another, it is possible that
the ->info_exist_lock of both ends up being taken when reading the
value of the consumer device.
Since they currently belong to the same lockdep class (being
initialized in a single location with mutex_init()), that results in a
lockdep warning
CPU0
----
lock(&iio_dev_opaque->info_exist_lock);
lock(&iio_dev_opaque->info_exist_lock);
*** DEADLOCK ***
May be due to missing lock nesting notation
4 locks held by sensors/414:
#0: c31fd6dc (&p->lock){+.+.}-{3:3}, at: seq_read_iter+0x44/0x4e4
#1: c4f5a1c4 (&of->mutex){+.+.}-{3:3}, at: kernfs_seq_start+0x1c/0xac
#2: c2827548 (kn->active#34){.+.+}-{0:0}, at: kernfs_seq_start+0x30/0xac
#3: c1dd2b6 (&iio_dev_opaque->info_exist_lock){+.+.}-{3:3}, at: iio_read_channel_processed_scale+0x24/0xd8
stack backtrace:
CPU: 0 UID: 0 PID: 414 Comm: sensors Not tainted 6.17.11 deepin-community#5 NONE
Hardware name: Generic AM33XX (Flattened Device Tree)
Call trace:
unwind_backtrace from show_stack+0x10/0x14
show_stack from dump_stack_lvl+0x44/0x60
dump_stack_lvl from print_deadlock_bug+0x2b8/0x334
print_deadlock_bug from __lock_acquire+0x13a4/0x2ab0
__lock_acquire from lock_acquire+0xd0/0x2c0
lock_acquire from __mutex_lock+0xa0/0xe8c
__mutex_lock from mutex_lock_nested+0x1c/0x24
mutex_lock_nested from iio_read_channel_raw+0x20/0x6c
iio_read_channel_raw from rescale_read_raw+0x128/0x1c4
rescale_read_raw from iio_channel_read+0xe4/0xf4
iio_channel_read from iio_read_channel_processed_scale+0x6c/0xd8
iio_read_channel_processed_scale from iio_hwmon_read_val+0x68/0xbc
iio_hwmon_read_val from dev_attr_show+0x18/0x48
dev_attr_show from sysfs_kf_seq_show+0x80/0x110
sysfs_kf_seq_show from seq_read_iter+0xdc/0x4e4
seq_read_iter from vfs_read+0x238/0x2e4
vfs_read from ksys_read+0x6c/0xec
ksys_read from ret_fast_syscall+0x0/0x1c
Just as the mlock_key already has its own lockdep class, add a
lock_class_key for the info_exist mutex.
Note that this has in theory been a problem since before IIO first
left staging, but it only occurs when a chain of consumers is in use
and that is not often done.
Fixes: ac917a8 ("staging:iio:core set the iio_dev.info pointer to null on unregister under lock.")
Signed-off-by: Rasmus Villemoes <ravi@prevas.dk>
Reviewed-by: Peter Rosin <peda@axentia.se>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 629be44a1f51ade3b587a2c0f4d8200cb40780d3)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit ea6b4feba85e996e840e0b661bc42793df6eb701 ] Since commit c6e126d ("of: Keep track of populated platform devices") child devices will not be created by of_platform_populate() if the devices had previously been deregistered individually so that the OF_POPULATED flag is still set in the corresponding OF nodes. Switch to using of_platform_depopulate() instead of open coding so that the child devices are created if the driver is rebound. Fixes: c6e126d ("of: Keep track of populated platform devices") Cc: stable@vger.kernel.org # 3.16 Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [ Adjust context ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 437a8711116e5095e95d8db9bc0e6c32b7773a69) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 9aee8de970f18c2aaaa348e3de86c38e2d956c1d ] Fix refcount leaks in `exfat_find` related to `exfat_get_dentry_set`. Function `exfat_get_dentry_set` would increase the reference counter of `es->bh` on success. Therefore, `exfat_put_dentry_set` must be called after `exfat_get_dentry_set` to ensure refcount consistency. This patch relocate two checks to avoid possible leaks. Fixes: 82ebecd ("exfat: fix improper check of dentry.stream.valid_size") Fixes: 13940ce ("exfat: add a check for invalid data size") Signed-off-by: Shuhao Fu <sfual@cse.ust.hk> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Li hongliang <1468888505@139.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit fc9ce762525e73438d31b613f18bca92a4d3d578) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit a257e97 ] For PREEMPT_RT=y kernels, the deferred_irq_workfn() is executed in the per-cpu irq_work/* task context and not disable-irq, if the rq returned by container_of() is current CPU's rq, the following scenarios may occur: lock(&rq->__lock); <Interrupt> lock(&rq->__lock); This commit use IRQ_WORK_INIT_HARD() to replace init_irq_work() to initialize rq->scx.deferred_irq_work, make the deferred_irq_workfn() is always invoked in hard-irq context. Signed-off-by: Zqiang <qiang.zhang@linux.dev> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Chen Yu <xnguchen@sina.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 541959b2fadb832a7d0ceb95041dc52bdcf6bff7) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit a8a3ca23bbd9d849308a7921a049330dc6c91398 ] KMSAN reports: Multiple uninitialized values detected: - KMSAN: uninit-value in ntfs_read_hdr (3) - KMSAN: uninit-value in bcmp (3) Memory is allocated by __getname(), which is a wrapper for kmem_cache_alloc(). This memory is used before being properly cleared. Change kmem_cache_alloc() to kmem_cache_zalloc() to properly allocate and clear memory before use. Fixes: 82cae26 ("fs/ntfs3: Add initialization of super block") Fixes: 78ab59f ("fs/ntfs3: Rework file operations") Tested-by: syzbot+332bd4e9d148f11a87dc@syzkaller.appspotmail.com Reported-by: syzbot+332bd4e9d148f11a87dc@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=332bd4e9d148f11a87dc Fixes: 82cae26 ("fs/ntfs3: Add initialization of super block") Fixes: 78ab59f ("fs/ntfs3: Rework file operations") Tested-by: syzbot+0399100e525dd9696764@syzkaller.appspotmail.com Reported-by: syzbot+0399100e525dd9696764@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=0399100e525dd9696764 Reviewed-by: Khalid Aziz <khalid@kernel.org> Signed-off-by: Bartlomiej Kubik <kubik.bartlomiej@gmail.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Li hongliang <1468888505@139.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit f7728057220cabd720e27e46097edad48e5bd728) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 00812636df370bedf4e44a0c81b86ea96bca8628 ] Fix 'Memory manager not clean during takedown' warning that occurs when ivpu_gem_bo_free() removes the BO from the BOs list before it gets unmapped. Then file_priv_unbind() triggers a warning in drm_mm_takedown() during context teardown. Protect the unmapping sequence with bo_list_lock to ensure the BO is always fully unmapped when removed from the list. This ensures the BO is either fully unmapped at context teardown time or present on the list and unmapped by file_priv_unbind(). Fixes: 48aea7f ("accel/ivpu: Fix locking in ivpu_bo_remove_all_bos_from_context()") Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com> Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://patch.msgid.link/20251029071451.184243-1-karol.wachowski@linux.intel.com [ The context change is due to the commit e0c0891cd63b ("accel/ivpu: Rework bind/unbind of imported buffers") and the proper adoption is done. ] Signed-off-by: Rahul Sharma <black.hawk@163.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 0328bb097bef05a796217c54b3d651cc3782827c) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 38e818718c5e04961eea0fa8feff3f100ce40408 ]
>From the memory-barriers.txt document regarding memory barrier ordering
guarantees:
(*) These guarantees do not apply to bitfields, because compilers often
generate code to modify these using non-atomic read-modify-write
sequences. Do not attempt to use bitfields to synchronize parallel
algorithms.
(*) Even in cases where bitfields are protected by locks, all fields
in a given bitfield must be protected by one lock. If two fields
in a given bitfield are protected by different locks, the compiler's
non-atomic read-modify-write sequences can cause an update to one
field to corrupt the value of an adjacent field.
btrfs_space_info has a bitfield sharing an underlying word consisting of
the fields full, chunk_alloc, and flush:
struct btrfs_space_info {
struct btrfs_fs_info * fs_info; /* 0 8 */
struct btrfs_space_info * parent; /* 8 8 */
...
int clamp; /* 172 4 */
unsigned int full:1; /* 176: 0 4 */
unsigned int chunk_alloc:1; /* 176: 1 4 */
unsigned int flush:1; /* 176: 2 4 */
...
Therefore, to be safe from parallel read-modify-writes losing a write to
one of the bitfield members protected by a lock, all writes to all the
bitfields must use the lock. They almost universally do, except for
btrfs_clear_space_info_full() which iterates over the space_infos and
writes out found->full = 0 without a lock.
Imagine that we have one thread completing a transaction in which we
finished deleting a block_group and are thus calling
btrfs_clear_space_info_full() while simultaneously the data reclaim
ticket infrastructure is running do_async_reclaim_data_space():
T1 T2
btrfs_commit_transaction
btrfs_clear_space_info_full
data_sinfo->full = 0
READ: full:0, chunk_alloc:0, flush:1
do_async_reclaim_data_space(data_sinfo)
spin_lock(&space_info->lock);
if(list_empty(tickets))
space_info->flush = 0;
READ: full: 0, chunk_alloc:0, flush:1
MOD/WRITE: full: 0, chunk_alloc:0, flush:0
spin_unlock(&space_info->lock);
return;
MOD/WRITE: full:0, chunk_alloc:0, flush:1
and now data_sinfo->flush is 1 but the reclaim worker has exited. This
breaks the invariant that flush is 0 iff there is no work queued or
running. Once this invariant is violated, future allocations that go
into __reserve_bytes() will add tickets to space_info->tickets but will
see space_info->flush is set to 1 and not queue the work. After this,
they will block forever on the resulting ticket, as it is now impossible
to kick the worker again.
I also confirmed by looking at the assembly of the affected kernel that
it is doing RMW operations. For example, to set the flush (3rd) bit to 0,
the assembly is:
andb $0xfb,0x60(%rbx)
and similarly for setting the full (1st) bit to 0:
andb $0xfe,-0x20(%rax)
So I think this is really a bug on practical systems. I have observed
a number of systems in this exact state, but am currently unable to
reproduce it.
Rather than leaving this footgun lying around for the future, take
advantage of the fact that there is room in the struct anyway, and that
it is already quite large and simply change the three bitfield members to
bools. This avoids writes to space_info->full having any effect on
writes to space_info->flush, regardless of locking.
Fixes: 957780e ("Btrfs: introduce ticketed enospc infrastructure")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[ The context change is due to the commit cc0517f
("btrfs: tweak extent/chunk allocation for space_info sub-space")
in v6.16 which is irrelevant to the logic of this patch. ]
Signed-off-by: Rahul Sharma <black.hawk@163.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit db4ae18e1b31e0421fb5312e56aefa382bbc6ece)
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[ Upstream commit 16c6c35 ] While processing the monitor destination ring, MSDUs are reaped from the link descriptor based on the corresponding buf_id. However, sometimes the driver cannot obtain a valid buffer corresponding to the buf_id received from the hardware. This causes an infinite loop in the destination processing, resulting in a kernel crash. kernel log: ath11k_pci 0000:58:00.0: data msdu_pop: invalid buf_id 309 ath11k_pci 0000:58:00.0: data dp_rx_monitor_link_desc_return failed ath11k_pci 0000:58:00.0: data msdu_pop: invalid buf_id 309 ath11k_pci 0000:58:00.0: data dp_rx_monitor_link_desc_return failed Fix this by skipping the problematic buf_id and reaping the next entry, replacing the break with the next MSDU processing. Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30 Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1 Fixes: d5c6515 ("ath11k: driver for Qualcomm IEEE 802.11ax devices") Signed-off-by: P Praneesh <quic_ppranees@quicinc.com> Signed-off-by: Kang Yang <quic_kangyang@quicinc.com> Acked-by: Kalle Valo <kvalo@kernel.org> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://patch.msgid.link/20241219110531.2096-2-quic_kangyang@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com> Signed-off-by: Li hongliang <1468888505@139.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 9f1a002f0171d27f3554e529f3c70df438f05dfe) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[Upstream commit 87dbae5] virtio_vsock_skb_rx_put() only calls skb_put() if the length in the packet header is not zero even though skb_put() handles this case gracefully. Remove the functionally redundant check from virtio_vsock_skb_rx_put() and, on the assumption that this is a worthwhile optimisation for handling credit messages, augment the existing length checks in virtio_transport_rx_work() to elide the call for zero-length payloads. Since the callers all have the length, extend virtio_vsock_skb_rx_put() to take it as an additional parameter rather than fish it back out of the packet header. Note that the vhost code already has similar logic in vhost_vsock_alloc_skb(). Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Will Deacon <will@kernel.org> Message-Id: <20250717090116.11987-4-will@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit ca82ab9fd920a3aeadc182383e3f6d83068de774) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[Upstream commit 2304c64] In preparation for nonlinear allocations for large SKBs, rename virtio_vsock_alloc_skb() to virtio_vsock_alloc_linear_skb() to indicate that it returns linear SKBs unconditionally and switch all callers over to this new interface for now. No functional change. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Will Deacon <will@kernel.org> Message-Id: <20250717090116.11987-6-will@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 74ea6184df7fa6497c0edc13ba6949a673132605) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[Upstream commit fac6b82] virtio_vsock_alloc_linear_skb() checks that the requested size is at least big enough for the packet header (VIRTIO_VSOCK_SKB_HEADROOM). Of the three callers of virtio_vsock_alloc_linear_skb(), only vhost_vsock_alloc_skb() can potentially pass a packet smaller than the header size and, as it already has a check against the maximum packet size, extend its bounds checking to consider the minimum packet size and remove the check from virtio_vsock_alloc_linear_skb(). Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Will Deacon <will@kernel.org> Message-Id: <20250717090116.11987-7-will@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit bdea2c39fa06cd30c68aaf48dfe399090b5c549e) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[Upstream commit 8ca7615] In preparation for using virtio_vsock_skb_rx_put() when populating SKBs on the vsock TX path, rename virtio_vsock_skb_rx_put() to virtio_vsock_skb_put(). No functional change. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Will Deacon <will@kernel.org> Message-Id: <20250717090116.11987-9-will@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 2d651c3c035e7531858313518bf3a38c8c34d9b0) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[Upstream commit ab9aa2f] When receiving a packet from a guest, vhost_vsock_handle_tx_kick() calls vhost_vsock_alloc_linear_skb() to allocate and fill an SKB with the receive data. Unfortunately, these are always linear allocations and can therefore result in significant pressure on kmalloc() considering that the maximum packet size (VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM) is a little over 64KiB, resulting in a 128KiB allocation for each packet. Rework the vsock SKB allocation so that, for sizes with page order greater than PAGE_ALLOC_COSTLY_ORDER, a nonlinear SKB is allocated instead with the packet header in the SKB and the receive data in the fragments. Finally, add a debug warning if virtio_vsock_skb_rx_put() is ever called on an SKB with a non-zero length, as this would be destructive for the nonlinear case. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Will Deacon <will@kernel.org> Message-Id: <20250717090116.11987-8-will@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 65e808a6023f883f298db734444a9c38e45af740) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
…fers [Upstream commit 6693731] When transmitting a vsock packet, virtio_transport_send_pkt_info() calls virtio_transport_alloc_linear_skb() to allocate and fill SKBs with the transmit data. Unfortunately, these are always linear allocations and can therefore result in significant pressure on kmalloc() considering that the maximum packet size (VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM) is a little over 64KiB, resulting in a 128KiB allocation for each packet. Rework the vsock SKB allocation so that, for sizes with page order greater than PAGE_ALLOC_COSTLY_ORDER, a nonlinear SKB is allocated instead with the packet header in the SKB and the transmit data in the fragments. Note that this affects both the vhost and virtio transports. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Will Deacon <will@kernel.org> Message-Id: <20250717090116.11987-10-will@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 69c5bf306115e6047e0d902f8c6551fff74a71ff) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[Upstream commit b08a784] In a similar manner to copy_from_iter()/copy_from_iter_full(), introduce skb_copy_datagram_from_iter_full() which reverts the iterator to its initial state when returning an error. A subsequent fix for a vsock regression will make use of this new function. Cc: Christian Brauner <brauner@kernel.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Will Deacon <will@kernel.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Link: https://patch.msgid.link/20250818180355.29275-2-will@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit a3fc25e1d78419f3c774623a25cc6785a83c2dcb) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
[Upstream commit 7fb1291] Commit 6693731 ("vsock/virtio: Allocate nonlinear SKBs for handling large transmit buffers") converted the virtio vsock transmit path to utilise nonlinear SKBs when handling large buffers. As part of this change, virtio_transport_fill_skb() was updated to call skb_copy_datagram_from_iter() instead of memcpy_from_msg() as the latter expects a single destination buffer and cannot handle nonlinear SKBs correctly. Unfortunately, during this conversion, I overlooked the error case when the copying function returns -EFAULT due to a fault on the input buffer in userspace. In this case, memcpy_from_msg() reverts the iterator to its initial state thanks to copy_from_iter_full() whereas skb_copy_datagram_from_iter() leaves the iterator partially advanced. This results in a WARN_ONCE() from the vsock code, which expects the iterator to stay in sync with the number of bytes transmitted so that virtio_transport_send_pkt_info() can return -EFAULT when it is called again: ------------[ cut here ]------------ 'send_pkt()' returns 0, but 65536 expected WARNING: CPU: 0 PID: 5503 at net/vmw_vsock/virtio_transport_common.c:428 virtio_transport_send_pkt_info+0xd11/0xf00 net/vmw_vsock/virtio_transport_common.c:426 Modules linked in: CPU: 0 UID: 0 PID: 5503 Comm: syz.0.17 Not tainted 6.16.0-syzkaller-12063-g37816488247d #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 Call virtio_transport_fill_skb_full() to restore the previous iterator behaviour. Cc: Jason Wang <jasowang@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Fixes: 6693731 ("vsock/virtio: Allocate nonlinear SKBs for handling large transmit buffers") Reported-by: syzbot+b4d960daf7a3c7c2b7b1@syzkaller.appspotmail.com Signed-off-by: Will Deacon <will@kernel.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Link: https://patch.msgid.link/20250818180355.29275-3-will@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> [halves: adjust __zerocopy_sg_from_iter() parameters] Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit e6cee5d4a122ca9d6b7bb8fa09e498c60b9f208c) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Link: https://lore.kernel.org/r/20260128145334.006287341@linuxfoundation.org Tested-by: Brett A C Sheffield <bacs@librecast.net> Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Tested-by: Peter Schneider <pschneider1968@googlemail.com> Tested-by: Slade Watkins <sr@sladewatkins.com> Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Mark Brown <broonie@kernel.org> Tested-by: Brett Mastbergen <bmastbergen@ciq.com> Tested-by: Hardik Garg <hargar@linux.microsoft.com> Tested-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 90ecf7d9daa2d0167e20b2e9f6f52e01f47b8147) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
There was a problem hiding this comment.
Sorry @opsiff, your pull request is larger than the review limit of 150000 diff characters
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Update kernel base to 6.12.68.
git log --oneline v6.12.67..v6.12.68 |wc
170 1467 12143