[26.04_linux-nvidia-bos] BPMP ACPI + SoC Hub MBWT sysfs for T410/NVL72#421
Conversation
PR Validation ReportPatchscan ✅ No Missing FixesAll cherry-picked commits checked — no missing upstream fixes found. PR Lint ❌ Errors foundDetailsChecking 3 commits... Cherry-pick digest: ┌──────────────┬──────────────────────────────────────────────────────────────────┬────────────┬─────────┬───────────────────────────┐ │ Local │ Referenced upstream / Patch subject │ Patch-ID │ Subject │ SoB chain │ ├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤ │ 6cfd7c422b5b │ [SAUCE] tegra: bpmp: add sysfs for memory bandwidth qos │ N/A │ N/A │ anrao, kobak │ ├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤ │ 84f4a187fc6c │ [SAUCE] tegra: bpmp: add acpi support │ N/A │ N/A │ anrao, kobak │ ├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤ │ 92f1cd122c98 │ [SAUCE] tegra: bpmp: move channel, resource init to helper │ N/A │ N/A │ anrao, kobak │ └──────────────┴──────────────────────────────────────────────────────────────────┴────────────┴─────────┴───────────────────────────┘ Lint: all checks passed. PR metadata: E: PR targets 26.04_linux-nvidia-bos but body has no https://bugs.launchpad.net/... link |
|
hey koba can you check this compile warnings |
Boro reviewLatest watcher review: open review Head: This comment is maintained by nv-pr-bot. It is updated when the GitHub watcher publishes a newer review. |
|
Some findings from review with Codex:
As a side note, didn't we address some of these same issues last time in the version we took into the v6.17 kernel? We need to understand why Aniruddha is not carrying these fixes forward. |
Refactor the BPMP driver by moving channel initialization and Device Tree resource parsing into separate helper functions. This prepares the driver for ACPI support, where these helpers will be skipped because channel initialization is handled by ACPI AML methods on ACPI-based systems. Signed-off-by: Aniruddha Rao <anrao@nvidia.com> (backported from V4 internal mail <20260423140823.2848045-2-anrao@nvidia.com>) [kobak: Preserve threaded channel count/semaphore initialization after the helper split and align rx_channel allocation continuation.] Signed-off-by: Koba Ko <kobak@nvidia.com>
This patch adds required changes in the Tegra BPMP driver to make it compatible with ACPI based platforms. On ACPI systems, IPC is handled through the AML method instead of the core kernel framework using Mailboxes and IVC. Bypass clock, reset and powergate init calls as these are not controlled by the Linux drivers on ACPI based systems. Signed-off-by: Aniruddha Rao <anrao@nvidia.com> (backported from V4 internal mail <20260423140823.2848045-3-anrao@nvidia.com>) [kobak: Add !ACPI_HANDLE(bpmp->dev) guard around bpmp->soc->ops->init because ACPI match driver_data=0 makes bpmp->soc NULL; make BPMP debugfs directory per-device on ACPI systems to avoid duplicate /sys/kernel/debug/bpmp collision on dual NVDA3001 instances; remove unused i and heap-allocate the ACPI BMRQ package to avoid the frame-size warning; guard ACPI deinit/resume paths when bpmp->soc is NULL; reject short BMRQ replies before copying response data.] Signed-off-by: Koba Ko <kobak@nvidia.com>
Tegra410 exposes memory bandwidth QoS for PCIe and GPU UPHY traffic on the path to DRAM. Each bandwidth group can cap PCIe read, PCIe write, or combined GPU UPHY read and write traffic, with target limits. The memory bandwidth QoS is not exposed as ordinary host MMIO and cannot be controlled from the kernel. The bandwidth limits can be programmed by sending the corresponding requests (MBWT MRQ) to the BPMP. On Tegra410, an ACPI-based platform, Linux BPMP driver does not use the device-tree mailbox path for communicating with the BPMP firmware. As a result, there is no existing client driver or interface that can be used to send the memory bandwidth requests to the BPMP. This patch exposes a sysfs directory mbwt_control on the tegra-bpmp platform device with pcie_instance_id, vc_type, and bandwidth. Writing bandwidth issues an MBWT_SET for the selected group (pcie_instance_id) and traffic class (vc_type). A read issues MBWT_GET and returns the bandwidth value reported by firmware. These attributes are exposed only if MBWT QUERY probe reports both MBWT_SET and MBWT_GET commands as supported. ABI documented in Documentation/ABI/testing/sys-platform-tegra-bpmp Signed-off-by: Aniruddha Rao <anrao@nvidia.com> (backported from V4 internal mail <20260423140823.2848045-4-anrao@nvidia.com>) [kobak: Keep functional MRQ_SOCHUB_MBWT ABI definitions and sysfs interface from V4; condense verbose per-field ABI comments while preserving enum/struct layout and Documentation/ABI coverage; use refcounted kobject allocation for mbwt_control; validate pcie_instance_id/vc_type before staging; return only the bandwidth value from bandwidth reads; report BPMP SET rejections to userspace.] Signed-off-by: Koba Ko <kobak@nvidia.com>
758f258 to
6cfd7c4
Compare
|
@clsotog thanks, fixed. @nirmoy thanks I found there's a issue of your review mechanism. it reviews one commit by one commit. not whole changes. @nvmochs thanks, fixed. |
clsotog
left a comment
There was a problem hiding this comment.
Thanks for the changes.
Acked-by: Carol L Soto <csoto@nvidia.com>
|
@kobak2026 Thanks for addressing the other items. Two more findings when reviewing with Codex...
TEGRA_BPMP still supports non-ACPI Tegra builds: drivers/firmware/tegra/Kconfig:14 So CONFIG_TEGRA_BPMP=y with CONFIG_ACPI=n is still a valid DT-only configuration. However, drivers/firmware/tegra/bpmp.c:313 now always compiles tegra_bpmp_transfer_acpi(), and that helper calls ACPI-only routines: Most other ACPI-looking uses in this series are okay with CONFIG_ACPI=n because they only use ACPI_HANDLE(), which has a stub that evaluates false. bpmp-tegra-sysfs.c and the debugfs naming change should therefore become inert in non-ACPI Suggested fix: compile the real helper only when ACPI is enabled and provide a stub otherwise: #if IS_ENABLED(CONFIG_ACPI) Optionally, the ACPI match table can also be guarded for cleanliness, but the required fix is guarding or stubbing tegra_bpmp_transfer_acpi().
After the split, the guard is only in the DT/mailbox path: drivers/firmware/tegra/bpmp.c:456 static int tegra_bpmp_transfer_channel(...) But the exported wrapper routes ACPI devices around that check: drivers/firmware/tegra/bpmp.c:496 if (ACPI_HANDLE(bpmp->dev)) The ACPI path can definitely sleep: acpi_evaluate_object() may execute AML and allocate memory, and the new code also does kzalloc(..., GFP_KERNEL). So an IRQ-disabled caller would now hit sleepable code instead of getting the old -EPERM. Suggested fix: put the guard back in the public wrapper, before choosing transport: if (WARN_ON(irqs_disabled())) Then the rule applies consistently to both DT and ACPI. The atomic API already rejects ACPI with -EOPNOTSUPP, so this preserves the existing contract cleanly. |
Sorry one question at line drivers/firmware/tegra/bpmp.c:951 do we need a goto or let it probe even if sysfs_register failed? |
Summary
Backport Aniruddha's V4 internal-review 3-patch BPMP ACPI + SoC Hub MBWT sysfs series to the 7.0-HWE / 26.04 BOS kernel branch. This is the V4 internal-mail design, not PR352's older export/unexport sysfs layout.
This is based on Aniruddha's V4 internal-review series; V5 is under review, so material upstream deltas will be handled as a follow-up refresh if needed.
V4 vs PR352
/sys/bus/platform/devices/NVDA3001:NN/mbwt_control/{pcie_instance_id,vc_type,bandwidth}.bandwidthreads back a 3-tuple:<pcie_instance_id> <vc_type> <bandwidth>.mbwt_controlis guarded by BPMPQUERY_ABIsupport for both GET_BW and SET_BW.bpmp-tegra410.cis introduced; the BPMP ACPI path is integrated into the existing driver structure./sys/class/bpmp_mbwt/.../{export,unexport,mbwtN}model is intentionally not carried forward.Commits
8a4662e7423f2b1c4250d773c6e25a4b136418b7-NVIDIA: VR: SAUCE: tegra: bpmp: Move channel, resource init to helperfd9097f38b08fe890012e3e97120935dc174047f-NVIDIA: VR: SAUCE: tegra: bpmp: Add ACPI support758f258afbfa39891b95de08226702d7bcb89391-NVIDIA: VR: SAUCE: tegra: bpmp: Add sysfs for memory bandwidth QoSValidation
LTS runtime validation on
comp016was completed on the paired LTS branch:Booted
7.0.0-sudeep-test-dgx16100-v4.mbwt_controlpresent under bothNVDA3001:00andNVDA3001:01platform devices.V4 sysfs unit test:
PASS=16 FAIL=0 SKIP=1.fio throttle sweep on
/dev/nvme0n1, socket00:100 GB/sbaseline:READ: bw=13.3GiB/s (14.3GB/s).50 GB/s:READ: bw=13.8GiB/s (14.8GB/s).10 GB/s:READ: bw=9344MiB/s (9798MB/s).1 GB/s:READ: bw=972MiB/s (1019MB/s).100 GB/srecovery:READ: bw=13.8GiB/s (14.8GB/s).These values match the PR352 throttle behavior: high limits do not cap read bandwidth,
10 GB/scaps near 10 GB/s,1 GB/scaps near 1 GB/s, and recovery returns to uncapped bandwidth.make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- O=/tmp/build-dgx16100-bos olddefconfigmake ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- O=/tmp/build-dgx16100-bos -j$(nproc) drivers/firmware/tegra/Result: exit 0.
Warning observed:
security/apparmor/Kconfig:120:warning: multi-line strings not supportedduringolddefconfig.The previous known
tegra_bpmp_transfer_acpiframe-size warning did not appear in the final targeted build output.Debugfs note: the first LTS runtime run observed the known duplicate BPMP debugfs warning (
debugfs: 'bpmp' already exists in '/',tegra-bpmp NVDA3001:01: debugfs initialization failed: -12). The current commits include the per-device debugfs fix, build-verified.