VR: update LFA support on linux-nvidia-6.18#427
Open
nirmoy wants to merge 16 commits into
Open
Conversation
…improve SMC retry pacing" This reverts commit 1137d9f. Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
…tialization race" This reverts commit 1e028a6. Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
This reverts commit 7227499. Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
This reverts commit e2b0a6e. Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
…ware Activation (LFA)" This reverts commit 43325a3. Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
…ivation (LFA) BugLink: https://bugs.launchpad.net/bugs/2150652 The Arm Live Firmware Activation (LFA) is a specification [1] to describe activating firmware components without a reboot. Those components (like TF-A's BL31, EDK-II, TF-RMM, secure paylods) would be updated the usual way: via fwupd, FF-A or other secure storage methods, or via some IMPDEF Out-Of-Bound method. The user can then activate this new firmware, at system runtime, without requiring a reboot. The specification covers the SMCCC interface to list and query available components and eventually trigger the activation. Add a new directory under /sys/firmware to present firmware components capable of live activation. Each of them is a directory under lfa/, and is identified via its GUID. The activation will be triggered by echoing "1" into the "activate" file: ========================================== /sys/firmware/lfa # ls -l . 6c* .: total 0 drwxr-xr-x 2 0 0 0 Jan 19 11:33 47d4086d-4cfe-9846-9b95-2950cbbd5a00 drwxr-xr-x 2 0 0 0 Jan 19 11:33 6c0762a6-12f2-4b56-92cb-ba8f633606d9 drwxr-xr-x 2 0 0 0 Jan 19 11:33 d6d0eea7-fcea-d54b-9782-9934f234b6e4 6c0762a6-12f2-4b56-92cb-ba8f633606d9: total 0 --w------- 1 0 0 4096 Jan 19 11:33 activate -r--r--r-- 1 0 0 4096 Jan 19 11:33 activation_capable -r--r--r-- 1 0 0 4096 Jan 19 11:33 activation_pending --w------- 1 0 0 4096 Jan 19 11:33 cancel -r--r--r-- 1 0 0 4096 Jan 19 11:33 cpu_rendezvous -r--r--r-- 1 0 0 4096 Jan 19 11:33 current_version -rw-r--r-- 1 0 0 4096 Jan 19 11:33 force_cpu_rendezvous -r--r--r-- 1 0 0 4096 Jan 19 11:33 may_reset_cpu -r--r--r-- 1 0 0 4096 Jan 19 11:33 name -r--r--r-- 1 0 0 4096 Jan 19 11:33 pending_version /sys/firmware/lfa/6c0762a6-12f2-4b56-92cb-ba8f633606d9 # grep . * grep: activate: Permission denied activation_capable:1 activation_pending:1 grep: cancel: Permission denied cpu_rendezvous:1 current_version:0.0 force_cpu_rendezvous:1 may_reset_cpu:0 name:TF-RMM pending_version:0.0 /sys/firmware/lfa/6c0762a6-12f2-4b56-92cb-ba8f633606d9 # echo 1 > activate [ 2825.797871] Arm LFA: firmware activation succeeded. /sys/firmware/lfa/6c0762a6-12f2-4b56-92cb-ba8f633606d9 # ========================================== [1] https://developer.arm.com/documentation/den0147/latest/ Signed-off-by: Salman Nabi <salman.nabi@arm.com> Signed-off-by: Andre Przywara <andre.przywara@arm.com> (backported from https://lore.kernel.org/all/20260317103336.1273582-1-andre.przywara@arm.com/) Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com> (cherry picked from commit 08bef19)
BugLink: https://bugs.launchpad.net/bugs/2150652 After an image activation, the list of firmware images might change, so we have to re-iterate them through the SMC interface. Move the corresponding code from the activate_fw_image() function into update_fw_images_tree(), where it could be reused more easily, for instance when triggered by an interrupt. Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> [Andre: split off from another patch, rebased] Signed-off-by: Andre Przywara <andre.przywara@arm.com> (backported from https://lore.kernel.org/all/20260317103336.1273582-1-andre.przywara@arm.com/) Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com> (cherry picked from commit cedf5ce)
…hdog BugLink: https://bugs.launchpad.net/bugs/2150652 Enhance PRIME/ACTIVATION functions to touch watchdog and implement timeout mechanism. This update ensures that any potential hangs are detected promptly and that the LFA process is allocated sufficient execution time before the watchdog timer expires. These changes improve overall system reliability by reducing the risk of undetected process stalls and unexpected watchdog resets. Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Signed-off-by: Andre Przywara <andre.przywara@arm.com> (backported from https://lore.kernel.org/all/20260317103336.1273582-1-andre.przywara@arm.com/) Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com> (cherry picked from commit 17cfbbd)
BugLink: https://bugs.launchpad.net/bugs/2150652 The Arm LFA spec describes an ACPI notification mechanism, where the platform (firmware) can notify an LFA client about newly available firmware imag updates ("pending images" in LFA terms). Add a faux device after discovering the existence of an LFA agent via the SMCCC discovery mechnism, and use that device to check for the ACPI notification description. Register this when one is provided. The notification just conveys the fact that at least one firmware image has now a pending update, it doesn't say which, also there could be more than one pending. Loop through all images to find every which needs to be activated, and trigger the activation. We need to do this is a loop, since an activation might change the number and the status of available images. Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> [Andre: convert from platform driver to faux device] Signed-off-by: Andre Przywara <andre.przywara@arm.com> (backported from https://lore.kernel.org/all/20260317103336.1273582-1-andre.przywara@arm.com/) Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com> (cherry picked from commit 195ce64)
BugLink: https://bugs.launchpad.net/bugs/2150652 The Arm LFA spec places control over the actual activation process in the hands of the non-secure host OS. An platform initiated interrupt or notification signals the availability of an updateable firmware image, but does not necessarily need to trigger it automatically. Add a sysfs control file that guards such automatic activation. If an administrator wants to allow automatic platform initiated updates, they can activate that by echoing a "1" into the auto_activate file in the respective sysfs directory. Any incoming notification would then result in the activation triggered. Signed-off-by: Andre Przywara <andre.przywara@arm.com> (backported from https://lore.kernel.org/all/20260317103336.1273582-1-andre.przywara@arm.com/) Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com> (cherry picked from commit bc3733f)
BugLink: https://bugs.launchpad.net/bugs/2150652 The Arm Live Firmware Activation spec describes an asynchronous notification mechanism, where the platform can notify the host OS about newly pending image updates. In the absence of the ACPI notification mechanism also a simple devicetree node can describe an interrupt. Add code to find the respective DT node and register the specified interrupt, to trigger the activation if needed. Signed-off-by: Andre Przywara <andre.przywara@arm.com> (backported from https://lore.kernel.org/all/20260317103336.1273582-1-andre.przywara@arm.com/) Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com> (cherry picked from commit 07cbad7)
BugLink: https://bugs.launchpad.net/bugs/2150652 After a successful live activation, the list of firmware images might change, which also affects the sequence IDs. We store the sequence ID in a data structure and connect it to its GUID, which is the identifier used to access certain image properties from userland. When an activation is happening, the sequence ID associations might change at any point, so we must be sure to not use any previously learned sequence ID during this time. Protect the association between a sequence ID and a firmware image (its GUID, really) by a reader/writer lock. In this case it's a R/W semaphore, so it can sleep and we can hold it for longer, also concurrent SMC calls are not blocked on each other, it's just an activation that blocks calls. Signed-off-by: Andre Przywara <andre.przywara@arm.com> (backported from https://lore.kernel.org/all/20260317103336.1273582-1-andre.przywara@arm.com/) Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com> (cherry picked from commit 6724bd7)
… ACTIVATE BugLink: https://bugs.launchpad.net/bugs/2150652 DEN0147 §2.6: LFA_ACTIVATE can return LFA_BUSY when the firmware postpones the activation. Although the rwsem in this driver prevents concurrent ACTIVATE calls from kernel space, an external agent or internal firmware state may still produce LFA_BUSY. Add an explicit retry loop (same budget and delay as CALL_AGAIN) so the code does not silently treat a retriable condition as a terminal failure. Catching LFA_BUSY explicitly also surfaces potential firmware or driver bugs. DEN0147 §2.5: LFA_PRIME returning LFA_BUSY means another CPU is running LFA_PRIME concurrently. This driver never issues parallel PRIME, so this is unexpected; log pr_warn and return so the caller can surface the anomaly rather than swallowing it in the generic error path. Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com> (cherry picked from commit 63cbc12)
…pdates BugLink: https://bugs.launchpad.net/bugs/2150652 Firmware image directories are plain kobjects under /sys/firmware. udev coldplug does not enumerate them as devices, so rules matching the per-image LFA kobjects do not run reliably at boot. LFA already creates the arm-lfa faux device. Emit KOBJ_CHANGE from that device after the firmware image tree is refreshed, so user space can use the existing driver-core device as the notification anchor for runtime inventory updates. The same udev rule then also covers coldplug via the device add event, e.g.: ACTION=="add|change", SUBSYSTEM=="faux", KERNEL=="arm-lfa", \ RUN+="/usr/local/sbin/lfa-auto-activate" Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com> (cherry picked from commit 74a636e)
Add missing parentheses around macro parameter 'x' in TEGRA_SMCCC_* macros to prevent operator precedence issues if invoked with expressions. Fixes: 579bc50 ("NVIDIA: VR: SAUCE: soc/tegra: misc: Use SMCCC to get chipid") Signed-off-by: Saurav Sachidanand <sauravsc@amazon.com> Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
…anup Unknown firmware image UUIDs should still produce a useful name in sysfs. Leave the stored image name empty for unknown UUIDs and use the kobject name as the fallback when reporting the image name. Also unwind resources allocated during init if a later step fails. Destroy the workqueue when kset creation fails, destroy the faux device when inventory initialization fails after the faux device was created, and guard the exit path against a missing faux device. Suggested-by: Saurav Sachidanand <sauravsc@amazon.com> Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
Collaborator
Author
Boro reviewLatest watcher review: open review Head: This comment is maintained by nv-pr-bot. It is updated when the GitHub watcher publishes a newer review. |
Collaborator
Author
|
Chris validated the 6.18 LFA debs successfully. BPMP FW, RAS FW, and RMM were updated, and the kernel detected NVIDIA VR SAUCE lfa_v2. Logs: http://tegra-bmc-sol.nvidia.com:4002/shared/10_103_171_145_20260519_2106_edfc8682 |
Collaborator
Collaborator
|
It looks like these were picked from the 7.0 branch. That is fine, but needs to be accounted for in the pick tag (just need to include the branch name after the SHA). The revert's are also missing a sign-off and a note that they are being replaced by a newer version of the series. |
Collaborator
|
These two also have Change-Id's that you can strip out: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Status
Draft PR. Remote build, install, and boot smoke completed on
nvidia@10.105.57.17; full LFA functional testing is still pending.Validation
git diff --check upstream/linux-nvidia-6.18..HEADnvidia@10.105.57.1736f52a70c249df084dc13bb874d558d6ccbb1b2d26a0e4dccc29(build-only: tag LFA detection log, not part of this PR)6.18.25-lfa-v2-testBUILD_STATUS=0nvidia@10.105.57.176.18.25-lfa-v2-test/sys/firmware/lfaenumerated UUIDs:0509b633-5734-422f-a681-6096e932d93a3ab71f81-32b9-496d-841b-e3d0e9fd1a4865922703-2f74-e644-8dff-579ac1ff06106c0762a6-12f2-4b56-92cb-ba8f633606d9Arm LFA: Live Firmware Activation: detected v1.0 [NVIDIA VR SAUCE lfa_v2]Arm LFA: registered LFA ACPI notificationNotes
linux-image-6.18.25-lfa-v2-testpackage is installed and booted, butdpkgreports it asiFbecause unrelated DKMS modules failed during postinst (iser,isert,mlnx-ofed-kernel,mods,srp). The boot smoke and LFA sysfs/log checks still passed.