Skip to content

<fix>[kvm]: update TLS certs via kvmagent on host reconnect#3706

Open
zstack-robot-2 wants to merge 1 commit into5.5.12from
sync/yingzhe.hu/5.5.12-ZSTAC-83696
Open

<fix>[kvm]: update TLS certs via kvmagent on host reconnect#3706
zstack-robot-2 wants to merge 1 commit into5.5.12from
sync/yingzhe.hu/5.5.12-ZSTAC-83696

Conversation

@zstack-robot-2
Copy link
Copy Markdown
Collaborator

Resolves: ZSTAC-83696

Change-Id: I368cc5af8fb3d553bedc3be5d031015719e68ddc

sync from gitlab !9569

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 9, 2026

Walkthrough

在 KVM 主机连接流程中新增了用于更新 libvirt TLS 证书的命令/响应 DTO、主机端 REST 路径常量,并在连接流程中加入一个无回滚步骤,按条件读取证书/私钥并调用主机代理更新证书;失败仅记录警告,不阻塞连接。

Changes

Cohort / File(s) Summary
TLS 证书更新命令定义
plugin/kvm/src/main/java/org/zstack/kvm/KVMAgentCommands.java
新增内部 DTO:UpdateTlsCertCmd extends AgentCommand(字段:caCertcaKey(标注 @NoLogging)、certIps,含 getter/setter)和 UpdateTlsCertResponse extends AgentResponse
常量定义
plugin/kvm/src/main/java/org/zstack/kvm/KVMConstant.java
新增常量 KVM_UPDATE_TLS_CERT_PATH = "/host/updateTlsCert",定义主机端证书更新 HTTP 路径。
主机连接流程扩展
plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java
在主机连接流程插入 update-tls-certs-if-neededNoRollbackFlow 步骤:按条件(非单元测试且启用 TLS)从 JsonLabel 读取 libvirtTLSCA/libvirtTLSPrivateKey,构建去重 IP 列表(managementIp + HostSystemTags.EXTRA_IPS),调用代理 updateTlsCertPath 发送 UpdateTlsCertCmd。代理调用失败仅记录警告并继续流程。

序列流程图

sequenceDiagram
    participant Host as KVMHost
    participant Config as 配置检查
    participant LabelDB as JsonLabel/DB
    participant Agent as KVM代理

    Host->>Config: 检查 UNIT_TEST_ON 与\nKVMGlobalConfig.LIBVIRT_TLS_ENABLED
    Config-->>Host: 是否继续

    alt TLS启用且非单元测试
        Host->>LabelDB: 读取 libvirtTLSCA\n和 libvirtTLSPrivateKey
        LabelDB-->>Host: 返回证书与私钥
        Host->>Host: 构建去重 certIps(managementIp + EXTRA_IPS)
        Host->>Agent: POST `KVM_UPDATE_TLS_CERT_PATH`\n发送 UpdateTlsCertCmd(caCert, caKey, certIps)
        Agent-->>Host: UpdateTlsCertResponse / 错误
        alt 调用成功
            Host->>Host: 继续连接流程(trigger.next)
        else 调用失败
            Host->>Host: 记录警告并继续(trigger.next)
        end
    else 条件不满足
        Host->>Host: 跳过证书更新步骤
    end
Loading

代码审查工作量估算

🎯 4 (Complex) | ⏱️ ~45 minutes

诗歌

🐰 证书新步调轻轻跑,
私钥与 CA 胸中抱,
管理 IP 排排队,去重真周到,
失败也只是眨眨眼,继续往前跳,
小兔欢喜主机稳,TLS 护航笑嘻嘻 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title follows the required format '[scope]: ' with 59 characters, well under the 72-character limit, and clearly describes the main change: updating TLS certificates via kvmagent on host reconnect.
Description check ✅ Passed The pull request description is related to the changeset, referencing ZSTAC-83696 and describing the sync of TLS certificate update logic that matches the code changes in the PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sync/yingzhe.hu/5.5.12-ZSTAC-83696

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ast-grep (0.42.1)
plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMAgentCommands.java`:
- Around line 459-466: The caKey field in KVMAgentCommands is sensitive and
missing the `@NoLogging` annotation; annotate the private String caKey field (and
optionally its getCaKey/setCaKey methods) with `@NoLogging` to prevent it from
appearing in serialized/debug logs, and add the required import for the
NoLogging annotation so compilation succeeds.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)

Review profile: CHILL

Plan: Pro

Run ID: 6f02ffd9-b89d-4cc8-bdbf-0149cd73c187

📥 Commits

Reviewing files that changed from the base of the PR and between 73ca9d8 and 09d7ccb.

📒 Files selected for processing (3)
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMAgentCommands.java
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMConstant.java
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java

Resolves: ZSTAC-83696

Change-Id: I368cc5af8fb3d553bedc3be5d031015719e68ddc
@MatheMatrix MatheMatrix force-pushed the sync/yingzhe.hu/5.5.12-ZSTAC-83696 branch from 09d7ccb to d2b5dc4 Compare April 9, 2026 06:38
@zstack-robot-1
Copy link
Copy Markdown
Collaborator

Comment from yaohua.wu:

🔍 Code Review — MR !9569 (zstack)

关联 Jira: ZSTAC-83696 — 迁移网络非物理机管理网段时,存储迁移失败 (P1 Bug)
关联 kvmagent 侧 MR: zstack-utility !6899
目标分支: 5.5.12 | 变更文件: 3 | 变更行数: +86

✅ 问题与方案一致性

Java 管理侧正确实现了在主机 reconnect 流程中调用新的 kvmagent TLS 证书更新 API。流程位置合理(在收集 host facts 之后),跳过逻辑正确(单元测试或 TLS 未启用时跳过),错误处理得当(失败不阻断 reconnect)。


🟡 Warning (2)

W1. new JsonLabel() 的依赖注入问题
📄 plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java — 约第 6063-6064 行

String caCert = new JsonLabel().get("libvirtTLSCA", String.class);
String caKey = new JsonLabel().get("libvirtTLSPrivateKey", String.class);

通过 new JsonLabel() 创建实例而非从 Spring 容器获取。请确认 JsonLabel 类是否支持无 Spring 上下文的直接实例化(即内部是否通过 Platform.getComponentLoader() 自行解析依赖)。如果 JsonLabelget() 方法依赖于注入的 DatabaseFacade,直接 new 可能导致 NPE。

建议确认现有代码库中是否有 new JsonLabel() 的先例用法,或考虑通过 @Autowired 注入。

W2. caCert 也应考虑敏感信息保护
📄 plugin/kvm/src/main/java/org/zstack/kvm/KVMAgentCommands.java — 约第 457-470 行

caKey 已正确标记 @NoLogging(已在最新 commit 中修复)。caCert 虽然技术上是公开信息,但建议确认团队安全策略是否要求对所有证书材料脱敏。当前实现是合理的。


🟢 Suggestion (2)

S1. IP 列表构建可以考虑使用 LinkedHashSet 简化去重
📄 plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java — 约第 6044-6056 行

List<String> allIps = new ArrayList<>();
allIps.add(managementIp);
if (extraIps != null && !extraIps.isEmpty()) {
    for (String ip : extraIps.split(",")) {
        String trimmed = ip.trim();
        if (!trimmed.isEmpty() && !allIps.contains(trimmed)) {
            allIps.add(trimmed);
        }
    }
}

allIps.contains() 在 ArrayList 上是 O(n)。虽然 IP 数量很少(通常 2-3 个),但使用 LinkedHashSet 可以更清晰地表达去重意图:

Set<String> allIps = new LinkedHashSet<>();
allIps.add(managementIp);
if (extraIps != null && !extraIps.isEmpty()) {
    for (String ip : extraIps.split(",")) {
        String trimmed = ip.trim();
        if (!trimmed.isEmpty()) allIps.add(trimmed);
    }
}
String certIps = String.join(",", allIps);

S2. 建议增加日志记录调用前的 IP 列表
在调用 kvmagent API 之前增加 debug 级别日志,记录将要更新的 IP 列表,便于排查问题:

logger.debug(String.format("Requesting TLS cert update on host[uuid:%s] for IPs: %s",
        self.getUuid(), certIps));

👍 优点

  • @NoLogging 已正确标注在 caKey 字段上,防止密钥泄漏到日志
  • 流程跳过条件设计合理:UNIT_TEST_ON!LIBVIRT_TLS_ENABLED 两个条件覆盖完整
  • 失败处理策略正确:TLS 证书更新失败不应阻断主机 reconnect,successfail 回调均调用 trigger.next()
  • 流程位置合理:放在收集 host facts 之后、版本检查之前

🤖 Robot Reviewer

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java`:
- Around line 6034-6096: The flow "update-tls-certs-if-needed" currently
unconditionally pushes libvirt TLS certs; change it to first read the host's
existing libvirt cert metadata and only proceed if the cert is already managed
by ZStack (e.g., compare a stored marker/fingerprint or host tag), otherwise
skip or require an explicit override flag (add a new config like
KVMGlobalConfig.LIBVIRT_TLS_FORCE_UPDATE). If proceeding, before calling
UpdateTlsCertCmd record the original cert/key/fingerprint (store under JsonLabel
keys such as "libvirtTLSOriginalCert"/"libvirtTLSOriginalKey" or a host tag) so
you can roll back, and modify the success/fail handlers to leave originals
intact and expose a rollback path; ensure checks live inside the run(...) of the
NoRollbackFlow and gate the new Http<>(updateTlsCertPath, ...) call by the
marker/fingerprint/or-force-flag logic.
- Around line 6062-6065: The current check only treats caCert/caKey as missing
when null, allowing empty or whitespace-only strings to be sent; update the
retrieval and check around JsonLabel.get("libvirtTLSCA", String.class) and
JsonLabel.get("libvirtTLSPrivateKey", String.class) (used in the block that logs
"TLS CA cert/key not found in database, skipping cert update") to treat blank
values as missing by applying a trimToNull-style normalization (or
StringUtils.isBlank) and then conditionally skipping and logging when the
normalized values are null/blank.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)

Review profile: CHILL

Plan: Pro

Run ID: 097fd425-a051-4c5f-bae8-0527680af8c2

📥 Commits

Reviewing files that changed from the base of the PR and between 09d7ccb and d2b5dc4.

📒 Files selected for processing (3)
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMAgentCommands.java
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMConstant.java
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java
✅ Files skipped from review due to trivial changes (1)
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMConstant.java

Comment on lines +6034 to +6096
flow(new NoRollbackFlow() {
String __name__ = "update-tls-certs-if-needed";

@Override
public boolean skip(Map data) {
return CoreGlobalProperty.UNIT_TEST_ON
|| !KVMGlobalConfig.LIBVIRT_TLS_ENABLED.value(Boolean.class);
}

@Override
public void run(FlowTrigger trigger, Map data) {
String managementIp = getSelf().getManagementIp();
String extraIps = HostSystemTags.EXTRA_IPS.getTokenByResourceUuid(
self.getUuid(), HostSystemTags.EXTRA_IPS_TOKEN);

List<String> allIps = new ArrayList<>();
allIps.add(managementIp);
if (extraIps != null && !extraIps.isEmpty()) {
for (String ip : extraIps.split(",")) {
String trimmed = ip.trim();
if (!trimmed.isEmpty() && !allIps.contains(trimmed)) {
allIps.add(trimmed);
}
}
}

String certIps = String.join(",", allIps);

String caCert = new JsonLabel().get("libvirtTLSCA", String.class);
String caKey = new JsonLabel().get("libvirtTLSPrivateKey", String.class);
if (caCert == null || caKey == null) {
logger.warn("TLS CA cert/key not found in database, skipping cert update");
trigger.next();
return;
}

UpdateTlsCertCmd cmd = new UpdateTlsCertCmd();
cmd.setCaCert(caCert);
cmd.setCaKey(caKey);
cmd.setCertIps(certIps);

new Http<>(updateTlsCertPath, cmd, UpdateTlsCertResponse.class)
.call(new ReturnValueCompletion<UpdateTlsCertResponse>(trigger) {
@Override
public void success(UpdateTlsCertResponse ret) {
if (!ret.isSuccess()) {
logger.warn(String.format("Failed to update TLS certs on host[uuid:%s]: %s",
self.getUuid(), ret.getError()));
}
// cert update failure should not block reconnect
trigger.next();
}

@Override
public void fail(ErrorCode errorCode) {
logger.warn(String.format("Failed to update TLS certs on host[uuid:%s]: %s",
self.getUuid(), errorCode));
// cert update failure should not block reconnect
trigger.next();
}
});
}
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

避免在连接流程里无条件覆盖宿主机现有 TLS 证书

这里在 LIBVIRT_TLS_ENABLED 打开后会直接向 host 下发并更新 libvirt TLS 证书,但没有判断当前证书是否由平台托管,也没有记录原状态或提供回退路径。这样会把宿主机上可能由用户手工维护的证书一起覆盖掉,尤其这段逻辑还会跑在新增/重连流程里。建议至少加上“仅处理 ZStack 托管证书”的标记/指纹校验,或者放到显式开关后面。

As per coding guidelines “平台不应直接覆盖用户可能自定义修改过的东西……应当对原状态做好记录,做好二次确认,做好回退准备”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java` around lines 6034 -
6096, The flow "update-tls-certs-if-needed" currently unconditionally pushes
libvirt TLS certs; change it to first read the host's existing libvirt cert
metadata and only proceed if the cert is already managed by ZStack (e.g.,
compare a stored marker/fingerprint or host tag), otherwise skip or require an
explicit override flag (add a new config like
KVMGlobalConfig.LIBVIRT_TLS_FORCE_UPDATE). If proceeding, before calling
UpdateTlsCertCmd record the original cert/key/fingerprint (store under JsonLabel
keys such as "libvirtTLSOriginalCert"/"libvirtTLSOriginalKey" or a host tag) so
you can roll back, and modify the success/fail handlers to leave originals
intact and expose a rollback path; ensure checks live inside the run(...) of the
NoRollbackFlow and gate the new Http<>(updateTlsCertPath, ...) call by the
marker/fingerprint/or-force-flag logic.

Comment on lines +6062 to +6065
String caCert = new JsonLabel().get("libvirtTLSCA", String.class);
String caKey = new JsonLabel().get("libvirtTLSPrivateKey", String.class);
if (caCert == null || caKey == null) {
logger.warn("TLS CA cert/key not found in database, skipping cert update");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

把空白证书内容也当成缺失处理

这里只判断了 null,但 "" 或全空白字符串会继续下发到 agent,绕过“缺失就跳过”的保护分支。这里先做 trimToNull 更稳妥。

🩹 建议修改
-                        String caCert = new JsonLabel().get("libvirtTLSCA", String.class);
-                        String caKey = new JsonLabel().get("libvirtTLSPrivateKey", String.class);
+                        String caCert = StringUtils.trimToNull(new JsonLabel().get("libvirtTLSCA", String.class));
+                        String caKey = StringUtils.trimToNull(new JsonLabel().get("libvirtTLSPrivateKey", String.class));
                         if (caCert == null || caKey == null) {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
String caCert = new JsonLabel().get("libvirtTLSCA", String.class);
String caKey = new JsonLabel().get("libvirtTLSPrivateKey", String.class);
if (caCert == null || caKey == null) {
logger.warn("TLS CA cert/key not found in database, skipping cert update");
String caCert = StringUtils.trimToNull(new JsonLabel().get("libvirtTLSCA", String.class));
String caKey = StringUtils.trimToNull(new JsonLabel().get("libvirtTLSPrivateKey", String.class));
if (caCert == null || caKey == null) {
logger.warn("TLS CA cert/key not found in database, skipping cert update");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java` around lines 6062 -
6065, The current check only treats caCert/caKey as missing when null, allowing
empty or whitespace-only strings to be sent; update the retrieval and check
around JsonLabel.get("libvirtTLSCA", String.class) and
JsonLabel.get("libvirtTLSPrivateKey", String.class) (used in the block that logs
"TLS CA cert/key not found in database, skipping cert update") to treat blank
values as missing by applying a trimToNull-style normalization (or
StringUtils.isBlank) and then conditionally skipping and logging when the
normalized values are null/blank.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants