[ZSTAC-83890][5.4.8] HA pre-fence leftover qemu on suspect host via sibling#3899
[ZSTAC-83890][5.4.8] HA pre-fence leftover qemu on suspect host via sibling#3899MatheMatrix wants to merge 1 commit into5.4.8from
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
Walkthrough在 HA 启动路径中添加 pre-fence:新增消息/agent 命令,KVM 主机实现 peer-SSH fencing,并在 VmInstanceBase 的 HA 启动 flow-chain 中插入 pre-fence 步骤、扩展聚合、状态更新与启动步骤,记录并上报调度结果。 变更说明HA 启动前主机隔离
Sequence Diagram(s)sequenceDiagram
participant HA as HA Starter
participant Target as Target KVMHost
participant Peer as Peer KVMHost
participant Agent as Peer Agent
HA->>Target: FenceVmOnHostMsg(hostUuid, vmUuid, siblings)
Target->>Target: pickFencePeerHost(hintedSiblings, cluster, exclude)
alt no peer
Target->>HA: FenceVmOnHostReply(error)
else peer found
Target->>Peer: KVMHostAsyncHttpCallMsg(FenceVmFromPeerCmd)
Peer->>Agent: FenceVmFromPeerCmd
Agent->>Agent: SSH fence suspect host, check qemu
Agent->>Peer: FenceVmFromPeerRsp(qemuConfirmedDead/qemuStillAlive/targetUnreachable)
Peer->>Target: response
alt qemuConfirmedDead or targetUnreachable
Target->>HA: FenceVmOnHostReply(success)
else qemuStillAlive or ambiguous
Target->>HA: FenceVmOnHostReply(failure)
end
end
sequenceDiagram
participant Msg as HaStartVmInstanceMsg
participant Base as VmInstanceBase
participant Recorder as VmSchedHistoryRecorder
participant Fence as Pre-Fence Step
participant Ext as BeforeHaStart Extensions
participant DB as VM DB
participant Starter as startVm()
Msg->>Base: 处理 HA 启动请求
Base->>Base: whetherStartVm()
Base->>Recorder: recorder.begin()
Base->>Fence: ha-pre-fence-vm (FenceVmOnHostMsg)
alt pre-fence 失败
Fence->>Base: fail
Base->>Recorder: recorder.end(error)
Base->>Msg: reply error
else pre-fence 成功
Fence->>Base: continue
Base->>Ext: 执行扩展(聚合错误)
alt 扩展失败
Ext->>Base: fail
Base->>Recorder: recorder.end(error)
Base->>Msg: reply error
else 扩展成功
Ext->>Base: continue
Base->>DB: 标记 VM 为 Stopped, 清 hostUuid, 设置 lastHostUuid
Base->>Starter: startVm(...)
alt 启动成功
Starter->>Base: VM started
Base->>Recorder: recorder.end(startedHostUuid)
Base->>Msg: reply success (inventory)
else 启动失败
Starter->>Base: error
Base->>Recorder: recorder.end(error)
Base->>Msg: reply error
end
end
end
🎯 4 (复杂) | ⏱️ ~45 分钟
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 1 warning)
✅ Passed checks (3 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
plugin/kvm/src/main/java/org/zstack/kvm/ha/KvmHaPreFenceVmExtension.java (1)
151-151: 💤 Low value注释中存在拼写错误
"Silbling" 应为 "Sibling"。根据代码规范,注释应使用正确的英文。
📝 建议修复
- // Fresh fallback - mirrors HaKvmHostSiblingChecker.findSilblingFromSameCluster. + // Fresh fallback - mirrors HaKvmHostSiblingChecker.findSiblingFromSameCluster.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugin/kvm/src/main/java/org/zstack/kvm/ha/KvmHaPreFenceVmExtension.java` at line 151, The comment in KvmHaPreFenceVmExtension that reads "Silbling" is misspelled; update the comment to use "Sibling" and, if it references HaKvmHostSiblingChecker.findSilblingFromSameCluster, correct the referenced method name in the comment to HaKvmHostSiblingChecker.findSiblingFromSameCluster so the comment matches proper English and the actual identifier.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java`:
- Around line 980-1028: preFenceHaStartVm currently treats any
HaPreFenceVmExtensionPoint failure as a hard rejection for all HA-starts; to
preserve backward compatibility, gate this strict behavior behind a
capability/global config or enable it only when the request explicitly indicates
peer-fence context (e.g. msg.getSiblingHostUuids() is non-empty) and the
environment reports peer-fence support via pluginRgty/Capability check; update
preFenceHaStartVm to (1) consult a new GlobalConfig or capability flag (or a
PluginRegistry capability query) before invoking/acting on
HaPreFenceVmExtensionPoint results, (2) fall back to next.run() when the gate is
disabled or siblingHostUuids is null/empty, and (3) only treat extension
failures as fatal when the gate is enabled and peer-fence capability is
confirmed.
In `@plugin/kvm/src/main/java/org/zstack/kvm/ha/FenceVmFromPeerRsp.java`:
- Around line 10-17: The three primitive boolean fields in FenceVmFromPeerRsp
(qemuConfirmedDead, qemuStillAlive, targetHostUnreachable) cannot express
"absent" from older agents; change their types to boxed Boolean to allow null,
update constructors/serialization metadata if needed, and then update the
consumer logic in KvmHaPreFenceVmExtension (the decision branch around
qemuStillAlive and related checks) to treat null as "absent" and enforce a
strict rule: accept only when exactly one of the three Boolean fields is
Boolean.TRUE, otherwise follow the safe/reject branch. Ensure any usages assume
possible nulls and handle NPEs accordingly.
---
Nitpick comments:
In `@plugin/kvm/src/main/java/org/zstack/kvm/ha/KvmHaPreFenceVmExtension.java`:
- Line 151: The comment in KvmHaPreFenceVmExtension that reads "Silbling" is
misspelled; update the comment to use "Sibling" and, if it references
HaKvmHostSiblingChecker.findSilblingFromSameCluster, correct the referenced
method name in the comment to HaKvmHostSiblingChecker.findSiblingFromSameCluster
so the comment matches proper English and the actual identifier.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)
Review profile: CHILL
Plan: Pro
Run ID: a23821f8-76e1-41b8-b17f-b013056adf9d
⛔ Files ignored due to path filters (1)
conf/springConfigXml/Kvm.xmlis excluded by!**/*.xml
📒 Files selected for processing (6)
compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.javaheader/src/main/java/org/zstack/header/vm/HaPreFenceVmExtensionPoint.javaheader/src/main/java/org/zstack/header/vm/HaStartVmInstanceMsg.javaplugin/kvm/src/main/java/org/zstack/kvm/ha/FenceVmFromPeerCmd.javaplugin/kvm/src/main/java/org/zstack/kvm/ha/FenceVmFromPeerRsp.javaplugin/kvm/src/main/java/org/zstack/kvm/ha/KvmHaPreFenceVmExtension.java
7395a66 to
c7d5bb3
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
header/src/main/java/org/zstack/header/vm/FenceVmOnHostMsg.java (1)
21-45: ⚡ Quick win建议避免
siblingHostUuids可变引用外泄。当前
setSiblingHostUuids()与getSiblingHostUuids()直接传递内部List引用;调用方在发送消息后若继续修改原列表,消息内容会被“隐式改写”。建议在 Line 44 做 defensive copy,并在 Line 40 返回不可变列表(同时给空默认值,减少下游空判)。可参考的修改
+import java.util.ArrayList; +import java.util.Collections; import java.util.List; @@ - private List<String> siblingHostUuids; + private List<String> siblingHostUuids = Collections.emptyList(); @@ public List<String> getSiblingHostUuids() { - return siblingHostUuids; + return siblingHostUuids; } @@ public void setSiblingHostUuids(List<String> siblingHostUuids) { - this.siblingHostUuids = siblingHostUuids; + this.siblingHostUuids = siblingHostUuids == null + ? Collections.emptyList() + : Collections.unmodifiableList(new ArrayList<>(siblingHostUuids)); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@header/src/main/java/org/zstack/header/vm/FenceVmOnHostMsg.java` around lines 21 - 45, Avoid exposing a mutable reference for siblingHostUuids: in setSiblingHostUuids(List<String>) make a defensive copy of the incoming list before assigning to the siblingHostUuids field (handle null by assigning an empty list), and in getSiblingHostUuids() return an unmodifiable/immutable view (or a new copy) instead of the internal list so callers cannot mutate the internal state; update references to siblingHostUuids, setSiblingHostUuids, and getSiblingHostUuids accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java`:
- Around line 982-996: The code in VmInstanceBase unconditionally sends
FenceVmOnHostMsg when self.getHostUuid() != null, but only KVMHost implements
handling; modify the sending logic in the block around FenceVmOnHostMsg so you
first check the host's hypervisor/capability (e.g. via HostVO/host type or a
capability check method) and only construct/send FenceVmOnHostMsg for KVM hosts
(or hosts that advertise the peer-fence capability); for other hypervisors, skip
the fence step and call trigger.next() (or short-circuit a successful reply) to
preserve existing HA behavior and avoid sending FenceVmOnHostMsg to unsupported
hosts.
In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java`:
- Around line 1002-1042: The callback in KVMHost that builds and sends
FenceVmFromPeerCmd must not transmit host SSH credentials or raw remote
stdout/stderr; remove targetHostUsername/targetHostPassword (and
targetHostSshPort if not required) from FenceVmFromPeerCmd and instead pass a
credential reference or one‑time token (e.g., a credentialId or ephemeral key)
and ensure the peer agent uses server-side managed credentials; also sanitize
the error detail logic in the bus.send callback (where FenceVmFromPeerRsp is
inspected) so reply.setError never contains rsp.stdout/rsp.stderr or raw
rsp.getError()—replace them with a short redacted summary or hashed fingerprint
and log full outputs only to a secure local audit with restricted access.
---
Nitpick comments:
In `@header/src/main/java/org/zstack/header/vm/FenceVmOnHostMsg.java`:
- Around line 21-45: Avoid exposing a mutable reference for siblingHostUuids: in
setSiblingHostUuids(List<String>) make a defensive copy of the incoming list
before assigning to the siblingHostUuids field (handle null by assigning an
empty list), and in getSiblingHostUuids() return an unmodifiable/immutable view
(or a new copy) instead of the internal list so callers cannot mutate the
internal state; update references to siblingHostUuids, setSiblingHostUuids, and
getSiblingHostUuids accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)
Review profile: CHILL
Plan: Pro
Run ID: 1e80c1a9-73e4-4f26-9d44-ab1b63bfc09b
📒 Files selected for processing (7)
compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.javaheader/src/main/java/org/zstack/header/vm/FenceVmOnHostMsg.javaheader/src/main/java/org/zstack/header/vm/FenceVmOnHostReply.javaheader/src/main/java/org/zstack/header/vm/HaStartVmInstanceMsg.javaplugin/kvm/src/main/java/org/zstack/kvm/KVMHost.javaplugin/kvm/src/main/java/org/zstack/kvm/ha/FenceVmFromPeerCmd.javaplugin/kvm/src/main/java/org/zstack/kvm/ha/FenceVmFromPeerRsp.java
✅ Files skipped from review due to trivial changes (2)
- header/src/main/java/org/zstack/header/vm/FenceVmOnHostReply.java
- header/src/main/java/org/zstack/header/vm/HaStartVmInstanceMsg.java
🚧 Files skipped from review as they are similar to previous changes (2)
- plugin/kvm/src/main/java/org/zstack/kvm/ha/FenceVmFromPeerRsp.java
- plugin/kvm/src/main/java/org/zstack/kvm/ha/FenceVmFromPeerCmd.java
| FenceVmFromPeerCmd cmd = new FenceVmFromPeerCmd(); | ||
| cmd.vmUuid = msg.getVmUuid(); | ||
| cmd.targetHostUuid = suspectHostUuid; | ||
| cmd.targetHostIp = self.getManagementIp(); | ||
| cmd.targetHostUsername = getSelf().getUsername(); | ||
| cmd.targetHostPassword = getSelf().getPassword(); | ||
| cmd.targetHostSshPort = getSelf().getPort() != null ? getSelf().getPort() : 22; | ||
| cmd.sshTimeoutSec = 20; | ||
|
|
||
| KVMHostAsyncHttpCallMsg fmsg = new KVMHostAsyncHttpCallMsg(); | ||
| fmsg.setHostUuid(peerHostUuid); | ||
| fmsg.setPath(FenceVmFromPeerCmd.FENCE_VM_FROM_PEER_PATH); | ||
| fmsg.setCommand(cmd); | ||
| bus.makeTargetServiceIdByResourceUuid(fmsg, HostConstant.SERVICE_ID, peerHostUuid); | ||
|
|
||
| logger.info(String.format("[HA pre-fence] vm[%s] ask sibling[%s] to fence suspect host[%s ip=%s]", | ||
| msg.getVmUuid(), peerHostUuid, suspectHostUuid, self.getManagementIp())); | ||
|
|
||
| bus.send(fmsg, new CloudBusCallBack(msg) { | ||
| @Override | ||
| public void run(MessageReply r) { | ||
| if (!r.isSuccess()) { | ||
| reply.setError(operr("HA-start vm[%s]: failed to ask sibling[%s] to fence suspect host[%s]: %s. " + | ||
| "Refuse to start to prevent split-brain.", | ||
| msg.getVmUuid(), peerHostUuid, suspectHostUuid, r.getError().getDetails())); | ||
| bus.reply(msg, reply); | ||
| return; | ||
| } | ||
| FenceVmFromPeerRsp rsp = ((KVMHostAsyncHttpCallReply) r).toResponse(FenceVmFromPeerRsp.class); | ||
| if (!rsp.isSuccess() && !rsp.qemuStillAlive) { | ||
| reply.setError(operr("HA-start vm[%s]: sibling[%s] failed to fence suspect host[%s]: %s. " + | ||
| "Refuse to start to prevent split-brain.", | ||
| msg.getVmUuid(), peerHostUuid, suspectHostUuid, rsp.getError())); | ||
| bus.reply(msg, reply); | ||
| return; | ||
| } | ||
| if (rsp.qemuStillAlive) { | ||
| reply.setError(operr("HA-start vm[%s]: qemu is still alive on suspect host[%s ip=%s] after sibling[%s] " + | ||
| "force-destroy attempt. Refuse to start to prevent split-brain. stdout=[%s] stderr=[%s]", | ||
| msg.getVmUuid(), suspectHostUuid, self.getManagementIp(), peerHostUuid, | ||
| rsp.stdout, rsp.stderr)); |
There was a problem hiding this comment.
不要把宿主机 SSH 凭据和远端原始输出透传到 sibling fencing 链路
这里把 suspect host 的 username/password 放进 FenceVmFromPeerCmd,随后又把 rsp.stdout/stderr 直接拼进 error detail。这样会把宿主机凭据和远端脚本原始输出暴露给 CloudBus、kvmagent HTTP 以及异常/审计日志链路,明显扩大敏感信息落盘面。建议改成管理节点代执行,或只下发一次性凭据/私钥引用,并在返回给上层的错误里仅保留脱敏摘要。
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java` around lines 1002 -
1042, The callback in KVMHost that builds and sends FenceVmFromPeerCmd must not
transmit host SSH credentials or raw remote stdout/stderr; remove
targetHostUsername/targetHostPassword (and targetHostSshPort if not required)
from FenceVmFromPeerCmd and instead pass a credential reference or one‑time
token (e.g., a credentialId or ephemeral key) and ensure the peer agent uses
server-side managed credentials; also sanitize the error detail logic in the
bus.send callback (where FenceVmFromPeerRsp is inspected) so reply.setError
never contains rsp.stdout/rsp.stderr or raw rsp.getError()—replace them with a
short redacted summary or hashed fingerprint and log full outputs only to a
secure local audit with restricted access.
c7d5bb3 to
7b832c5
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@simulator/simulatorImpl/src/main/java/org/zstack/simulator/kvm/KVMSimulatorController.java`:
- Around line 759-767: The fenceVmFromPeer handler currently skips deserializing
the request and storing the command, making it hard to unit-test; modify the
method fenceVmFromPeer to deserialize the request body into a FenceVmFromPeerCmd
using JSONObjectUtil.toObject(entity.getBody(), FenceVmFromPeerCmd.class),
ensure KVMSimulatorConfig has a public List<FenceVmFromPeerCmd>
fenceVmFromPeerCmds field, add the deserialized cmd to
config.fenceVmFromPeerCmds, then construct FenceVmFromPeerRsp as before (set
qemuConfirmedDead and success) and call replyer.reply(entity, rsp); keep the
rest of the reply behavior unchanged so tests can assert the stored command.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)
Review profile: CHILL
Plan: Pro
Run ID: 1bcd0710-b547-43c3-9a4b-c49692dcd006
📒 Files selected for processing (1)
simulator/simulatorImpl/src/main/java/org/zstack/simulator/kvm/KVMSimulatorController.java
| @RequestMapping(value = FenceVmFromPeerCmd.FENCE_VM_FROM_PEER_PATH, method = RequestMethod.POST) | ||
| public @ResponseBody String fenceVmFromPeer(HttpServletRequest req) { | ||
| HttpEntity<String> entity = restf.httpServletRequestToHttpEntity(req); | ||
| FenceVmFromPeerRsp rsp = new FenceVmFromPeerRsp(); | ||
| rsp.qemuConfirmedDead = true; | ||
| rsp.setSuccess(true); | ||
| replyer.reply(entity, rsp); | ||
| return null; | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
建议遵循模拟器处理器的既定模式以提升可测试性
当前实现与本文件中其他模拟器处理器不一致,缺少以下标准模式:
- 未反序列化命令:其他处理器(如
hardenVmConsole、checkVmState)都会通过JSONObjectUtil.toObject反序列化请求体,以便单元测试可以验证请求参数。 - 未存储命令到 config:其他处理器将命令存储到
config中(例如config.hardenVmConsoleCmds.add(cmd)),使得测试可以断言该端点被正确调用。
这会降低该端点的可测试性。
♻️ 建议的重构方案
首先,在 KVMSimulatorConfig.java 中添加存储字段(如果尚未添加):
public List<FenceVmFromPeerCmd> fenceVmFromPeerCmds = new ArrayList<>();然后修改处理器:
`@RequestMapping`(value = FenceVmFromPeerCmd.FENCE_VM_FROM_PEER_PATH, method = RequestMethod.POST)
public `@ResponseBody` String fenceVmFromPeer(HttpServletRequest req) {
HttpEntity<String> entity = restf.httpServletRequestToHttpEntity(req);
+ FenceVmFromPeerCmd cmd = JSONObjectUtil.toObject(entity.getBody(), FenceVmFromPeerCmd.class);
+ config.fenceVmFromPeerCmds.add(cmd);
FenceVmFromPeerRsp rsp = new FenceVmFromPeerRsp();
rsp.qemuConfirmedDead = true;
rsp.setSuccess(true);
replyer.reply(entity, rsp);
return null;
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@simulator/simulatorImpl/src/main/java/org/zstack/simulator/kvm/KVMSimulatorController.java`
around lines 759 - 767, The fenceVmFromPeer handler currently skips
deserializing the request and storing the command, making it hard to unit-test;
modify the method fenceVmFromPeer to deserialize the request body into a
FenceVmFromPeerCmd using JSONObjectUtil.toObject(entity.getBody(),
FenceVmFromPeerCmd.class), ensure KVMSimulatorConfig has a public
List<FenceVmFromPeerCmd> fenceVmFromPeerCmds field, add the deserialized cmd to
config.fenceVmFromPeerCmds, then construct FenceVmFromPeerRsp as before (set
qemuConfirmedDead and success) and call replyer.reply(entity, rsp); keep the
rest of the reply behavior unchanged so tests can assert the stored command.
d06a0cd to
cb6369c
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java`:
- Around line 1051-1079: The change persists VmInstanceVO state/hostUuid to
Stopped before calling startVm(), which causes startVm()'s refreshVO() to treat
the modified state as originState and breaks HA-failure semantics and
changeVmStateInDb callbacks; fix by reverting the
"mark-vm-stopped-and-save-last-host" flow: instead of updating the DB before
start, capture and store the original VmInstanceVO fields (state, hostUuid,
lastHostUuid) into the flow data map (inside the first NoRollbackFlow) and
either (a) perform the UpdateQuery that sets state=Stopped/hostUuid=null only
after startVm() success (move the DB update into the success() callback of
startVm) or (b) if you must keep the DB update early, ensure failures restore
the original values by invoking changeVmStateInDb or executing an UpdateQuery
using the saved originals in the failure path (trigger.fail/trigger.rollback).
Reference: the NoRollbackFlow named "mark-vm-stopped-and-save-last-host",
startVm(...), changeVmStateInDb(...), and VmInstanceVO/VmInstanceVO_.uuid to
locate the code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)
Review profile: CHILL
Plan: Pro
Run ID: 7939f7e6-3080-4a7d-879f-6b8807cda191
📒 Files selected for processing (8)
compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.javaheader/src/main/java/org/zstack/header/vm/FenceVmOnHostMsg.javaheader/src/main/java/org/zstack/header/vm/FenceVmOnHostReply.javaheader/src/main/java/org/zstack/header/vm/HaStartVmInstanceMsg.javaplugin/kvm/src/main/java/org/zstack/kvm/KVMHost.javaplugin/kvm/src/main/java/org/zstack/kvm/ha/FenceVmFromPeerCmd.javaplugin/kvm/src/main/java/org/zstack/kvm/ha/FenceVmFromPeerRsp.javasimulator/simulatorImpl/src/main/java/org/zstack/simulator/kvm/KVMSimulatorController.java
🚧 Files skipped from review as they are similar to previous changes (5)
- plugin/kvm/src/main/java/org/zstack/kvm/ha/FenceVmFromPeerRsp.java
- header/src/main/java/org/zstack/header/vm/FenceVmOnHostReply.java
- plugin/kvm/src/main/java/org/zstack/kvm/ha/FenceVmFromPeerCmd.java
- header/src/main/java/org/zstack/header/vm/FenceVmOnHostMsg.java
- plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java
| fchain.then(new NoRollbackFlow() { | ||
| String __name__ = "mark-vm-stopped-and-save-last-host"; | ||
|
|
||
| @Override | ||
| public void run(FlowTrigger trigger, Map data) { | ||
| logger.debug(String.format("HaStartVmJudger[%s] says the VM[uuid:%s, name:%s] is qualified for HA start, now we are starting it", | ||
| judger.getClass(), self.getUuid(), self.getName())); | ||
| UpdateQuery sql = SQL.New(VmInstanceVO.class) | ||
| .eq(VmInstanceVO_.uuid, self.getUuid()) | ||
| .set(VmInstanceVO_.state, VmInstanceState.Stopped) | ||
| .set(VmInstanceVO_.hostUuid, null); | ||
|
|
||
| if (self.getHostUuid() != null) { | ||
| sql.set(VmInstanceVO_.lastHostUuid, self.getHostUuid()); | ||
| } | ||
|
|
||
| sql.update(); | ||
| trigger.next(); | ||
| } | ||
| }); | ||
|
|
||
| startVm(msg, new Completion(msg, chain) { | ||
| fchain.then(new NoRollbackFlow() { | ||
| String __name__ = "start-ha-vm"; | ||
|
|
||
| @Override | ||
| public void run(FlowTrigger trigger, Map data) { | ||
| startVm(msg, new Completion(trigger) { | ||
| @Override | ||
| public void success() { | ||
| reply.setInventory(getSelfInventory()); | ||
| bus.reply(msg, reply); | ||
| recorder.end(reply.getInventory().getHostUuid()); | ||
| chain.next(); | ||
| trigger.next(); |
There was a problem hiding this comment.
不要在真正启动前先把 VM 落库成 Stopped。
这里先改库,再调用 startVm();而 startVm() 一进来会 refreshVO(),随后把已经被改写后的状态当作自己的 originState。这样一来,只要后面的分配、pre-start 或 start 失败,回滚就只会回到 Stopped/hostUuid = null,而不是 HA 进入前的原始状态。这个改动会直接改写 5.4.8 的 HA 失败语义,而且还绕过了 changeVmStateInDb(...) 的状态事件/扩展点回调。
建议把原始 state/hostUuid/lastHostUuid 存进 flow data 并在失败时恢复,或者把这次落库延后到 HA start 真正成功之后。
As per coding guidelines: “向后兼容原则……不应直接改动已有行为……开关控制等”.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java` around lines
1051 - 1079, The change persists VmInstanceVO state/hostUuid to Stopped before
calling startVm(), which causes startVm()'s refreshVO() to treat the modified
state as originState and breaks HA-failure semantics and changeVmStateInDb
callbacks; fix by reverting the "mark-vm-stopped-and-save-last-host" flow:
instead of updating the DB before start, capture and store the original
VmInstanceVO fields (state, hostUuid, lastHostUuid) into the flow data map
(inside the first NoRollbackFlow) and either (a) perform the UpdateQuery that
sets state=Stopped/hostUuid=null only after startVm() success (move the DB
update into the success() callback of startVm) or (b) if you must keep the DB
update early, ensure failures restore the original values by invoking
changeVmStateInDb or executing an UpdateQuery using the saved originals in the
failure path (trigger.fail/trigger.rollback). Reference: the NoRollbackFlow
named "mark-vm-stopped-and-save-last-host", startVm(...),
changeVmStateInDb(...), and VmInstanceVO/VmInstanceVO_.uuid to locate the code.
436a16b to
746506a
Compare
|
Comment from yaohua.wu: Review: MR !9783 — ZSTAC-83890关联 MR
总体评价本仓承担 ZSTAC-83890 fix 的核心架构:
设计合理:opt-in(仅当 🔴 Critical(关联仓 zstack-utility !7019)本仓 但 zstack-utility !7019 的 本仓侧没问题、但作为整套 fix 的一部分,utility 那条不修,本仓的 ambiguous-refuse 防护也起不到作用:utility 永远把 qemu D-state(唯一会触发 stillAlive 的真实场景)误报为 unreachable,本仓直接放行 HA → 脑裂。详见 🟡 Warning
🟢 Suggestions
Coverage 评估
VerdictREVISION_REQUIRED —— 本仓自身实现合理,但 关联仓 zstack-utility !7019 有 Critical。三个 MR 必须一起 merge,utility 那条 pkill self-match bug 不修,本仓的 refuse 分支永远走不到、整套 fix 等于失效。本仓侧建议处理 Warning 里的 🤖 Robot Reviewer |
ZSTAC-83890 / TIC-5513: Before allowing HA to start a VM on a new host, ask a sibling KVM host to SSH into the suspect host and force-destroy any leftover qemu process (virsh destroy + pkill). This prevents split-brain when the suspect host is alive but unreachable from MN. - KVMHost handles FenceVmOnHostMsg, picks a Connected sibling, sends FenceVmFromPeerCmd to the sibling's kvmagent via KVMHostAsyncHttpCallMsg - kvmagent (ha_plugin.py) SSHes to target host, runs virsh destroy + pkill; returns qemuConfirmedDead / qemuStillAlive / targetHostUnreachable - FenceVmFromPeerCmd/Rsp defined as KVMHost inner static classes; path constant KVM_HA_FENCE_VM_FROM_PEER_PATH in KVMConstant - @nologging on targetHostPassword; implements Serializable for CloudBusGsonSkipLoggerCase Resolves: ZSTAC-83890 Change-Id: Ibcd7b318f57248795006079c7059911a2ecde333
746506a to
2cb4fef
Compare
Background
TIC-5513: Ceph OSD osd_client_watch_timeout (default 30s) evicts the watcher when watch_ping is delayed (e.g. source host OOM / storage net blocked). rbd status briefly returns Watchers: none even though qemu is still alive. HA's second watcher check hits this transient empty window -> VM starts on a new host -> split-brain after the source host recovers.
Fix
Add HaPreFenceVmExtensionPoint, called by VmInstanceBase.handle(HaStartVmInstanceMsg) before BeforeHaStartVmInstanceExtensionPoint. KVM impl KvmHaPreFenceVmExtension:
Companion MRs
Refs
Resolves: ZSTAC-83890
Refs: TIC-5513
sync from gitlab !9783