Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions conf/springConfigXml/sdnController.xml
Original file line number Diff line number Diff line change
Expand Up @@ -94,4 +94,6 @@
</zstack:plugin>
</bean>

<bean id="DirtySyncTracker" class="org.zstack.sdnController.DirtySyncTrackerImpl" />

</beans>
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
package org.zstack.sdnController;

/**
* Tracks which SDN controllers need a full reconciliation sync.
*
* When external operations (VM NIC create/delete, migration rollback, etc.) encounter failures
* that may leave Cloud and SDN controller out of sync, they call {@link #markNeedSync} to flag
* the affected controller.
*
* The ping tracker checks this flag after consecutive successful pings and triggers a full sync
* when needed.
*/
public interface DirtySyncTracker {
/**
* Mark a controller as needing a full sync.
* Idempotent — calling multiple times has no additional effect.
*/
void markNeedSync(String controllerUuid);

/**
* Check if a controller needs a sync.
*/
boolean needsSync(String controllerUuid);

/**
* Atomically clear the needsSync flag and return its previous value.
* @return true if the controller was marked as needing sync, false otherwise.
*/
boolean clearNeedSync(String controllerUuid);
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
package org.zstack.sdnController;

import org.zstack.utils.Utils;
import org.zstack.utils.logging.CLogger;

import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

/**
* In-memory implementation of DirtySyncTracker.
* Thread-safe via ConcurrentHashMap.
* Dirty state is ephemeral — lost on MN restart, which is acceptable because
* the ping-triggered sync provides eventual consistency.
*/
public class DirtySyncTrackerImpl implements DirtySyncTracker {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== 查找 DirtySyncTracker 相关实现与注解 =="
rg -n --type=java -C2 'interface\s+DirtySyncTracker|class\s+DirtySyncTrackerImpl|@Component|@Service|@Configuration|@Bean'

echo
echo "== 查找 XML 中是否有 DirtySyncTracker/DirtySyncTrackerImpl Bean 注册 =="
rg -n -C2 'DirtySyncTracker|DirtySyncTrackerImpl|<bean' --iglob '*.xml'

echo
echo "== 查找是否存在其他 DirtySyncTracker 实现 =="
rg -n --type=java -C2 'class\s+\w+\s+implements\s+DirtySyncTracker'

Repository: MatheMatrix/zstack

Length of output: 50375


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== 检查 DirtySyncTrackerImpl 完整内容(前30行)=="
head -30 plugin/sdnController/src/main/java/org/zstack/sdnController/DirtySyncTrackerImpl.java

echo
echo "== 检查 SdnControllerManagerImpl 的 `@Autowired` 位置 =="
sed -n '65,75p' plugin/sdnController/src/main/java/org/zstack/sdnController/SdnControllerManagerImpl.java

echo
echo "== 查找 sdnController 目录下是否有配置文件 =="
find plugin/sdnController -name "*.xml" -o -name "*.properties" | head -20

Repository: MatheMatrix/zstack

Length of output: 1753


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== 查找 sdnController 模块中的 `@Configuration` 类 =="
rg -n --type=java '@Configuration' plugin/sdnController/

echo
echo "== 查找 sdnController 模块中的 `@SpringBootApplication` 或其他配置入口 =="
rg -n --type=java '@SpringBootApplication|@EnableAutoConfiguration|ComponentScan' plugin/sdnController/

echo
echo "== 查找 SdnControllerManagerImpl 类声明及其注解 =="
sed -n '1,30p' plugin/sdnController/src/main/java/org/zstack/sdnController/SdnControllerManagerImpl.java

Repository: MatheMatrix/zstack

Length of output: 106


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== 检查 SdnControllerManagerImpl 类声明(前50行)=="
head -50 plugin/sdnController/src/main/java/org/zstack/sdnController/SdnControllerManagerImpl.java

echo
echo "== 查找 sdnController 模块中所有 `@Component/`@Service/@Aspect 注解的类 =="
rg -n --type=java '@Component|@Service|@Aspect' plugin/sdnController/ -l

echo
echo "== 检查是否存在 Bean 工厂方法创建 DirtySyncTrackerImpl =="
rg -n --type=java '@Bean.*DirtySyncTracker|new DirtySyncTrackerImpl' plugin/sdnController/

Repository: MatheMatrix/zstack

Length of output: 2544


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== 查找 AbstractService 的定义 =="
rg -n --type=java 'class AbstractService' --max-count=5

echo
echo "== 检查 AbstractService 中的注解 =="
rg -B5 'class AbstractService' --type=java | head -20

echo
echo "== 查找所有实现 DirtySyncTracker 的类 =="
rg -n --type=java 'implements DirtySyncTracker'

echo
echo "== 检查 DirtySyncTrackerImpl 是否在测试或其他模块中被显式注册 =="
rg -n 'DirtySyncTrackerImpl' --type=java

Repository: MatheMatrix/zstack

Length of output: 1258


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== 最终确认: 检查 DirtySyncTrackerImpl 完整文件内容 =="
cat plugin/sdnController/src/main/java/org/zstack/sdnController/DirtySyncTrackerImpl.java | head -20

echo
echo "== 检查使用 DirtySyncTracker 注入的其他类 =="
rg -n '@Autowired.*DirtySyncTracker|@Resource.*DirtySyncTracker' --type=java

Repository: MatheMatrix/zstack

Length of output: 897


添加 @Component 注解到 DirtySyncTrackerImpl 以完成 Spring Bean 注册。

SdnControllerManagerImpl 在第 69-70 行通过 @Autowired 注入 DirtySyncTracker 接口,但当前 DirtySyncTrackerImpl 缺少任何 Spring Bean 注册注解(@Component@Service 等)。由于这是该接口的唯一实现,且找不到其他显式的 Bean 注册机制(XML 配置、@Configuration 类或 @Bean 工厂方法),应用启动时将抛出 UnsatisfiedDependencyException

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugin/sdnController/src/main/java/org/zstack/sdnController/DirtySyncTrackerImpl.java`
at line 16, DirtySyncTrackerImpl is not registered as a Spring bean, causing
UnsatisfiedDependencyException when SdnControllerManagerImpl autowires
DirtySyncTracker; fix by annotating the implementation class
DirtySyncTrackerImpl with `@Component` (and add the corresponding import
org.springframework.stereotype.Component) so Spring can discover and register
the bean for the DirtySyncTracker interface used by SdnControllerManagerImpl
(which relies on `@Autowired`).

private static final CLogger logger = Utils.getLogger(DirtySyncTrackerImpl.class);

private final Map<String, Boolean> needsSyncMap = new ConcurrentHashMap<>();

@Override
public void markNeedSync(String controllerUuid) {
needsSyncMap.put(controllerUuid, Boolean.TRUE);
logger.debug(String.format("[sync-tracker] marked controller[uuid:%s] needsSync", controllerUuid));
}

@Override
public boolean needsSync(String controllerUuid) {
return Boolean.TRUE.equals(needsSyncMap.get(controllerUuid));
}

@Override
public boolean clearNeedSync(String controllerUuid) {
Boolean prev = needsSyncMap.remove(controllerUuid);
if (Boolean.TRUE.equals(prev)) {
logger.info(String.format("[sync-tracker] cleared needsSync for controller[uuid:%s]", controllerUuid));
}
return Boolean.TRUE.equals(prev);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,9 @@ public class SdnControllerBase {

public SdnControllerVO self;

// Consecutive sync failure tracking (in-memory, per controller)
private static final java.util.Map<String, Integer> consecutiveFailCount = new java.util.concurrent.ConcurrentHashMap<>();

public SdnControllerBase(SdnControllerVO self) {
this.self = self;
}
Expand Down Expand Up @@ -141,6 +144,23 @@ public void handleMessage(SdnControllerMessage msg) {

private void handle(SyncSdnControllerDataMsg msg) {
SyncSdnControllerDataReply reply = new SyncSdnControllerDataReply();

SdnControllerFactory factory = sdnMgr.getSdnControllerFactory(self.getVendorType());
if (!factory.isSyncEnabled()) {
logger.info(String.format("sync is disabled for sdn controller[uuid:%s], skip", self.getUuid()));
bus.reply(msg, reply);
return;
}

if (self.getStatus() != SdnControllerStatus.Connected) {
logger.info(String.format("sdn controller[uuid:%s] is not connected (status=%s), skip sync",
self.getUuid(), self.getStatus()));
bus.reply(msg, reply);
return;
}

boolean dryRun = factory.isDryRun();

// Run a sync chain that only syncs data, without touching connection status
thdf.chainSubmit(new ChainTask(reply) {
@Override
Expand All @@ -152,21 +172,34 @@ public String getSyncSignature() {
public void run(SyncTaskChain chain) {
FlowChain flowChain = sdnMgr.getSyncChain(self);
flowChain.getData().put(SDN_CONTROLLER_UUID, self.getUuid());
flowChain.getData().put(SYNC_DRY_RUN, dryRun);
flowChain.setName(String.format("sync-sdn-controller-data-%s-%s", self.getUuid(), self.getName()));

// Start the chain; flows in factory-provided chain should perform data sync operations
flowChain.done(new FlowDoneHandler(msg) {
@Override
public void handle(Map data) {
populateSyncReply(reply, data, dryRun);
bus.reply(msg, reply);

// Reset consecutive fail count and fire completion event
consecutiveFailCount.remove(self.getUuid());
fireSyncCompletedEvent(reply);

chain.next();
}
}).error(new FlowErrorHandler(msg) {
@Override
public void handle(ErrorCode errCode, Map data) {
SyncSdnControllerDataReply r = new SyncSdnControllerDataReply();
r.setError(errCode);
populateSyncReply(r, data, dryRun);
bus.reply(msg, r);

// Increment consecutive fail count and check alarm threshold
int failCount = consecutiveFailCount.merge(self.getUuid(), 1, Integer::sum);
fireSyncFailedEvent(errCode, failCount);
checkSyncAlarm(failCount, factory);

chain.next();
}
}).start();
Expand All @@ -179,6 +212,63 @@ public String getName() {
});
}

private void populateSyncReply(SyncSdnControllerDataReply reply, Map data, boolean dryRun) {
reply.setDryRun(dryRun);
if (data == null) return;
Object statsObj = data.get(SYNC_STATS);
if (statsObj == null) return;
try {
java.lang.reflect.Method getToCreateCount = statsObj.getClass().getMethod("getToCreateCount");
java.lang.reflect.Method getToDeleteCount = statsObj.getClass().getMethod("getToDeleteCount");
java.lang.reflect.Method getToUpdateCount = statsObj.getClass().getMethod("getToUpdateCount");
java.lang.reflect.Method getFailedCount = statsObj.getClass().getMethod("getFailedCount");
reply.setToCreateCount((int) getToCreateCount.invoke(statsObj));
reply.setToDeleteCount((int) getToDeleteCount.invoke(statsObj));
reply.setToUpdateCount((int) getToUpdateCount.invoke(statsObj));
reply.setFailedCount((int) getFailedCount.invoke(statsObj));
} catch (Exception e) {
logger.warn(String.format("failed to read sync stats: %s", e.getMessage()));
}
}

private void fireSyncCompletedEvent(SyncSdnControllerDataReply reply) {
SdnControllerCanonicalEvents.SdnControllerSyncCompletedData d =
new SdnControllerCanonicalEvents.SdnControllerSyncCompletedData();
d.setSdnControllerUuid(self.getUuid());
d.setToCreateCount(reply.getToCreateCount());
d.setToDeleteCount(reply.getToDeleteCount());
d.setToUpdateCount(reply.getToUpdateCount());
d.setFailedCount(reply.getFailedCount());
d.setDryRun(reply.isDryRun());
d.setScope("FULL");
evtf.fire(SdnControllerCanonicalEvents.SDN_CONTROLLER_SYNC_COMPLETED_PATH, d);
}

private void fireSyncFailedEvent(ErrorCode errCode, int failCount) {
SdnControllerCanonicalEvents.SdnControllerSyncFailedData d =
new SdnControllerCanonicalEvents.SdnControllerSyncFailedData();
d.setSdnControllerUuid(self.getUuid());
d.setErrorCode(errCode.getCode());
d.setErrorDetails(errCode.getDetails());
d.setConsecutiveFailCount(failCount);
evtf.fire(SdnControllerCanonicalEvents.SDN_CONTROLLER_SYNC_FAILED_PATH, d);
}

private void checkSyncAlarm(int failCount, SdnControllerFactory factory) {
int alarmThreshold = factory.getSyncFailAlarmThreshold();

if (failCount >= alarmThreshold && (failCount - alarmThreshold) % 5 == 0) {
logger.error(String.format("[sync-alarm] controller[uuid:%s] has %d consecutive sync failures (threshold=%d)",
self.getUuid(), failCount, alarmThreshold));
SdnControllerCanonicalEvents.SdnControllerSyncAlarmData d =
new SdnControllerCanonicalEvents.SdnControllerSyncAlarmData();
d.setSdnControllerUuid(self.getUuid());
d.setConsecutiveFailCount(failCount);
d.setAlarmThreshold(alarmThreshold);
evtf.fire(SdnControllerCanonicalEvents.SDN_CONTROLLER_SYNC_ALARM_PATH, d);
}
}

public void changeSdnControllerStatus(SdnControllerStatus status) {
if (status == self.getStatus()) {
return;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,12 @@ public interface SdnControllerFactory {
default SdnControllerDhcp getSdnControllerDhcp(String l2NetworkUuid) {return null;};

default FlowChain getSyncChain() {return FlowChainBuilder.newSimpleFlowChain();};

default boolean isSyncEnabled() { return false; }

default boolean isDryRun() { return false; }
Comment on lines +30 to +32
Copy link
Copy Markdown

@coderabbitai coderabbitai bot Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== 查找 SdnControllerFactory 实现类 =="
rg -nP --type=java -C2 'class\s+\w+.*implements\s+[^{]*\bSdnControllerFactory\b'

echo
echo "== 查找 isSyncEnabled 覆写 =="
rg -nP --type=java -C2 '\bboolean\s+isSyncEnabled\s*\('

echo
echo "== 查找 isDryRun 覆写 =="
rg -nP --type=java -C2 '\bboolean\s+isDryRun\s*\('

Repository: MatheMatrix/zstack

Length of output: 8881


确认目标厂商实现需覆写 isSyncEnabled()

当前 SdnControllerFactory 的两个已识别实现类(SugonSdnControllerFactoryH3cVcfcSdnControllerFactory)均未覆写此方法,将采用默认的 false 返回值,导致同步功能被静默禁用。建议各实现类明确覆写 isSyncEnabled()isDryRun() 方法,避免意外行为。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugin/sdnController/src/main/java/org/zstack/sdnController/SdnControllerFactory.java`
around lines 30 - 32, SdnControllerFactory defines default methods
isSyncEnabled() and isDryRun() returning false, but SugonSdnControllerFactory
and H3cVcfcSdnControllerFactory do not override them, silently disabling
sync/dry-run; update each concrete factory (SugonSdnControllerFactory and
H3cVcfcSdnControllerFactory) to explicitly override isSyncEnabled() and
isDryRun() with the intended boolean values for that vendor (or compute them
from config), and ensure any callers use the overridden methods rather than
assuming the defaults.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment from shixin.ruan:

这两个类原来有sync功能吗?如果没有这个这个功能,本次修改也不需要增加功能

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
echo "== Check SugonSdnControllerFactory for sync-related code =="
rg -nP --type=java -C3 'getSyncChain|isSyncEnabled|isDryRun|[Ss]ync' --iglob '*SugonSdnControllerFactory*'

echo
echo "== Check H3cVcfcSdnControllerFactory for sync-related code =="
rg -nP --type=java -C3 'getSyncChain|isSyncEnabled|isDryRun|[Ss]ync' --iglob '*H3cVcfcSdnControllerFactory*'

echo
echo "== Find which factory DOES override getSyncChain =="
rg -nP --type=java -C3 'getSyncChain\s*\('

Length of output: 4702


@MatheMatrix,shixin.ruan 的判断完全正确。

脚本确认 SugonSdnControllerFactoryH3cVcfcSdnControllerFactory 均无任何同步相关代码,也没有任何工厂类覆写了 getSyncChain()。这两个方法的默认值 false 正是用于表示"该厂商不支持同步",是合理的设计,无需在这两个实现类中显式覆写。

我之前的 review 评论有误,抱歉造成困扰,可以忽略该建议。

(∩`-´)⊃━☆゚.*・。゚

   (\_/)
  ( •_•)
  / >🙏  my bad!

✏️ Learnings added
Learnt from: MatheMatrix
Repo: MatheMatrix/zstack PR: 3777
File: plugin/sdnController/src/main/java/org/zstack/sdnController/SdnControllerFactory.java:30-32
Timestamp: 2026-04-17T02:23:21.142Z
Learning: In the ZStack SDN controller plugin, `SdnControllerFactory.isSyncEnabled()` and `isDryRun()` are intentional default methods returning `false`. Concrete factory implementations (e.g., SugonSdnControllerFactory, H3cVcfcSdnControllerFactory) that do NOT have sync functionality should NOT override these methods — the default `false` correctly signals that sync is disabled for that vendor. Only factories that actually implement the sync chain need to override these methods.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2419
File: plugin/sdnController/src/main/java/org/zstack/sdnController/h3cVcfc/H3cVcfcSdnController.java:568-570
Timestamp: 2025-08-13T02:40:34.873Z
Learning: 在ZStack的H3C SDN控制器实现中,getDeleteH3cNetworksCmd()方法需要保持private访问修饰符,因为它是非overwrite方法,改为protected可能导致H3cVcfcSdnController和H3cVcfcV2SdnController之间的意外继承行为。

Learnt from: MatheMatrix
Repo: MatheMatrix/zstack PR: 2419
File: plugin/sdnController/src/main/java/org/zstack/sdnController/h3cVcfc/H3cVcfcSdnController.java:560-562
Timestamp: 2025-08-13T02:41:52.024Z
Learning: 在ZStack的H3C SDN控制器实现中,getCreateH3cNetworksRspClass()方法需要保持private访问修饰符,因为它是非overwrite方法,改为protected可能导致H3cVcfcSdnController和H3cVcfcV2SdnController之间的意外继承行为。

Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2419
File: plugin/sdnController/src/main/java/org/zstack/sdnController/h3cVcfc/H3cVcfcSdnController.java:556-558
Timestamp: 2025-08-13T02:42:07.774Z
Learning: 在ZStack的H3C SDN控制器实现中,getCreateH3cNetworksCmd()方法需要保持private访问修饰符,因为它是非overwrite方法,改为protected可能导致H3cVcfcSdnController和H3cVcfcV2SdnController之间的意外继承行为。

Learnt from: zstack-robot-2
Repo: MatheMatrix/zstack PR: 2419
File: plugin/sdnController/src/main/java/org/zstack/sdnController/h3cVcfc/H3cVcfcSdnController.java:552-554
Timestamp: 2025-08-12T03:30:52.097Z
Learning: 在ZStack的H3C SDN控制器实现中,getNetworkCmd()方法在H3cVcfcSdnController和H3cVcfcV2SdnController中各自有独立的实现,需要保持private访问修饰符以防止意外的继承/重写行为,不能为了保持一致性而改为protected。

Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 0
File: :0-0
Timestamp: 2026-04-02T01:40:40.916Z
Learning: 在 ZStack 中,资源创建成功路径(如 `handle(APIAddSdnControllerMsg)` 中)对 `tagMgr.createTagsFromAPICreateMessage(msg, vo.getUuid(), SdnControllerVO.class.getSimpleName())` 和 `dbf.findByUuid(vo.getUuid(), SdnControllerVO.class)` 等后续操作,不需要用 try/catch 来防御"资源不存在"异常。ZStack 的设计哲学是:在资源创建流程中,通过生命周期管理和数据库外键约束来保证一致性,一般假设此类情况不会发生;若需要解决框架层面的一致性问题,应通过声明式 API 等机制统一处理,而不是在每个操作中无休止地判断异常。(plugin/sdnController/src/main/java/org/zstack/sdnController/SdnControllerManagerImpl.java)

Learnt from: ZStack-Robot
Repo: MatheMatrix/zstack PR: 2419
File: plugin/sdnController/src/main/java/org/zstack/sdnController/h3cVcfc/H3cVcfcSdnController.java:540-542
Timestamp: 2025-08-12T03:30:08.976Z
Learning: 在ZStack的H3C VCFC SDN控制器实现中,getGetH3cTenantsRspClass()方法在H3cVcfcSdnController和H3cVcfcV2SdnController中各自有独立的实现,因此需要保持private访问修饰符以防止意外的继承/重写行为,不能为了保持一致性而改为protected。

Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 3563
File: plugin/kvm/src/main/java/org/zstack/kvm/efi/KvmSecureBootManager.java:328-334
Timestamp: 2026-03-23T09:33:01.215Z
Learning: In ZStack message handlers, ensure your `handleMessage` method is annotated with `MessageSafe` (`org.zstack.core.cloudbus.MessageSafe`). This enables `MessageSafeAspect.aj` to automatically convert exceptions thrown by the handler into appropriate reply errors. Note that `ExceptionSafeAspect.aj` only applies to methods annotated with `ExceptionSafe` and will only log/swallow exceptions without sending a reply—so missing `MessageSafe` can break reply/error semantics for the message bus.


default int getSyncFailAlarmThreshold() { return 3; }

default int getSyncPingSuccessThreshold() { return 10; }
}
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,8 @@ public class SdnControllerManagerImpl extends AbstractService implements SdnCont
private SecurityGroupManager sgMgr;
@Autowired
private SdnControllerPingTracker pingTracker;
@Autowired
private DirtySyncTracker dirtySyncTracker;

private Map<String, SdnControllerFactory> sdnControllerFactories = Collections.synchronizedMap(new HashMap<String, SdnControllerFactory>());

Expand Down Expand Up @@ -405,6 +407,8 @@ public void success() {

@Override
public void fail(ErrorCode errorCode) {
// Mark controller needsSync — port creation failed, ZNS may be out of sync
dirtySyncTracker.markNeedSync(e.getKey());
wcomp.addError(errorCode);
wcomp.allDone();
}
Expand Down Expand Up @@ -443,6 +447,8 @@ public void success() {

@Override
public void fail(ErrorCode errorCode) {
// Mark controller needsSync — port deletion failed, ZNS may be out of sync
dirtySyncTracker.markNeedSync(e.getKey());
wcomp.addError(errorCode);
wcomp.allDone();
}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package org.zstack.sdnController;

import org.springframework.beans.factory.annotation.Autowired;
import org.zstack.core.cloudbus.CloudBusCallBack;
import org.zstack.core.cloudbus.ResourceDestinationMaker;
import org.zstack.core.componentloader.PluginRegistry;
import org.zstack.core.config.GlobalConfig;
Expand All @@ -27,6 +28,9 @@
import org.zstack.utils.logging.CLogger;

import java.util.List;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.atomic.AtomicInteger;

/**
* Created by shixin on 06/26/2025.
Expand All @@ -43,6 +47,10 @@ public class SdnControllerPingTracker extends PingTracker implements
protected PluginRegistry pluginRgty;
@Autowired
protected SdnControllerManager sdnMgr;
@Autowired
private DirtySyncTracker dirtySyncTracker;

private final Map<String, AtomicInteger> consecutivePingSuccessCount = new ConcurrentHashMap<>();

public String getResourceName() {
return "sdn controller";
Expand Down Expand Up @@ -81,23 +89,59 @@ public void handleReply(final String resourceUuid, MessageReply reply) {
return;
}


if (!reply.isSuccess()) {
logger.warn(String.format("[SDN Ping Tracker]: unable to ping the sdn controller[uuid: %s], %s", resourceUuid, reply.getError()));
consecutivePingSuccessCount.remove(resourceUuid);
new SdnControllerBase(vo).changeSdnControllerStatus(SdnControllerStatus.Disconnected);
return;
}

SdnControllerStatus oldStatus = vo.getStatus();
if (oldStatus == SdnControllerStatus.Disconnected) {
// when reconnect successfully, it will fire event: SdnControllerStatus.Connected
consecutivePingSuccessCount.remove(resourceUuid);
ReconnectSdnControllerMsg msg = new ReconnectSdnControllerMsg();
msg.setControllerUuid(resourceUuid);
bus.makeTargetServiceIdByResourceUuid(msg, SdnControllerConstant.SERVICE_ID, resourceUuid);
bus.send(msg);
return;
}

// Ping success on Connected controller: count consecutive successes and trigger sync if needed
if (vo.getStatus() == SdnControllerStatus.Connected) {
int count = consecutivePingSuccessCount
.computeIfAbsent(resourceUuid, k -> new AtomicInteger(0))
.incrementAndGet();

SdnControllerFactory factory = sdnMgr.getSdnControllerFactory(vo.getVendorType());
if (factory.isSyncEnabled() && dirtySyncTracker.needsSync(resourceUuid)) {
int threshold = factory.getSyncPingSuccessThreshold();
if (count >= threshold) {
consecutivePingSuccessCount.remove(resourceUuid);
dirtySyncTracker.clearNeedSync(resourceUuid);
triggerSync(resourceUuid);
}
Comment on lines +117 to +123
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

clearNeedSync() 作为原子门闸,避免并发下重复触发全量同步

当前先 needsSync()clearNeedSync(),且忽略 clearNeedSync() 返回值,存在竞态窗口;并发回调下可能重复发送 SyncSdnControllerDataMsg。建议只在 clearNeedSync(...) == true 时触发同步。

建议修改
-            if (factory.isSyncEnabled() && dirtySyncTracker.needsSync(resourceUuid)) {
-                int threshold = factory.getSyncPingSuccessThreshold();
-                if (count >= threshold) {
-                    consecutivePingSuccessCount.remove(resourceUuid);
-                    dirtySyncTracker.clearNeedSync(resourceUuid);
-                    triggerSync(resourceUuid);
-                }
-            }
+            if (factory.isSyncEnabled()) {
+                int threshold = Math.max(1, factory.getSyncPingSuccessThreshold());
+                if (count >= threshold && dirtySyncTracker.clearNeedSync(resourceUuid)) {
+                    consecutivePingSuccessCount.remove(resourceUuid);
+                    triggerSync(resourceUuid);
+                }
+            }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (factory.isSyncEnabled() && dirtySyncTracker.needsSync(resourceUuid)) {
int threshold = factory.getSyncPingSuccessThreshold();
if (count >= threshold) {
consecutivePingSuccessCount.remove(resourceUuid);
dirtySyncTracker.clearNeedSync(resourceUuid);
triggerSync(resourceUuid);
}
if (factory.isSyncEnabled()) {
int threshold = Math.max(1, factory.getSyncPingSuccessThreshold());
if (count >= threshold && dirtySyncTracker.clearNeedSync(resourceUuid)) {
consecutivePingSuccessCount.remove(resourceUuid);
triggerSync(resourceUuid);
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugin/sdnController/src/main/java/org/zstack/sdnController/SdnControllerPingTracker.java`
around lines 117 - 123, 当前逻辑先调用 dirtySyncTracker.needsSync(resourceUuid) 再调用
dirtySyncTracker.clearNeedSync(resourceUuid) 并忽略其返回值,存在竞态导致重复触发
triggerSync(resourceUuid)。修改为以 clearNeedSync(...) 的返回值作为原子门闩:在
factory.isSyncEnabled() 且 count >= factory.getSyncPingSuccessThreshold() 时,调用
dirtySyncTracker.clearNeedSync(resourceUuid);只有当该调用返回 true 时才调用
consecutivePingSuccessCount.remove(resourceUuid) 并触发
triggerSync(resourceUuid);否则不做任何操作,从而避免并发下重复发送 SyncSdnControllerDataMsg。

}
}
}

private void triggerSync(String controllerUuid) {
logger.info(String.format("[SDN Ping Tracker]: triggering sync for controller[uuid:%s] after consecutive ping successes", controllerUuid));
SyncSdnControllerDataMsg msg = new SyncSdnControllerDataMsg();
msg.setControllerUuid(controllerUuid);
bus.makeTargetServiceIdByResourceUuid(msg, SdnControllerConstant.SERVICE_ID, controllerUuid);
bus.send(msg, new CloudBusCallBack(null) {
@Override
public void run(MessageReply reply) {
if (!reply.isSuccess()) {
logger.warn(String.format("[SDN Ping Tracker]: sync failed for controller[uuid:%s], re-marking needsSync: %s",
controllerUuid, reply.getError()));
dirtySyncTracker.markNeedSync(controllerUuid);
}
}
});
}

private void trackOurs() {
List<String> sdnControllerUuids = Q.New(SdnControllerVO.class)
.select(SdnControllerVO_.uuid).listValues();
Expand Down Expand Up @@ -154,6 +198,7 @@ protected void trackHook(String resourceUuid) {

@Override
public void deleteNetworkServiceOfSdnController(String sdnControllerUuid, Completion completion) {
consecutivePingSuccessCount.remove(sdnControllerUuid);
super.untrack(sdnControllerUuid);
completion.success();
}
Expand Down
Loading