Skip to content

test(framework): fix test quality issues#6655

Open
Little-Peony wants to merge 25 commits intotronprotocol:developfrom
Little-Peony:fix_tests_v2
Open

test(framework): fix test quality issues#6655
Little-Peony wants to merge 25 commits intotronprotocol:developfrom
Little-Peony:fix_tests_v2

Conversation

@Little-Peony
Copy link
Copy Markdown

What does this PR do?

Systematically fix five categories of test quality issues across framework, actuator, and plugins modules (44 files, 6 commits):

1. Silent-pass tests

  • Add Assert.fail() after code expected to throw exceptions (FetchInvDataMsgHandlerTest, MerkleTreeTest, CredentialsTest, etc.)
  • Fix assertEquals(actual, expected) argument order → assertEquals(expected, actual)

2. Thread / resource leaks

  • Replace bare Executors.newFixedThreadPool() / new Thread[] with ExecutorServiceManager in LogBlockQueryTest, TransactionRegisterTest, ConcurrentHashMapTest; add shutdownAndAwaitTermination
  • Fix InterruptedException swallowed silently → restore interrupt flag in PropUtilTest, DBIteratorTest, SnapshotImplTest, ProposalControllerTest, BlockStoreTest, PbftTest
  • Fix SolidityNode not shutting down correctly in SolidityNodeTest; patch SolidityNode.java shutdown path

3. Dead code / phantom fixtures

  • Remove @After methods cleaning paths never written by any test (DbMoveTest)
  • Remove never-called private static helpers (writeProperty(), deleteDir()) in ArchiveManifestTest, DbArchiveTest
  • Remove unreachable test code in FreezeTest, EventParserTest, RelayServiceTest, etc.

4. Stale / obsolete tests

  • Remove moreThanFrozenNumber test in FreezeBalanceActuatorTest — the corresponding actuator error path no longer exists

5. Test infrastructure unification

  • Extract ClassLevelAppContextFixture (with shutdownChannel / shutdownChannels helpers) to unify AppContext and gRPC channel lifecycle across service tests
  • Migrate CredentialsTest from JUnit 3 (junit.framework.TestCase) to JUnit 4 + Mockito; remove dependency on real SecureRandom
  • Fix package typo: keystroekeystore
  • Add Mockito.verify() assertions in InventoryMsgHandlerTest, PeerStatusCheckMockTest

Why are these changes required?

These issues caused tests to pass silently even when the code under test was broken, leaked threads between test runs (causing flakiness in parallel execution), and kept dead code that obscured the real test intent.

This PR has been tested by:

  • All changed tests pass locally (./gradlew :framework:test :actuator:test :plugins:test)

Breaking Changes

None. Test-only changes except SolidityNode.java shutdown fix (additive only).

@Sunny6889 Sunny6889 changed the title test: fix test quality issues in framework, actuator and plugins test(framework): fix test quality issues Apr 8, 2026
Comment thread framework/src/test/java/org/tron/core/db/AccountAssetStoreTest.java Outdated
@Sunny6889 Sunny6889 requested a review from 3for April 8, 2026 07:53
Comment thread framework/src/test/java/org/tron/core/db/DBIteratorTest.java
Comment thread plugins/src/test/java/org/tron/plugins/DbMoveTest.java
Comment thread framework/src/main/java/org/tron/program/SolidityNode.java Outdated
Little-Peony and others added 15 commits April 13, 2026 13:11
- FetchInvDataMsgHandlerTest: add Assert.fail() before catch blocks so
  tests fail when expected exceptions are not thrown; fix assertEquals
  argument order (expected, actual)
- ConcurrentHashMapTest: restore commented-out Assert.fail() for
  ItemNotFoundException; replace printStackTrace() with interrupt flag
  restoration and proper logging; fail main thread on interrupted join
- InventoryMsgHandlerTest: add Mockito.verify() to assert
  isBlockUnsolidified() is called in the unsolidified-block scenario
- PeerStatusCheckMockTest: add Mockito.verify() to assert statusCheck()
  is invoked by the scheduler after init()
- MerkleTreeTest: replace Assert.assertFalse(true) with Assert.fail()
  for clearer failure message

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- DbLiteTest: remove dead null-dbPath reassignment before init() call;
  restore thread interrupt flag on InterruptedException in
  generateSomeTransactions()
- DbMoveTest: remove legacy @after destroy() that cleaned a hardcoded
  path never written by any test; remove dead deleteDir() helper and
  OUTPUT_DIRECTORY constant; remove now-unused After import
- DbTest: narrow INPUT_DIRECTORY and cli fields from public to protected
- ArchiveManifestTest, DbArchiveTest: remove copy-pasted writeProperty()
  and deleteDir() private static methods that were never called; clean
  up unused imports
- DbLiteRocksDbV2Test: rename testToolsWithRocksDB to
  testToolsWithRocksDbV2 to match the LevelDB v1/v2 naming convention

Note: testMvForRocksDB is a pre-existing failure unrelated to these changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lanceActuatorTest

The method tested a "max frozen number is: X" error that no longer exists
in FreezeBalanceActuator. The actuator was updated to check
"frozenCount must be 0 or 1" instead, and a second freeze with
frozenCount=1 now succeeds rather than throwing. Remove the now-unused
ChainConstant import.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nd fix test quality issues

- Replace Executors.newFixedThreadPool / bare new Thread[] with
  ExecutorServiceManager in LogBlockQueryTest, TransactionRegisterTest,
  ConcurrentHashMapTest; add proper shutdownAndAwaitTermination
- Fix InterruptedException handling: restore interrupt flag instead of
  swallowing in PropUtilTest, DBIteratorTest, SnapshotImplTest,
  ProposalControllerTest, BlockStoreTest, PbftTest
- Remove dead/unreachable test code in FreezeTest, EventParserTest,
  EventParserJsonTest, BlockFilterCapsuleTest, MessageTest,
  RelayServiceTest, DelegationServiceTest
- Fix package typo: rename keystroe.CredentialsTest → keystore.CredentialsTest;
  rewrite test using JUnit Assert and Mockito instead of junit.framework.TestCase
- Minor cleanup: unused imports, whitespace, AccountAssetStoreTest assertion style

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Move to local .git/info/exclude instead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ityNodeTest

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- BandWidthRuntimeTest: deploy contract before setDebug(true) to avoid
  CPU timeout triggered by debug mode during contract deployment
- TronStatsManagerTest: capture P2pStats snapshot before invoking work()
  and compare against snapshot instead of hardcoded 0, preventing
  failures when another test starts the real P2pService with non-zero stats
- DbMoveTest: add @after cleanup of output-directory-toolkit so LevelDB
  and RocksDB tests don't share the same relative destination path
- PeerStatusCheckMockTest: replace 5-second sleep with mock executor;
  verify scheduleWithFixedDelay args and run the captured Runnable directly

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ection

Extract relayNodes from a local variable inside setChannel() to a lazy-
initialized instance field. setChannel() now only fetches from Args when
the field is still null, so tests can pre-inject a custom list via
reflection before calling setChannel(). This restores testability of
PeerManagerTest without touching the test itself.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use BigInteger for memory size computation to improve precision and add unit tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
getBlockByNum() and getLastSolidityBlockNum() used while(true) which
ignores the shutdown flag. If either method is blocked retrying on
network errors, shutdown() cannot interrupt the thread, causing
ExecutorServiceManager.shutdownAndAwaitTermination to time out.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread framework/src/test/java/org/tron/common/utils/PropUtilTest.java Outdated
halibobo1205 and others added 3 commits April 16, 2026 16:55
Fix incorrect protobuf types used in any.unpack() within getOwnerAddress()
for three actuators:
- UpdateAssetActuator: AccountUpdateContract -> UpdateAssetContract
- MarketCancelOrderActuator: AssetIssueContract -> MarketCancelOrderContract
- MarketSellAssetActuator: AssetIssueContract -> MarketSellAssetContract

Also remove unused imports and add unit tests for getOwnerAddress().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…#6631)

* feat(protocol): add protoLint script for enum validation

* feat(protocol): optimize protoLint performance and caching

* fix(proto): resolve gradle implicit task dependency warning

* docs(protocol): clarify enum discriminator in protoLint

* build(protocol): harden proto lint buf config

* docs: update comment to reflect dynamic include path derivation
…thods

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

// Reset global static flag that other tests may leave as true, which would prevent
// ConfigLoader.load() from updating VMConfig during VMActuator.execute().
ConfigLoader.disable = false;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[NIT] EventPluginLoader.filterQuery is subject to the same cross-test pollution and is not reset here. FilterQueryTest.testMatchFilter() sets filterQuery = {fromBlock=100, toBlock=190} with no @After cleanup. When the two tests share the same Gradle forkEvery JVM batch, matchFilter() returns false for block 1, and processTrigger gets 0 coverage even though all assertions pass.

Consider adding alongside the existing reset:

EventPluginLoader.getInstance().setFilterQuery(null);

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c8e7ef5. Added EventPluginLoader.getInstance().setFilterQuery(null) to @Before alongside the existing ConfigLoader.disable = false reset.

scheduledTask.run();

Mockito.verify(peerStatusCheck).statusCheck();
Assert.assertNotNull(peerStatusCheck);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[NIT] Assert.assertNotNull(peerStatusCheck) is a dead assertion — peerStatusCheck is a local variable created at the top of the test and never reassigned, so it is trivially non-null. PeerStatusCheck.init() wraps statusCheck() in a try-catch, so scheduledTask.run() always returns normally and this line is always reached. Was this assertion intended to verify something specific, or can it be removed?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c8e7ef5. Removed the dead assertion and the now-unused import org.junit.Assert.


public void shutdown() {
flag = false;
ExecutorServiceManager.shutdownAndAwaitTermination(getBlockEs, "solid-get-block");
Copy link
Copy Markdown
Collaborator

@bladehan1 bladehan1 Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: @halibobo1205 #3077488830 — concrete findings on the shutdown-time verification you asked for:

[SHOULD] shutdown() worst-case latency can reach 240s

Both shutdownAndAwaitTermination calls are sequential, and each waits up to 60s + 60s (see common/src/main/java/org/tron/common/es/ExecutorServiceManager.java). If a worker is blocked inside databaseGrpcClient.getBlock(...) (no deadline set), Thread.interrupt() is not guaranteed to unblock the underlying Netty I/O, so the total shutdown time can reach 240s.

The current SolidityNodeTest uses a mocked node so CI is not affected, but any integration test or operational restart that runs the real worker will see this latency.

Suggestion: run the two shutdownAndAwaitTermination calls in parallel (the two pools are independent), and track a follow-up to add a gRPC deadline on DatabaseGrpcClient.getBlock.


@After
public void tearDown() {
if (subscriberExecutor != null) {
Copy link
Copy Markdown
Collaborator

@bladehan1 bladehan1 Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[SHOULD] Use the project's standard shutdown helper

Every other test in this PR routes executor shutdown through ExecutorServiceManager.shutdownAndAwaitTermination(...), but this @After calls subscriberExecutor.shutdownNow() + awaitTermination(2, SECONDS) directly. ZMQ's subscriber.recv() does not always respond to Thread.interrupt() within 2s, so the subscriber thread may leak across test runs and occasionally interfere with the next test.

Suggestion: replace the manual shutdown block with ExecutorServiceManager.shutdownAndAwaitTermination(subscriberExecutor, "zmq-subscriber") to align with the rest of the codebase.

logger.info("Backup server started, bind port {}", port);

// If close() was called while bind() was in progress, channel was not yet visible
if (shutdown) {
Copy link
Copy Markdown
Collaborator

@bladehan1 bladehan1 Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[SHOULD] The close()/start() race is only partially closed, and the remaining control flow is confusing.

  1. The new if (shutdown) break; at line 81 only covers a narrow window — the case where close() has already set shutdown=true but has not yet read channel. The other window is still open: if close() runs while b.bind(port).sync() is still in progress, close() sees the previous iteration's (already-closed) channel (or null) and its channel.close().await(10s) is a no-op on the channel that is about to be bound. The newly-bound channel is then closed indirectly by group.shutdownGracefully().sync() in start()'s finally, meaning there are two distinct close paths with two different timeout semantics (10s vs. group's default quiet period).

  2. Ordering in close(): backupManager.stop() runs before channel.close(). Any in-flight UDP datagram already picked up by the event loop will still be dispatched through the pipeline into a stopped BackupManager. The usual order is to stop ingress (channel) first, then business state (backupManager).

  3. channel.close().await(10, SECONDS) is effectively redundant — ExecutorServiceManager.shutdownAndAwaitTermination(executor, name) already waits for start() to return, and start() only returns after group.shutdownGracefully().sync() has closed the channel. Its only real job is to wake up the channel.closeFuture().sync() in start(); a fire-and-forget channel.close() would do that.

  4. The while (!shutdown) + logger.warn("Restart backup server ...") branch is practically dead code for a UDP NioDatagramChannel: closeFuture() only completes on close() (→ shutdown=true → break) or on an exception (→ outer catch → finally). There is no tested path that loops back and rebinds, so the three shutdown checks + restart loop read as defensive code that will never run.

Suggestion: drop the restart loop and extra if (shutdown) break; guards, call channel.close() (fire-and-forget) in close(), rely on ExecutorServiceManager.shutdownAndAwaitTermination(executor, name) as the single sync barrier, and stop backupManager only after the executor has terminated.

@Getter
private P2pRateLimiter p2pRateLimiter = new P2pRateLimiter();
@Getter
private List<InetSocketAddress> relayNodes;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[NIT] Three small follow-ups on the relayNodes rewrite

Moving relayNodes from a class-load-time static field to a lazy instance field is a meaningful behaviour change — it avoids capturing an empty Args.getInstance().getFastForwardNodes() before Spring wiring completes. The new shape is race-free today, but three small improvements would make the intent durable:

  1. Document why lazy. Without a one-line comment above the null-check, a future reader may "optimise" it back to a static or eager field and silently reintroduce the original empty-list bug. The if (relayNodes == null) guard also lets tests pre-inject a value via ReflectUtils.setFieldValue before setChannel runs; that contract is worth stating.

  2. Tighten publication. The safety argument is non-local: it relies on (a) PeerConnection being @Scope("prototype"), (b) PeerManager.add(...) being public static synchronized and being the only caller of setChannel, and (c) PeerManager.peers being a Collections.synchronizedList so that readers obtained via getPeers() inherit a lock release/acquire edge. If any of those change, the field is neither volatile nor final and @Getter already exposes it publicly, so a concurrent reader could observe a stale null with no data-race detector to warn about it. Prefer either volatile, a single unconditional assignment (no guard), or — best — final initialised at declaration / in the constructor, since the prototype bean is created after Spring finishes parsing Args.

  3. Narrow the getter. @Getter on a field that exists to support tests exposes it as a public API. @VisibleForTesting (from com.google.common.annotations) or @Getter(AccessLevel.PACKAGE) signals intent and keeps future changes to the field free of external consumers.

Suggestion: add a one-line comment above the null-check explaining the Args-init ordering + test-injection contract; make the field final (or at least volatile); and narrow the getter's visibility.

@@ -18,6 +22,7 @@

@Slf4j
public class ConcurrentHashMapTest {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[NIT] Make the test executor daemon

ExecutorServiceManager.newFixedThreadPool(EXECUTOR_NAME, 4) defaults to non-daemon. Normal runs are fine (finally shuts the pool down), but if the JVM exits abruptly, non-daemon workers can hold it open. Other project tests using this helper with short-lived test scope typically pass isDaemon=true.

Suggestion: switch to newFixedThreadPool(EXECUTOR_NAME, 4, true).


@Rule
public Timeout globalTimeout = Timeout.seconds(60);
public Timeout globalTimeout = Timeout.seconds(90);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[QUESTION] Why does this test need 90s?

The globalTimeout goes 60s → 90s and the method-level @Test(timeout=60_000) is removed. If this is driven by the 60s+60s wait inside ExecutorServiceManager.shutdownAndAwaitTermination, say so in a comment; otherwise the +30s is unexplained and future maintainers won't know how to tune it.

Suggestion: add a short comment explaining the new upper bound, or tighten the shutdown path so 60s still fits.

awaitShutdown(appT, node);
}

static void awaitShutdown(Application appT, SolidityNode node) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[NIT] Signal test-only visibility

awaitShutdown(Application, SolidityNode) is package-private purely to support the two new mock-based tests. Marking it @VisibleForTesting communicates that intent and discourages accidental external use.

Suggestion: add @VisibleForTesting (from com.google.common.annotations).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants