Speed up and clean up acceptance tests#3248
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 40 minutes and 20 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (18)
📝 WalkthroughWalkthroughThis PR enhances the acceptance testing infrastructure by introducing cached Go dependency restoration in CI workflows, adding explicit CLI tool installation (tkn, kubectl), refactoring container-based test dependencies with proper lifecycle management, implementing file-based per-scenario logging with configurable output formats, and optimizing image building with caching and concurrent operations. Changes
Sequence DiagramsequenceDiagram
participant Caller
participant Kind as Kind Cluster
participant Registry as Image Registry
participant Podman
participant Docker
Caller->>Kind: Start()
Kind->>Kind: ensureConfigMapRBAC()
Kind->>Kind: waitForDeployment(image-registry)
par Concurrent Operations
Kind->>Kind: buildCliImage()
activate Kind
Kind->>Kind: computeSourceHash()
Kind->>Kind: check cache marker
opt Cache miss
Kind->>Docker: go build ec binary
Kind->>Kind: locate/build kubectl
Kind->>Podman: build with acceptance.Dockerfile
Podman->>Registry: push image
Kind->>Kind: write cache marker
end
deactivate Kind
and
Kind->>Kind: buildTaskBundleImage()
activate Kind
par Bundle Tasks
Kind->>Docker: make task-bundle (concurrent)
Kind->>Docker: make task-bundle (concurrent)
Kind->>Docker: make task-bundle (concurrent)
end
deactivate Kind
and
Kind->>Kind: waitForDeployment(tekton-pipelines)
end
Kind->>Caller: return
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
acceptance/conftest/conftest.go (1)
95-115:⚠️ Potential issue | 🟡 MinorDebug output is now suppressed for non-execution failures.
Previously the deferred block emitted stdout/stderr whenever the function returned an error; it now only fires when
cmd.Run()itself failed. Failures in the subsequentReadFile/diff path (e.g., conftest ran but produced unexpected content) will no longer surface the command output, which can make such failures harder to debug. If that tradeoff is intentional (quieter passing runs, per-scenario log files elsewhere), ignore; otherwise, consider gating on the outererror ontestenv.VerboseOutputinstead ofcmdErr.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acceptance/conftest/conftest.go` around lines 95 - 115, The deferred debug-print block only runs when cmdErr is non-nil and cmdErr is set by cmd.Run(), so errors that occur later (e.g., file reads/diff checks after cmd.Run()) no longer trigger stdout/stderr printing; change the gating so the defer prints when the function returns any non-nil err or when verbose output is enabled: capture the named return error (err) or check testenv.VerboseOutput(ctx) inside the defer (instead of only cmdErr) and use that to decide whether to print; refer to the defer func, cmdErr, cmd.Run(), and testenv.VerboseOutput to locate and modify the condition.acceptance/wiremock/wiremock.go (1)
288-336:⚠️ Potential issue | 🟠 MajorConsolidate After hooks: termination currently runs before unmatched-requests check.
Two
sc.Afterhooks are registered at lines 289 and 318. In godog v0.15.0, After hooks execute in LIFO order (last registered first), so the termination hook runs before the unmatched-requests check. This terminates the WireMock container beforeUnmatchedRequests()can call its admin API, causing the check to fail.Consolidate both into a single hook, running the unmatched-requests check before termination.
Suggested consolidation
- sc.After(func(ctx context.Context, finished *godog.Scenario, scenarioErr error) (context.Context, error) { - if !IsRunning(ctx) { - return ctx, nil - } - - w, err := wiremockFrom(ctx) - if err != nil { - // wiremock wasn't launched, we don't need to proceed - return ctx, err - } - - unmatched, err := w.UnmatchedRequests() - if err != nil { - return ctx, err - } - - if len(unmatched) == 0 { - return ctx, nil - } - - logger, ctx := log.LoggerFor(ctx) - logger.Log("Found unmatched WireMock requests:") - for i, u := range unmatched { - logger.Logf("[%d]: %s", i, u) - } - - return ctx, nil - }) - - sc.After(func(ctx context.Context, finished *godog.Scenario, scenarioErr error) (context.Context, error) { - if testenv.Persisted(ctx) { - return ctx, nil - } - - if !testenv.HasState[wiremockState](ctx) { - return ctx, nil - } - - state := testenv.FetchState[wiremockState](ctx) - if state.Container != nil { - if err := state.Container.Terminate(ctx); err != nil { - logger, _ := log.LoggerFor(ctx) - logger.Warnf("failed to terminate wiremock container: %v", err) - } - } - - return ctx, nil - }) + sc.After(func(ctx context.Context, finished *godog.Scenario, scenarioErr error) (context.Context, error) { + if !IsRunning(ctx) { + return ctx, nil + } + + if w, err := wiremockFrom(ctx); err == nil { + if unmatched, err := w.UnmatchedRequests(); err == nil && len(unmatched) > 0 { + logger, _ := log.LoggerFor(ctx) + logger.Log("Found unmatched WireMock requests:") + for i, u := range unmatched { + logger.Logf("[%d]: %s", i, u) + } + } + } + + if testenv.Persisted(ctx) { + return ctx, nil + } + + state := testenv.FetchState[wiremockState](ctx) + if state.Container != nil { + if err := state.Container.Terminate(ctx); err != nil { + logger, _ := log.LoggerFor(ctx) + logger.Warnf("failed to terminate wiremock container: %v", err) + } + } + + return ctx, nil + })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acceptance/wiremock/wiremock.go` around lines 288 - 336, The two sc.After hooks in AddStepsTo run LIFO so the termination hook (which uses testenv.FetchState[wiremockState] and state.Container.Terminate) currently runs before the UnmatchedRequests check (wiremockFrom(ctx).UnmatchedRequests()), causing the admin API to be unreachable; consolidate them into a single sc.After hook that first calls wiremockFrom(ctx) and w.UnmatchedRequests() (logging as currently done) and only after processing unmatched requests performs the testenv.HasState[wiremockState] / testenv.FetchState[wiremockState] termination logic (including the existing logger.Warnf on Terminate error), preserving the same early returns and error handling.
🧹 Nitpick comments (5)
acceptance/registry/registry.go (1)
330-348: Terminate uses the scenario's context, which may already be cancelled.If godog cancels
ctxon scenario failure/timeout,state.Container.Terminate(ctx)can fail immediately and the container may leak (Ryuk still reaps it, but termination becomes best-effort). Consider usingcontext.WithTimeout(context.Background(), ...)for teardown to ensure clean shutdown independent of scenario cancellation. Same pattern likely applies to the analogous After hooks added inacceptance/git/git.goandacceptance/wiremock/wiremock.go.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acceptance/registry/registry.go` around lines 330 - 348, The After hook uses the scenario's ctx which may be cancelled; change the teardown to call state.Container.Terminate with a fresh context that has a timeout (e.g., context.WithTimeout(context.Background(), <reasonableDuration>)) so termination runs independent of scenario cancellation; create the cancel func and defer cancel, call state.Container.Terminate(newCtx) and log errors as before. Apply the same pattern to the equivalent After hooks referencing registryState/state.Container.Terminate in acceptance/git/git.go and acceptance/wiremock/wiremock.go.acceptance/kubernetes/kind/acceptance.Dockerfile (1)
22-22: Consider separatingmicrodnf upgradefrom install and cleaning caches.Chaining
upgradeandinstallworks, but running them as separate steps (and ending withmicrodnf clean all) is the idiomatic UBI pattern and produces slightly smaller images. Minor, non-blocking.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acceptance/kubernetes/kind/acceptance.Dockerfile` at line 22, Split the combined microdnf command into two RUN steps: first run "microdnf upgrade --assumeyes --nodocs --setopt=keepcache=0 --refresh" and then run "microdnf -y --nodocs --setopt=keepcache=0 install gzip jq ca-certificates", and finally add a "microdnf clean all" (or equivalent) at the end of the install step to clear caches; update the Dockerfile commands around the existing microdnf upgrade and install invocations so they run in separate layers and include the clean step to reduce image size.acceptance/log/log.go (1)
165-168: Panic on temp-file creation failure is harsh but acceptable; consider returning an error instead for testability.
panicterminates the entire test binary on a transient filesystem issue. Returning an error (and changing the signature) would let callers degrade gracefully or retry.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acceptance/log/log.go` around lines 165 - 168, Replace the panic on os.CreateTemp failure with error propagation: change the enclosing function (the one that calls os.CreateTemp(fmt.Sprintf("scenario-%010d-*.log", id))) to return an error (and the temp file or nil) instead of panicking, return the wrapped error when os.CreateTemp fails, and update all callers to handle/propagate the error; specifically modify the function signature to include error return, replace panic(fmt.Sprintf(..., err)) with returning fmt.Errorf("failed to create scenario log file: %w", err) (or equivalent), and adjust callers/tests accordingly..github/workflows/checks-codecov.yaml (1)
128-139: Fragile grep parsing may pick the wrong version if multiple matching lines exist.
grep 'tektoncd/cli' tools/go.modandgrep 'k8s.io/kubernetes' tools/kubectl/go.modmatch any line (including transitive/indirect entries) containing the substring. Ifgo.modever grows additional matching entries,awk '{print $2}'on multi-line output will break silently. Prefergo list -m -modfile=... <module>or a more precise anchored regex.- TKN_VERSION=$(grep 'tektoncd/cli' tools/go.mod | awk '{print $2}' | sed 's/^v//') + TKN_VERSION=$(go list -m -modfile=tools/go.mod -f '{{.Version}}' github.com/tektoncd/cli | sed 's/^v//') ... - KUBECTL_VERSION=$(grep 'k8s.io/kubernetes' tools/kubectl/go.mod | grep -oE 'v[0-9]+\.[0-9]+\.[0-9]+') + KUBECTL_VERSION=$(go list -m -modfile=tools/kubectl/go.mod -f '{{.Version}}' k8s.io/kubernetes)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/checks-codecov.yaml around lines 128 - 139, The current TKN_VERSION and KUBECTL_VERSION extraction using grep (the lines that set TKN_VERSION=$(grep 'tektoncd/cli' tools/go.mod ...) and KUBECTL_VERSION=$(grep 'k8s.io/kubernetes' tools/kubectl/go.mod ...) ) is fragile; replace those grep/awk commands with precise module resolution using go list -m -modfile=... (e.g., run go list -m -modfile=tools/go.mod tektoncd/cli and go list -m -modfile=tools/kubectl/go.mod k8s.io/kubernetes) to get exact versions, assign their stdout to TKN_VERSION and KUBECTL_VERSION, strip the leading "v" from TKN_VERSION if needed, and keep the existing curl/tar and curl/chmod steps unchanged; add a simple check to fail early if the go list command returns empty.Makefile (1)
137-140: Minor: test_passed preserves exit status, but a failure ingo tool covdata textfmton line 139 will short-circuit before the final[ ... ]check.With
.SHELLFLAGS=-e -c, ifgo tool covdata textfmtexits non-zero (e.g., no coverage files produced after test failure), the recipe will fail with that error and you'll lose thetest_passedsignal. Consider making the covdata step tolerant of missing data or placing the final check before it.- cd acceptance && go test -timeout $(ACCEPTANCE_TIMEOUT) ./... && test_passed=1 || test_passed=0; \ - echo "[`date '+%H:%M:%S'`] Tests finished in $$((SECONDS/60))m$$((SECONDS%60))s"; \ - go tool covdata textfmt -i=$${GOCOVERDIR} -o="$(ROOT_DIR)/coverage-acceptance.out"; \ - [ "$$test_passed" = "1" ] + cd acceptance && go test -timeout $(ACCEPTANCE_TIMEOUT) ./... && test_passed=1 || test_passed=0; \ + echo "[`date '+%H:%M:%S'`] Tests finished in $$((SECONDS/60))m$$((SECONDS%60))s"; \ + go tool covdata textfmt -i=$${GOCOVERDIR} -o="$(ROOT_DIR)/coverage-acceptance.out" || true; \ + [ "$$test_passed" = "1" ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 137 - 140, The recipe currently sets test_passed based on the `go test` result but then runs `go tool covdata textfmt` which can abort the shell (because of `.SHELLFLAGS=-e -c`) and lose the `test_passed` signal; update the Makefile so the covdata step cannot fail the recipe (e.g., make the `go tool covdata textfmt -i=${GOCOVERDIR} -o="${ROOT_DIR}/coverage-acceptance.out"` invocation tolerant of missing data by appending a conditional fallback such as `|| true` or otherwise swallowing non-zero exits) or move the final `[ "$$test_passed" = "1" ]` check to run before the covdata line; reference the `test_passed` variable, the `go tool covdata textfmt` invocation, and the `GOCOVERDIR`/`ROOT_DIR` variables when applying the fix.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@acceptance/acceptance_test.go`:
- Around line 235-241: The current TestMain uses klog.LogToStderr(false) and
klog.SetOutput(io.Discard) unconditionally which hides all k8s client warnings;
change TestMain so klog is silenced only when tests are not verbose (use
testing.Verbose() or a -verbose flag): if not verbose, call
klog.LogToStderr(false) and klog.SetOutput(io.Discard); otherwise restore normal
output (e.g., klog.LogToStderr(true) and klog.SetOutput(os.Stderr)) or wire klog
into the per-scenario logger used by your tests; update imports accordingly and
keep references to TestMain, klog.LogToStderr, klog.SetOutput, io.Discard (and
os.Stderr/testing.Verbose) so the behavior is reversible for debugging.
In `@acceptance/kubernetes/kind/image.go`:
- Around line 98-112: The code currently copies any kubectl found via
exec.LookPath into kubectlBinary which can lead to client/server version drift;
change the logic in this block (symbols: kubectlBinary, exec.LookPath, copyFile,
kubectlBuildCmd, tools/kubectl/go.mod) to prefer building the pinned kubectl
from tools/kubectl/go.mod by default and only use the PATH binary when an
explicit opt-in (env var or config flag like USE_HOST_KUBECTL) is set; if opt-in
is allowed, detect the host kubectl version (invoke kubectl version --client
--short) and log that version before copying, or compare it to the pinned
version and warn/error on mismatch to make cache/CI differences traceable.
- Around line 54-66: In buildCliImage, the cache file path currently uses a
fixed /tmp/ec-cli-image-cache-<port>.hash which can collide across users and
repos; change the cacheFile construction in computeSourceHash logic to use a
user-scoped cache directory (os.UserCacheDir(), falling back to os.TempDir())
and include a repo-specific and user-unique component (e.g., the current working
directory or repo root hash and UID) plus k.registryPort to form the filename so
cache files are isolated per user/checkout; update references to cacheFile in
buildCliImage to use that new path.
---
Outside diff comments:
In `@acceptance/conftest/conftest.go`:
- Around line 95-115: The deferred debug-print block only runs when cmdErr is
non-nil and cmdErr is set by cmd.Run(), so errors that occur later (e.g., file
reads/diff checks after cmd.Run()) no longer trigger stdout/stderr printing;
change the gating so the defer prints when the function returns any non-nil err
or when verbose output is enabled: capture the named return error (err) or check
testenv.VerboseOutput(ctx) inside the defer (instead of only cmdErr) and use
that to decide whether to print; refer to the defer func, cmdErr, cmd.Run(), and
testenv.VerboseOutput to locate and modify the condition.
In `@acceptance/wiremock/wiremock.go`:
- Around line 288-336: The two sc.After hooks in AddStepsTo run LIFO so the
termination hook (which uses testenv.FetchState[wiremockState] and
state.Container.Terminate) currently runs before the UnmatchedRequests check
(wiremockFrom(ctx).UnmatchedRequests()), causing the admin API to be
unreachable; consolidate them into a single sc.After hook that first calls
wiremockFrom(ctx) and w.UnmatchedRequests() (logging as currently done) and only
after processing unmatched requests performs the testenv.HasState[wiremockState]
/ testenv.FetchState[wiremockState] termination logic (including the existing
logger.Warnf on Terminate error), preserving the same early returns and error
handling.
---
Nitpick comments:
In @.github/workflows/checks-codecov.yaml:
- Around line 128-139: The current TKN_VERSION and KUBECTL_VERSION extraction
using grep (the lines that set TKN_VERSION=$(grep 'tektoncd/cli' tools/go.mod
...) and KUBECTL_VERSION=$(grep 'k8s.io/kubernetes' tools/kubectl/go.mod ...) )
is fragile; replace those grep/awk commands with precise module resolution using
go list -m -modfile=... (e.g., run go list -m -modfile=tools/go.mod tektoncd/cli
and go list -m -modfile=tools/kubectl/go.mod k8s.io/kubernetes) to get exact
versions, assign their stdout to TKN_VERSION and KUBECTL_VERSION, strip the
leading "v" from TKN_VERSION if needed, and keep the existing curl/tar and
curl/chmod steps unchanged; add a simple check to fail early if the go list
command returns empty.
In `@acceptance/kubernetes/kind/acceptance.Dockerfile`:
- Line 22: Split the combined microdnf command into two RUN steps: first run
"microdnf upgrade --assumeyes --nodocs --setopt=keepcache=0 --refresh" and then
run "microdnf -y --nodocs --setopt=keepcache=0 install gzip jq ca-certificates",
and finally add a "microdnf clean all" (or equivalent) at the end of the install
step to clear caches; update the Dockerfile commands around the existing
microdnf upgrade and install invocations so they run in separate layers and
include the clean step to reduce image size.
In `@acceptance/log/log.go`:
- Around line 165-168: Replace the panic on os.CreateTemp failure with error
propagation: change the enclosing function (the one that calls
os.CreateTemp(fmt.Sprintf("scenario-%010d-*.log", id))) to return an error (and
the temp file or nil) instead of panicking, return the wrapped error when
os.CreateTemp fails, and update all callers to handle/propagate the error;
specifically modify the function signature to include error return, replace
panic(fmt.Sprintf(..., err)) with returning fmt.Errorf("failed to create
scenario log file: %w", err) (or equivalent), and adjust callers/tests
accordingly.
In `@acceptance/registry/registry.go`:
- Around line 330-348: The After hook uses the scenario's ctx which may be
cancelled; change the teardown to call state.Container.Terminate with a fresh
context that has a timeout (e.g., context.WithTimeout(context.Background(),
<reasonableDuration>)) so termination runs independent of scenario cancellation;
create the cancel func and defer cancel, call state.Container.Terminate(newCtx)
and log errors as before. Apply the same pattern to the equivalent After hooks
referencing registryState/state.Container.Terminate in acceptance/git/git.go and
acceptance/wiremock/wiremock.go.
In `@Makefile`:
- Around line 137-140: The recipe currently sets test_passed based on the `go
test` result but then runs `go tool covdata textfmt` which can abort the shell
(because of `.SHELLFLAGS=-e -c`) and lose the `test_passed` signal; update the
Makefile so the covdata step cannot fail the recipe (e.g., make the `go tool
covdata textfmt -i=${GOCOVERDIR} -o="${ROOT_DIR}/coverage-acceptance.out"`
invocation tolerant of missing data by appending a conditional fallback such as
`|| true` or otherwise swallowing non-zero exits) or move the final `[
"$$test_passed" = "1" ]` check to run before the covdata line; reference the
`test_passed` variable, the `go tool covdata textfmt` invocation, and the
`GOCOVERDIR`/`ROOT_DIR` variables when applying the fix.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 20300c2d-e7e2-4dbb-85de-0d9a7a817b75
⛔ Files ignored due to path filters (1)
features/__snapshots__/validate_input.snapis excluded by!**/*.snap
📒 Files selected for processing (18)
.github/workflows/checks-codecov.yamlMakefileacceptance/acceptance_test.goacceptance/cli/cli.goacceptance/conftest/conftest.goacceptance/git/git.goacceptance/go.modacceptance/kubernetes/kind/acceptance.Dockerfileacceptance/kubernetes/kind/image.goacceptance/kubernetes/kind/kind.goacceptance/kubernetes/kind/kubernetes.goacceptance/log/log.goacceptance/log/log_test.goacceptance/registry/registry.goacceptance/testenv/testenv.goacceptance/wiremock/wiremock.gofeatures/validate_input.featurehack/ubi-base-image-bump.sh
All 285 containers accumulated during the full test run with zero cleanup, wasting memory and adding Docker daemon overhead. Each container type (registry, git, wiremock) now stores the testcontainers.Container reference and calls Terminate(ctx) in the sc.After hook, respecting the persist flag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Ref: https://issues.redhat.com/browse/EC-1710
Each scenario now writes to its own temp file, which eliminates log interleaving across parallel goroutines. Remove shouldSuppress(), which incorrectly filtered Error-level messages, and remove /dev/tty writes that failed silently in CI. Replace os.Exit(1) with t.Fatalf() so TestMain cleanup runs. Print failed scenario log file paths in the test summary. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Ref: https://issues.redhat.com/browse/EC-1710
Instead of running make push-image (full multi-stage Dockerfile that compiles Go inside the container), build the ec and kubectl binaries locally with the host Go cache, then inject them into a minimal ubi-minimal base image via acceptance.Dockerfile. The old approach compiled Go twice: once on the host, once in the container. Skipping the in-container build significantly reduces CI build time. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Ref: https://issues.redhat.com/browse/EC-1710
The make push-image step was one of the slowest steps in CI even when source had not changed. Compute a SHA-256 hash of all build inputs (Go source, go.mod, go.sum, Dockerfile, build.sh, Makefile, hack/reduce-snapshot.sh) and compare it against a per-registry-port cache marker file. When the hash matches, skip the build entirely. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Ref: https://issues.redhat.com/browse/EC-1710
The sequential per-version task bundle loop was the slowest step in CI, much slower than the same step locally. Each version produces an independent bundle image, so the builds are safe to run concurrently. Use errgroup to propagate the first error and cancel remaining builds via context. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Ref: https://issues.redhat.com/browse/EC-1710
After applying cluster resources, wait only for the in-cluster registry before starting image builds. The CLI image build and task bundle build now run concurrently with each other and with the Tekton Pipelines deployment, since they only need the registry to push to. Significantly reduces CI setup time. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Ref: https://issues.redhat.com/browse/EC-1710
The acceptance job's Go build cache was broken: it used actions/cache/restore (read-only) with path '**' and key 'main', which did not match the real cache paths (~/.cache/go-build, ~/go/pkg/mod). Switch to read-write actions/cache targeting the correct directories with a content-based key from go.sum. Download pre-built tkn and kubectl binaries (versions extracted from tools/go.mod and tools/kubectl/go.mod) instead of compiling from source every run. The Makefile and image.go prefer binaries from PATH when available, falling back to go build locally. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Ref: https://issues.redhat.com/browse/EC-1710
Print the test suite start time so the user can estimate when it will finish, and output total elapsed time at the end. The shell runs with -e, so a go test failure would abort before the timing echo. Capture the exit code and re-propagate it after printing duration and collecting coverage so timing info survives test failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Ref: https://issues.redhat.com/browse/EC-1710
Switch godog formatter from "pretty" to "progress" (overridable via -format flag or EC_ACCEPTANCE_FORMAT env var), with output suppressed by default via progress:/dev/null. Gate logExecution and conftest debug output on scenario or command failure so passing scenarios stay silent. Route snapshot artifact debug prints through the per-scenario file logger instead of stdout. Suppress k8s client-side throttling warnings by disabling klog logtostderr. Move failed scenario summaries and the profiling report to TestMain so they appear after all go test output. Gate verbose execution details behind a -verbose flag and use diff for stderr assertions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Ref: https://issues.redhat.com/browse/EC-1710
Remove CPU resource requests from task bundle steps in the acceptance tests to eliminate the Tekton scheduling waterfall. Each TaskRun pod requested 2600m CPU, limiting concurrent pods to 1-2 on CI (3.5 allocatable CPUs), which effectively serialized 26 Kind scenarios. Without the requests all pods schedule immediately. Set imagePullPolicy: IfNotPresent on CLI image steps as a defensive measure. Parallelize namespaced resource application in applyResources by applying cluster-scoped resources first, then namespaced resources concurrently via errgroup. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Ref: https://issues.redhat.com/browse/EC-1710
The ensureConfigMapRBAC() call lived only in CreateConfigMap, so scenarios that never create a ConfigMap (like "Collect keyless signing parameters when the namespace does not exist") relied on another scenario running first to set up the ClusterRole and ClusterRoleBinding. Parallel execution broke this assumption. Move the call into Start() right after applyResources, so RBAC is in place before any scenario runs. The call in CreateConfigMap stays as an idempotent guard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Ref: https://issues.redhat.com/browse/EC-1710
Two scenarios in validate_input.feature both write policy.yaml
and input.json to the shared repo root. When godog schedules
them concurrently, one overwrites the other's files before ec
reads them, causing spurious exit code mismatches.
Write per-scenario files to ${TMPDIR} instead, which is unique
per scenario and avoids the collision.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Ref: https://issues.redhat.com/browse/EC-1710
4daaa9f to
b758c00
Compare
|
Went through the CodeRabbit review and addressed two findings:
For the rest of the CodeRabbit suggestions, here's why we're leaving them as-is:
|
Codecov Report✅ All modified and coverable lines are covered by tests.
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
|
There's a cve |
The acceptance tests were slow in CI and hard to debug when
they failed. Most of the wall-clock time went into sequential
image builds and a setup phase that blocked on each step
finishing before the next could start. On top of that, test
output was noisy and interleaved across parallel goroutines,
making failure diagnosis a chore.
The setup phase now overlaps image builds with the Tekton
deployment, builds task bundles in parallel, and skips the
CLI image build entirely when source hasn't changed. The old
multi-stage Docker build has been replaced with a host-side
binary injection into a minimal base image, and CI downloads
pre-built tool binaries instead of compiling them from source.
Kind scenario scheduling no longer bottlenecks on CPU resource
requests that serialized pod creation.
Logging has been reworked so each scenario writes to its own
file, eliminating cross-goroutine interleaving. Passing runs
are now quiet by default, and failed scenario summaries with
log file paths are printed at the end. Test containers are
properly terminated after each scenario instead of
accumulating for the entire run.
The increased parallelism exposed a couple of latent race
conditions: two validate_input scenarios were writing to the
same files in the repo root, and ConfigMap RBAC setup depended
on scenario execution order. Both have been fixed so tests
run reliably regardless of scheduling.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Ref: https://issues.redhat.com/browse/EC-1710