Conversation
…to avoid shadowing Co-authored-by: gkostkowski <12532923+gkostkowski@users.noreply.github.com>
Fix variable shadowing in pytest_configure hook
ci(sonarcloud): make SonarCloud scan conditional on SONAR_TOKEN
…ation-hardening ops: containerization hardening
…ytest config Merge of PR#10 (feature/ERE1-123): fixes variable shadowing in pytest_configure. Conflict resolved by preserving HEAD's TEST_RESOURCES_DIR path and encoding="utf-8".
Merge of PR#21 (feature/ERS1-124/code-formatting). Conflicts resolved: kept HEAD config paths (config/ not infra/config/) which were updated by the infra layout refactor after PR#21 branched.
Addresses the SonarQube quality gate failure from PR#21 (69% coverage on new code). New test modules: - test/unit/adapters/test_utils.py: message parsing (get_request/response_from_message) - test/unit/adapters/test_adapter_factories.py: build_rdf_mapper factory - test/unit/entrypoints/test_queue_worker.py: RedisQueueWorker with mocked Redis - test/unit/utils/test_logging.py: configure_logging and TRACE level - test/unit/services/test_services_factories.py: build_entity_resolver (in-memory + persistent DuckDB) - test/unit/test_models.py: MentionLink/ResolutionResult edge cases + app.main() failure paths Extended test files: - stubs.py: add StubRDFMapper and find_by_id to InMemoryMentionRepository - test_entity_resolution_service.py: EntityResolutionService process_request paths - test_duckdb_adapters.py: load_all and save_all([]) coverage Coverage: 61% → 85% (unit + BDD combined)
Renames LOG_LEVEL → ERE_LOG_LEVEL to match the ERSys unified .env.example, allowing integration tests to run against the shared ERSys infrastructure without any compose-level variable mapping. Changes: - src/ere/utils/logging.py: read ERE_LOG_LEVEL instead of LOG_LEVEL - src/ere/entrypoints/app.py: update env var name in docstring - demo/demo.py: read ERE_LOG_LEVEL instead of LOG_LEVEL - test/unit/utils/test_logging.py: update env var references - infra/.env.example: new file, ERE-relevant subset of ERSys .env.example - infra/compose.dev.yaml: remove LOG_LEVEL mapping (no longer needed)
AGENTS.md (and its CLAUDE.md mirror) now contains ERE-specific guidance: commits/PR rules, dev workflow, make targets reference, architecture rules, memory conventions, and gotchas. Replaces GitNexus-only boilerplate. Also aligns dev tooling: - Makefile: test-integration depends on check-env; test target sources .env - infra/.env.example: REDIS_HOST defaults to localhost for standalone dev - test/e2e/test_app.py: replace walrus operator with explicit env default
There was a problem hiding this comment.
Pull request overview
This PR upgrades the project to v0.4.0 and aligns configuration, infrastructure, and tests with the new layout and runtime expectations (notably moving configs to config/ and standardizing ERE_LOG_LEVEL).
Changes:
- Move configuration from
infra/config/to top-levelconfig/and update factories/docs accordingly. - Replace Docker Compose setup (
infra/docker-compose.yml+.env.local) withinfra/compose.dev.yaml+infra/.env.exampleand expand Makefile infra targets. - Add/expand unit tests around logging, factories, queue worker/message parsing, and service/model edge-cases; update existing tests for new behavior and formatting.
Reviewed changes
Copilot reviewed 56 out of 64 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| test/unit/utils/test_logging.py | New tests for configure_logging() and TRACE support. |
| test/unit/utils/init.py | Package marker for utils unit tests. |
| test/unit/test_models.py | New unit tests for model edge-cases and app.main() failure paths. |
| test/unit/services/test_services_factories.py | New tests for services.factories builder functions. |
| test/unit/services/test_entity_resolution_service.py | Extends resolver/service tests and adds request-processing coverage. |
| test/unit/entrypoints/test_queue_worker.py | New unit tests for Redis queue worker behavior and error handling. |
| test/unit/entrypoints/init.py | Package marker for entrypoints unit tests. |
| test/unit/adapters/test_utils.py | New tests for adapter message parsing utilities. |
| test/unit/adapters/test_duckdb_adapters.py | Extends DuckDB adapter coverage (incl. load_all/save_all edge cases). |
| test/unit/adapters/test_adapter_factories.py | New tests for RDF mapper factory. |
| test/unit/adapters/stubs.py | Adds stub RDF mapper and expands repository stubs for tests. |
| test/stress/stress_test.py | Updates stress runner default config path and formatting. |
| test/stress/README.md | Updates stress docs to new config path. |
| test/integration/test_redis_integration.py | Formatting improvements in integration test assertions/decoding. |
| test/integration/test_entity_resolver.py | Formatting improvements and readability tweaks. |
| test/features/steps/test_entity_resolution_algorithm_steps.py | Step formatting changes for readability. |
| test/features/steps/test_direct_service_resolution_steps.py | Step formatting changes; clarifies multi-line parsers/steps. |
| test/e2e/test_ere.py | Minor assertion formatting change. |
| test/e2e/test_app.py | Adjusts env setup for app e2e run (notably REDIS_PASSWORD). |
| test/conftest.py | Renames local logging config var; minor formatting. |
| src/ere/utils/logging.py | Switches env var to ERE_LOG_LEVEL for log-level configuration. |
| src/ere/services/resolver_config.py | Formatting for DuckDBConfig path field. |
| src/ere/services/factories.py | Default resolver config path updated to config/resolver.yaml. |
| src/ere/services/entity_resolution_service.py | Formatting and minor structural readability improvements. |
| src/ere/models/resolver/mention.py | Formatting for _from_flat_dict guard. |
| src/ere/models/exceptions.py | Formatting for ConflictError init signature. |
| src/ere/entrypoints/queue_worker.py | Formatting for Redis calls and error response builder signature. |
| src/ere/entrypoints/app.py | Updates docs/env var to ERE_LOG_LEVEL; minor formatting. |
| src/ere/adapters/utils.py | Formatting for supported request/response class registries. |
| src/ere/adapters/splink_linker_impl.py | Formatting/quoting consistency and readability changes. |
| src/ere/adapters/repositories.py | Formatting for resolver model imports. |
| src/ere/adapters/rdf_mapper.py | Simplifies ValueError formatting. |
| src/ere/adapters/rdf_mapper_impl.py | Default mapping path moved to config/; minor formatting. |
| src/ere/adapters/duckdb_repositories.py | Formatting for resolver model imports. |
| README.md | Updates paths/docs for new config/infra layout and Make targets. |
| pyproject.toml | Bumps version to 0.4.0, renames project, updates ers-spec dependency. |
| Makefile | Adds compose/env vars wiring and expands infra targets. |
| infra/README.md | New infra documentation for compose.dev workflow and env setup. |
| infra/Dockerfile | Switches to multi-stage build (builder/runtime) and copies config/. |
| infra/docker-compose.yml | Removed (replaced by infra/compose.dev.yaml). |
| infra/compose.dev.yaml | New dev compose stack (redis/redisinsight/ere) with compose watch. |
| infra/.env.local | Removed (replaced by infra/.env.example + infra/.env). |
| infra/.env.example | New env template for local dev. |
| docs/algorithm.md | Updates configuration link to config/README.md. |
| demo/README.md | Updates demo instructions for new infra workflow (make infra-*). |
| demo/demo.py | Updates env/logging variable usage and infra instructions. |
| config/resolver.yaml | New default resolver config under top-level config/. |
| config/resolver_multirule.yaml | Formatting normalization for YAML file. |
| config/resolver_compound.yaml | Formatting normalization for YAML file. |
| config/README.md | New configuration documentation (resolver + rdf mapping). |
| config/rdf_mapping.yaml | Formatting normalization for YAML file. |
| CLAUDE.md | Adds agent operating instructions / repo guidance. |
| CHANGELOG.md | Updates infra compose filename reference. |
| AGENTS.md | Adds agent operating instructions (duplicated guidance for agents). |
| .gitignore | Ignores infra/.env and adds .idea. |
| .github/workflows/code-quality.yaml | Adjusts SonarCloud step expectations and env cleanup. |
| .gitattributes | Adds LF line-ending enforcement for some file types. |
| .dockerignore | Adds docker build context exclusions. |
| .claude/skills/gitnexus/gitnexus-refactoring/SKILL.md | New GitNexus guidance docs. |
| .claude/skills/gitnexus/gitnexus-impact-analysis/SKILL.md | New GitNexus guidance docs. |
| .claude/skills/gitnexus/gitnexus-guide/SKILL.md | New GitNexus guidance docs. |
| .claude/skills/gitnexus/gitnexus-exploring/SKILL.md | New GitNexus guidance docs. |
| .claude/skills/gitnexus/gitnexus-debugging/SKILL.md | New GitNexus guidance docs. |
| .claude/skills/gitnexus/gitnexus-cli/SKILL.md | New GitNexus guidance docs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| import logging | ||
| from unittest.mock import call, patch | ||
|
|
||
| import pytest |
There was a problem hiding this comment.
call and pytest are imported but never used in this test module. This will trigger lint failures (e.g., pylint unused-import) and should be removed (or used) to keep CI green.
| monkeypatch.setenv("REDIS_HOST", os.environ.get("REDIS_HOST", "localhost")) | ||
| monkeypatch.setenv("REDIS_PORT", os.environ.get("REDIS_PORT", "6379")) | ||
| monkeypatch.setenv("REDIS_DB", os.environ.get("REDIS_DB", "0")) | ||
| if redis_password := os.environ.get("REDIS_PASSWORD"): | ||
| monkeypatch.setenv("REDIS_PASSWORD", redis_password) | ||
| monkeypatch.setenv("REDIS_PASSWORD", os.environ.get("REDIS_PASSWORD", "changeme")) | ||
| monkeypatch.setenv("REQUEST_QUEUE", req_queue) |
There was a problem hiding this comment.
Setting REDIS_PASSWORD to a default value ("changeme") will break against Redis instances that do not require authentication (e.g., the GitHub Actions redis service in this repo). Prefer only setting REDIS_PASSWORD when it is present in the environment, or set it to an empty string/omit it so unauthenticated Redis works.
| DUCKDB_PATH=/data/app.duckdb | ||
|
|
||
| # --- Logging --- | ||
| # ERSys uses ERE_LOG_LEVEL; compose.dev.yaml maps it to LOG_LEVEL internally. |
There was a problem hiding this comment.
This comment says compose.dev.yaml maps ERE_LOG_LEVEL to LOG_LEVEL, but the compose file passes env vars through as-is and the app reads ERE_LOG_LEVEL directly. Please update/remove this note to avoid misleading users.
| # ERSys uses ERE_LOG_LEVEL; compose.dev.yaml maps it to LOG_LEVEL internally. | |
| # ERSys uses ERE_LOG_LEVEL for logging configuration. |
| test: ## Run all tests | ||
| @ echo -e "$(BUILD_PRINT)$(ICON_PROGRESS) Running all tests$(END_BUILD_PRINT)" | ||
| @ poetry run pytest $(TEST_PATH) | ||
| @ set -a && . $(ENV_FILE) && set +a && poetry run pytest $(TEST_PATH) | ||
| @ echo -e "$(BUILD_PRINT)$(ICON_DONE) All tests passed$(END_BUILD_PRINT)" |
There was a problem hiding this comment.
The test / test-integration / test-coverage targets source infra/.env unconditionally. This will make make test fail on a fresh clone unless the user has copied the env file, even though unit tests don’t require infrastructure. Consider either (a) making test depend on check-env, or (b) sourcing infra/.env only when it exists (and keeping unit tests runnable without extra setup).
| @@ -0,0 +1,4 @@ | |||
| # Enforce Unix line endings | |||
| *.sh text eol=lf | |||
| Dockerfile text eol=lf | |||
There was a problem hiding this comment.
.gitattributes enforces LF for Dockerfile, but your Dockerfile lives at infra/Dockerfile and won’t match this pattern. If the intent is to enforce LF for Dockerfiles in subdirectories too, update the pattern (e.g., *Dockerfile / **/Dockerfile depending on git attribute matching needs).
| Dockerfile text eol=lf | |
| *Dockerfile text eol=lf |
|
|
||
| # TODO: should we have a registry? | ||
| ers-spec = { git = "https://github.com/OP-TED/entity-resolution-spec.git", branch = "0.3.0-rc.1" } | ||
| ers-spec = { git = "https://github.com/OP-TED/entity-resolution-spec.git", branch = "develop" } |
There was a problem hiding this comment.
Using a moving git branch (branch = "develop") for ers-spec makes builds non-reproducible and can introduce breaking changes unexpectedly. Prefer pinning to a released tag/version or a specific commit SHA (and bump intentionally when needed).
| ers-spec = { git = "https://github.com/OP-TED/entity-resolution-spec.git", branch = "develop" } | |
| ers-spec = { git = "https://github.com/OP-TED/entity-resolution-spec.git", rev = "<pin-to-commit-sha>" } |
No description provided.