Skip to content

Feature/upgrade to 0.4.0#7

Closed
costezki wants to merge 19 commits intoOP-TED:developfrom
meaningfy-ws:feature/upgrade-to-0.3.0
Closed

Feature/upgrade to 0.4.0#7
costezki wants to merge 19 commits intoOP-TED:developfrom
meaningfy-ws:feature/upgrade-to-0.3.0

Conversation

@costezki
Copy link
Copy Markdown
Contributor

@costezki costezki commented Apr 2, 2026

No description provided.

Copilot AI and others added 19 commits February 26, 2026 08:07
…to avoid shadowing

Co-authored-by: gkostkowski <12532923+gkostkowski@users.noreply.github.com>
Fix variable shadowing in pytest_configure hook
ci(sonarcloud): make SonarCloud scan conditional on SONAR_TOKEN
…ation-hardening

ops: containerization hardening
…ytest config

Merge of PR#10 (feature/ERE1-123): fixes variable shadowing in pytest_configure.
Conflict resolved by preserving HEAD's TEST_RESOURCES_DIR path and encoding="utf-8".
Merge of PR#21 (feature/ERS1-124/code-formatting).
Conflicts resolved: kept HEAD config paths (config/ not infra/config/)
which were updated by the infra layout refactor after PR#21 branched.
Addresses the SonarQube quality gate failure from PR#21 (69% coverage on new code).

New test modules:
- test/unit/adapters/test_utils.py: message parsing (get_request/response_from_message)
- test/unit/adapters/test_adapter_factories.py: build_rdf_mapper factory
- test/unit/entrypoints/test_queue_worker.py: RedisQueueWorker with mocked Redis
- test/unit/utils/test_logging.py: configure_logging and TRACE level
- test/unit/services/test_services_factories.py: build_entity_resolver (in-memory + persistent DuckDB)
- test/unit/test_models.py: MentionLink/ResolutionResult edge cases + app.main() failure paths

Extended test files:
- stubs.py: add StubRDFMapper and find_by_id to InMemoryMentionRepository
- test_entity_resolution_service.py: EntityResolutionService process_request paths
- test_duckdb_adapters.py: load_all and save_all([]) coverage

Coverage: 61% → 85% (unit + BDD combined)
Renames LOG_LEVEL → ERE_LOG_LEVEL to match the ERSys unified .env.example,
allowing integration tests to run against the shared ERSys infrastructure
without any compose-level variable mapping.

Changes:
- src/ere/utils/logging.py: read ERE_LOG_LEVEL instead of LOG_LEVEL
- src/ere/entrypoints/app.py: update env var name in docstring
- demo/demo.py: read ERE_LOG_LEVEL instead of LOG_LEVEL
- test/unit/utils/test_logging.py: update env var references
- infra/.env.example: new file, ERE-relevant subset of ERSys .env.example
- infra/compose.dev.yaml: remove LOG_LEVEL mapping (no longer needed)
AGENTS.md (and its CLAUDE.md mirror) now contains ERE-specific guidance:
commits/PR rules, dev workflow, make targets reference, architecture
rules, memory conventions, and gotchas. Replaces GitNexus-only boilerplate.

Also aligns dev tooling:
- Makefile: test-integration depends on check-env; test target sources .env
- infra/.env.example: REDIS_HOST defaults to localhost for standalone dev
- test/e2e/test_app.py: replace walrus operator with explicit env default
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR upgrades the project to v0.4.0 and aligns configuration, infrastructure, and tests with the new layout and runtime expectations (notably moving configs to config/ and standardizing ERE_LOG_LEVEL).

Changes:

  • Move configuration from infra/config/ to top-level config/ and update factories/docs accordingly.
  • Replace Docker Compose setup (infra/docker-compose.yml + .env.local) with infra/compose.dev.yaml + infra/.env.example and expand Makefile infra targets.
  • Add/expand unit tests around logging, factories, queue worker/message parsing, and service/model edge-cases; update existing tests for new behavior and formatting.

Reviewed changes

Copilot reviewed 56 out of 64 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
test/unit/utils/test_logging.py New tests for configure_logging() and TRACE support.
test/unit/utils/init.py Package marker for utils unit tests.
test/unit/test_models.py New unit tests for model edge-cases and app.main() failure paths.
test/unit/services/test_services_factories.py New tests for services.factories builder functions.
test/unit/services/test_entity_resolution_service.py Extends resolver/service tests and adds request-processing coverage.
test/unit/entrypoints/test_queue_worker.py New unit tests for Redis queue worker behavior and error handling.
test/unit/entrypoints/init.py Package marker for entrypoints unit tests.
test/unit/adapters/test_utils.py New tests for adapter message parsing utilities.
test/unit/adapters/test_duckdb_adapters.py Extends DuckDB adapter coverage (incl. load_all/save_all edge cases).
test/unit/adapters/test_adapter_factories.py New tests for RDF mapper factory.
test/unit/adapters/stubs.py Adds stub RDF mapper and expands repository stubs for tests.
test/stress/stress_test.py Updates stress runner default config path and formatting.
test/stress/README.md Updates stress docs to new config path.
test/integration/test_redis_integration.py Formatting improvements in integration test assertions/decoding.
test/integration/test_entity_resolver.py Formatting improvements and readability tweaks.
test/features/steps/test_entity_resolution_algorithm_steps.py Step formatting changes for readability.
test/features/steps/test_direct_service_resolution_steps.py Step formatting changes; clarifies multi-line parsers/steps.
test/e2e/test_ere.py Minor assertion formatting change.
test/e2e/test_app.py Adjusts env setup for app e2e run (notably REDIS_PASSWORD).
test/conftest.py Renames local logging config var; minor formatting.
src/ere/utils/logging.py Switches env var to ERE_LOG_LEVEL for log-level configuration.
src/ere/services/resolver_config.py Formatting for DuckDBConfig path field.
src/ere/services/factories.py Default resolver config path updated to config/resolver.yaml.
src/ere/services/entity_resolution_service.py Formatting and minor structural readability improvements.
src/ere/models/resolver/mention.py Formatting for _from_flat_dict guard.
src/ere/models/exceptions.py Formatting for ConflictError init signature.
src/ere/entrypoints/queue_worker.py Formatting for Redis calls and error response builder signature.
src/ere/entrypoints/app.py Updates docs/env var to ERE_LOG_LEVEL; minor formatting.
src/ere/adapters/utils.py Formatting for supported request/response class registries.
src/ere/adapters/splink_linker_impl.py Formatting/quoting consistency and readability changes.
src/ere/adapters/repositories.py Formatting for resolver model imports.
src/ere/adapters/rdf_mapper.py Simplifies ValueError formatting.
src/ere/adapters/rdf_mapper_impl.py Default mapping path moved to config/; minor formatting.
src/ere/adapters/duckdb_repositories.py Formatting for resolver model imports.
README.md Updates paths/docs for new config/infra layout and Make targets.
pyproject.toml Bumps version to 0.4.0, renames project, updates ers-spec dependency.
Makefile Adds compose/env vars wiring and expands infra targets.
infra/README.md New infra documentation for compose.dev workflow and env setup.
infra/Dockerfile Switches to multi-stage build (builder/runtime) and copies config/.
infra/docker-compose.yml Removed (replaced by infra/compose.dev.yaml).
infra/compose.dev.yaml New dev compose stack (redis/redisinsight/ere) with compose watch.
infra/.env.local Removed (replaced by infra/.env.example + infra/.env).
infra/.env.example New env template for local dev.
docs/algorithm.md Updates configuration link to config/README.md.
demo/README.md Updates demo instructions for new infra workflow (make infra-*).
demo/demo.py Updates env/logging variable usage and infra instructions.
config/resolver.yaml New default resolver config under top-level config/.
config/resolver_multirule.yaml Formatting normalization for YAML file.
config/resolver_compound.yaml Formatting normalization for YAML file.
config/README.md New configuration documentation (resolver + rdf mapping).
config/rdf_mapping.yaml Formatting normalization for YAML file.
CLAUDE.md Adds agent operating instructions / repo guidance.
CHANGELOG.md Updates infra compose filename reference.
AGENTS.md Adds agent operating instructions (duplicated guidance for agents).
.gitignore Ignores infra/.env and adds .idea.
.github/workflows/code-quality.yaml Adjusts SonarCloud step expectations and env cleanup.
.gitattributes Adds LF line-ending enforcement for some file types.
.dockerignore Adds docker build context exclusions.
.claude/skills/gitnexus/gitnexus-refactoring/SKILL.md New GitNexus guidance docs.
.claude/skills/gitnexus/gitnexus-impact-analysis/SKILL.md New GitNexus guidance docs.
.claude/skills/gitnexus/gitnexus-guide/SKILL.md New GitNexus guidance docs.
.claude/skills/gitnexus/gitnexus-exploring/SKILL.md New GitNexus guidance docs.
.claude/skills/gitnexus/gitnexus-debugging/SKILL.md New GitNexus guidance docs.
.claude/skills/gitnexus/gitnexus-cli/SKILL.md New GitNexus guidance docs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +3 to +6
import logging
from unittest.mock import call, patch

import pytest
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call and pytest are imported but never used in this test module. This will trigger lint failures (e.g., pylint unused-import) and should be removed (or used) to keep CI green.

Copilot uses AI. Check for mistakes.
Comment thread test/e2e/test_app.py
Comment on lines 42 to 46
monkeypatch.setenv("REDIS_HOST", os.environ.get("REDIS_HOST", "localhost"))
monkeypatch.setenv("REDIS_PORT", os.environ.get("REDIS_PORT", "6379"))
monkeypatch.setenv("REDIS_DB", os.environ.get("REDIS_DB", "0"))
if redis_password := os.environ.get("REDIS_PASSWORD"):
monkeypatch.setenv("REDIS_PASSWORD", redis_password)
monkeypatch.setenv("REDIS_PASSWORD", os.environ.get("REDIS_PASSWORD", "changeme"))
monkeypatch.setenv("REQUEST_QUEUE", req_queue)
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting REDIS_PASSWORD to a default value ("changeme") will break against Redis instances that do not require authentication (e.g., the GitHub Actions redis service in this repo). Prefer only setting REDIS_PASSWORD when it is present in the environment, or set it to an empty string/omit it so unauthenticated Redis works.

Copilot uses AI. Check for mistakes.
Comment thread infra/.env.example
DUCKDB_PATH=/data/app.duckdb

# --- Logging ---
# ERSys uses ERE_LOG_LEVEL; compose.dev.yaml maps it to LOG_LEVEL internally.
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment says compose.dev.yaml maps ERE_LOG_LEVEL to LOG_LEVEL, but the compose file passes env vars through as-is and the app reads ERE_LOG_LEVEL directly. Please update/remove this note to avoid misleading users.

Suggested change
# ERSys uses ERE_LOG_LEVEL; compose.dev.yaml maps it to LOG_LEVEL internally.
# ERSys uses ERE_LOG_LEVEL for logging configuration.

Copilot uses AI. Check for mistakes.
Comment thread Makefile
Comment on lines 104 to 107
test: ## Run all tests
@ echo -e "$(BUILD_PRINT)$(ICON_PROGRESS) Running all tests$(END_BUILD_PRINT)"
@ poetry run pytest $(TEST_PATH)
@ set -a && . $(ENV_FILE) && set +a && poetry run pytest $(TEST_PATH)
@ echo -e "$(BUILD_PRINT)$(ICON_DONE) All tests passed$(END_BUILD_PRINT)"
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test / test-integration / test-coverage targets source infra/.env unconditionally. This will make make test fail on a fresh clone unless the user has copied the env file, even though unit tests don’t require infrastructure. Consider either (a) making test depend on check-env, or (b) sourcing infra/.env only when it exists (and keeping unit tests runnable without extra setup).

Copilot uses AI. Check for mistakes.
Comment thread .gitattributes
@@ -0,0 +1,4 @@
# Enforce Unix line endings
*.sh text eol=lf
Dockerfile text eol=lf
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.gitattributes enforces LF for Dockerfile, but your Dockerfile lives at infra/Dockerfile and won’t match this pattern. If the intent is to enforce LF for Dockerfiles in subdirectories too, update the pattern (e.g., *Dockerfile / **/Dockerfile depending on git attribute matching needs).

Suggested change
Dockerfile text eol=lf
*Dockerfile text eol=lf

Copilot uses AI. Check for mistakes.
Comment thread pyproject.toml

# TODO: should we have a registry?
ers-spec = { git = "https://github.com/OP-TED/entity-resolution-spec.git", branch = "0.3.0-rc.1" }
ers-spec = { git = "https://github.com/OP-TED/entity-resolution-spec.git", branch = "develop" }
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a moving git branch (branch = "develop") for ers-spec makes builds non-reproducible and can introduce breaking changes unexpectedly. Prefer pinning to a released tag/version or a specific commit SHA (and bump intentionally when needed).

Suggested change
ers-spec = { git = "https://github.com/OP-TED/entity-resolution-spec.git", branch = "develop" }
ers-spec = { git = "https://github.com/OP-TED/entity-resolution-spec.git", rev = "<pin-to-commit-sha>" }

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants