Skip to content

Add time filtering to SDK + extra model fields#278

Open
klpoland wants to merge 7 commits intomasterfrom
feature-kpoland-enhanced-capture-download-options
Open

Add time filtering to SDK + extra model fields#278
klpoland wants to merge 7 commits intomasterfrom
feature-kpoland-enhanced-capture-download-options

Conversation

@klpoland
Copy link
Copy Markdown
Collaborator

No description provided.

@semanticdiff-com
Copy link
Copy Markdown

semanticdiff-com Bot commented Apr 24, 2026

Review changes with  SemanticDiff

Changed Files
File Status
  sdk/src/spectrumx/ops/pagination.py  31% smaller
  gateway/sds_gateway/api_methods/tests/test_celery_tasks.py  29% smaller
  gateway/sds_gateway/api_methods/helpers/temporal_filtering.py  28% smaller
  sdk/src/spectrumx/client.py  26% smaller
  sdk/src/spectrumx/api/sds_files.py  16% smaller
  sdk/src/spectrumx/gateway.py  7% smaller
  gateway/sds_gateway/api_methods/views/file_endpoints.py  6% smaller
  gateway/sds_gateway/api_methods/serializers/capture_serializers.py  6% smaller
  sdk/src/spectrumx/models/captures.py  5% smaller
  gateway/pyproject.toml Unsupported file format
  gateway/sds_gateway/api_methods/serializers/dataset_serializers.py  0% smaller
  gateway/sds_gateway/api_methods/serializers/file_serializers.py  0% smaller
  gateway/sds_gateway/api_methods/tests/test_composite_capture_serialization.py  0% smaller
  gateway/sds_gateway/api_methods/tests/test_dataset_endpoints.py  0% smaller
  gateway/sds_gateway/api_methods/tests/test_file_endpoints.py  0% smaller
  gateway/sds_gateway/api_methods/utils/swagger_example_schema.py  0% smaller
  gateway/sds_gateway/api_methods/views/dataset_endpoints.py  0% smaller
  sdk/docs/mkdocs/changelog.md Unsupported file format
  sdk/pyproject.toml Unsupported file format
  sdk/src/spectrumx/api/datasets.py  0% smaller
  sdk/src/spectrumx/models/capture_enums.py  0% smaller
  sdk/src/spectrumx/models/datasets.py  0% smaller
  sdk/tests/integration/test_file_ops.py  0% smaller
  sdk/tests/ops/test_files.py  0% smaller
  sdk/tests/ops/test_paginator.py  0% smaller

@klpoland klpoland self-assigned this Apr 24, 2026
@klpoland klpoland added feature New feature or request sdk SDK component labels Apr 24, 2026
@klpoland klpoland requested a review from lucaspar April 24, 2026 21:01
@lucaspar lucaspar requested a review from Copilot April 24, 2026 21:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds end-to-end temporal filtering for DigitalRF (rf@*.h5) file listings/downloads, and expands dataset/capture models and APIs to expose richer relationship/ownership metadata (including a new dataset detail endpoint).

Changes:

  • SDK: Add start_time / end_time support to list_files and download, plus pagination propagation and warning forwarding.
  • Gateway: Add temporal query params to file listing with a warnings array included in paginated responses; add dataset detail retrieve returning captures + artifact files.
  • Models/serializers/tests: Introduce shared capture enums, expand dataset/capture fields, and add unit/integration coverage for temporal filtering and composite capture serialization.

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
sdk/tests/ops/test_paginator.py Adds unit tests for preserving temporal kwargs across pages and logging API warnings.
sdk/tests/ops/test_files.py Adds unit tests for temporal param validation and datetime-to-ISO query formatting.
sdk/tests/integration/test_file_ops.py Adds integration tests for temporal listing/download behavior and warning logging.
sdk/src/spectrumx/ops/pagination.py Forwards first-page warnings from API via log_user_warning.
sdk/src/spectrumx/models/datasets.py Adds captures/files to Dataset and introduces nested dataset-side models.
sdk/src/spectrumx/models/captures.py Moves enums to shared module; adds additional indexed time display fields and sharing flags.
sdk/src/spectrumx/models/capture_enums.py New shared CaptureType / CaptureOrigin enums module.
sdk/src/spectrumx/gateway.py Adds start_time/end_time query params to list_files; adds get_dataset.
sdk/src/spectrumx/client.py Adds temporal params to download/list_files; adds dataset convenience methods.
sdk/src/spectrumx/api/sds_files.py Validates paired temporal params and formats datetimes as UTC ISO for gateway.
sdk/src/spectrumx/api/datasets.py Adds dataset detail fetch + helpers to list captures/artifact files from that payload.
sdk/pyproject.toml Updates Ruff per-file ignores for monorepo path layouts.
sdk/docs/mkdocs/changelog.md Documents new temporal filtering and model expansions.
gateway/sds_gateway/api_methods/views/file_endpoints.py Adds temporal query params, RF filtering, and always returns warnings in paginated responses.
gateway/sds_gateway/api_methods/views/dataset_endpoints.py Adds dataset retrieve endpoint returning dataset metadata with captures + artifact files.
gateway/sds_gateway/api_methods/utils/swagger_example_schema.py Adds warnings key to example paginated file list response.
gateway/sds_gateway/api_methods/tests/test_file_endpoints.py Adds tests for warnings key presence and temporal filtering behavior.
gateway/sds_gateway/api_methods/tests/test_dataset_endpoints.py Adds test for dataset detail retrieval containing captures/files.
gateway/sds_gateway/api_methods/tests/test_composite_capture_serialization.py Adds serializer-level tests for multi-channel composite capture output.
gateway/sds_gateway/api_methods/tests/test_celery_tasks.py Updates temporal filtering task docstring references.
gateway/sds_gateway/api_methods/serializers/file_serializers.py Adds a nested artifact file summary serializer for dataset payloads.
gateway/sds_gateway/api_methods/serializers/dataset_serializers.py Extends dataset serializer to embed captures/artifact files and break serializer cycles.
gateway/sds_gateway/api_methods/serializers/capture_serializers.py Adds new derived time fields and enriches composite channel rows with OpenSearch bounds.
gateway/sds_gateway/api_methods/helpers/temporal_filtering.py Refactors temporal filtering to share filter_files_by_temporal_bounds.
gateway/pyproject.toml Adds Ruff ignore for composite serialization test magic numbers.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +216 to +220
def _datetime_string_to_milliseconds(self, datetime_string: str) -> int:
"""Converts a datetime string to milliseconds since start of capture."""
parsed = datetime.fromisoformat(datetime_string)
return int(parsed.timestamp() * 1000)

Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_datetime_string_to_milliseconds uses datetime.fromisoformat() without validation/handling. If a client passes an invalid datetime (or a naive datetime), this will raise ValueError and bubble up as a 500; and naive datetimes will be interpreted in the server's local timezone when calling .timestamp(), which is likely not what you want for temporal filtering. Consider catching ValueError and returning a 400 with a clear message, and either requiring timezone-aware inputs or treating naive inputs as UTC (to match SDK behavior). Also, the docstring says "milliseconds since start of capture" but this function is computing epoch milliseconds.

Copilot uses AI. Check for mistakes.
)
log.warning(msg)
warnings.append(msg)
elif start_time or end_time:
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the non-RF warning branch you use elif start_time or end_time: but start_time/end_time are converted to integers earlier. A valid bound at the Unix epoch converts to 0, which is falsy, so this can incorrectly skip the warning (and makes the check inconsistent with the RF branch which uses is not None). Prefer checking start_time is not None or end_time is not None after conversion.

Suggested change
elif start_time or end_time:
elif start_time is not None or end_time is not None:

Copilot uses AI. Check for mistakes.
Comment on lines +487 to +542
def _enriched_channels(self, obj: dict[str, Any]) -> list[dict[str, Any]]:
"""Per-channel rows with OpenSearch bounds (each channel may differ)."""
key = str(obj.get("uuid", ""))
if not hasattr(self, "_enriched_channels_cache"):
self._enriched_channels_cache: dict[str, list[dict[str, Any]]] = {}
if key not in self._enriched_channels_cache:
out: list[dict[str, Any]] = []
for ch in obj.get("channels") or []:
entry: dict[str, Any] = {
"channel": ch["channel"],
"uuid": ch["uuid"],
"channel_metadata": ch.get("channel_metadata", {}),
}
try:
capture = Capture.objects.get(uuid=ch["uuid"])
except Capture.DoesNotExist:
entry["capture_start_epoch_sec"] = None
entry["capture_end_epoch_sec"] = None
entry["capture_start_iso_utc"] = None
entry["capture_end_iso_utc"] = None
entry["capture_start_display"] = None
entry["capture_end_display"] = None
entry["length_of_capture_ms"] = None
entry["file_cadence_ms"] = None
else:
# Per-channel bounds/cadence (Capture.get_opensearch_metadata).
start_sec = capture.start_time
end_sec = capture.end_time
entry["capture_start_epoch_sec"] = start_sec
entry["capture_end_epoch_sec"] = end_sec
entry["capture_start_iso_utc"] = (
_epoch_sec_to_iso_utc_z(start_sec)
if start_sec is not None
else None
)
entry["capture_end_iso_utc"] = (
_epoch_sec_to_iso_utc_z(end_sec)
if end_sec is not None
else None
)
entry["capture_start_display"] = (
_epoch_sec_to_local_display(start_sec)
if start_sec is not None
else None
)
entry["capture_end_display"] = (
_epoch_sec_to_local_display(end_sec)
if end_sec is not None
else None
)
if start_sec is None or end_sec is None:
entry["length_of_capture_ms"] = None
else:
entry["length_of_capture_ms"] = (end_sec - start_sec) * 1000
entry["file_cadence_ms"] = capture.file_cadence
out.append(entry)
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CompositeCaptureSerializer._enriched_channels() introduces per-channel Capture.objects.get(...) lookups and then reads capture.start_time / end_time / file_cadence, each of which calls get_opensearch_metadata(). This creates an N+1 pattern (DB + OpenSearch) for composite serialization and can become very expensive for multi-channel captures and list endpoints. Consider passing the Capture instances (or their already-fetched OpenSearch metadata) into the composite payload from build_composite_capture_data, or bulk-fetching captures with filter(uuid__in=...) and caching get_opensearch_metadata() results per UUID so each channel is resolved once.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature New feature or request sdk SDK component

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants