Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
**04/23/2026:** Added `waterdata.get_nearest_continuous(targets, ...)` — for each of N target timestamps, fetches the single continuous observation closest to that timestamp in one HTTP round-trip (auto-chunked when the resulting CQL filter is long, via the facility added in #238). The helper is designed for workflows that pair many discrete-measurement timestamps with surrounding instantaneous data, which the OGC `time` parameter can't express since it only accepts one instant or one interval per request. Ties at window midpoints are resolved per a configurable `on_tie` ∈ {`"first"`, `"last"`, `"mean"`}; the default `window="PT7M30S"` matches a 15-minute continuous gauge.

**04/22/2026:** Highlights since the `v1.1.0` release (2025-11-26), which shipped the `waterdata` module:

- Added `get_channel` for channel-measurement data (#218) and `get_stats_por` / `get_stats_date_range` for period-of-record and daily statistics (#207).
- Added `get_reference_table` (and made it considerably simpler and faster in #209), then extended it to accept arbitrary collections-API query parameters (#214).
- Removed the deprecated `waterwatch` module (#228) and several defunct NWIS stubs (#222, #225), and added `py.typed` so `dataretrieval` ships type information to downstream users (#186).
- Now supports `pandas` 3.x (#221).
- The OGC `waterdata` getters (`get_continuous`, `get_daily`, `get_field_measurements`, and the six others built on the same OGC collections) now accept `filter` and `filter_lang` kwargs that are passed through to the service's CQL filter parameter. This enables advanced server-side filtering that isn't expressible via the other kwargs — most commonly, OR'ing multiple time ranges into a single request. A long expression made up of a top-level `OR` chain is transparently split into multiple requests that each fit under the server's URI length limit, and the results are concatenated.

**12/04/2025:** The `get_continuous()` function was added to the `waterdata` module, which provides access to measurements collected via automated sensors at a high frequency (often 15 minute intervals) at a monitoring location. This is an early version of the continuous endpoint and should be used with caution as the API team improves its performance. In the future, we anticipate the addition of an endpoint(s) specifically for handling large data requests, so it may make sense for power users to hold off on heavy development using the new continuous endpoint.

**11/24/2025:** `dataretrieval` is pleased to offer a new module, `waterdata`, which gives users access USGS's modernized [Water Data APIs](https://api.waterdata.usgs.gov/). The Water Data API endpoints include daily values, instantaneous values, field measurements (modernized groundwater levels service), time series metadata, and discrete water quality data from the Samples database. Though there will be a period of overlap, the functions within `waterdata` will eventually replace the `nwis` module, which currently provides access to the legacy [NWIS Water Services](https://waterservices.usgs.gov/). More example workflows and functions coming soon. Check `help(waterdata)` for more information.
Expand Down
4 changes: 4 additions & 0 deletions dataretrieval/waterdata/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@
get_stats_por,
get_time_series_metadata,
)
from .filters import FILTER_LANG
from .nearest import get_nearest_continuous
from .types import (
CODE_SERVICES,
PROFILE_LOOKUP,
Expand All @@ -34,6 +36,7 @@

__all__ = [
"CODE_SERVICES",
"FILTER_LANG",
"PROFILES",
"PROFILE_LOOKUP",
"SERVICES",
Expand All @@ -45,6 +48,7 @@
"get_latest_continuous",
"get_latest_daily",
"get_monitoring_locations",
"get_nearest_continuous",
"get_reference_table",
"get_samples",
"get_stats_date_range",
Expand Down
72 changes: 72 additions & 0 deletions dataretrieval/waterdata/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
from requests.models import PreparedRequest

from dataretrieval.utils import BaseMetadata, to_str
from dataretrieval.waterdata.filters import FILTER_LANG
from dataretrieval.waterdata.types import (
CODE_SERVICES,
METADATA_COLLECTIONS,
Expand Down Expand Up @@ -51,6 +52,8 @@ def get_daily(
time: str | list[str] | None = None,
bbox: list[float] | None = None,
limit: int | None = None,
filter: str | None = None,
filter_lang: FILTER_LANG | None = None,
convert_type: bool = True,
) -> tuple[pd.DataFrame, BaseMetadata]:
"""Daily data provide one data value to represent water conditions for the
Expand Down Expand Up @@ -177,6 +180,11 @@ def get_daily(
allowable limit is 50000. It may be beneficial to set this number lower
if your internet connection is spotty. The default (NA) will set the
limit to the maximum allowable limit for the service.
filter, filter_lang : optional
Server-side CQL filter passed through as the OGC ``filter`` /
``filter-lang`` query parameters. See
:mod:`dataretrieval.waterdata.filters` for syntax, auto-chunking,
and the lexicographic-comparison pitfall.
convert_type : boolean, optional
If True, converts columns to appropriate types.

Expand Down Expand Up @@ -228,6 +236,8 @@ def get_continuous(
last_modified: str | None = None,
time: str | list[str] | None = None,
limit: int | None = None,
filter: str | None = None,
filter_lang: FILTER_LANG | None = None,
convert_type: bool = True,
) -> tuple[pd.DataFrame, BaseMetadata]:
"""
Expand Down Expand Up @@ -348,6 +358,11 @@ def get_continuous(
allowable limit is 10000. It may be beneficial to set this number lower
if your internet connection is spotty. The default (NA) will set the
limit to the maximum allowable limit for the service.
filter, filter_lang : optional
Server-side CQL filter passed through as the OGC ``filter`` /
``filter-lang`` query parameters. See
:mod:`dataretrieval.waterdata.filters` for syntax, auto-chunking,
and the lexicographic-comparison pitfall.
convert_type : boolean, optional
If True, the function will convert the data to dates and qualifier to
string vector
Expand All @@ -370,6 +385,21 @@ def get_continuous(
... parameter_code="00065",
... time="2021-01-01T00:00:00Z/2022-01-01T00:00:00Z",
... )

>>> # Pull several disjoint time windows in one call via a CQL
>>> # ``filter``. See ``dataretrieval.waterdata.filters`` for the
>>> # full grammar, auto-chunking, and pitfalls.
>>> df, md = dataretrieval.waterdata.get_continuous(
... monitoring_location_id="USGS-02238500",
... parameter_code="00060",
... filter=(
... "(time >= '2023-06-01T12:00:00Z' "
... "AND time <= '2023-06-01T13:00:00Z') "
... "OR (time >= '2023-06-15T12:00:00Z' "
... "AND time <= '2023-06-15T13:00:00Z')"
... ),
... filter_lang="cql-text",
... )
"""
service = "continuous"
output_id = "continuous_id"
Expand Down Expand Up @@ -426,6 +456,8 @@ def get_monitoring_locations(
time: str | list[str] | None = None,
bbox: list[float] | None = None,
limit: int | None = None,
filter: str | None = None,
filter_lang: FILTER_LANG | None = None,
convert_type: bool = True,
) -> tuple[pd.DataFrame, BaseMetadata]:
"""Location information is basic information about the monitoring location
Expand Down Expand Up @@ -635,6 +667,11 @@ def get_monitoring_locations(
The returning object will be a data frame with no spatial information.
Note that the USGS Water Data APIs use camelCase "skipGeometry" in
CQL2 queries.
filter, filter_lang : optional
Server-side CQL filter passed through as the OGC ``filter`` /
``filter-lang`` query parameters. See
:mod:`dataretrieval.waterdata.filters` for syntax, auto-chunking,
and the lexicographic-comparison pitfall.
convert_type : boolean, optional
If True, converts columns to appropriate types.

Expand Down Expand Up @@ -697,6 +734,8 @@ def get_time_series_metadata(
time: str | list[str] | None = None,
bbox: list[float] | None = None,
limit: int | None = None,
filter: str | None = None,
filter_lang: FILTER_LANG | None = None,
convert_type: bool = True,
) -> tuple[pd.DataFrame, BaseMetadata]:
"""Daily data and continuous measurements are grouped into time series,
Expand Down Expand Up @@ -851,6 +890,11 @@ def get_time_series_metadata(
allowable limit is 50000. It may be beneficial to set this number lower
if your internet connection is spotty. The default (None) will set the
limit to the maximum allowable limit for the service.
filter, filter_lang : optional
Server-side CQL filter passed through as the OGC ``filter`` /
``filter-lang`` query parameters. See
:mod:`dataretrieval.waterdata.filters` for syntax, auto-chunking,
and the lexicographic-comparison pitfall.
convert_type : boolean, optional
If True, converts columns to appropriate types.

Expand Down Expand Up @@ -903,6 +947,8 @@ def get_latest_continuous(
time: str | list[str] | None = None,
bbox: list[float] | None = None,
limit: int | None = None,
filter: str | None = None,
filter_lang: FILTER_LANG | None = None,
convert_type: bool = True,
) -> tuple[pd.DataFrame, BaseMetadata]:
"""This endpoint provides the most recent observation for each time series
Expand Down Expand Up @@ -1026,6 +1072,11 @@ def get_latest_continuous(
allowable limit is 50000. It may be beneficial to set this number lower
if your internet connection is spotty. The default (None) will set the
limit to the maximum allowable limit for the service.
filter, filter_lang : optional
Server-side CQL filter passed through as the OGC ``filter`` /
``filter-lang`` query parameters. See
:mod:`dataretrieval.waterdata.filters` for syntax, auto-chunking,
and the lexicographic-comparison pitfall.
convert_type : boolean, optional
If True, converts columns to appropriate types.

Expand Down Expand Up @@ -1075,6 +1126,8 @@ def get_latest_daily(
time: str | list[str] | None = None,
bbox: list[float] | None = None,
limit: int | None = None,
filter: str | None = None,
filter_lang: FILTER_LANG | None = None,
convert_type: bool = True,
) -> tuple[pd.DataFrame, BaseMetadata]:
"""Daily data provide one data value to represent water conditions for the
Expand Down Expand Up @@ -1200,6 +1253,11 @@ def get_latest_daily(
allowable limit is 50000. It may be beneficial to set this number lower
if your internet connection is spotty. The default (None) will set the
limit to the maximum allowable limit for the service.
filter, filter_lang : optional
Server-side CQL filter passed through as the OGC ``filter`` /
``filter-lang`` query parameters. See
:mod:`dataretrieval.waterdata.filters` for syntax, auto-chunking,
and the lexicographic-comparison pitfall.
convert_type : boolean, optional
If True, converts columns to appropriate types.

Expand Down Expand Up @@ -1251,6 +1309,8 @@ def get_field_measurements(
time: str | list[str] | None = None,
bbox: list[float] | None = None,
limit: int | None = None,
filter: str | None = None,
filter_lang: FILTER_LANG | None = None,
convert_type: bool = True,
) -> tuple[pd.DataFrame, BaseMetadata]:
"""Field measurements are physically measured values collected during a
Expand Down Expand Up @@ -1366,6 +1426,11 @@ def get_field_measurements(
allowable limit is 50000. It may be beneficial to set this number lower
if your internet connection is spotty. The default (None) will set the
limit to the maximum allowable limit for the service.
filter, filter_lang : optional
Server-side CQL filter passed through as the OGC ``filter`` /
``filter-lang`` query parameters. See
:mod:`dataretrieval.waterdata.filters` for syntax, auto-chunking,
and the lexicographic-comparison pitfall.
convert_type : boolean, optional
If True, converts columns to appropriate types.

Expand Down Expand Up @@ -2017,6 +2082,8 @@ def get_channel(
skip_geometry: bool | None = None,
bbox: list[float] | None = None,
limit: int | None = None,
filter: str | None = None,
filter_lang: FILTER_LANG | None = None,
convert_type: bool = True,
) -> tuple[pd.DataFrame, BaseMetadata]:
"""
Expand Down Expand Up @@ -2123,6 +2190,11 @@ def get_channel(
vertical_velocity_description, longitudinal_velocity_description,
measurement_type, last_modified, channel_measurement_type. The default (NA) will
return all columns of the data.
filter, filter_lang : optional
Server-side CQL filter passed through as the OGC ``filter`` /
``filter-lang`` query parameters. See
:mod:`dataretrieval.waterdata.filters` for syntax, auto-chunking,
and the lexicographic-comparison pitfall.
convert_type : boolean, optional
If True, the function will convert the data to dates and qualifier to
string vector
Expand Down
Loading