Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
**04/22/2026:** Highlights since the `v1.1.0` release (2025-11-26), which shipped the `waterdata` module:

- Added `get_channel` for channel-measurement data (#218) and `get_stats_por` / `get_stats_date_range` for period-of-record and daily statistics (#207).
- Added `get_reference_table` (and made it considerably simpler and faster in #209), then extended it to accept arbitrary collections-API query parameters (#214).
- Removed the deprecated `waterwatch` module (#228) and several defunct NWIS stubs (#222, #225), and added `py.typed` so `dataretrieval` ships type information to downstream users (#186).
- Now supports `pandas` 3.x (#221).
- The OGC `waterdata` getters (`get_continuous`, `get_daily`, `get_field_measurements`, and the six others built on the same OGC collections) now accept `filter` and `filter_lang` kwargs that are passed through to the service's CQL filter parameter. This enables advanced server-side filtering that isn't expressible via the other kwargs — most commonly, OR'ing multiple time ranges into a single request. A long expression made up of a top-level `OR` chain is transparently split into multiple requests that each fit under the server's URI length limit, and the results are concatenated.

**12/04/2025:** The `get_continuous()` function was added to the `waterdata` module, which provides access to measurements collected via automated sensors at a high frequency (often 15 minute intervals) at a monitoring location. This is an early version of the continuous endpoint and should be used with caution as the API team improves its performance. In the future, we anticipate the addition of an endpoint(s) specifically for handling large data requests, so it may make sense for power users to hold off on heavy development using the new continuous endpoint.

**11/24/2025:** `dataretrieval` is pleased to offer a new module, `waterdata`, which gives users access USGS's modernized [Water Data APIs](https://api.waterdata.usgs.gov/). The Water Data API endpoints include daily values, instantaneous values, field measurements (modernized groundwater levels service), time series metadata, and discrete water quality data from the Samples database. Though there will be a period of overlap, the functions within `waterdata` will eventually replace the `nwis` module, which currently provides access to the legacy [NWIS Water Services](https://waterservices.usgs.gov/). More example workflows and functions coming soon. Check `help(waterdata)` for more information.
Expand Down
2 changes: 2 additions & 0 deletions dataretrieval/waterdata/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,15 @@
)
from .types import (
CODE_SERVICES,
FILTER_LANG,
PROFILE_LOOKUP,
PROFILES,
SERVICES,
)

__all__ = [
"CODE_SERVICES",
"FILTER_LANG",
"PROFILES",
"PROFILE_LOOKUP",
"SERVICES",
Expand Down
144 changes: 144 additions & 0 deletions dataretrieval/waterdata/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
from dataretrieval.utils import BaseMetadata, to_str
from dataretrieval.waterdata.types import (
CODE_SERVICES,
FILTER_LANG,
METADATA_COLLECTIONS,
PROFILES,
SERVICES,
Expand Down Expand Up @@ -51,6 +52,8 @@ def get_daily(
time: str | list[str] | None = None,
bbox: list[float] | None = None,
limit: int | None = None,
filter: str | None = None,
filter_lang: FILTER_LANG | None = None,
convert_type: bool = True,
) -> tuple[pd.DataFrame, BaseMetadata]:
"""Daily data provide one data value to represent water conditions for the
Expand Down Expand Up @@ -177,6 +180,18 @@ def get_daily(
allowable limit is 50000. It may be beneficial to set this number lower
if your internet connection is spotty. The default (NA) will set the
limit to the maximum allowable limit for the service.
filter : string, optional
A CQL text or JSON expression passed through to the OGC API
``filter`` query parameter. Commonly used to OR several time
ranges into a single request. At the time of writing the server
accepts ``cql-text`` (default) and ``cql-json``; ``cql2-text`` /
``cql2-json`` are not yet supported. A long expression made up
of a top-level ``OR`` chain is automatically split into
multiple requests that each fit under the server's URI length
limit; the results are concatenated.
filter_lang : string, optional
Language of the ``filter`` expression, for example ``cql-text``
(default) or ``cql-json``. Sent as ``filter-lang`` in the URL.
convert_type : boolean, optional
If True, converts columns to appropriate types.

Expand Down Expand Up @@ -228,6 +243,8 @@ def get_continuous(
last_modified: str | None = None,
time: str | list[str] | None = None,
limit: int | None = None,
filter: str | None = None,
filter_lang: FILTER_LANG | None = None,
convert_type: bool = True,
) -> tuple[pd.DataFrame, BaseMetadata]:
"""
Expand Down Expand Up @@ -348,6 +365,18 @@ def get_continuous(
allowable limit is 10000. It may be beneficial to set this number lower
if your internet connection is spotty. The default (NA) will set the
limit to the maximum allowable limit for the service.
filter : string, optional
A CQL text or JSON expression passed through to the OGC API
``filter`` query parameter. Commonly used to OR several time
ranges into a single request. At the time of writing the server
accepts ``cql-text`` (default) and ``cql-json``; ``cql2-text`` /
``cql2-json`` are not yet supported. A long expression made up
of a top-level ``OR`` chain is automatically split into
multiple requests that each fit under the server's URI length
limit; the results are concatenated.
filter_lang : string, optional
Language of the ``filter`` expression, for example ``cql-text``
(default) or ``cql-json``. Sent as ``filter-lang`` in the URL.
convert_type : boolean, optional
If True, the function will convert the data to dates and qualifier to
string vector
Expand All @@ -370,6 +399,37 @@ def get_continuous(
... parameter_code="00065",
... time="2021-01-01T00:00:00Z/2022-01-01T00:00:00Z",
... )

>>> # The ``time`` parameter accepts a single instant or a single
>>> # interval. To pull several disjoint windows in one call, pass a
>>> # CQL-text ``filter`` expression instead:
>>> df, md = dataretrieval.waterdata.get_continuous(
... monitoring_location_id="USGS-02238500",
... parameter_code="00060",
... filter=(
... "(time >= '2023-06-01T12:00:00Z' "
... "AND time <= '2023-06-01T13:00:00Z') "
... "OR (time >= '2023-06-15T12:00:00Z' "
... "AND time <= '2023-06-15T13:00:00Z')"
... ),
... filter_lang="cql-text",
... )

>>> # Long top-level ``OR`` chains (e.g. one window per discrete
>>> # measurement timestamp) are built up the same way. If the
>>> # resulting URL would exceed the server's length limit, the
>>> # client transparently splits it into multiple sub-requests and
>>> # returns the concatenated, deduplicated result.
>>> windows = [
... f"(time >= '2023-{m:02d}-15T00:00:00Z' "
... f"AND time <= '2023-{m:02d}-15T00:30:00Z')"
... for m in range(1, 13)
... ]
>>> df, md = dataretrieval.waterdata.get_continuous(
... monitoring_location_id="USGS-02238500",
... parameter_code="00060",
... filter=" OR ".join(windows),
... )
"""
service = "continuous"
output_id = "continuous_id"
Expand Down Expand Up @@ -426,6 +486,8 @@ def get_monitoring_locations(
time: str | list[str] | None = None,
bbox: list[float] | None = None,
limit: int | None = None,
filter: str | None = None,
filter_lang: FILTER_LANG | None = None,
convert_type: bool = True,
) -> tuple[pd.DataFrame, BaseMetadata]:
"""Location information is basic information about the monitoring location
Expand Down Expand Up @@ -635,6 +697,18 @@ def get_monitoring_locations(
The returning object will be a data frame with no spatial information.
Note that the USGS Water Data APIs use camelCase "skipGeometry" in
CQL2 queries.
filter : string, optional
A CQL text or JSON expression passed through to the OGC API
``filter`` query parameter. Commonly used to OR several time
ranges into a single request. At the time of writing the server
accepts ``cql-text`` (default) and ``cql-json``; ``cql2-text`` /
``cql2-json`` are not yet supported. A long expression made up
of a top-level ``OR`` chain is automatically split into
multiple requests that each fit under the server's URI length
limit; the results are concatenated.
filter_lang : string, optional
Language of the ``filter`` expression, for example ``cql-text``
(default) or ``cql-json``. Sent as ``filter-lang`` in the URL.
convert_type : boolean, optional
If True, converts columns to appropriate types.

Expand Down Expand Up @@ -697,6 +771,8 @@ def get_time_series_metadata(
time: str | list[str] | None = None,
bbox: list[float] | None = None,
limit: int | None = None,
filter: str | None = None,
filter_lang: FILTER_LANG | None = None,
convert_type: bool = True,
) -> tuple[pd.DataFrame, BaseMetadata]:
"""Daily data and continuous measurements are grouped into time series,
Expand Down Expand Up @@ -851,6 +927,18 @@ def get_time_series_metadata(
allowable limit is 50000. It may be beneficial to set this number lower
if your internet connection is spotty. The default (None) will set the
limit to the maximum allowable limit for the service.
filter : string, optional
A CQL text or JSON expression passed through to the OGC API
``filter`` query parameter. Commonly used to OR several time
ranges into a single request. At the time of writing the server
accepts ``cql-text`` (default) and ``cql-json``; ``cql2-text`` /
``cql2-json`` are not yet supported. A long expression made up
of a top-level ``OR`` chain is automatically split into
multiple requests that each fit under the server's URI length
limit; the results are concatenated.
filter_lang : string, optional
Language of the ``filter`` expression, for example ``cql-text``
(default) or ``cql-json``. Sent as ``filter-lang`` in the URL.
convert_type : boolean, optional
If True, converts columns to appropriate types.

Expand Down Expand Up @@ -903,6 +991,8 @@ def get_latest_continuous(
time: str | list[str] | None = None,
bbox: list[float] | None = None,
limit: int | None = None,
filter: str | None = None,
filter_lang: FILTER_LANG | None = None,
convert_type: bool = True,
) -> tuple[pd.DataFrame, BaseMetadata]:
"""This endpoint provides the most recent observation for each time series
Expand Down Expand Up @@ -1026,6 +1116,18 @@ def get_latest_continuous(
allowable limit is 50000. It may be beneficial to set this number lower
if your internet connection is spotty. The default (None) will set the
limit to the maximum allowable limit for the service.
filter : string, optional
A CQL text or JSON expression passed through to the OGC API
``filter`` query parameter. Commonly used to OR several time
ranges into a single request. At the time of writing the server
accepts ``cql-text`` (default) and ``cql-json``; ``cql2-text`` /
``cql2-json`` are not yet supported. A long expression made up
of a top-level ``OR`` chain is automatically split into
multiple requests that each fit under the server's URI length
limit; the results are concatenated.
filter_lang : string, optional
Language of the ``filter`` expression, for example ``cql-text``
(default) or ``cql-json``. Sent as ``filter-lang`` in the URL.
convert_type : boolean, optional
If True, converts columns to appropriate types.

Expand Down Expand Up @@ -1075,6 +1177,8 @@ def get_latest_daily(
time: str | list[str] | None = None,
bbox: list[float] | None = None,
limit: int | None = None,
filter: str | None = None,
filter_lang: FILTER_LANG | None = None,
convert_type: bool = True,
) -> tuple[pd.DataFrame, BaseMetadata]:
"""Daily data provide one data value to represent water conditions for the
Expand Down Expand Up @@ -1200,6 +1304,18 @@ def get_latest_daily(
allowable limit is 50000. It may be beneficial to set this number lower
if your internet connection is spotty. The default (None) will set the
limit to the maximum allowable limit for the service.
filter : string, optional
A CQL text or JSON expression passed through to the OGC API
``filter`` query parameter. Commonly used to OR several time
ranges into a single request. At the time of writing the server
accepts ``cql-text`` (default) and ``cql-json``; ``cql2-text`` /
``cql2-json`` are not yet supported. A long expression made up
of a top-level ``OR`` chain is automatically split into
multiple requests that each fit under the server's URI length
limit; the results are concatenated.
filter_lang : string, optional
Language of the ``filter`` expression, for example ``cql-text``
(default) or ``cql-json``. Sent as ``filter-lang`` in the URL.
convert_type : boolean, optional
If True, converts columns to appropriate types.

Expand Down Expand Up @@ -1251,6 +1367,8 @@ def get_field_measurements(
time: str | list[str] | None = None,
bbox: list[float] | None = None,
limit: int | None = None,
filter: str | None = None,
filter_lang: FILTER_LANG | None = None,
convert_type: bool = True,
) -> tuple[pd.DataFrame, BaseMetadata]:
"""Field measurements are physically measured values collected during a
Expand Down Expand Up @@ -1366,6 +1484,18 @@ def get_field_measurements(
allowable limit is 50000. It may be beneficial to set this number lower
if your internet connection is spotty. The default (None) will set the
limit to the maximum allowable limit for the service.
filter : string, optional
A CQL text or JSON expression passed through to the OGC API
``filter`` query parameter. Commonly used to OR several time
ranges into a single request. At the time of writing the server
accepts ``cql-text`` (default) and ``cql-json``; ``cql2-text`` /
``cql2-json`` are not yet supported. A long expression made up
of a top-level ``OR`` chain is automatically split into
multiple requests that each fit under the server's URI length
limit; the results are concatenated.
filter_lang : string, optional
Language of the ``filter`` expression, for example ``cql-text``
(default) or ``cql-json``. Sent as ``filter-lang`` in the URL.
convert_type : boolean, optional
If True, converts columns to appropriate types.

Expand Down Expand Up @@ -2017,6 +2147,8 @@ def get_channel(
skip_geometry: bool | None = None,
bbox: list[float] | None = None,
limit: int | None = None,
filter: str | None = None,
filter_lang: FILTER_LANG | None = None,
convert_type: bool = True,
) -> tuple[pd.DataFrame, BaseMetadata]:
"""
Expand Down Expand Up @@ -2123,6 +2255,18 @@ def get_channel(
vertical_velocity_description, longitudinal_velocity_description,
measurement_type, last_modified, channel_measurement_type. The default (NA) will
return all columns of the data.
filter : string, optional
A CQL text or JSON expression passed through to the OGC API
``filter`` query parameter. Commonly used to OR several time
ranges into a single request. At the time of writing the server
accepts ``cql-text`` (default) and ``cql-json``; ``cql2-text`` /
``cql2-json`` are not yet supported. A long expression made up
of a top-level ``OR`` chain is automatically split into
multiple requests that each fit under the server's URI length
limit; the results are concatenated.
filter_lang : string, optional
Language of the ``filter`` expression, for example ``cql-text``
(default) or ``cql-json``. Sent as ``filter-lang`` in the URL.
convert_type : boolean, optional
If True, the function will convert the data to dates and qualifier to
string vector
Expand Down
2 changes: 2 additions & 0 deletions dataretrieval/waterdata/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@
"results",
]

FILTER_LANG = Literal["cql-text", "cql-json"]

PROFILES = Literal[
"actgroup",
"actmetric",
Expand Down
Loading
Loading