Skip to content

Commit 0d079ca

Browse files
thodson-usgsclaude
andauthored
Add waterdata.get_field_measurements_metadata (#268)
Wraps the OGC /collections/field-measurements-metadata collection. Returns one row per (location, parameter) field-measurement series describing its period of record, units, etc., without the underlying observations. Discrete-measurement analogue to get_time_series_metadata. Mirrors R's read_waterdata_field_meta in DOI-USGS/dataRetrieval, with the same output_id ("field_series_id") and parameter list. Body is the standard service-agnostic dispatch through get_ogc_data, with no new infrastructure required. Two live tests cover the single-site happy path and the multi-site POST path. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent f10f08b commit 0d079ca

4 files changed

Lines changed: 152 additions & 0 deletions

File tree

NEWS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
**05/06/2026:** Added `waterdata.get_field_measurements_metadata(...)` — wraps the OGC `field-measurements-metadata` collection. Returns one row per (location, parameter) field-measurement series describing its period of record, units, etc., without the underlying observations. Discrete-measurement analogue to `get_time_series_metadata`. Mirrors R's `read_waterdata_field_meta`.
2+
13
**05/05/2026:** Added `waterdata.get_combined_metadata(...)` — wraps the Water Data API's `combined-metadata` collection, which joins the monitoring-locations catalog with the time-series-metadata catalog and returns one row per (location, parameter, statistic) inventory entry. This is the most flexible "what data is available" endpoint in the API: any location attribute (state, HUC, site type, drainage area, well-construction depth, …) can be combined with any time-series attribute (parameter code, statistic, data type, period of record, …) in a single query. Mirrors R's `read_waterdata_combined_meta`.
24

35
**05/05/2026:** Added `waterdata.get_samples_summary(monitoringLocationIdentifier=...)` — wraps the Samples database `/summary/{id}` endpoint, returning per-characteristic result and activity counts plus first / most recent activity dates for a single monitoring location. Useful for taking inventory of available discrete-sample data before pulling observations with `get_samples`.

dataretrieval/waterdata/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
get_continuous,
1818
get_daily,
1919
get_field_measurements,
20+
get_field_measurements_metadata,
2021
get_latest_continuous,
2122
get_latest_daily,
2223
get_monitoring_locations,
@@ -48,6 +49,7 @@
4849
"get_continuous",
4950
"get_daily",
5051
"get_field_measurements",
52+
"get_field_measurements_metadata",
5153
"get_latest_continuous",
5254
"get_latest_daily",
5355
"get_monitoring_locations",

dataretrieval/waterdata/api.py

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1761,6 +1761,123 @@ def get_field_measurements(
17611761
return get_ogc_data(args, output_id, service)
17621762

17631763

1764+
def get_field_measurements_metadata(
1765+
monitoring_location_id: str | list[str] | None = None,
1766+
parameter_code: str | list[str] | None = None,
1767+
parameter_name: str | list[str] | None = None,
1768+
parameter_description: str | list[str] | None = None,
1769+
begin: str | list[str] | None = None,
1770+
end: str | list[str] | None = None,
1771+
last_modified: str | list[str] | None = None,
1772+
properties: str | list[str] | None = None,
1773+
skip_geometry: bool | None = None,
1774+
bbox: list[float] | None = None,
1775+
limit: int | None = None,
1776+
filter: str | None = None,
1777+
filter_lang: FILTER_LANG | None = None,
1778+
convert_type: bool = True,
1779+
) -> tuple[pd.DataFrame, BaseMetadata]:
1780+
"""Get field-measurement metadata: one row per (location, parameter) series.
1781+
1782+
Each row describes a single field-measurement series — what parameter is
1783+
measured at the location, the period of record (``begin`` / ``end``), the
1784+
units, and so on — without returning the underlying observations
1785+
themselves. Use :func:`get_field_measurements` to fetch the values.
1786+
1787+
This is the discrete-measurement analogue to
1788+
:func:`get_time_series_metadata` (which describes daily and continuous
1789+
series). It's primarily useful for inventory queries: "what
1790+
field-measurement parameters does this site have, and over what date
1791+
range?"
1792+
1793+
See the OpenAPI reference for the full list of supported fields:
1794+
https://api.waterdata.usgs.gov/ogcapi/v0/openapi?f=html#/field-measurements-metadata
1795+
The R analogue is ``read_waterdata_field_meta`` in
1796+
https://github.com/DOI-USGS/dataRetrieval/.
1797+
1798+
Parameters
1799+
----------
1800+
monitoring_location_id : string or list of strings, optional
1801+
A unique identifier representing a single monitoring location, in
1802+
``AGENCY-ID`` form (e.g. ``"USGS-02238500"``).
1803+
parameter_code : string or list of strings, optional
1804+
5-digit parameter code. See
1805+
https://help.waterdata.usgs.gov/codes-and-parameters/parameters.
1806+
parameter_name : string or list of strings, optional
1807+
A human-understandable name corresponding to ``parameter_code``.
1808+
parameter_description : string or list of strings, optional
1809+
A human-readable description of what is being measured.
1810+
begin, end, last_modified : string, optional
1811+
Datetime fields that accept either an RFC 3339 datetime, an
1812+
interval (``"start/end"``, optionally half-bounded with ``..``),
1813+
or an ISO 8601 duration (e.g. ``"P1M"``, ``"PT36H"``). See
1814+
:func:`get_time_series_metadata` for the full grammar.
1815+
properties : string or list of strings, optional
1816+
Subset of columns to return. Defaults to every available property.
1817+
skip_geometry : boolean, optional
1818+
Skip per-feature geometries; the returned object will be a plain
1819+
``DataFrame`` with no spatial information.
1820+
bbox : list of numbers, optional
1821+
Only features whose geometry intersects the bounding box are
1822+
selected. Format: ``[xmin, ymin, xmax, ymax]`` in CRS 4326
1823+
(longitude / latitude, west-south-east-north).
1824+
limit : numeric, optional
1825+
Page size; the maximum allowable value is 50000. Default
1826+
(``None``) requests the maximum allowable limit.
1827+
filter, filter_lang : optional
1828+
Server-side CQL filter passed through as the OGC ``filter`` /
1829+
``filter-lang`` query parameters. See
1830+
:mod:`dataretrieval.waterdata.filters` for syntax, auto-chunking,
1831+
and the lexicographic-comparison pitfall.
1832+
convert_type : boolean, optional
1833+
If True, converts columns to appropriate types.
1834+
1835+
Returns
1836+
-------
1837+
df : ``pandas.DataFrame`` or ``geopandas.GeoDataFrame``
1838+
Formatted data returned from the API query.
1839+
md : :obj:`dataretrieval.utils.Metadata`
1840+
A custom metadata object pertaining to the query.
1841+
1842+
Examples
1843+
--------
1844+
.. code::
1845+
1846+
>>> # All field-measurement series at a surface-water site
1847+
>>> df, md = dataretrieval.waterdata.get_field_measurements_metadata(
1848+
... monitoring_location_id="USGS-02238500"
1849+
... )
1850+
1851+
>>> # Same, for a groundwater well
1852+
>>> df, md = dataretrieval.waterdata.get_field_measurements_metadata(
1853+
... monitoring_location_id="USGS-375907091432201"
1854+
... )
1855+
1856+
>>> # Multi-site, narrowed to two parameter codes
1857+
>>> df, md = dataretrieval.waterdata.get_field_measurements_metadata(
1858+
... monitoring_location_id=[
1859+
... "USGS-451605097071701",
1860+
... "USGS-263819081585801",
1861+
... ],
1862+
... parameter_code=["62611", "72019"],
1863+
... )
1864+
1865+
>>> # Series modified in the last year — useful for incremental ETL
1866+
>>> df, md = dataretrieval.waterdata.get_field_measurements_metadata(
1867+
... monitoring_location_id="USGS-375907091432201",
1868+
... parameter_code="72019",
1869+
... last_modified="P1Y",
1870+
... )
1871+
1872+
"""
1873+
service = "field-measurements-metadata"
1874+
output_id = "field_series_id"
1875+
1876+
args = _get_args(locals())
1877+
1878+
return get_ogc_data(args, output_id, service)
1879+
1880+
17641881
def get_reference_table(
17651882
collection: str,
17661883
limit: int | None = None,

tests/waterdata_test.py

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
get_continuous,
1414
get_daily,
1515
get_field_measurements,
16+
get_field_measurements_metadata,
1617
get_latest_continuous,
1718
get_latest_daily,
1819
get_monitoring_locations,
@@ -368,6 +369,36 @@ def test_get_combined_metadata_multi_site_post():
368369
assert (df["parameter_code"] == "00060").all()
369370

370371

372+
def test_get_field_measurements_metadata():
373+
df, md = get_field_measurements_metadata(
374+
monitoring_location_id="USGS-02238500", skip_geometry=True
375+
)
376+
assert "field_series_id" in df.columns
377+
assert "begin" in df.columns
378+
assert "end" in df.columns
379+
assert (df["monitoring_location_id"] == "USGS-02238500").all()
380+
assert hasattr(md, "url")
381+
assert hasattr(md, "query_time")
382+
383+
384+
def test_get_field_measurements_metadata_multi_site():
385+
df, _ = get_field_measurements_metadata(
386+
monitoring_location_id=[
387+
"USGS-07069000",
388+
"USGS-07064000",
389+
"USGS-07068000",
390+
],
391+
parameter_code="00060",
392+
skip_geometry=True,
393+
)
394+
assert (df["parameter_code"] == "00060").all()
395+
assert set(df["monitoring_location_id"].unique()) == {
396+
"USGS-07069000",
397+
"USGS-07064000",
398+
"USGS-07068000",
399+
}
400+
401+
371402
def test_get_reference_table():
372403
df, md = get_reference_table("agency-codes")
373404
assert "agency_code" in df.columns

0 commit comments

Comments
 (0)