Auto-retry live-API tests on transient upstream errors by thodson-usgs · Pull Request #277 · DOI-USGS/dataretrieval-python

thodson-usgs · 2026-05-15T14:57:51Z

Summary

After #273 merged, the next CI run on main failed with a transient HTTP 502: Bad Gateway from the live USGS Water Data API surfacing through test_get_field_measurements (run). The 502 was real upstream behavior — pre-#273, _walk_pages would have silently swallowed it and the test would have passed on truncated data; post-#273, _raise_for_non_200 correctly propagates it.

That's the intended behavior for users, but it means every existing test that hits the live API is now flaky against any transient blip the USGS API has. There are ~25 such tests in waterdata_test.py. This PR makes the CI test suite tolerant of those transients — no library code changes, no new tests, only test infrastructure.

Fix

tests/waterdata_test.py:

Adds a module-level pytestmark = pytest.mark.flaky(...) declaring the retry policy explicitly at the file scope where live tests live. Opt-in is per-file, not auto-applied by a global collection hook.
Configures pytest-rerunfailures to retry up to twice on a 5-second backoff, but only when the failure trace matches a narrow set of transient-upstream patterns: RuntimeError: 429: / 5\d\d: (from _raise_for_non_200), ConnectionError, and Timeout shapes from the requests stack.
Library bugs that raise other exception types fail on the first attempt — the only_rerun set deliberately omits generic RuntimeError.
The marker is attached to every test in the module, but because the only_rerun regex matches only traces from real network round-trips, mocked tests (those using requests_mock or mock.patch) are no-ops for the rerun by construction.

pyproject.toml: adds pytest-rerunfailures to the test optional-deps set.

Also bumps actions/checkout@v4 → @v6 and actions/setup-python@v5 → @v6 across all three workflows ahead of the GitHub Actions Node 20 deprecation (warning surfaced on the same CI run).

Design note

An earlier revision of this PR placed the retry logic in tests/conftest.py with a pytest_collection_modifyitems hook that auto-applied the marker based on a _PUBLIC_GETTERS source-inspection heuristic. That approach was rejected during review in favor of explicit module-level opt-in (commit 047130f): the test author who imports the live API declares the marker at the top of the file. Trade-off is intentional — small amount of "marker on a mocked test" in exchange for not maintaining implicit auto-marking infrastructure.

What this PR is NOT

No library behavior changes — only test infrastructure.
No new tests — the existing tests are the thing being kept green; adding tests for the marker itself would be infrastructure-on-infrastructure.
No NEWS entry — CI plumbing isn't user-facing.

Test plan

pip install -e '.[test]' resolves cleanly with the new dep
pytest --collect-only shows the marker attached to 61 parametrize-expanded items in waterdata_test.py and zero items outside it
All offline tests in waterdata_test.py pass unchanged
CI on this branch validates end-to-end against real CI infrastructure

🤖 Generated with Claude Code

DOI-USGS#273 surfaced a transient HTTP 502 from upstream on its merge-to-main CI run that pre-PR-273 would have been silently swallowed by _walk_pages and turned into an empty DataFrame. The status-code-aware behavior is correct for users, but it makes every test that hits the live USGS Water Data API susceptible to flaking on a transient blip. This PR adds: - pytest-rerunfailures to the `test` optional dependency set. - tests/conftest.py that (a) registers a `live` pytest marker; (b) auto-applies it to every test that does not take `requests_mock` as a fixture (the existing mock-driven convention in this repo); and (c) configures live-marked tests to retry up to twice on a 5-second backoff — but ONLY when the failure trace matches one of a narrow set of transient-upstream patterns: `429:` / `5xx:` prefixes from `_raise_for_non_200`, `ConnectionError` shapes, and timeout strings from the requests/urllib3 stack. Library bugs raising other exception types are not retried — the `only_rerun` pattern set deliberately matches on the failure-text shape produced by `_raise_for_non_200` rather than on generic `RuntimeError`, so a real bug fails on the first attempt. Verified: pytest collection identifies 59 of 61 tests in waterdata_test.py as live (the two `*_mock_*` tests correctly stay unmarked). Offline tests (`_get_args`, `_check_*`, `_normalize_str_*`, `_construct_api_requests_*`) pass unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@v6

Simplify-review consensus across three reviewers (reuse / quality / efficiency) on the conftest.py from the previous commit: - Drop the `pytest.mark.live` marker double-application: no `-m live` selector is wired anywhere in CI; only `pytest.mark.flaky` carries behavior. The `live` marker plus its `pytest_configure` registration was dead metadata. - Tighten `_TRANSIENT_PATTERNS`: drop `r"timed out"` (subsumed by the `Timeout` arm) and `r"Bad Gateway|Service Unavailable|Gateway Timeout"` (subsumed by the `RuntimeError:\s*5\d\d:` prefix from `_raise_for_non_200`). Three load-bearing patterns left. - Drop defensive `getattr(item, "fixturenames", ())` — pytest items reliably expose `fixturenames`; the default-tuple fallback is dead defense. - Shorten the module docstring: keep the WHY paragraph (PR DOI-USGS#273 → surfaced errors → CI flake risk) and drop the WHAT bullets that narrate the code below. - Add a one-line justification for the 5-second backoff (USGS upstream typically recovers from brief 5xx within that window). - Tighten the NEWS.md entry: drop conftest implementation detail (registers a marker, applies via collection hook, …) and keep the user-relevant claim. Skipped from the review (reviewer can decide): - Inverting the auto-mark heuristic to opt-in `@pytest.mark.live` would force decorating ~35 test functions; the current "everything not using requests_mock is live" is zero-cost on success and self-extending for new tests. - Centralizing the transient-error regex set in `dataretrieval/ waterdata/utils.py` would couple library code to test infrastructure for one pattern. Separately, bump GitHub Actions to Node-24-compatible versions (@v6 across all three workflows) ahead of the deprecation warning surfaced on the previous CI run: actions/checkout@v4 → v6, actions/setup-python@v5 → v6. Node 20 is deprecated and will be removed from runners on 2026-09-16; the default flips on 2026-06-02. NEWS.md entry updated to mention the bump. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

This PR exists only to keep existing tests green against transient upstream blips; no user-facing behavior changes. NEWS entries are for library behavior changes, not CI plumbing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR aims to reduce CI flakiness from transient upstream failures in live USGS Water Data API tests by automatically re-running only those tests when failures match narrow “transient” error signatures. It also updates GitHub Actions workflow action versions.

Changes:

Add tests/conftest.py to mark non-requests_mock tests as flaky and auto-rerun them (with delay) only on transient upstream/network error patterns.
Add pytest-rerunfailures to the test optional dependency set.
Bump actions/checkout and actions/setup-python versions across workflows.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tests/conftest.py	Adds collection hook to auto-mark “live” tests for conditional reruns on transient errors.
pyproject.toml	Adds `pytest-rerunfailures` to test extras.
.github/workflows/sphinx-docs.yml	Updates checkout/setup-python action versions.
.github/workflows/python-publish.yml	Updates checkout/setup-python action versions (but see review comment re: YAML validity).
.github/workflows/python-package.yml	Updates checkout/setup-python action versions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The "auto-mark every test that doesn't take requests_mock" heuristic was overkill — out of 244 collected tests it marked 200, including every pure unit test of internal helpers (rdb parsing, _to_str, CQL filter splitting, NWIS_Metadata setters, validation logic, etc). Marker over-application is cheap on the success path but conveys the wrong intent and is one more thing to keep in sync. Refined heuristic: a test is live iff (a) it does not take requests_mock as a fixture, AND (b) its function-body source references one of the package's public user-facing getters (the _PUBLIC_GETTERS set — all `get_*` / `what_*` entry points exposed by waterdata, wqp, samples, nadp, streamstats, nldi, nwis, and ngwmn modules). The set is maintained explicitly because there's no programmatic way for the conftest to know "what counts as a user-facing entry point"; adding a new public getter means adding it here. A regex with `\b` word boundaries avoids matching internal helpers that happen to share a prefix (e.g. `_get_args` won't match `get_args`). Dry-run against the current test inventory: - Old heuristic marked: 200 / 244 - New heuristic marked: 89 / 244 - Now-unmarked unit tests: 111 Spot-checked the 111 de-marked tests — all are genuinely offline: rdb_test.py parsers, utils_test.py to_str/BaseMetadata/datetime helpers, waterdata_filters_test.py CQL-filter logic, nldi_test.py validators, nwis_test.py metadata setters, etc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

tests/ngwmn_test.py is local-only work not yet in the repo. The ngwmn module itself is committed, but no upstream tests reference it through the conftest's selection path, so listing get_data/get_levels/get_sites in _PUBLIC_GETTERS adds maintenance surface for no upstream benefit. Add these back if/when ngwmn tests land. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

PR DOI-USGS#273 fixed pagination loops in dataretrieval.waterdata.utils specifically; other modules (wqp, samples, nadp, streamstats, nldi, nwis) reach the network through different paths and don't share that regression. Narrow _PUBLIC_GETTERS to waterdata's 16 user-facing getters and add a tests/waterdata file-path scope check, which also prevents name-collisions with legacy nwis getters (get_ratings, get_daily) from spuriously marking nwis tests. After this change, the retry marker is applied to 55 tests across tests/waterdata_test.py, tests/waterdata_nearest_test.py, tests/waterdata_filters_test.py, and tests/waterdata_ratings_test.py, with zero matches outside that directory. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous conftest hook auto-applied pytest.mark.flaky to tests by inspecting function source for references to a central _PUBLIC_GETTERS frozenset. This was implicit and required keeping that set in sync with the public API. Switch to explicit opt-in: drop conftest.py and declare the marker at the top of waterdata_test.py via pytestmark. The other waterdata_*_test.py files are entirely mocked or pure unit tests and need no retry. The retry config (reruns count, delay, transient-error trace patterns) is co-located with the tests it protects. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Make explicit that the marker is attached to every test in the module, but the only_rerun patterns are tied to real-network failure traces, so mocked tests are no-ops for the rerun by construction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

thodson-usgs and others added 3 commits May 15, 2026 09:57

thodson-usgs requested a review from Copilot May 15, 2026 15:05

Copilot started reviewing on behalf of thodson-usgs May 15, 2026 15:06 View session

Copilot AI reviewed May 15, 2026

View reviewed changes

Comment thread .github/workflows/python-publish.yml

thodson-usgs and others added 4 commits May 15, 2026 10:12

thodson-usgs requested a review from Copilot May 15, 2026 15:57

Copilot started reviewing on behalf of thodson-usgs May 15, 2026 15:58 View session

Copilot AI reviewed May 15, 2026

View reviewed changes

Comment thread tests/waterdata_test.py Outdated

thodson-usgs requested a review from Copilot May 15, 2026 16:12

Copilot started reviewing on behalf of thodson-usgs May 15, 2026 16:12 View session

Copilot AI reviewed May 15, 2026

View reviewed changes

Comment thread tests/waterdata_test.py

thodson-usgs requested a review from Copilot May 15, 2026 16:48

Copilot started reviewing on behalf of thodson-usgs May 15, 2026 16:49 View session

Copilot AI reviewed May 15, 2026

View reviewed changes

thodson-usgs marked this pull request as ready for review May 15, 2026 18:49

thodson-usgs merged commit c3bc2b8 into DOI-USGS:main May 15, 2026
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto-retry live-API tests on transient upstream errors#277

Auto-retry live-API tests on transient upstream errors#277
thodson-usgs merged 8 commits into
DOI-USGS:mainfrom
thodson-usgs:fix/live-test-retries

thodson-usgs commented May 15, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

thodson-usgs commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix

Design note

What this PR is NOT

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

thodson-usgs commented May 15, 2026 •

edited

Loading