Skip to content

Polish OpenSHA v25.4 port, upgrade pyrecodes, swap to pandarm#428

Merged
fmckenna merged 10 commits into
NHERI-SimCenter:masterfrom
zsarnoczay:master
May 20, 2026
Merged

Polish OpenSHA v25.4 port, upgrade pyrecodes, swap to pandarm#428
fmckenna merged 10 commits into
NHERI-SimCenter:masterfrom
zsarnoczay:master

Conversation

@zsarnoczay
Copy link
Copy Markdown
Contributor

@zsarnoczay zsarnoczay commented May 20, 2026

A polish pass on the existing OpenSHA v25.4.1 port, an OpenSHA jar
upgrade to v26.1.1, plus two unrelated dependency upgrades. Together,
these are a step on the path to modern macOS arm64 support and numpy 2
compatibility.

OpenSHA polish + v26.1.1 jar bump (8 commits, regionalGroundMotion)

Builds on the v1.5.2 → v25.4.1 upgrade work landed earlier:
consolidates the parallel FetchOpenSHA_25_4.py into the canonical
FetchOpenSHA.py path while preserving the pre-upgrade module as
FetchOpenSHA_1.5.2.py for users who still need it. Adds a multi-arch
JVM lookup so JPype picks the right architecture when several JDKs are
installed, pins point-source distance corrections to NSHM_2013, fixes
a Vs30 provider-index regression introduced by v25.4 (Thompson 2022
shifted the provider order), drops runtime pip install scaffolding,
replaces ~30 sys.exit() calls with exceptions, and swaps ~30 star
imports for explicit named imports. Bumps the OpenSHA jar from
v25.4.3 to v26.1.1 (no API changes required; picks up multi-server
ERF download fallback and GetFile offline support).

pyrecodes 0.3.0 (performREC)

Adopts the new pyrecodes.main.run(...) entry point and API; replaces
hard-coded calculator indices with configuration-driven lookups. Adds a
pandana → pandarm sys.modules shim (pyrecodes still imports
pandana via a vendored helper) and a libomp preload on macOS.

pandana → pandarm (ResidualDemand)

pandana is unmaintained, pins numpy<2.0, and ships no macOS arm64
wheels. pandarm is a drop-in fork. Also replaces np.in1d (removed in
numpy 2.0) with np.isin.

zsarnoczay added 10 commits May 18, 2026 10:52
…hazard

pandana is unmaintained (last release 2022), pins numpy<2.0, and ships
no macOS arm64 wheels — three blockers for a clean backend install on
modern macOS. pandarm is an actively maintained fork that preserves
the C++ routing engine (unlike pandana2, which trades the C++
extension for a pure-Python rewrite) and reinstates numpy 2
compatibility.

The migration in transportation.py is a one-line swap. pandarm exposes
Network at its top level with the same constructor and method
signatures as pandana.network.Network, so every pdna.* call site
(Network, set, shortest_paths, shortest_path_lengths, node_ids) works
unchanged. API parity verified against upstream pandana 0.7 and
pandarm sources.

Also fix the only numpy-2 removal surfaced by an audit of this module:
run_residual_demand.py:103 used np.in1d (removed in numpy 2.0).
Replace with np.isin — both inputs are 1-D, so the semantics are
identical — and drop the # noqa: NPY201 suppression.
…-code removal

Adopts the pyrecodes 0.3.0 API surface, papers over the upstream rough
edges that block local install on macOS arm64, switches from hard-coded
calculator indices to a configuration-driven lookup, and deletes a
stale duplicate file from the folder. Example 17 is now running end-to-end
against pyrecodes 0.3 in R2D.

modules/performREC/pyrecodes/run_pyrecodes.py

- Adopt pyrecodes.main.run(folder_name, main_file_name). pyrecodes 0.3
  resolves every path in the main JSON relative to folder_name, so
  modify_main_file now copies the component library into the per-
  realization run directory and writes basename references for both
  the component library and the system configuration.
- Rename R2D_GeoVisualizer -> R2DGeoVisualizer (upstream class rename).
- Rename create_current_state_figure -> create_current_recovery_state_figure.
  Upstream split the prior method into a low-level primitive plus a
  v0.2-compatible wrapper; the wrapper preserves the original "render
  component states at time T" semantics.
- Rename the create_recovery_gif kwarg file_name -> load_file_name.
- Replace hard-coded resilience_calculators[0]/[1] with two helpers:
  find_display_recodes_calculator(system) returns the first
  ReCoDeSCalculator with scope="All"; find_recovery_time_calculator(system)
  returns the first calculator exposing component_recovery_times. If
  either is missing, the affected step is skipped with a warning
  rather than silently using the wrong calculator.
- Drop the redundant default-calc-id save_supply_demand_consumption call
  from the per-resource loop; the per-calculator loop above already
  writes the same files for every ReCoDeS calculator.
- Move save_component_recovery_progress out of the per-resource loop.
  pyrecodes 0.3 writes a single aggregated
  R2D_component_recovery_progress.json, so the per-resource calls
  produced the same file N times.
- Add a sys.modules shim that exposes pandarm under the pandana name.
  pyrecodes 0.3 still bundles residual_demand_API/transportation.py
  (a vendored older SimCenter file) which top-level imports
  pandana.network. The shim bridges the only API the bundled file
  uses (Network class with .set, .shortest_paths,
  .shortest_path_lengths, .node_ids). Removable once pyrecodes ships
  a release that has dropped the pandana import.
- Preload libomp on macOS via ctypes.CDLL(..., RTLD_GLOBAL) before
  importing pandarm. Pandarm's cyaccess extension uses OpenMP
  symbols without an explicit libomp load command, so dyld can't
  find them in R2D's stripped-env subprocess. Mirrors the preload
  block in systemPerformance/ResidualDemand/transportation.py.
- Silence pandarm's CRS-default UserWarning. residual_demand_API
  rebuilds the Network every substep without passing a CRS, which
  flooded the log with hundreds of warnings.
- Replace np.in1d with np.isin (np.in1d was removed in numpy 2.0).
- Add a completion print at the end of run_pyrecodes naming the
  aggregated results file and the per-realization output directory.

modules/performREC/

- Delete transportation.py. The file was a duplicate of
  systemPerformance/ResidualDemand/transportation.py with no active
  importers in the backend. Active recovery dispatch always goes
  through pyrecodes/run_pyrecodes.py. The CMake registration is
  removed alongside.
…dule

Surgical port of FetchOpenSHA.py to OpenSHA v25.4.1: the file is otherwise
the legacy module with inline `# v25.4:` comments marking each forced
change (jar path, geo/json shadowing, PtSrcDistCorr removal, GMM
constructor signatures, PointSource zHyp). The pre-upgrade module is
snapshotted as FetchOpenSHA_1.5.2.py for users who still need the legacy
jar.

Reviewer note: git's default rename heuristic pairs the removed
FetchOpenSHA_25_4.py (Frank's parallel WIP, now deleted) with
FetchOpenSHA_1.5.2.py. The actual mapping is:

  FetchOpenSHA.py (v1.5.2)  -> FetchOpenSHA_1.5.2.py (verbatim + 3-line header)
  FetchOpenSHA.py           -> FetchOpenSHA.py (surgical v25.4 changes)
  FetchOpenSHA_25_4.py      -> deleted (folded into FetchOpenSHA.py)

calculation_single_proc.py still pins the legacy jar and is not invoked by
R2D; a warning header marks it as out of scope for this upgrade.

Subsequent commits in this series layer audit fixes, JVM bootstrap
hardening, scaffolding removal, and import cleanups on top.
Three audit fixes on top of the v25.4 base port; each addresses a silent
v25.4 regression that the inline `# v25.4:` comments cover in more detail:

  - NGAW2_Wrappers: bind the nested wrapper classes explicitly, since
    v25.4 reorganized them under a class (not a sub-package) and the
    star import was importing nothing.

  - Vs30 fallback: look up the Wald & Allen global provider by name
    instead of by index. v25.4 inserted Thompson 2022 at index 0 and
    shifted the global provider from index 3 to index 4; the old
    hard-coded index now returns the CA-only Wills Map 2006.

  - Point-source distance correction: pin it to NSHM_2013 (the current
    USGS standard) per ERF rather than rely on the v25.4 per-ERF
    defaults (UCERF2/MeanUCERF2 still default to NSHM_2008).

Also guards `updateForecast()` with `if erf is not None:` so an
unrecognized ERF name no longer crashes with `None.updateForecast()`.
When multiple JDKs of different architectures are installed side-by-side
(possible on Apple Silicon Macs, where an x86_64 alongside an arm64 JDK
is enough to trigger this), JPype's default JVM lookup picks the first
one it finds and may load a JVM whose architecture does not
match Python, failing with `FileNotFoundError: JVM DLL not found`.

GlobalVariable.find_compatible_jvm_path() resolves the right JVM via
`/usr/libexec/java_home -a <arch>` on macOS and a JAVA_HOME / Program
Files scan on Windows; on Linux and other platforms it returns None and
lets JPype fall back to its default lookup. The four entry-point scripts
(FetchOpenSHA, HazardSimulation, HazardSimulationEQ, ScenarioForecast)
now pass the result through to jpype.startJVM(jvmpath=...) unconditionally.

HazardSimulation.py was previously calling jpype.startJVM without
importing GlobalVariable; this commit adds the missing import.
Eight files were calling `subprocess.check_call([..., 'pip', 'install', ...])`
at import time to bootstrap their dependencies. All of this is removed:

  - JPype1 install (six entry points)
  - joblib and contextlib install (CreateStation)
  - OpenQuake engine install/uninstall/version-juggling (FetchOpenQuake)

These dependencies are now declared upstream in the SimCenter Python
distribution (nheri_simcenter[r2d]); the bootstrap was redundant in a
properly-configured environment and silently dangerous in a misconfigured
one — it could downgrade an existing OpenQuake install, target the wrong
site-packages with `--user`, or fail mid-import on read-only systems.

FetchOpenQuake's OQVersion handling becomes a detect-and-warn: if the
installed engine version does not match the scenario's OQVersion, the
user gets a warning and the run proceeds with whatever is installed.
Local OpenQuake checkouts via OQLocal still work.

Stale `import subprocess` / `import importlib` / `import sys` statements
that existed only to support the scaffolding are removed where they
become unused; lines that retain other uses are kept.
…ions

Library code was terminating the process via `sys.exit(message)` on ~45
error conditions across 14 files. Library code that calls `sys.exit`
cannot be caught by callers, breaks unit-testability, and prevents
graceful cleanup (e.g., JVM teardown).

Each `sys.exit(message)` becomes the most appropriate exception:

  - FileNotFoundError:   source rupture file missing
  - ValueError:          bad input (empty file, CRS mismatch, missing IM,
                         duplicated stations, unsupported user choice)
  - NotImplementedError: unsupported model / generator / correlation method
  - OSError:             raster file unreadable
  - RuntimeError:        OpenQuake DbServer / classical-PSHA failures, etc.

Three top-level `sys.exit(0)` calls at the end of __main__ blocks
(HazardSimulation, HazardSimulationEQ, ScenarioForecast) are dropped: a
process that reaches the end of __main__ already exits with status 0.

`exit()` / `exit(1)` / `exit(-1)` calls inside narrow `except:` blocks
for missing user-supplied input files are kept as deliberate exits
(annotated with `# noqa: PLR1722`).

`import sys` is removed from files where this commit makes it unused.
…med imports

The regionalGroundMotion module relied on ~30 `from X import *` statements.
Star imports made it impossible to see which symbols a file actually needed,
allowed the v25.4 `org.opensha.commons.geo.json` sub-package to silently
shadow Python's stdlib `json`, and brought in dozens of unused Java classes
per file.

Every active star import is replaced with an explicit named import list
limited to the symbols each file actually uses. The legacy snapshot
(FetchOpenSHA_1.5.2.py) is untouched.

Notable consequences:

  - `import ujson as python_json` / `import json as python_json` aliases
    (workarounds for the geo.json shadowing) are reverted to the natural
    `as json` / `import json` forms in CreateScenario and HazardOccurrence.

  - HazardSimulation and HazardSimulationEQ previously called compute_im,
    create_stations, select_ground_motion etc. while having the matching
    star imports commented out — the explicit imports are now in place
    so the calls actually resolve. (References to download_ground_motion
    and parse_record are NOT added; those are phantom names handled in
    the next commit.)

  - FetchOpenSHA: the defensive try/except around
    PointSourceDistanceCorrection(s) is replaced with a direct
    `from ...utils import PointSourceDistanceCorrections`. The singular
    form was imported but never used.

  - ScenarioForecast's `from FetchOpenQuake import *` block was dead and
    is dropped.

  - `from jpype.types import *` (live in five files) brought in JBoolean,
    JInt, JString etc. that none of the files actually used. The one
    site that needs JInt (testFetchOpenSHA) imports it locally.

  - Several Java/OpenSHA star imports (java.io, java.lang.reflect,
    commons.data.function, commons.param.constraint, sha.calc,
    sha.faultSurface, sha.imr, gcim.calc, gcim.imr.param.*) provided no
    symbols actually referenced by the code and are dropped entirely.

`# noqa: F405` annotations on call sites are left in place
…er dead imports

After the explicit-imports refactor in the previous commit, seven call
sites in HazardSimulation and HazardSimulationEQ still reference names
that are not actually imported:

  - download_ground_motion, parse_record: SelectGroundMotion.py defines
    these only inside a triple-quoted "Uncommenting below" block, so they
    are not part of its public surface. The NGAWest2 record-download
    branch that calls them is dead code.

  - export_sampled_gmms (HazardSimulation): the actual function is a
    method on the occurrence model object (the EQ variant calls it
    correctly as occurrence_model_gmm.export_sampled_gmms(...)). The
    standalone call here was a pre-existing typo.

  - dir_info (HazardSimulationEQ, two sites): hazard_info['Directory']
    is used in the surrounding scope; dir_info itself is never bound.
    Pre-existing latent bug in the OpenQuake Classical PSHA branch.

All seven sites previously carried `# noqa: F405` annotations, which were
silently misleading: F405 means "may be defined via star import", but
once the star imports are gone (previous commit) it is clear that they
are not defined. They are re-annotated with `# noqa: F821 -- pre-existing
latent bug; ...` and a NOTE comment in each entry-point's __main__ block
documents the phantom imports for future readers.

Also drops three top-level imports in CreateStation.py that earlier
cleanup commits left orphaned:

  - `import sys` (sys.path is only used inside get_soil_model_user(),
    which already has its own local import)
  - `import importlib` (importlib.__import__ is only used inside the
    same function, with its own local import)
  - `import multiprocessing` (only referenced in comments)
Drops the v25.4.3-built `opensha-all.jar` in favor of the v26.1.1
build. All API surfaces this module touches (GMM constructors, ERF
classes, PointSourceDistanceCorrections, NGAW2_Wrappers nested
classes, OrderedSiteDataProviderList, SiteTranslator, etc.) are
unchanged between v25.4.3 and v26.1.1 — no Python code changes were
required beyond bumping the version strings in the `OPENSHA_JAR`
comments and the testFetchOpenSHA banner. Picks up multi-server ERF
download fallback (CARC project2 → OpenSHA alias → hardcoded) and
GetFile v25.11.0 offline ERF support along the way.
@fmckenna fmckenna merged commit 9642d70 into NHERI-SimCenter:master May 20, 2026
0 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants