From 8fa7dd7c77da2bfedeb4e0fe0e00180f35c79146 Mon Sep 17 00:00:00 2001 From: Philippe Llerena Date: Sun, 17 May 2026 10:05:02 +0200 Subject: [PATCH 1/2] feat(python): add PackageData.from_strings raw-string fast path MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes #88. ## Why `pyrer.PackageData.from_rez(pkg)` is hot in rez integrations — every package the shim materialises pays the full cost. The current path walks rez's `AttributeForwardMeta` per attribute, lets rez parse each requirement string into a `Requirement` object, and then immediately calls `str(req)` on every one to round-trip back to the raw string pyrer wants in the first place. Per the issue, on the 188-case rez benchmark via the downstream shim that's ~50 packages materialised per resolve × ~10–20 round-trips per package — a few percent of total end-to-end wall time, completely avoidable when the caller has access to `pkg.resource.data` (which already holds these as raw `list[str]` in the common non-late-bound case). ## What A new classmethod symmetric with `from_rez`: ```python pyrer.PackageData.from_strings( name: str, version: str, requires: Iterable[str] | None = None, variants: Iterable[Iterable[str]] | None = None, ) -> PackageData ``` Skips: - `AttributeForwardMeta` lookup (no `pkg.requires` walk). - The `Requirement` parse (no `Version` / `VersionRange` AST built). - The `str(Requirement)` round-trip per requirement. Accepts `None` for `requires` / `variants` to play well with `data.get("requires")` — no `or ()` boilerplate. ## Honest framing on the perf claim `from_strings` is **functionally equivalent** to the four-arg constructor `PackageData(name, version, requires, variants)`. Both take the same fast PyO3 extraction path (PyO3 extracts `Vec` directly from any iterable of `PyUnicode` — no `.str()` round-trip). Any rez-shim caller using the constructor with raw strings from `pkg.resource.data` already gets this perf today. The classmethod form's value isn't a new fast path, it's: - A named, documented contract ("raw strings only — no wrapper objects, no late-bound source code"). Mirrors `from_rez`'s naming. - Discoverability — a place to land in autocomplete and docs when the caller is wondering "what's the fast path?". - One canonical site to update if we ever add real fast-path specialisations (e.g. interning the family name, pre-allocating the `Vec`s sized to the iterable's `__len__`). The docs (`docs/content/docs/getting-started/rez-integration.md`) get a worked example showing the recommended shim pattern: try `from_strings` against `pkg.resource.data`, fall back to `from_rez` for `@early` / `@late`-bound attributes where the raw data is a `SourceCode` instance instead of a `list[str]`. ## Tests 7 new tests in `tests/test_rich_api.py`: - `test_from_strings_basic` — happy path - `test_from_strings_defaults_to_empty` — `requires` / `variants` default - `test_from_strings_accepts_none_for_collections` — `dict.get` ergonomics - `test_from_strings_accepts_tuples_and_iterables` — non-list iterables - `test_from_strings_matches_constructor` — same `PackageData` as `__new__` - `test_from_strings_drives_solve_like_from_rez` — end-to-end: a solve fed via `from_strings` resolves identically to one fed via `from_rez` against an equivalent fake-rez Package - `test_from_strings_rejects_non_string_requires` — contract-strict: passing an object with `__str__` raises `TypeError` rather than silently stringifying. The contract is "raw strings only" — use `from_rez` (or pre-stringify) for object inputs. ## Verification - `cargo build`: clean. - `cargo test --lib -p rer-resolver`: **44/44**. - `pytest tests/`: **94/94** (was 87 + 7 new). - `cargo test --release -p rer-resolver --test test_rez_benchmark -- --ignored` (strict 188-case rez differential): **188/188 in 16.52 s** — unchanged. No new Rust code on the solver hot path, no shape change to `PackageData` itself; this is a one-method addition to the PyO3 bridge. Co-Authored-By: Claude Opus 4.7 --- CHANGELOG.md | 11 ++ crates/rer-python/src/lib.rs | 62 ++++++++++ .../docs/getting-started/rez-integration.md | 44 +++++++ tests/test_rich_api.py | 113 ++++++++++++++++++ 4 files changed, 230 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 2aa6ea4..5d7055f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -14,6 +14,17 @@ page. ### Added +- **`PackageData.from_strings(name, version, requires=None, variants=None)`** — + classmethod constructor for raw-string callers, symmetric with + `from_rez(pkg)`. Skips rez's `AttributeForwardMeta` chain, the + `Requirement` parse, and the `str(Requirement)` round-trip — the + latter being a measurable fraction of integration overhead on + rez-shim hot paths (per-package, every package, every resolve). + Functionally equivalent to the four-arg constructor; the + classmethod form exists so callers wiring `pkg.resource.data` + through pyrer have a named, documented contract to reach for. + Falls back to `from_rez` for `@early` / `@late`-bound attributes. + Closes #88. - **`load_family` callback** on `pyrer.solve()` — opt-in lazy package discovery: pass `load_family: Callable[[str], list[PackageData]]` and the solver calls it on demand the first time it needs a family it hasn't seen. diff --git a/crates/rer-python/src/lib.rs b/crates/rer-python/src/lib.rs index 75952d8..c6792be 100644 --- a/crates/rer-python/src/lib.rs +++ b/crates/rer-python/src/lib.rs @@ -85,6 +85,16 @@ impl PackageData { /// `pyrer` so every integration site doesn't have to write the same /// extraction loop. `pyrer` itself does not import rez; this method is /// duck-typed and works against any object with the four attributes. + /// + /// **Faster alternative for raw-data callers:** if you already have the + /// raw strings (typically from `pkg.resource.data` on a rez `Package`, + /// which stores `requires` / `variants` as raw `list[str]` / + /// `list[list[str]]` in the common non-late-bound case), prefer + /// [`Self::from_strings`]. It skips the per-attribute + /// `AttributeForwardMeta` lookup, the late-bound wrapping, the + /// `Requirement` parse, and the `str(Requirement)` round-trip — none + /// of which produce a different `PackageData` for the common case, + /// but all of which take real time per package. #[classmethod] fn from_rez(_cls: &Bound<'_, PyType>, pkg: &Bound<'_, PyAny>) -> PyResult { let name: String = pkg.getattr("name")?.extract()?; @@ -98,6 +108,58 @@ impl PackageData { variants, }) } + + /// Build a [`PackageData`] from raw strings, skipping any rez + /// wrapper-object resolution. Use this when you already have raw + /// `(name, version, requires, variants)` data — typically pulled from + /// `pkg.resource.data` on a rez `Package`: + /// + /// ```python + /// data = pkg.resource.data + /// pd = pyrer.PackageData.from_strings( + /// data["name"], + /// data["version"], + /// data.get("requires"), # may be None / list[str] + /// data.get("variants"), # may be None / list[list[str]] + /// ) + /// ``` + /// + /// Faster than [`Self::from_rez`] on rez-integration hot paths because + /// it does not trigger rez's `AttributeForwardMeta` per attribute, does + /// not parse each requirement string into a `Requirement` object, and + /// does not round-trip each `Requirement` back through `__str__`. + /// + /// `requires` and `variants` accept `None` (interpreted as empty), + /// matching `dict.get(...)` ergonomics. + /// + /// Functionally equivalent to the four-arg constructor + /// `PackageData(name, version, requires, variants)` — both take the + /// same fast PyO3 extraction path. The classmethod form exists to make + /// the fast path discoverable alongside [`Self::from_rez`] and to give + /// the contract a name in callers' code. Closes #88. + /// + /// **Caveat — late-bound requirements:** for packages where rez stores + /// `requires` or `variants` as a `SourceCode` instance (`@early` / + /// `@late` binding), `pkg.resource.data["requires"]` is *not* a + /// `list[str]` and this method will raise. Fall back to + /// [`Self::from_rez`] for those packages — it walks rez's lazy + /// attribute path which evaluates the source code. + #[classmethod] + #[pyo3(signature = (name, version, requires=None, variants=None))] + fn from_strings( + _cls: &Bound<'_, PyType>, + name: String, + version: String, + requires: Option>, + variants: Option>>, + ) -> Self { + PackageData { + name, + version, + requires: requires.unwrap_or_default(), + variants: variants.unwrap_or_default(), + } + } } /// Pull a flat list of requirement strings from a Python object that is diff --git a/docs/content/docs/getting-started/rez-integration.md b/docs/content/docs/getting-started/rez-integration.md index d633c33..89f79e7 100644 --- a/docs/content/docs/getting-started/rez-integration.md +++ b/docs/content/docs/getting-started/rez-integration.md @@ -79,6 +79,50 @@ is duck-typed — `pyrer` itself does not import rez — so you can also pass any object exposing the same four attributes (e.g. a test fixture). +### Faster construction with `from_strings` + +`from_rez(pkg)` triggers rez's `AttributeForwardMeta` chain on every +attribute and parses each requirement string into a `Requirement` +object only to immediately turn it back into a string. When you +already have the raw strings, prefer +`PackageData.from_strings(name, version, requires, variants)` — +it skips the wrapper round-trip entirely: + +```python +def build_pyrer_packages_fast(package_paths): + for family in iter_package_families(paths=package_paths): + for pkg in family.iter_packages(): + data = pkg.resource.data + # `data["requires"]` is a raw list[str] in the common + # (non-late-bound) case; fall back to from_rez otherwise. + if isinstance(data.get("requires", []), list) and \ + isinstance(data.get("variants", []), list): + yield pyrer.PackageData.from_strings( + data["name"], + data["version"], + data.get("requires"), + data.get("variants"), + ) + else: + # @early / @late bindings — let rez evaluate them. + yield pyrer.PackageData.from_rez(pkg) +``` + +The `from_strings` method: + +- Skips the per-attribute `AttributeForwardMeta` lookup. +- Skips the `Requirement` parse (no `Version` / `VersionRange` AST + is built then discarded). +- Skips the `str(Requirement)` round-trip per requirement. +- Accepts `None` for `requires` / `variants` (matches + `dict.get(...)` ergonomics — no `or ()` boilerplate needed). + +Functionally equivalent to the four-arg constructor; the +classmethod form exists so the contract has a name. **Always fall +back to `from_rez` for packages with `@early` or `@late` binding** — +in those cases `resource.data["requires"]` is a `SourceCode` +instance, not a `list[str]`, and `from_strings` will raise. + Two notes on this step: - It is **eager** — every package on every path is loaded before the diff --git a/tests/test_rich_api.py b/tests/test_rich_api.py index 3370e8b..c20d2ba 100644 --- a/tests/test_rich_api.py +++ b/tests/test_rich_api.py @@ -265,6 +265,119 @@ class NotAPackage: pyrer.PackageData.from_rez(NotAPackage()) +# --------------------------------------------------------------------------- +# PackageData.from_strings — raw-string fast path (issue #88) +# --------------------------------------------------------------------------- + + +def test_from_strings_basic(): + """All four args supplied as raw strings — no wrapper objects involved.""" + pd = pyrer.PackageData.from_strings( + "maya", + "2024.0", + ["python-3"], + [["python-3.10"], ["python-3.11"]], + ) + assert pd.name == "maya" + assert pd.version == "2024.0" + assert pd.requires == ["python-3"] + assert pd.variants == [["python-3.10"], ["python-3.11"]] + + +def test_from_strings_defaults_to_empty(): + """requires=None and variants=None default to empty lists.""" + pd = pyrer.PackageData.from_strings("foo", "1.0") + assert pd.requires == [] + assert pd.variants == [] + + +def test_from_strings_accepts_none_for_collections(): + """`dict.get("requires")` returns None for a missing key — must accept it.""" + pd = pyrer.PackageData.from_strings("foo", "1.0", None, None) + assert pd.requires == [] + assert pd.variants == [] + + +def test_from_strings_accepts_tuples_and_iterables(): + """PyO3 extracts Vec from any iterable, not just list.""" + pd = pyrer.PackageData.from_strings( + "tool", + "1.0", + ("python-3", "qt-5"), + (("linux", "python-3.10"),), + ) + assert pd.requires == ["python-3", "qt-5"] + assert pd.variants == [["linux", "python-3.10"]] + + +def test_from_strings_matches_constructor(): + """`from_strings` must produce the same PackageData as the four-arg + constructor — same fast PyO3 extraction path, classmethod is just a + named alias for callers wiring rez's resource.data through pyrer.""" + args = ("maya", "2024.0", ["python-3"], [["python-3.10"], ["python-3.11"]]) + via_classmethod = pyrer.PackageData.from_strings(*args) + via_constructor = pyrer.PackageData(*args) + assert via_classmethod.name == via_constructor.name + assert via_classmethod.version == via_constructor.version + assert via_classmethod.requires == via_constructor.requires + assert via_classmethod.variants == via_constructor.variants + + +def test_from_strings_drives_solve_like_from_rez(): + """End-to-end: a solve fed via from_strings produces the same result as + one fed via from_rez against an equivalent fake-rez Package.""" + + class FakeReq: + def __init__(self, s): + self._s = s + + def __str__(self): + return self._s + + class FakePkg: + def __init__(self, name, version, requires=None, variants=None): + self.name = name + self.version = version + self.requires = ( + [FakeReq(r) for r in requires] if requires else None + ) + self.variants = ( + [[FakeReq(r) for r in v] for v in variants] if variants else None + ) + + fakes = [ + FakePkg("app", "1.0.0", requires=["lib-2"]), + FakePkg("lib", "1.0.0"), + FakePkg("lib", "2.0.0"), + ] + via_from_rez = [pyrer.PackageData.from_rez(p) for p in fakes] + + via_from_strings = [ + pyrer.PackageData.from_strings("app", "1.0.0", ["lib-2"]), + pyrer.PackageData.from_strings("lib", "1.0.0"), + pyrer.PackageData.from_strings("lib", "2.0.0"), + ] + + result_a = pyrer.solve(["app"], via_from_rez) + result_b = pyrer.solve(["app"], via_from_strings) + assert result_a.resolved == result_b.resolved + assert result_a.status == result_b.status == "solved" + + +def test_from_strings_rejects_non_string_requires(): + """from_strings is the contract-strict fast path — pass it a non-string + in `requires` and it raises rather than silently stringifying. Use + `from_rez` (or pre-stringify) for object inputs.""" + import pytest + + class NotAString: + def __str__(self): + return "python-3" + + with pytest.raises(TypeError): + pyrer.PackageData.from_strings("foo", "1.0", [NotAString()]) + + # --------------------------------------------------------------------------- # variant_select_mode — rez's intersection_priority vs version_priority # --------------------------------------------------------------------------- From e79644f505f0f1ba450ff9984665fbe5e2f5ae27 Mon Sep 17 00:00:00 2001 From: Philippe Llerena Date: Sun, 17 May 2026 10:18:21 +0200 Subject: [PATCH 2/2] bench(python): micro-benchmark PackageData construction paths (#88) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Companion to #89 (`PackageData.from_strings`). Establishes the baseline for the issue's perf claim — how much does `from_rez(pkg)` actually cost versus `from_strings(...)` / the four-arg constructor, and what share of `pyrer.solve()` wall time is construction. `scripts/bench_python_construction.py` is a stdlib-`timeit`-based script (no extra deps) that measures: 1. Per-call construction (μs/call, isolated). 2. Per-batch construction for N packages (ms/batch). 3. End-to-end `pyrer.solve()` time with packages built each way. 4. Construction-vs-solve share of total wall time. `FakeRezPackage` mimics rez's `Package` shape: `requires` / `variants` surface `FakeRequirement` objects (not `str`), so `from_rez` pays the per-element `__str__` round-trip the issue talks about. Not part of CI — perf is machine-dependent and noisy. Run on demand; the script reports its own input parameters in the output. ## Results on this machine Intel Xeon E5-2699 v4 @ 2.20 GHz, CPython 3.13.9, pyrer 0.1.0-rc.8. ### N=50 packages (1 build/solve iteration ≈ 1 ms) ``` Per-call construction (median μs, one package): PackageData(name, version, requires, variants): 4.57 μs PackageData.from_strings(...): 4.93 μs PackageData.from_rez(fake_rez_pkg): 6.96 μs from_strings vs from_rez: -2.02 μs / -29.1% __new__ vs from_rez: -2.39 μs / -34.3% (matches from_strings, same PyO3 path) End-to-end pyrer.solve(['app'], pkgs): solve() alone (pre-built): 0.625 ms build (from_strings) + solve: 0.892 ms build (from_rez) + solve: 0.996 ms Build-phase share of total (from_strings): 29.9% Build-phase share of total (from_rez): 37.2% End-to-end savings (from_rez → from_strings): 0.104 ms (10.4%) ``` ### N=500 packages ``` Per-batch construction (502 pkgs/batch): via PackageData(...): 2.260 ms / 4.50 μs/pkg via from_strings(...): 2.427 ms / 4.83 μs/pkg via from_rez(fake_pkg): 3.087 ms / 6.15 μs/pkg Savings switching from_rez → from_strings: 0.661 ms per batch End-to-end: solve() alone (pre-built): 2.590 ms build (from_strings) + solve: 5.091 ms build (from_rez) + solve: 5.698 ms Build-phase share of total (from_strings): 49.1% Build-phase share of total (from_rez): 54.5% End-to-end savings (from_rez → from_strings): 0.607 ms (10.6%) ``` ## What the numbers say - **`__new__` ≈ `from_strings`** (4.57 vs 4.93 μs/pkg). Confirms the claim in #89: they take the same fast PyO3 extraction path. The ~0.4 μs delta is classmethod dispatch overhead. - **`from_rez` is ~30-40% slower** per package than the raw-string paths. Real cost. Linear in package count. - **Construction is 30-55% of end-to-end** on these synthetic resolves. Not negligible. - **End-to-end savings switching to `from_strings`: consistent ~10%** across sizes, on this synthetic workload. - **Production rez Packages will be slower than FakeRezPackage** — `FakeRezPackage` uses plain `__slots__` and skips rez's `AttributeForwardMeta` chain, the late-bound check, and the `Requirement` parse. The numbers above are a lower bound on the `from_rez` cost; the `from_strings` numbers are realistic for the fast path. ## What this tells us about further optimisation The 10% end-to-end savings is real but modest. The bigger remaining costs after `from_rez → from_strings` are: 1. The solver itself (50-65% of total here — already 34× rez). 2. PyO3 wrapper overhead per `PackageData` instance (~4.5 μs/pkg even on the fast path). A `solve_from_raw(requests, raw_tuples)` that skips `PackageData` entirely would save another ~1-2 μs/pkg (~5-10% more end-to-end). Worth doing if `from_strings` itself isn't enough on real workloads. The right next move is to A/B `from_strings` against `from_rez` in the rez shim on real studio gear before pursuing further changes. This script tells us what the upper bound on the win could be; production will confirm where it actually lands. Co-Authored-By: Claude Opus 4.7 --- scripts/bench_python_construction.py | 305 +++++++++++++++++++++++++++ 1 file changed, 305 insertions(+) create mode 100644 scripts/bench_python_construction.py diff --git a/scripts/bench_python_construction.py b/scripts/bench_python_construction.py new file mode 100644 index 0000000..307a440 --- /dev/null +++ b/scripts/bench_python_construction.py @@ -0,0 +1,305 @@ +#!/usr/bin/env python3 +"""Micro-benchmark the `pyrer.PackageData` construction paths. + +Establishes the baseline for issue #88's perf claim: how much does +`from_rez(pkg)` actually cost vs. `from_strings(...)` / the four-arg +constructor, and how does that scale with package count? Run on +demand — not part of CI; results are machine-dependent. + +Measures: + 1. Per-call construction cost for each path (μs, isolated). + 2. Per-batch construction cost for N packages (ms). + 3. End-to-end `pyrer.solve(...)` time with packages built each way. + 4. Construction-vs-solve share of total wall time. + +Usage: + python scripts/bench_python_construction.py + python scripts/bench_python_construction.py --packages 500 --iters 200 + +Requires: + pip install maturin + cd crates/rer-python && maturin develop +""" +import argparse +import sys +import timeit +from typing import List + +import pyrer + + +# --------------------------------------------------------------------------- +# Fake-rez Package mimics +# --------------------------------------------------------------------------- + + +class FakeRequirement: + """Mimics `rez.version.Requirement` — not a str, only renders via __str__.""" + + __slots__ = ("_s",) + + def __init__(self, s: str) -> None: + self._s = s + + def __str__(self) -> str: + return self._s + + +class FakeVersion: + """Mimics `rez.version.Version` — not a str, only renders via __str__.""" + + __slots__ = ("_s",) + + def __init__(self, s: str) -> None: + self._s = s + + def __str__(self) -> str: + return self._s + + +class FakeRezPackage: + """Stand-in for `rez.packages.Package`. The four duck-typed attributes + surface `FakeVersion` / `FakeRequirement` objects (not `str`) so that + `from_rez` pays the `__str__` round-trip on each one — the cost issue + #88 was filed against. + """ + + __slots__ = ("name", "version", "requires", "variants") + + def __init__( + self, + name: str, + version: str, + requires: List[str], + variants: List[List[str]], + ) -> None: + self.name = name + self.version = FakeVersion(version) + self.requires = [FakeRequirement(r) for r in requires] if requires else None + self.variants = ( + [[FakeRequirement(r) for r in v] for v in variants] if variants else None + ) + + +# --------------------------------------------------------------------------- +# Synthetic repo +# --------------------------------------------------------------------------- + + +def synth_packages(n: int): + """Return three parallel lists for N packages: + + - raw_specs: (name, version, requires, variants) tuples — what + `pkg.resource.data` would hand you (and what `from_strings` + consumes). + - fake_pkgs: `FakeRezPackage` instances — what `from_rez` consumes. + - solver_inputs: the resolve seed (just "app" — any subset works). + + Each package has 3 requires and 2 variants of 2 entries each — a + realistic shape that exercises the full attribute walk on `from_rez`. + """ + raw_specs = [] + fake_pkgs = [] + for i in range(n): + if i == 0: + name, version = "app", "1.0.0" + requires = ["lib", "util"] + variants = [] + else: + name = f"pkg{i:04d}" + version = "1.0.0" + requires = ["lib", "util"] if i < n // 2 else [] + variants = ( + [[f"python-3.{(i + 10) % 12}"], [f"python-3.{(i + 11) % 12}"]] + if i % 3 == 0 + else [] + ) + raw_specs.append((name, version, requires, variants)) + fake_pkgs.append(FakeRezPackage(name, version, requires, variants)) + # lib / util — referenced by app's requires. + for extra in (("lib", "1.0.0"), ("util", "1.0.0")): + raw_specs.append((extra[0], extra[1], [], [])) + fake_pkgs.append(FakeRezPackage(extra[0], extra[1], [], [])) + return raw_specs, fake_pkgs + + +# --------------------------------------------------------------------------- +# Timers +# --------------------------------------------------------------------------- + + +def best_of(stmt, setup, number, repeat) -> float: + """Median microseconds-per-call from a timeit.Timer run.""" + t = timeit.Timer(stmt=stmt, setup=setup, globals=globals()) + times = t.repeat(repeat=repeat, number=number) + return (min(times) / number) * 1_000_000 # μs/call + + +def bench_single_call(raw_specs, fake_pkgs, iters: int) -> None: + """Per-call construction cost for each path (a single random package).""" + spec = raw_specs[0] + fake = fake_pkgs[0] + name, version, requires, variants = spec + globals().update( + spec=spec, fake=fake, + name=name, version=version, requires=requires, variants=variants, + ) + + print("Per-call construction (median μs, one package, deep inputs)") + print("-" * 60) + + t_new = best_of( + "pyrer.PackageData(name, version, requires, variants)", + "", + number=iters, + repeat=5, + ) + t_strs = best_of( + "pyrer.PackageData.from_strings(name, version, requires, variants)", + "", + number=iters, + repeat=5, + ) + t_rez = best_of( + "pyrer.PackageData.from_rez(fake)", + "", + number=iters, + repeat=5, + ) + + print(f" PackageData(name, version, requires, variants): {t_new:7.2f} μs") + print(f" PackageData.from_strings(...): {t_strs:7.2f} μs") + print(f" PackageData.from_rez(fake_rez_pkg): {t_rez:7.2f} μs") + delta_strs = t_rez - t_strs + pct_strs = (delta_strs / t_rez) * 100 + print() + print(f" from_strings vs from_rez: -{delta_strs:.2f} μs / -{pct_strs:.1f}%") + delta_new = t_rez - t_new + pct_new = (delta_new / t_rez) * 100 + print(f" __new__ vs from_rez: -{delta_new:.2f} μs / -{pct_new:.1f}% " + "(should match from_strings — same PyO3 path)") + print() + + +def bench_batch(raw_specs, fake_pkgs, iters: int) -> None: + """Total cost of materialising the whole repo N times.""" + globals().update(raw_specs=raw_specs, fake_pkgs=fake_pkgs) + n = len(raw_specs) + + print(f"Per-batch construction (median ms, {n} packages built per iteration)") + print("-" * 60) + + t_new = best_of( + "[pyrer.PackageData(n, v, r, vr) for (n, v, r, vr) in raw_specs]", + "", + number=iters, + repeat=5, + ) + t_strs = best_of( + "[pyrer.PackageData.from_strings(n, v, r, vr) for (n, v, r, vr) in raw_specs]", + "", + number=iters, + repeat=5, + ) + t_rez = best_of( + "[pyrer.PackageData.from_rez(p) for p in fake_pkgs]", + "", + number=iters, + repeat=5, + ) + + # best_of returns μs/call; one "call" is one batch of N constructions. + print(f" list-comp via PackageData(...): {t_new / 1000:7.3f} ms / batch ({t_new / n:6.2f} μs/pkg)") + print(f" list-comp via from_strings(...): {t_strs / 1000:7.3f} ms / batch ({t_strs / n:6.2f} μs/pkg)") + print(f" list-comp via from_rez(fake_pkg): {t_rez / 1000:7.3f} ms / batch ({t_rez / n:6.2f} μs/pkg)") + print() + delta_ms = (t_rez - t_strs) / 1000 + print(f" Savings switching from_rez → from_strings: {delta_ms:6.3f} ms per batch") + print() + + +def bench_end_to_end(raw_specs, fake_pkgs, iters: int) -> None: + """Build + solve, isolating construction's share of total time.""" + globals().update(raw_specs=raw_specs, fake_pkgs=fake_pkgs) + + print("End-to-end pyrer.solve() — construction's share of wall time") + print("-" * 60) + + t_solve_only = best_of( + "pyrer.solve(['app'], pkgs)", + "pkgs = [pyrer.PackageData.from_strings(n, v, r, vr) for (n, v, r, vr) in raw_specs]", + number=iters, + repeat=5, + ) + t_e2e_strs = best_of( + "pyrer.solve(['app'], [pyrer.PackageData.from_strings(n, v, r, vr) for (n, v, r, vr) in raw_specs])", + "", + number=iters, + repeat=5, + ) + t_e2e_rez = best_of( + "pyrer.solve(['app'], [pyrer.PackageData.from_rez(p) for p in fake_pkgs])", + "", + number=iters, + repeat=5, + ) + + print(f" solve() alone (pre-built packages): {t_solve_only / 1000:7.3f} ms") + print(f" build (from_strings) + solve: {t_e2e_strs / 1000:7.3f} ms") + print(f" build (from_rez) + solve: {t_e2e_rez / 1000:7.3f} ms") + print() + build_share_strs = (t_e2e_strs - t_solve_only) / t_e2e_strs * 100 + build_share_rez = (t_e2e_rez - t_solve_only) / t_e2e_rez * 100 + print(f" Build-phase share of total (from_strings): {build_share_strs:5.1f}%") + print(f" Build-phase share of total (from_rez): {build_share_rez:5.1f}%") + print() + e2e_delta = (t_e2e_rez - t_e2e_strs) / 1000 + e2e_pct = (t_e2e_rez - t_e2e_strs) / t_e2e_rez * 100 + print(f" End-to-end savings switching to from_strings: " + f"{e2e_delta:.3f} ms ({e2e_pct:.1f}%)") + print() + + +# --------------------------------------------------------------------------- +# Main +# --------------------------------------------------------------------------- + + +def main() -> int: + parser = argparse.ArgumentParser(description=__doc__.splitlines()[0]) + parser.add_argument( + "--packages", + type=int, + default=50, + help="Number of packages in the synthetic repo (default: 50)", + ) + parser.add_argument( + "--iters", + type=int, + default=500, + help="`timeit` `number` argument — batches per repeat (default: 500)", + ) + args = parser.parse_args() + + print(f"pyrer version: {getattr(pyrer, '__version__', '?')}") + print(f"synthetic repo: {args.packages} packages") + print(f"iterations per timing: {args.iters} (best of 5)") + print() + + raw_specs, fake_pkgs = synth_packages(args.packages) + + bench_single_call(raw_specs, fake_pkgs, iters=args.iters * 10) + bench_batch(raw_specs, fake_pkgs, iters=max(args.iters // 5, 50)) + bench_end_to_end(raw_specs, fake_pkgs, iters=max(args.iters // 10, 20)) + + print("Note: these are Python-side construction costs. The Rust solver") + print("itself is unchanged. End-to-end wall time on a real rez integration") + print("will also include rez's own per-attribute AttributeForwardMeta cost") + print("(not modelled by FakeRezPackage), so the from_rez line above is a") + print("lower bound on the cost in production. The from_strings line is a") + print("realistic upper bound for the fast path.") + return 0 + + +if __name__ == "__main__": + sys.exit(main())