Skip to content

ltm: sliced reducer sub-expressions are not hoisted into aggregate nodes (stay on the wildcard-link-score path) #514

@bpowers

Description

@bpowers

Summary

When LTM (Loops that Matter) is enabled, the aggregate-node enumerator enumerate_agg_nodes (in src/simlin-engine/src/ltm_agg.rs) hoists an array-reducer subexpression into a synthetic $⁚ltm⁚agg⁚{n} node only when the reducer reads a full extent of its source (a bare Var(x) or all-wildcard subscript: x[*], x[*,*]). A reducer over an explicit slice used as a sub-expression -- x[r] = ... + SUM(pop[NYC,*]) or x[D1] = ... + SUM(matrix[D1,*]) -- is deliberately not minted as a synthetic agg.

The carve-out lives in reducer_is_full_reduce / expr_is_full_extent (src/simlin-engine/src/ltm_agg.rs:573-643), consulted from walk_subexpr_for_aggs (ltm_agg.rs:297-441, the gate is at ltm_agg.rs:345). The reason: the AggNode descriptor only carries the source variable name, not which slice rows/axes the reducer actually reads -- so if such a reducer were hoisted, the element-graph reroute (route reducer references through aggregate nodes) and the per-element reducer link scores would over-approximate the unread rows with nonzero garbage (the slice pop[NYC,*] only reads the NYC row, but the agg edge pop[*] -> agg and the per-element link scores would cover every element of pop).

Consequence

A reducer over an explicit slice used as a sub-expression stays on the conservative full-cross-product element graph and gets a …⁚wildcard per-shape link score (the pre-Phase-5 behavior). It does not get the fine-grained per-element aggregate edges + per-element reducer link scores that whole-RHS slice reducers (agg[D1] = SUM(matrix[D1,*]), where the user variable becomes the agg) and whole-extent reducers (SUM(pop[*])) get.

The test slice_reducer_subexpression_is_not_hoisted (src/simlin-engine/src/ltm_agg.rs:1022) pins this behavior, and there's a code comment at the carve-out documenting it.

Why this matters

Silent over-approximation, not a regression: a slice-reducer-in-a-subexpression's loops use the conservative full-element cross-product (which can be combinatorially large for a >N-element dimension) and a per-shape …⁚wildcard link score that lumps all source elements into one number. It's strictly weaker than what Phase 5 delivers for the full-extent and whole-RHS-slice cases, and it's the one remaining …⁚wildcard-link-score reducer path that the cross-element-aggregate design (AC5: retire the :wildcard/:dynamic paths) did not eliminate.

Possible approaches

  • Have AggNode (in src/simlin-engine/src/ltm_agg.rs) carry the slice's pinned elements / result-axis dimensions in addition to the source variable name. Then enumerate_agg_nodes can hoist a sliced sub-expression reducer too, and the element-graph reroute + per-element reducer link-score emitter can emit edges and per-element scores over just the read slice (e.g. pop[NYC] -> agg instead of pop[*] -> agg).
  • If that's deemed not worth the descriptor complexity, the carve-out is at least correct and documented; in that case the issue is "won't fix, here's why" plus a pointer to the test.

Locations

  • src/simlin-engine/src/ltm_agg.rs -- reducer_is_full_reduce, expr_is_full_extent (the carve-out predicate), walk_subexpr_for_aggs (the gate at the reducer_is_full_reduce(...) call), enumerate_agg_nodes, AggNode (the descriptor that would need to grow a slice field)
  • src/simlin-engine/src/ltm_agg.rs:1022 -- slice_reducer_subexpression_is_not_hoisted test pinning the behavior

Discovery context

Identified during code review of Phase 5 ("reducer hoist / aggregate nodes") of the "LTM Cross-Element Aggregate Scoring" implementation plan (branch ltm-503-cross-element-agg; design plan under docs/implementation-plans/2026-05-09-ltm-503-cross-element-agg/). An acknowledged trade-off, not a Phase 5 blocker -- the carve-out is intentional and tested.

Tracking

Part of LTM tracking epic: #488. Related to the LTM array-support umbrella #273 and to #503 (cross-element loops normalizing by Δ-aggregate) -- but distinct: #503 is about whole-extent wildcard-reducer cross-element loops normalizing by the wrong denominator; this is about sliced reducer sub-expressions not being hoisted into aggregate nodes at all.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ltmLoops that Matter (LTM) analysis subsystem

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions