Skip to content

Arena ToT: incremental multi-page construction#549

Merged
evaleev merged 4 commits into
masterfrom
feature/arena_incremental
May 19, 2026
Merged

Arena ToT: incremental multi-page construction#549
evaleev merged 4 commits into
masterfrom
feature/arena_incremental

Conversation

@evaleev
Copy link
Copy Markdown
Member

@evaleev evaleev commented May 19, 2026

Summary

Lets an arena-backed tensor-of-tensors outer tile (Tensor<ArenaTensor>)
be built incrementally — inner cells sized and filled one at a time —
instead of requiring every inner range up front via a range_fn.

Stacks on #548 (base is feature/arena_tensor); retarget to master
once #548 merges.

  • Arena → multi-page bump allocator. A growing list of pages; each
    page buffer is a stable heap block, so the raw Cell* of every
    ArenaTensor view stays valid as pages are added. claim_bytes()
    bumps the current page and appends a fresh one on a miss; a request
    larger than a page gets its own dedicated, exactly-sized page.
    reserve_page() lays down a single exact page — the up-front path
    uses it, so kernel/einsum-built tiles keep their one contiguous slab,
    byte-identical to before. Page size is a knob
    (TILEDARRAY_ARENA_PAGE_BYTES).
  • ArenaToTBuilder — one-pass incremental construction: the caller
    discovers each inner range and fills the returned cell in a single
    step. arena_compact coalesces a multi-page tile into one slab.
  • Single-pass DistArray construction. make_nested_tile,
    make_arena_nested_tile, init_elements, set(i, InIter), and the
    ArrayImpl retile path are rebuilt on the builder: the two-pass
    size-then-fill walk and the std::vector temporaries that buffered a
    whole outer tile's worth of inner tensors are gone. foreach /
    make_array were reviewed — tile-type-agnostic, no two-pass machinery,
    no change needed.

Test plan

  • Arena unit suite — page rollover, oversized/dedicated pages,
    single exact page, aliasing survival.
  • ArenaToTBuilder + arena_compact coverage for both TA::Tensor
    and ArenaTensor inner cells.
  • DistArray-level incremental construction via init_tiles +
    ArenaToTBuilder.
  • Full serial unit suite (tiledarray/unit/run-np-1) passes 100%.
  • CI green.

evaleev added 3 commits May 19, 2026 21:23
Replace the one-shot Arena (a single slab sized by a pre-walk of every
inner range) with a multi-page bump allocator, so an arena-backed ToT
outer tile no longer has to know all inner-tensor sizes before
construction.

Arena (arena.h):
- A growing list of pages; each page buffer is a stable heap block, so
  the raw Cell* of every ArenaTensor view stays valid as pages are
  added. claim_bytes() bumps the current page and appends a fresh one
  on a miss; a request larger than a page (or needing finer alignment)
  gets its own dedicated, exactly-sized page.
- reserve_page() lays down a single exact page -- the up-front path
  uses it so kernel/einsum-built tiles keep their one contiguous slab,
  byte-identical to before.
- Page size is a knob (TILEDARRAY_ARENA_PAGE_BYTES). Construction is
  single-threaded by contract (one task per outer tile); the bump path
  is intentionally unsynchronized.
- Drop the unused plan()/ArenaPlan helpers.

arena_kernels.h:
- arena_outer_init reimplemented against the new Arena (reserve_page +
  sequential claim_bytes); signature unchanged, so einsum/tensor.h
  callers are untouched.
- ArenaToTBuilder: one-pass incremental construction -- the caller
  discovers each inner range and fills the returned cell in a single
  step, driving its own loop. A cell larger than a page and a
  single-cell tile both route to an exactly-sized dedicated page.
- arena_compact: coalesce a multi-page incrementally-built tile into
  one contiguous slab.

Tests: rewrite tests/arena.cpp for the new API (page rollover,
oversized/dedicated pages, single exact page, aliasing survival); add
ArenaToTBuilder + arena_compact coverage for both TA::Tensor and
ArenaTensor inner cells.
Add a test that builds a TA::DistArray<Tensor<ArenaTensor>> by calling
ArenaToTBuilder inside the init_tiles callback -- each outer tile's
inner cells are sized (jagged) and filled one at a time, with no
up-front range_fn. Confirms the incremental builder composes with
init_tiles and needs no new DistArray API.
The arena-ToT construction paths pre-walked their cells twice: the
two-pass make_nested_tile invoked its source once to size each cell and
again to fill it, so callers with a single-pass source materialized the
whole outer tile into a temporary vector first. ArenaToTBuilder makes a
single ascending pass possible everywhere.

- make_nested_tile (arena_kernels.h): rebuilt on ArenaToTBuilder --
  inner_range_fn and inner_fill_fn are now interleaved per cell instead
  of two full passes; no separate all-ranges walk. Cells stay
  zero-initialized so the no-op-fill (shape-only) path is unchanged.
- DistArray::make_arena_nested_tile: rebuilt on ArenaToTBuilder;
  cell_source is invoked exactly once per cell in ascending order.
- DistArray::init_elements (arena branch): drops the std::vector<R>
  that collected every inner tensor of the outer tile before building.
- DistArray::set(i, InIter) (arena branch): drops the std::vector that
  buffered the single-pass iterator; it now feeds straight through.
- ArrayImpl retile (arena-ToT branch): builds each target tile with
  ArenaToTBuilder, one source-cell lookup per cell instead of two.

Eliminates a peak-memory doubling during construction (the temporary
held the whole tile's data alongside the arena slab). foreach /
make_array were also reviewed: both are tile-type-agnostic (the result
tile is default-constructed and the user op populates it) -- no
two-pass machinery there, nothing to relax.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables incremental construction of arena-backed tensor-of-tensors tiles by replacing one-shot slab allocation assumptions with a multi-page arena and a new builder API.

Changes:

  • Adds multi-page Arena allocation with standard/dedicated pages and allocation accounting.
  • Introduces ArenaToTBuilder and arena_compact for one-pass ToT construction and compaction.
  • Updates DistArray/retile paths and tests to exercise incremental arena-backed construction.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/TiledArray/tensor/arena.h Reworks Arena into a multi-page bump allocator and updates ArenaResource.
src/TiledArray/tensor/arena_kernels.h Adds ArenaToTBuilder, updates nested tile construction, and adds compaction support.
src/TiledArray/dist_array.h Removes buffering in arena ToT set/init_elements paths via one-pass construction.
src/TiledArray/array_impl.h Updates arena ToT retile construction to use the incremental builder.
tests/arena.cpp Updates arena unit tests for multi-page allocation behavior.
tests/arena_kernels.cpp Adds builder/compaction tests for Tensor<Tensor<...>> inners.
tests/arena_tensor_kernels.cpp Adds builder/compaction and DistArray incremental construction tests for ArenaTensor inners.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/TiledArray/tensor/arena_kernels.h Outdated
Comment on lines +191 to +198
/// fill. A zero-volume range leaves the cell null. Outer element indices
/// translate via `outer_range().ordinal(idx)`.
inner_t& emplace(std::size_t ord, inner_range_t inner_range) {
TA_ASSERT(ord < n_cells_);
inner_t& cell = data_[ord];
const std::size_t vol = inner_range.volume();
if (vol == 0) return cell; // stays null
constexpr bool arena = is_arena_tensor_v<inner_t>;
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in a02f28e. ArenaToTBuilder::emplace now mirrors arena_outer_init's non-arena zero-volume handling: an owning (non-view) inner given a rank>0 zero-volume range is built as inner_t(range) (empty but rank-preserving); a rank-0 range — and any arena view inner, which cannot carry a standalone range — stays null. Added a regression test builder_zero_volume_nonscalar_range_keeps_rank.

ArenaToTBuilder::emplace, given a zero-volume range, used to leave the
cell default/null. For an owning (non-view) inner that drops the range
metadata -- arena_outer_init keeps a rank>0 zero-volume range as an
empty-but-ranked tensor and only collapses a rank-0 range to null. Since
make_nested_tile now routes through the builder, mirror that handling so
a TA::Tensor inner with e.g. Range{0} stays an empty rank-1 tensor.
Arena view inners (which cannot carry a standalone range) still go null.
Adds a regression test.

Also drops a stale type_traits.h comment that listed TensorInterface as
an is_tensor_view specialization -- it is deliberately not a view.
Base automatically changed from feature/arena_tensor to master May 19, 2026 20:54
@evaleev evaleev merged commit 8ca1ff2 into master May 19, 2026
9 checks passed
@evaleev evaleev deleted the feature/arena_incremental branch May 19, 2026 21:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants