Skip to content

[RFC] Bring your own type erasure#1068

Open
hildebrandmw wants to merge 4 commits into
mainfrom
mhildebr/byo-type-erasure
Open

[RFC] Bring your own type erasure#1068
hildebrandmw wants to merge 4 commits into
mainfrom
mhildebr/byo-type-erasure

Conversation

@hildebrandmw
Copy link
Copy Markdown
Contributor

@hildebrandmw hildebrandmw commented May 14, 2026

A small RFC describing a pattern we can use to reduce levels indirection in distance function composition. No true action is needed in the codebase, I'm just writing down what I've recommended to several people for an issue I'm seeing occur more frequently.

Rendered

@hildebrandmw hildebrandmw changed the title [RFC] Bring your own type erasure. [RFC] Bring your own type erasure May 14, 2026
@hildebrandmw hildebrandmw marked this pull request as ready for review May 14, 2026 17:57
@hildebrandmw hildebrandmw requested review from a team and Copilot May 14, 2026 17:57
Copy link
Copy Markdown
Contributor

@arkrishn94 arkrishn94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Mark, this is super helpful.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an RFC documenting a “bring your own type erasure” pattern intended to reduce nested dispatch in distance-computation composition.

Changes:

  • Adds a new RFC describing the motivation, example pattern, generated-code comparison, and trade-offs.
  • Identifies potential application areas such as DistanceProvider, spherical quantization kernels, and multi-vector backends.
Comments suppressed due to low confidence (3)

rfcs/01068-byo-type-erasure.md:31

  • Grammar: the article and plural noun do not agree here.
The combination of unwrapping + delegation is used to create another trait object, leading to an unavoidable situations where we have at least two levels of dynamic dispatch.

rfcs/01068-byo-type-erasure.md:19

  • Grammar: this compound modifier should be hyphenated before the noun.
Lower level APIs in our library use various flavors of type-erasure to enable polymorphism over metric, micro-architecture, and length specialization.

rfcs/01068-byo-type-erasure.md:44

  • Grammar: this compound modifier should be hyphenated before the noun.
How can we redesign our lower level APIs to allow composition of distance computations?

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +29 to +31
Here, an inner distance computations (using one of the type-erasure approaches outlined above) need to be composed with a small unwrapping layer.
For `diskann-garnet` specifically, this unwrapping layer reifies the type of raw byte slices (translates from `&[u8]` to the type needed by the inner computer).
The combination of unwrapping + delegation is used to create another trait object, leading to an unavoidable situations where we have at least two levels of dynamic dispatch.
## Proposal

The solution is relatively simple and is probably a variant of some visitor pattern.
For the purposes of this demonstration, assume we have two level of distance function factories.
Multiply,
}

// Instead of return a function pointer from level 1, we visit implementations of `Level1`.
The main trade-offs here are API complexity and compile times.
If the `level_1_factory` dispatches to many possible implementations, like the `DistanceProvider` API which dispatches across micro-architecture, metric, and length specialization, each higher level essentially redoes that work.

However, for distance functions that are called millions or billions of time in a hot loop, the extra complexity to minimize overhead is often worth it.
## Summary

This RFC outlines a pattern for tackling composition of distance computers with only a single level of type erasure.
The goal is to streamline patterns like #1050 where trait object based distance computers are embedded in new-type wrappers to create yet another trait object.
mulss xmm0, xmm1
ret
```
Everything is inlined!
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 14, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.51%. Comparing base (3fb79a7) to head (d25a81d).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1068   +/-   ##
=======================================
  Coverage   89.51%   89.51%           
=======================================
  Files         461      461           
  Lines       85920    85920           
=======================================
  Hits        76912    76912           
  Misses       9008     9008           
Flag Coverage Δ
miri 89.51% <ø> (ø)
unittests 89.14% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants