You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/blog/2025-10-10-1760088945.md
+6-1Lines changed: 6 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -84,4 +84,9 @@ At fp32:
84
84
- IREE (CUDA): 5.4 ms / it
85
85
- IREE (Vulkan): 12.8 ms / it
86
86
```
87
-
{{< /details >}}
87
+
{{< /details >}}
88
+
89
+
### Footnote: What about "glue code"?
90
+
Glue code is the code connecting the various sub-models with each other (Text Encoder, VAE, Unet, Transformer etc), as well as the scheduling/sampling code (Euler, Heun etc). This code won't be included in the ONNX export (because it isn't part of the model), and therefore won't be compiled by ML compilers. This glue code uses torch for modifying tensors and generating intermediate tensors (random, zeros etc).
91
+
92
+
Which means this glue code will have to be ported to C++ (in order to avoid python and torch's installation size overhead). That's a decent amount of work. Projects like [stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp) implement them in C++ but are those lines are fairly entangled with the rest of the codebase.
0 commit comments