Skip to content
Change the repository type filter

All

    Repositories list

    • sglang

      Public
      SGLang is a high-performance serving framework for large language models and multimodal models.
      Python
      Apache License 2.0
      5.5k26k6632.3kUpdated Apr 24, 2026Apr 24, 2026
    • SGLang Omni: High-Performance Multi-Stage Pipeline Framework for Omni Models
      Python
      MIT License
      912234640Updated Apr 24, 2026Apr 24, 2026
    • This is the documentation repository for SGLang. It is auto-generated from https://github.com/sgl-project/sglang
      HTML
      35128122Updated Apr 24, 2026Apr 24, 2026
    • JAX backend for SGL
      Python
      Apache License 2.0
      902679246Updated Apr 24, 2026Apr 24, 2026
    • rbg

      Public
      A workload for deploying LLM inference services on Kubernetes
      Go
      Apache License 2.0
      542083024Updated Apr 24, 2026Apr 24, 2026
    • sgl-kernel-xpu

      Public
      SGLang kernel library for Intel XPU
      Python
      MIT License
      2224113Updated Apr 24, 2026Apr 24, 2026
    • sgl-docs

      Public
      MDX
      Apache License 2.0
      16400Updated Apr 24, 2026Apr 24, 2026
    • Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serving systems.
      Python
      MIT License
      5129498Updated Apr 23, 2026Apr 23, 2026
    • Cookbook of SGLang - Recipe
      JavaScript
      Apache License 2.0
      63127613Updated Apr 22, 2026Apr 22, 2026
    • sgl-kernel-npu

      Public
      SGLang kernel library for NPU
      C++
      MIT License
      1161251948Updated Apr 22, 2026Apr 22, 2026
    • whl

      Public
      SGLang Kernel Wheel Index
      HTML
      MIT License
      112202Updated Apr 21, 2026Apr 21, 2026
    • DeepGEMM

      Public
      DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
      Cuda
      MIT License
      9232200Updated Apr 17, 2026Apr 17, 2026
    • FlashMLA

      Public
      FlashMLA: Efficient Multi-head Latent Attention Kernels
      C++
      MIT License
      1k000Updated Apr 13, 2026Apr 13, 2026
    • Fast and memory-efficient exact attention
      Python
      BSD 3-Clause "New" or "Revised" License
      2.6k2100Updated Apr 10, 2026Apr 10, 2026
    • cuLA

      Public
      Python
      Apache License 2.0
      0100Updated Apr 8, 2026Apr 8, 2026
    • SpecForge

      Public
      Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
      Python
      MIT License
      2157996749Updated Apr 2, 2026Apr 2, 2026
    • rbg-api

      Public
      Go
      1001Updated Mar 26, 2026Mar 26, 2026
    • A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
      Python
      MIT License
      5944.1k935Updated Mar 13, 2026Mar 13, 2026
    • The test files for SGLang.
      MIT License
      3101Updated Feb 23, 2026Feb 23, 2026
    • ome-crd

      Public
      0000Updated Jan 15, 2026Jan 15, 2026
    • sgl-learning-materials

      Public
      Materials for learning SGLang
      MIT License
      6180300Updated Jan 5, 2026Jan 5, 2026
    • Fast Hadamard transform in CUDA, with a PyTorch interface
      C
      BSD 3-Clause "New" or "Revised" License
      60100Updated Oct 15, 2025Oct 15, 2025
    • sgl-whl

      Public
      SGLang wheels for multiple platforms
      MIT License
      21110Updated Oct 13, 2025Oct 13, 2025
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.