Skip to content

Model features#15

Merged
caseymcc merged 1 commit into
mainfrom
model_features
Apr 3, 2026
Merged

Model features#15
caseymcc merged 1 commit into
mainfrom
model_features

Conversation

@caseymcc
Copy link
Copy Markdown
Owner

@caseymcc caseymcc commented Apr 3, 2026

Model Features

Hardware Detection & Model Fit

  • Vulkan memory budget detection via VK_EXT_memory_budget with automatic fallback
  • Model fit calculator accounts for multi-GPU tensor splitting and context scaling

Model Runtime & Downloads

  • Async model downloads with progress tracking and cancellation
  • Split-GGUF support (multi-part .gguf files reassembled on download)
  • Detailed load-error diagnostics surfaced through ModelRuntime
  • Storage manager for model cache directory and disk-space queries

Llama Provider

  • Upgraded llama.cpp from b4743 to b8573
  • Chat-template formatting via llama_chat_apply_template
  • Streaming completion support with per-token callbacks
  • Telemetry recording (tokens/sec, latency, memory)

Server & Dashboard

  • New REST endpoints: model load/unload/pin/download, hardware info, telemetry snapshots
  • Live HTML dashboard with VRAM/RAM gauges, GPU utilization charts, model management controls, and download progress bars
  • Server-Sent Events for real-time stat updates

Config & Schema

  • Model config schema extended with variants, hardware_requirements, and context_scaling
  • ConfigDownloader fleshed out for git-based clone/pull from remote config repo

Tests

  • New test suites: hardware detector, model runtime, telemetry collector, llama provider, storage manager, server connectivity
  • Server connect tests auto-skip when server is unreachable or no model is loaded

@caseymcc caseymcc merged commit 55d2f59 into main Apr 3, 2026
1 check passed
@caseymcc caseymcc deleted the model_features branch May 10, 2026 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant