A production-ready web application for deploying your own AI assistant. Works with any OpenAI-compatible API (TabbyAPI, Ollama, vLLM) and includes document search, long-term memory, and enterprise admin features.
π Quick Start β’ π Wiki β’ π― Features β’ ποΈ Architecture
- Dark-themed UI inspired by Claude.AI - Professional, distraction-free design
- Real-time streaming - SSE-powered responses with typewriter effect
- Markdown rendering - Code blocks with syntax highlighting, tables, lists
- Mobile responsive - Collapsible sidebar, adaptive layout
- Conversation management - Search, pin, export, and organize chats
- Vector Search - ChromaDB semantic search across all conversation history
- Knowledge Graphs - Automatic entity extraction and relationship mapping with NetworkX
- Contextual Awareness - Relevant history automatically injected into each response
- Cross-conversation memory - AI remembers context from past sessions
- Document upload - Supports PDF, DOCX, TXT, CSV formats
- Automatic processing - Smart chunking (2000 chars) with overlap for context preservation
- Semantic search - Vector embeddings for accurate document retrieval
- Source citations - Responses include references to specific document chunks
- Collection management - Organize documents into searchable collections
- Persistence verified - All uploads survive container rebuilds
- Role-based authentication - Admin and user roles with session-based auth
- User management - Create, update, delete users via admin dashboard
- Model switching - Hot-swap between LLMs without restart (TabbyAPI integration)
- Service monitoring - View and manage Docker container status
- Real-time logs - Stream logs from backend, frontend, or LLM services
- Audit trail - Track user actions and system events
- LLM: Works with any OpenAI-compatible API
- TabbyAPI (ExLlamaV2, llama.cpp)
- Ollama (Llama, Mistral, Qwen, etc.)
- vLLM (production inference server)
- LocalAI, LM Studio, text-generation-webui
- Memory: ChromaDB for vector embeddings and semantic search
- Database: SQLite by default (PostgreSQL, MySQL via SQLAlchemy)
- Deployment: Docker Compose with hot-reload for development
- Docker & Docker Compose (recommended path)
- OpenAI-compatible LLM API (TabbyAPI, Ollama, vLLM, etc.)
- ChromaDB instance (optional but recommended for memory features)
git clone https://github.com/jlwestsr/nebulus-gantry.git
cd nebulus-gantry
# Create environment file
cat > .env << 'EOF'
DATABASE_URL=sqlite:///./data/gantry.db
CHROMA_HOST=http://chromadb:8000
TABBY_HOST=http://your-llm-api:5000
SECRET_KEY=change-this-to-a-random-secret
SESSION_EXPIRE_HOURS=24
VITE_API_URL=http://localhost:8000
EOFdocker compose up -dAccess Points:
- π Frontend: http://localhost:3001
- π§ Backend API: http://localhost:8000
- π API Docs: http://localhost:8000/docs (interactive Swagger UI)
docker compose exec backend python -c "
from backend.services.auth_service import AuthService
from backend.database import get_engine, get_session_maker
engine = get_engine()
Session = get_session_maker(engine)
db = Session()
auth = AuthService(db)
user = auth.create_user(
email='admin@example.com',
password='your-secure-password',
role='admin',
display_name='Admin'
)
print(f'Created admin user: {user.email}')
db.close()
"Navigate to http://localhost:3001, log in with your admin credentials, and start your first conversation. The AI will remember context across sessions and you can upload documents for RAG-powered answers.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Browser (React SPA) β
β ββββββββββββ¬βββββββββββ¬βββββββββββ¬βββββββββββ¬βββββββββββ β
β β Chat UI β Knowledgeβ Admin β Settings β Search β β
β β β Vault β Panel β β (Ctrl+K) β β
β ββββββββββββ΄βββββββββββ΄βββββββββββ΄βββββββββββ΄βββββββββββ β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β HTTP / SSE Streaming
ββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββ
β FastAPI Backend (:8000) β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Routers: /auth /chat /admin /documents β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β
β β Services: Auth β Chat β LLM β Memory β Graph β Docs β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββ¬βββββββββββββ¬ββββββββββββββ¬ββββββββββββββ β
β β SQLite β ChromaDB β NetworkX β LLM API β β
β β (users, β (vector β (knowledge β (streaming β β
β β messages, β embeddings)β graph JSON) β responses) β β
β β docs) β β β β β
β βββββββββββββ΄βββββββββββββ΄ββββββββββββββ΄ββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Technology Stack:
| Layer | Technology | Purpose |
|---|---|---|
| Frontend | React 19, TypeScript, Vite 7 | Modern SPA with hot reload |
| Styling | Tailwind CSS v4 | Utility-first, dark theme |
| State | Zustand | Lightweight global state |
| Backend | FastAPI 0.109, Python 3.12 | Async API with auto docs |
| ORM | SQLAlchemy 2 | Type-safe database queries |
| Database | SQLite (default) | Zero-config, file-based DB |
| Vectors | ChromaDB | Semantic search & embeddings |
| Graphs | NetworkX | Entity relationship mapping |
| Auth | Session cookies, bcrypt | Secure, httponly sessions |
| Streaming | SSE (Server-Sent Events) | Real-time chat responses |
| LLM | OpenAI-compatible API | TabbyAPI, Ollama, vLLM, etc. |
| Containers | Docker Compose | Orchestrated deployment |
Comprehensive guides available in the GitHub Wiki:
| Guide | Description |
|---|---|
| Installation | Detailed setup for Docker and manual deployment |
| Configuration | Environment variables, LLM backends, ChromaDB |
| Knowledge Vault | Document upload, RAG setup, search optimization |
| Long-Term Memory | Vector search, knowledge graphs, context injection |
| Admin Dashboard | User management, model switching, logs |
| API Reference | REST endpoints, SSE streaming, authentication |
| Architecture | System design, data flow, service interaction |
| Deployment | Production checklist, HTTPS, reverse proxy, scaling |
| Developer Guide | Local setup, testing, contributing |
- β Self-hosted ChatGPT alternative - Complete control over your AI infrastructure
- β Clean architecture - FastAPI + React with proper separation of concerns
- β Full API access - Build integrations, bots, or custom frontends
- β Docker-first - Reproducible environments, easy deployment
- β Open to extend - Add custom models, tools, or integrations
- π Private deployment - No data leaves your infrastructure
- π Compliance-friendly - GDPR, HIPAA, SOC2 compatible when self-hosted
- π₯ Multi-user with RBAC - Admin and user roles, session management
- π Audit controls - User actions, system logs, conversation exports
- π Security-first - Bcrypt passwords, httponly cookies, CORS controls
- π¦ Beautiful UI for local LLMs - Llama, Mistral, Qwen, Yi, DeepSeek, etc.
- π Built-in RAG - Upload PDFs, query your documents with citations
- π§ Memory system - AI remembers context across sessions
- π³ One-command deploy -
docker compose upand you're running - π¨ Claude.AI-inspired UX - Professional dark theme, smooth animations
Backend:
cd backend
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
uvicorn main:app --reload --port 8000Frontend:
cd frontend
npm install
npm run devFrontend dev server: http://localhost:5173 Backend API: http://localhost:8000
cd backend
pytest tests/ -v
# Run with coverage
pytest tests/ --cov=backend --cov-report=htmlCurrent test status: 276 tests passing
See Contributing Guide for:
- Code style guidelines (Black, Flake8, ESLint)
- Commit conventions (Conventional Commits)
- Pull request process
- Development workflow
# Production deployment
docker compose up -d
# View logs
docker compose logs -f
# Restart services
docker compose restart backend frontendBefore deploying to production:
- Change
SECRET_KEYto a random 32+ character string - Configure HTTPS via reverse proxy (nginx, Caddy, Traefik)
- Set up backups for
./datadirectory (SQLite, knowledge graphs) - Review CORS settings in
backend/main.py - Create strong admin password (min 12 chars, mixed case, symbols)
- Configure ChromaDB persistence with external volume
- Set
SESSION_EXPIRE_HOURSto appropriate value (default: 24) - Enable firewall rules (allow 80/443, block 8000/3001 externally)
- Set up monitoring (uptime, logs, resource usage)
See Production Deployment Guide for detailed instructions.
Key environment variables:
| Variable | Default | Description |
|---|---|---|
DATABASE_URL |
sqlite:///./data/gantry.db |
SQLAlchemy database connection string |
SECRET_KEY |
dev-secret-change-in-production |
Session signing secret (change in prod!) |
CHROMA_HOST |
http://localhost:8000 |
ChromaDB HTTP endpoint for vectors |
TABBY_HOST |
http://localhost:5000 |
LLM API endpoint (OpenAI-compatible) |
SESSION_EXPIRE_HOURS |
24 |
Session cookie lifetime in hours |
VITE_API_URL |
http://localhost:8000 |
Backend URL for frontend (build-time) |
Full configuration guide: Wiki > Configuration
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feat/amazing-feature) - Commit your changes (
git commit -m 'feat: add amazing feature') - Push to the branch (
git push origin feat/amazing-feature) - Open a Pull Request
See CONTRIBUTING.md for detailed guidelines.
- β v2.0 - Complete rewrite (FastAPI + React)
- β Streaming chat - SSE real-time responses
- β Long-term memory - ChromaDB + NetworkX
- β Knowledge Vault - RAG with document upload
- β Admin panel - Users, models, services, logs
- β Model switching - Hot-swap LLMs via TabbyAPI
- β Conversation export - JSON and PDF formats
- β Searchable history - Ctrl+K command palette
- π§ Personas - Custom system prompts (planned)
- π§ Multi-modal - Image upload support (planned)
Proprietary - West AI Labs LLC
This software is proprietary and not open source. All rights reserved.
Built with excellent open-source tools:
- FastAPI - Modern Python web framework
- React - UI library
- ChromaDB - Vector database
- Tailwind CSS - Utility-first CSS
- Zustand - State management
- NetworkX - Graph algorithms
UI design inspired by Claude.AI and ChatGPT.
Part of the Nebulus AI Ecosystem:
- Nebulus Prime - Complete local AI infrastructure (Linux/NVIDIA)
- Nebulus Edge - macOS deployment with MLX (Apple Silicon)
- Nebulus Core - Shared Python library and CLI framework
Made with β€οΈ for the self-hosted AI community
