The Intelligent Analytics Assistant is a full-stack, AI-powered application that allows users to query a large-scale MySQL database using natural language. It leverages a powerful Large Language Model (LLM) to translate business questions into complex, executable SQL, providing real-time analytics on a 1.3 million-row dataset.
- Natural Language Query: Translate English questions into valid MySQL queries, including complex joins and window functions.
- Dual-Engine LLM: Configured to easily switch between Gemini 2.5 Flash (via API for reliability) and TinyLlama (via Ollama for local execution).
- High-Volume Data: Successfully handles a database containing over 1.3 million records across three key tables (
orders,order_items,customer_meetings). - Full-Stack Architecture: Separates concerns across a dedicated frontend, backend API, and database layer.
- Debugging/Transparency: Displays the generated SQL query before execution for transparency and debugging.
The project is built on three main components:
| Component | Technology | Role |
|---|---|---|
| Frontend | Streamlit | Provides the interactive chat interface and visualizes the results. |
| Backend | FastAPI (Python) | The core API that handles all logic: accepts the natural language query, calls the LLM, and executes the final SQL against the database. |
| LLM (AI Brain) | Gemini 2.5 Flash (or TinyLlama) | Generates the required MySQL query based on the database schema. |
| Database | MySQL | Stores the 1.3M+ rows of raw analytical data. |
Follow these steps to get the application running on your local machine.
You must have the following installed:
- Python 3.10+
- MySQL Server (and administrative access)
- (Optional, for local LLM) Ollama (if using TinyLlama)
git clone https://github.com/HenryMorganDibie/intelligent-analytics-assistant.git
cd intelligent-analytics-assistantCreate and activate a Python virtual environment, and install the required dependencies.
python -m venv .venv
.\.venv\Scripts\activatepip install -r requirements.txtYou must set up the MySQL database and load the data.
-
Create Database: Log into your MySQL server and create an empty database named analytics_db.
-
Configure .env: Create a file named .env in the project root and fill in your MySQL credentials and LLM preferences.
Create a .env file in the project root with the following content:
# Database Credentials
DB_NAME=analytics_db
DB_USER=root
DB_PASS="YOUR_ACTUAL_PASSWORD"
DB_HOST=localhost
DB_PORT=3306
# LLM Engine Configuration (SET USE_GEMINI=True for the final, reliable setup)
USE_GEMINI=True
GEMINI_API_KEY="YOUR_GEMINI_API_KEY"
GEMINI_MODEL="gemini-2.5-flash"
# (Optional) Ollama Configuration
OLLAMA_HOST=http://localhost:11434
OLLAMA_MODEL=tinyllama- Load Data: Use the provided script to create the tables and load the data.
python data_loader.py# Navigate to your Ollama installation directory
cd "$env:LOCALAPPDATA\Programs\Ollama"# Pull the required model
.\ollama pull tinyllamaYou must run the backend and frontend in two separate terminals.
- Launch the Backend API (Terminal 1)
# Ensure you are in the project root and the .venv is active
uvicorn backend.main:app --reload- Launch the Frontend UI (Terminal 2)
# Ensure you are in the project root and the .venv is active
streamlit run frontend/app.pyTry these questions to test its full functionality:
| Complexity | Question | Expected SQL Component |
|---|---|---|
| Simple Aggregate | "What is the total quantity of all orders?" | SELECT SUM(total_qty) FROM orders |
| Relational Join | "What is the average total price of orders placed by customers who have never had a meeting?" | LEFT JOIN ... WHERE meetingid IS NULL |
| Window Function | "For every customer, list the date of their meeting and the date of their previous meeting." | LAG(meetingdate) OVER (PARTITION BY customer_id ...) |
Henry Dibie