Skip to content

Commit b63151a

Browse files
NeverdecelNeverdecel
andauthored
chore: tighten repo tooling (#2)
Co-authored-by: Neverdecel <neverdecel@proton.me>
1 parent 42b17f6 commit b63151a

13 files changed

Lines changed: 278 additions & 128 deletions

.github/workflows/ci-tests.yml

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,32 +5,34 @@ on:
55
branches: [ main, master, develop ]
66
pull_request:
77
branches: [ main, master, develop ]
8+
schedule:
9+
- cron: "0 3 * * *"
810

911
jobs:
1012
test-imports:
1113
runs-on: ubuntu-latest
1214

1315
steps:
1416
- uses: actions/checkout@v4
15-
17+
1618
- name: Set up Python 3.11
1719
uses: actions/setup-python@v5
1820
with:
1921
python-version: '3.11'
20-
22+
2123
- name: Cache pip dependencies
2224
uses: actions/cache@v4
2325
with:
2426
path: ~/.cache/pip
2527
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
2628
restore-keys: |
2729
${{ runner.os }}-pip-
28-
30+
2931
- name: Install dependencies
3032
run: |
3133
python -m pip install --upgrade pip
3234
pip install -r requirements.txt
33-
35+
3436
- name: Test Import Structure
3537
run: |
3638
python -c "import coderag.config; print('✓ Config import successful')"
@@ -53,13 +55,14 @@ jobs:
5355
run: |
5456
python -m pip install --upgrade pip
5557
pip install -r requirements.txt
56-
pip install black flake8 isort mypy pytest
57-
- name: Lint and type-check
58+
- name: Format check
5859
run: |
5960
black --check .
6061
isort --check-only .
61-
flake8 . --max-line-length=88 --ignore=E203,W503
62-
mypy .
62+
- name: Lint
63+
run: flake8 . --max-line-length=88 --ignore=E203,W503
64+
- name: Type check
65+
run: mypy .
6366
- name: Run tests
6467
env:
6568
PYTHONPATH: ${{ github.workspace }}

.pre-commit-config.yaml

Lines changed: 17 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,29 @@
11
repos:
2+
- repo: https://github.com/pre-commit/pre-commit-hooks
3+
rev: v4.6.0
4+
hooks:
5+
- id: check-added-large-files
6+
- id: end-of-file-fixer
7+
- id: trailing-whitespace
28
- repo: https://github.com/psf/black
3-
rev: 23.12.1
9+
rev: 24.8.0
410
hooks:
511
- id: black
612
language_version: python3
7-
args: ['--line-length=88']
8-
9-
- repo: https://github.com/pycqa/flake8
10-
rev: 7.0.0
11-
hooks:
12-
- id: flake8
13-
args: ['--max-line-length=88', '--ignore=E203,W503']
14-
15-
- repo: https://github.com/pycqa/isort
13+
- repo: https://github.com/PyCQA/isort
1614
rev: 5.13.2
1715
hooks:
1816
- id: isort
1917
args: ["--profile", "black"]
20-
18+
- repo: https://github.com/PyCQA/flake8
19+
rev: 7.1.1
20+
hooks:
21+
- id: flake8
22+
additional_dependencies: ["flake8-bugbear==24.4.26"]
23+
args: ["--max-line-length=88", "--ignore=E203,W503"]
2124
- repo: https://github.com/pre-commit/mirrors-mypy
22-
rev: v1.8.0
25+
rev: v1.11.1
2326
hooks:
2427
- id: mypy
25-
additional_dependencies: [types-all]
26-
args: [--ignore-missing-imports, --no-strict-optional]
27-
28-
- repo: https://github.com/pre-commit/pre-commit-hooks
29-
rev: v4.5.0
30-
hooks:
31-
- id: trailing-whitespace
32-
- id: end-of-file-fixer
33-
- id: check-yaml
34-
- id: check-added-large-files
35-
- id: check-merge-conflict
36-
- id: debug-statements
28+
additional_dependencies: ["types-requests"]
29+
args: ["--config-file=pyproject.toml"]

DEVELOPMENT.md

Lines changed: 21 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -10,18 +10,22 @@ cd CodeRAG
1010
python -m venv venv
1111
source venv/bin/activate # Windows: venv\Scripts\activate
1212
pip install -r requirements.txt
13+
14+
> The requirements file delegates to `-e .[dev]`, so you can also run
15+
> `pip install -e .[dev]` directly if you prefer editable installs.
1316
```
1417
1518
### 2. Configure Pre-commit Hooks
1619
1720
```bash
1821
pip install pre-commit
1922
pre-commit install
23+
pre-commit run --all-files
2024
```
2125
2226
This will run code quality checks on every commit:
2327
- **Black**: Code formatting
24-
- **isort**: Import sorting
28+
- **isort**: Import sorting
2529
- **Flake8**: Linting and style checks
2630
- **MyPy**: Type checking
2731
- **Basic hooks**: Trailing whitespace, file endings, etc.
@@ -71,11 +75,11 @@ Use concise docstrings for public functions:
7175
```python
7276
def search_code(query: str, k: int = 5) -> List[Dict[str, Any]]:
7377
\"\"\"Search the FAISS index using a text query.
74-
78+
7579
Args:
7680
query: The search query text
7781
k: Number of results to return
78-
82+
7983
Returns:
8084
List of search results with metadata
8185
\"\"\"
@@ -94,18 +98,16 @@ streamlit run app.py
9498
9599
### Code Quality Checks
96100
```bash
97-
# Format code
101+
pre-commit run --all-files
102+
```
103+
104+
If you need to run a specific tool locally:
105+
106+
```bash
98107
black .
99108
isort .
100-
101-
# Check linting
102109
flake8 .
103-
104-
# Type checking
105110
mypy .
106-
107-
# Run all pre-commit checks
108-
pre-commit run --all-files
109111
```
110112
111113
## Adding New Features
@@ -173,6 +175,13 @@ print(f"Shape: {result.shape if result is not None else 'None'}")
173175
- Check file permissions
174176
- Ensure consistent embedding dimensions
175177
178+
## Routine Maintenance
179+
180+
- **Regenerate the FAISS index** after large code refactors: `python scripts/initialize_index.py`.
181+
- **Rotate environment secrets** by updating `.env` or your deployment variables, then restarting services.
182+
- **Refresh dependencies** with `pip install --upgrade -r requirements.txt` and run `pre-commit run --all-files` plus `pytest -q`.
183+
- **Keep hooks current** using `pre-commit autoupdate` followed by a commit once checks pass.
184+
176185
## Project Structure
177186
178187
```
@@ -191,4 +200,4 @@ CodeRAG/
191200
├── app.py # Streamlit frontend
192201
├── prompt_flow.py # RAG orchestration
193202
└── requirements.txt # Dependencies
194-
```
203+
```

README.md

Lines changed: 15 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
44
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
5-
[![Code Quality](https://github.com/neverdecel/CodeRAG/workflows/Code%20Quality/badge.svg)](https://github.com/neverdecel/CodeRAG/actions)
5+
[![CI Tests](https://github.com/Neverdecel/CodeRAG/actions/workflows/ci-tests.yml/badge.svg?branch=main)](https://github.com/Neverdecel/CodeRAG/actions/workflows/ci-tests.yml)
66

77
> **Note**: This POC was innovative for its time, but modern tools like Cursor and Windsurf now apply this principle directly in IDEs. This remains an excellent educational project for understanding RAG implementation.
88
@@ -14,7 +14,7 @@ CodeRAG combines **Retrieval-Augmented Generation (RAG)** with AI to provide int
1414

1515
Most coding assistants work with limited scope, but CodeRAG provides the full context of your project by:
1616
- **Real-time indexing** of your entire codebase using FAISS vector search
17-
- **Semantic code search** powered by OpenAI embeddings
17+
- **Semantic code search** powered by OpenAI embeddings
1818
- **Contextual AI responses** that understand your project structure
1919

2020
## 🚀 Quick Start
@@ -34,14 +34,17 @@ cd CodeRAG
3434
python -m venv venv
3535
source venv/bin/activate # On Windows: venv\\Scripts\\activate
3636

37-
# Install dependencies
37+
# Install dependencies (installs the package with dev extras)
3838
pip install -r requirements.txt
3939

4040
# Configure environment
4141
cp example.env .env
4242
# Edit .env with your OpenAI API key and settings
4343
```
4444

45+
> The requirements file simply references `-e .[dev]`; feel free to run
46+
> `pip install -e .[dev]` directly if you prefer editable installs.
47+
4548
### Configuration
4649

4750
Create a `.env` file with your settings:
@@ -63,6 +66,9 @@ python main.py
6366

6467
# In a separate terminal, start the web interface
6568
streamlit run app.py
69+
70+
# Query the local index from the terminal (after indexing completes)
71+
coderag-cli "how is faiss configured?"
6672
```
6773

6874
## 📖 How It Works
@@ -88,7 +94,7 @@ graph LR
8894

8995
```
9096
CodeRAG/
91-
├── 🧠 coderag/ # Core RAG functionality
97+
├── 🧠 coderag/ # Core RAG functionality
9298
│ ├── config.py # Environment configuration
9399
│ ├── embeddings.py # OpenAI embedding generation
94100
│ ├── index.py # FAISS vector operations
@@ -103,7 +109,7 @@ CodeRAG/
103109
### Key Components
104110

105111
- **🔍 Vector Search**: FAISS-powered similarity search for code retrieval
106-
- **🎯 Smart Embeddings**: OpenAI embeddings capture semantic code meaning
112+
- **🎯 Smart Embeddings**: OpenAI embeddings capture semantic code meaning
107113
- **📡 Real-time Updates**: Watchdog monitors file changes for live indexing
108114
- **💬 Conversational UI**: Streamlit interface with chat-like experience
109115

@@ -116,7 +122,7 @@ CodeRAG/
116122
"Show me examples of the embedding generation process"
117123
```
118124

119-
### Get Improvements
125+
### Get Improvements
120126
```
121127
"How can I optimize the search performance?"
122128
"What are potential security issues in this code?"
@@ -125,7 +131,7 @@ CodeRAG/
125131

126132
### Debug Issues
127133
```
128-
"Why might the search return no results?"
134+
"Why might the search return no results?"
129135
"How do I troubleshoot OpenAI connection issues?"
130136
"What could cause indexing to fail?"
131137
```
@@ -138,11 +144,7 @@ CodeRAG/
138144
# Install pre-commit hooks
139145
pip install pre-commit
140146
pre-commit install
141-
142-
# Run formatting and linting
143-
black .
144-
flake8 .
145-
mypy .
147+
pre-commit run --all-files
146148
```
147149

148150
### Testing
@@ -165,7 +167,7 @@ python scripts/run_monitor.py
165167
- Verify OpenAI API key is working
166168
- Ensure your query relates to indexed Python files
167169

168-
**OpenAI API errors**
170+
**OpenAI API errors**
169171
- Verify API key in `.env` file
170172
- Check API usage limits and billing
171173
- Ensure model names are correct (gpt-4, text-embedding-ada-002)

coderag/cli.py

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
"""Minimal command-line interface for querying an existing CodeRAG index."""
2+
3+
import argparse
4+
import logging
5+
import textwrap
6+
from typing import List
7+
8+
from coderag.search import search_code
9+
10+
11+
def _format_result(result: dict, index: int) -> str:
12+
snippet = textwrap.shorten(
13+
result.get("content", "").replace("\n", " "), width=200, placeholder="..."
14+
)
15+
return (
16+
f"{index}. {result.get('filename')} ({result.get('filepath')})\n"
17+
f" similarity={result.get('distance', 0.0):.3f}\n"
18+
f" {snippet}"
19+
)
20+
21+
22+
def main(argv: List[str] | None = None) -> int:
23+
parser = argparse.ArgumentParser(
24+
description="Query a local CodeRAG FAISS index without the Streamlit UI."
25+
)
26+
parser.add_argument("query", help="Text to search for in the indexed codebase.")
27+
parser.add_argument(
28+
"-k",
29+
type=int,
30+
default=5,
31+
help="Maximum number of matches to display (defaults to 5).",
32+
)
33+
parser.add_argument(
34+
"--log-level",
35+
default="WARNING",
36+
choices=["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"],
37+
help="Logging verbosity for debugging issues.",
38+
)
39+
40+
args = parser.parse_args(argv)
41+
42+
logging.basicConfig(level=getattr(logging, args.log_level))
43+
44+
results = search_code(args.query, k=args.k)
45+
if not results:
46+
print("No results found; ensure the FAISS index exists and contains data.")
47+
return 1
48+
49+
for idx, item in enumerate(results, start=1):
50+
print(_format_result(item, idx))
51+
52+
return 0
53+
54+
55+
if __name__ == "__main__":
56+
raise SystemExit(main())

coderag/embeddings.py

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,7 @@
33

44
import numpy as np
55
from openai import OpenAI
6-
from tenacity import (
7-
retry,
8-
stop_after_attempt,
9-
wait_exponential,
10-
)
6+
from tenacity import retry, stop_after_attempt, wait_exponential
117

128
from coderag.config import OPENAI_API_KEY, OPENAI_EMBEDDING_MODEL
139

example.env

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,4 +8,4 @@ WATCHED_DIR=/home/user/projects/my_codebase
88

99
# FAISS Configuration
1010
FAISS_INDEX_FILE=/home/user/projects/coderag/faiss_index.bin
11-
EMBEDDING_DIM=1536
11+
EMBEDDING_DIM=1536

0 commit comments

Comments
 (0)