Add comprehensive secret detection and update documentation

christopherpaquin · christopherpaquin · commit cf83351d4a9c · 2026-01-02T15:06:35.000-05:00
- Enhanced .gitignore with all secrets categories from CONTEXT.md
  - API keys, access tokens, cloud credentials, private keys
  - Service account credentials, webhook secrets
  - Organized with CONTEXT.md section references

- Created scripts/detect-secrets.sh for pre-commit secret scanning
  - Detects API keys (Stripe, OpenAI, GitHub, AWS, etc.)
  - Detects tokens, credentials, private keys, JWT tokens
  - Uses entropy analysis for high-entropy strings
  - Filters false positives (variable names, examples, URLs, comments)
  - Excludes test files and example files from scanning
  - Provides clear error messages with file location and context

- Updated .pre-commit-config.yaml
  - Added detect-secrets hook as local hook
  - Runs automatically on commit

- Updated bootstrap-template-structure.sh
  - Added scripts/detect-secrets.sh to file list

- Updated docs/ai/CONTEXT.md
  - Section 7.1.4: Documented detect-secrets.sh implementation
  - Section 11.3: Added AI agent requirements for secret detection
  - Instructions for handling false positives

- Updated README.md
  - Added Secret Detection subsection in Configuration
  - Documented what it detects and false positive filtering
  - Updated Security Notes section
  - Added detect-secrets.sh to Key Files Reference table

All pre-commit checks passing
diff --git a/.gitignore b/.gitignore
@@ -8,6 +8,7 @@
 # ---------------------------------------------------------------------------
 # Environment variables / secrets
 # ---------------------------------------------------------------------------
+# Per docs/ai/CONTEXT.md Section 7.1 - Secrets Management
 
 # Standard environment files
 .env
@@ -24,21 +25,94 @@ secrets.*
 *.secret
 *.secrets
 
-# Credential files
-*.pem
-*.key
-*.crt
-*.pfx
-*.p12
-id_rsa
-id_ed25519
+# ---------------------------------------------------------------------------
+# API Keys and Access Tokens (Section 7.1.2.1, 7.1.2.2)
+# ---------------------------------------------------------------------------
+# API key files
+*api*key*
+*apikey*
+*access*token*
+*oauth*token*
+*refresh*token*
+*pat*
+*personal*access*token*
 
-# Cloud / tooling credentials
+# ---------------------------------------------------------------------------
+# Cloud Provider Credentials (Section 7.1.2.4)
+# ---------------------------------------------------------------------------
 .aws/
 .gcp/
 .azure/
+.terraform/
 terraform.tfvars
 terraform.tfvars.json
+*.tfvars
+*.tfvars.json
+
+# GCP service account keys
+*service-account*.json
+*gcp*.json
+*gcloud*.json
+
+# AWS credentials
+.aws/credentials
+.aws/config
+aws-credentials.json
+
+# Azure credentials
+.azure/credentials
+azure-credentials.json
+
+# ---------------------------------------------------------------------------
+# Cryptographic Private Keys (Section 7.1.2.5)
+# ---------------------------------------------------------------------------
+# SSH keys
+id_rsa
+id_ed25519
+id_ecdsa
+id_dsa
+*.pem
+*.key
+*.p12
+*.pfx
+*.crt
+*.cer
+*.der
+
+# TLS/SSL certificates (private keys only - public certs may be OK)
+*.key
+*.pem
+!*.pub
+!*.public
+
+# Signing keys
+*.signing.key
+*.private.key
+
+# ---------------------------------------------------------------------------
+# Service Account Credentials (Section 7.1.2.6)
+# ---------------------------------------------------------------------------
+*service-account*.json
+*service-account*.yaml
+*service-account*.yml
+*credentials*.json
+*credentials*.yaml
+*credentials*.yml
+
+# ---------------------------------------------------------------------------
+# Webhook Secrets (Section 7.1.2.7)
+# ---------------------------------------------------------------------------
+*webhook*secret*
+*webhook*signing*
+*hook*secret*
+
+# ---------------------------------------------------------------------------
+# Generated logs and caches that may contain secrets
+# ---------------------------------------------------------------------------
+*.log
+!artifacts/pre-commit.log
+*.cache
+*.tmp
 
 # ---------------------------------------------------------------------------
 # Python
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -12,6 +12,16 @@ repos:
         args: [--fix=lf]
       - id: detect-private-key
 
+  # Custom secret detection (comprehensive API keys, tokens, credentials)
+  - repo: local
+    hooks:
+      - id: detect-secrets
+        name: Detect Secrets (API Keys, Tokens, Credentials)
+        entry: scripts/detect-secrets.sh
+        language: system
+        pass_filenames: false
+        always_run: false
+
   # Python: Ruff (lint + autofix) and Ruff formatter
   - repo: https://github.com/astral-sh/ruff-pre-commit
     rev: v0.8.4
diff --git a/README.md b/README.md
@@ -367,7 +367,32 @@ Pre-commit hooks are configured in `.pre-commit-config.yaml`. The default config
 | 🐍 Python | Ruff linting and formatting | ✅ Active |
 | 🐚 Bash | ShellCheck and shfmt formatting | ✅ Active |
 | 📄 Markdown | PyMarkdown validation | ✅ Active |
-| 🔒 Security | Private key detection | ✅ Active |
+| 🔒 Security | Private key detection, API key detection, token scanning | ✅ Active |
+
+#### Secret Detection
+
+The framework includes comprehensive secret detection via `scripts/detect-secrets.sh`:
+
+**What It Detects**:
+
+- ✅ API keys (Stripe, OpenAI, Google, AWS, etc.)
+- ✅ GitHub tokens (PATs, OAuth tokens)
+- ✅ Cloud provider credentials (AWS, GCP, Azure)
+- ✅ Private keys (SSH, TLS, signing keys)
+- ✅ OAuth tokens and refresh tokens
+- ✅ JWT tokens
+- ✅ High-entropy strings (potential secrets)
+
+**False Positive Filtering**:
+
+- ✅ Ignores variable names (e.g., `api_key =`)
+- ✅ Ignores example/placeholder values
+- ✅ Ignores URLs and API endpoints
+- ✅ Ignores comments and documentation
+- ✅ Excludes test files and example files
+
+If secrets are detected, the commit will be blocked. Use example placeholders like
+`YOUR_API_KEY_HERE` instead of real secrets.
 
 ### CI/CD Configuration
 
@@ -429,7 +454,7 @@ This framework is designed with security as the highest priority:
 | Security Feature | Status | Description |
 |-----------------|--------|-------------|
 | 🔐 Secrets Protection | ✅ Active | Comprehensive `.gitignore` prevents accidental secret commits |
-| 🔍 Automated Detection | ✅ Active | Pre-commit hooks detect private keys and other secrets |
+| 🔍 Automated Detection | ✅ Active | Pre-commit hooks detect secrets via detect-secrets.sh |
 | 🛡️ Least Privilege | ✅ Active | All scripts should use least privilege principles |
 | ✅ Input Validation | ✅ Active | All inputs should be treated as untrusted |
 | 📋 Audit Trail | ✅ Active | Pre-commit logs provide an audit trail of quality checks |
@@ -475,6 +500,7 @@ Licensed under the Apache License, Version 2.0. See [LICENSE](LICENSE) for full
 | `.pre-commit-config.yaml` | Quality check configuration | ✅ Required |
 | `.github/workflows/ci.yaml` | CI/CD pipeline definition | ✅ Required |
 | `scripts/run-precommit.sh` | Pre-commit execution wrapper (use this, not `pre-commit` directly) | ✅ Required |
+| `scripts/detect-secrets.sh` | Secret detection script (runs automatically via pre-commit) | ✅ Required |
 | `bootstrap-template-structure.sh` | Recreate template structure | ✅ Optional |
 
 ---
diff --git a/bootstrap-template-structure.sh b/bootstrap-template-structure.sh
@@ -29,6 +29,7 @@ FILES=(
   ".github/ISSUE_TEMPLATE/feature_request.yml"
   ".github/ISSUE_TEMPLATE/bug_report.yml"
   "scripts/run-precommit.sh"
+  "scripts/detect-secrets.sh"
 )
 
 echo "==> Creating directory structure (no overwrite)"
diff --git a/docs/ai/CONTEXT.md b/docs/ai/CONTEXT.md
@@ -294,10 +294,23 @@ Secrets must not appear:
 
 Repositories must enforce:
 
-- Pre-commit secret scanning
-- CI-level secret detection
+- Pre-commit secret scanning via `scripts/detect-secrets.sh`
+- CI-level secret detection (runs same pre-commit hooks)
 - AI agent instructions prohibiting secret creation or logging
 
+**Secret Detection Implementation**: This framework includes `scripts/detect-secrets.sh`, a
+comprehensive secret detection script that:
+
+- Scans staged files for API keys, tokens, credentials, and private keys
+- Uses pattern matching for known secret formats (GitHub tokens, AWS keys, Stripe keys, etc.)
+- Applies entropy analysis to detect high-entropy strings
+- Filters false positives (variable names, example values, API endpoints, comments)
+- Excludes test files, example files, and documentation from scanning
+- Provides clear error messages with file location and context
+
+AI agents must understand that this script will block commits containing secrets and must
+never attempt to bypass or disable this protection.
+
 High-entropy strings and known token patterns must be treated as potential secrets until proven otherwise.
 
 #### 7.1.5 Security Posture Statement
@@ -509,6 +522,15 @@ AI agents must execute pre-commit using the repository helper script:
 ./scripts/run-precommit.sh
 ```
 
+**Secret Detection**: The pre-commit hooks include `scripts/detect-secrets.sh` which
+automatically scans all staged files for secrets, API keys, and credentials. If secrets
+are detected, the commit will be blocked. AI agents must:
+
+- Never attempt to bypass secret detection
+- Use example placeholders (e.g., `YOUR_API_KEY_HERE`) instead of real secrets
+- Ensure all secrets are stored in `.env` files (which are excluded from git)
+- If a false positive is detected, add appropriate allowlist patterns to the script
+
 ---
 
 ## 12. Why Security Matters in This Framework
diff --git a/scripts/detect-secrets.sh b/scripts/detect-secrets.sh

Original file line number	Diff line number	Diff line change
`@@ -29,6 +29,7 @@ FILES=(`
`29`	`29`	`".github/ISSUE_TEMPLATE/feature_request.yml"`
`30`	`30`	`".github/ISSUE_TEMPLATE/bug_report.yml"`
`31`	`31`	`"scripts/run-precommit.sh"`
	`32`	`+ "scripts/detect-secrets.sh"`
`32`	`33`	`)`
`33`	`34`
`34`	`35`	`echo "==> Creating directory structure (no overwrite)"`