Skip to content

Commit f91540c

Browse files
committed
docs: Update README and AGENTS.md for new features
1 parent 96c0df0 commit f91540c

2 files changed

Lines changed: 73 additions & 31 deletions

File tree

AGENTS.md

Lines changed: 28 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Agents Guide – Logmaster
1+
# Agents Guide – privlog
22

33
This document is for **coding agents** that work on this repo.
44
It explains what the project does and where the important code lives.
@@ -7,52 +7,51 @@ It explains what the project does and where the important code lives.
77

88
## 1. Purpose of this project
99

10-
- **`logmaster`** is a Python CLI tool built with Typer.
11-
- Its purpose is to analyze log files, identify patterns, and provide formatted output.
12-
- It uses a `rules/` directory to define custom analysis rules for Semgrep.
10+
- **`privlog`** is a Python CLI tool built with Typer for finding and preventing sensitive data leaks.
11+
- It uses a hybrid approach, combining pattern-based Semgrep rules with a high-precision, language-aware AST-based scanner.
1312

1413
---
1514

1615
## 2. Key files and modules
1716

1817
- `pyproject.toml`
19-
- **Purpose:** Defines project metadata, dependencies (`typer`, `pyyaml`), and entry points.
20-
- **Responsibilities:** Manages the package and its dependencies using setuptools. Includes configuration for package data to ensure rule files are included.
18+
- **Purpose:** Defines project metadata, dependencies (`typer`, `pyyaml`, `semgrep`), and the `privlog` entry point.
19+
- **Responsibilities:** Manages the package and its dependencies.
2120

2221
- `README.md`
23-
- **Purpose:** Provides a high-level overview of the project for human users.
22+
- **Purpose:** Provides a high-level overview for human users.
2423

2524
- `logmaster/`
26-
- The main Python package directory.
27-
28-
- `logmaster/__init__.py`
29-
- **Purpose:** Makes the `logmaster` directory a Python package.
25+
- The main Python package directory. (Note: The project is named `privlog`, but the package directory is still `logmaster`).
3026

3127
- `logmaster/cli.py`
3228
- **Purpose:** The main entry point for the CLI application.
33-
- **Responsibilities:** Defines the CLI commands and arguments using Typer. It orchestrates calls to the runner and formatter.
29+
- **Responsibilities:** Defines commands and arguments using Typer. Implements the `--warnings`/`-w` flag and filters findings based on severity (`ERROR` vs. `WARNING`).
3430

3531
- `logmaster/runner.py`
36-
- **Purpose:** The main analysis engine, orchestrating checks from multiple sources.
37-
- **Responsibilities:** Runs both the Semgrep-based pattern checks (`_run_semgrep`) and the high-precision `ast_checks`. It then merges the findings from both sources into a single, unified list for the formatter and CLI. It also contains the data classes for the results (`Finding`, `RunResult`).
32+
- **Purpose:** The main analysis engine.
33+
- **Responsibilities:** Runs both Semgrep and AST checks, converts all findings into a common `Finding` object, and determines the final exit code based *only* on the presence of `ERROR`-level findings.
3834

3935
- `logmaster/formatter.py`
40-
- **Purpose:** Handles the presentation of the analysis results.
41-
- **Responsibilities:** Takes a list of `Finding` objects from the runner and prints them to the console in a compact, `Flake8`-like format (`path:line:col CODE message`).
36+
- **Purpose:** Handles the presentation of results.
37+
- **Responsibilities:** Prints findings in a `Flake8`-like format, with color-coding for severities.
4238

4339
- `logmaster/ast_checks.py`
44-
- **Purpose:** A high-precision Python linter using the built-in `ast` module.
45-
- **Responsibilities:** Parses Python source code into an Abstract Syntax Tree to perform complex, language-aware checks that are difficult with pattern-matching alone. It specializes in detecting sensitive variables inside f-strings that are not wrapped in known sanitizing functions (e.g., `get_salted_identifier`).
46-
47-
- `logmaster/rules/`
48-
- **Purpose:** Stores custom analysis rules.
40+
- **Purpose:** A high-precision Python linter using the `ast` module. It is the core of the tool's intelligence.
41+
- **Responsibilities:**
42+
1. **Severity System**: Divides sensitive variable names into `HIGH_CONFIDENCE_SENSITIVE_NAMES` (`ERROR`) and `WARNING_SENSITIVE_NAMES` (`WARNING`).
43+
2. **Multi-Format Detection**: Understands and inspects arguments within f-strings, `.format()` calls, and `%`-style formatting.
44+
3. **`print()` Check**: Scans `print()` statements for sensitive variables, applying the same severity logic as logging calls.
45+
4. **Heuristic Analysis**: Flags risky but not definitively incorrect patterns as `WARNING`s.
46+
- **Finding Codes**:
47+
- `LM2101`: A direct sensitive identifier was found in a logging call. Severity can be `ERROR` or `WARNING`.
48+
- `LM2201`: A logging call uses the `extra` parameter, which could hide sensitive data. Severity is `WARNING`.
49+
- `LM2202`: `json.dumps()` is used in a logging call. Severity is `WARNING`.
50+
- `LM2203`: `.to_dict()` is used in a logging call. Severity is `WARNING`.
51+
- `LM2301`: A direct sensitive identifier was found in a `print()` call. Severity can be `ERROR` or `WARNING`.
52+
- `LM2302`: `json.dumps()` is used in a `print()` call. Severity is `WARNING`.
53+
- `LM2303`: `.to_dict()` is used in a `print()` call. Severity is `WARNING`.
4954

5055
- `logmaster/rules/logmaster.yml`
51-
- **Purpose:** The core Semgrep ruleset for LogMaster, based on production-proven patterns.
52-
- **Responsibilities:** Defines specific, categorized patterns to detect common logging anti-patterns. The rules are grouped by ID prefixes:
53-
- `LM11xx`: High-signal PII leaks (e.g., raw emails, user IDs, IP addresses).
54-
- `LM12xx`: High-confidence secret leakage, focusing on raw authentication headers (`Authorization`, `Cookie`). Complex variable name checks are handled by the AST module.
55-
- `LM13xx`: Raw payload and header dumping.
56-
- `LM14xx`: Unsafe exception logging that may leak sensitive data.
57-
- `LM15xx`: Unbounded logging of vendor API responses.
58-
Each rule has a unique ID, severity, and a clear message.
56+
- **Purpose:** The core Semgrep ruleset, which complements the AST checker by finding broader, less precise patterns.
57+
- **Responsibilities:** Defines rules for detecting PII, secrets, and unsafe logging patterns like payload dumping.

README.md

Lines changed: 45 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,46 @@
1-
# Logmaster
1+
# privlog
22

3-
A CLI for mastering logs.
3+
A privacy-aware linter for Python projects, designed to catch accidental leaks of sensitive data in logs and `print` statements before they reach production.
4+
5+
`privlog` is built to be a developer's first line of defense, integrating directly into your local workflow and CI/CD pipelines to enforce logging hygiene.
6+
7+
## Features
8+
9+
- **High-Precision AST Analysis**: Goes beyond simple regex to parse Python code, understanding variable names inside f-strings, `.format()` calls, and more.
10+
- **Severity System**: Differentiates between definite leaks (`ERROR`) and suspicious patterns that require manual review (`WARNING`), preventing false positives from breaking your build.
11+
- **Built-in Heuristics**: Flags risky patterns like logging entire dictionaries (`extra=...`) or `json.dumps()` output.
12+
- **`print()` Statement Detection**: Catches sensitive data in leftover `print()` statements, a common source of leaks.
13+
- **CI/CD Friendly**: Exits with a non-zero code only on `ERROR` findings, allowing warnings to be reviewed without blocking development.
14+
- **Extensible**: Powered by a combination of custom AST checks and a Semgrep rule engine.
15+
16+
## Usage
17+
18+
First, install the tool in your project's virtual environment:
19+
```sh
20+
pip install -e .
21+
```
22+
23+
To run the checks, use the `privlog` command.
24+
25+
### Default (Errors Only)
26+
27+
By default, `privlog` only reports high-confidence `ERROR`s. If any are found, it will exit with a non-zero code, failing your build.
28+
29+
```sh
30+
privlog /path/to/your/project
31+
```
32+
33+
If only warnings are found, the command will pass and provide a helpful message:
34+
```
35+
✅ privlog passed. No errors found.
36+
(Warnings were found. Run with -w to show them)
37+
```
38+
39+
### Show Warnings
40+
41+
To see both `ERROR`s and `WARNING`s, use the `-w` or `--warnings` flag.
42+
43+
```sh
44+
privlog -w /path/to/your/project
45+
```
46+
This will display all findings, color-coded by severity, but will still only fail the build if `ERROR`s are present.

0 commit comments

Comments
 (0)