Skip to content

Commit 3f9e5f1

Browse files
committed
docs: Add section on configuring custom wrappers to README
1 parent 675c102 commit 3f9e5f1

2 files changed

Lines changed: 48 additions & 22 deletions

File tree

CONTRIBUTING.md

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,38 @@
1-
# Agents Guide – privlog
1+
# Contributing to privlog
22

3-
This document is for **coding agents** that work on this repo.
4-
It explains what the project does and where the important code lives.
3+
This guide is for developers who want to contribute to the `privlog` project. It explains the project's architecture and where key logic lives.
54

65
---
76

87
## 1. Purpose of this project
98

10-
- **`privlog`** is a Python CLI tool built with Typer for finding and preventing sensitive data leaks.
11-
- It uses a hybrid approach, combining pattern-based Semgrep rules with a high-precision, language-aware AST-based scanner.
9+
- **`privlog`** is a privacy-aware linter for Python that uses a Typer CLI. Its analysis is powered by a hybrid engine combining pattern-based Semgrep rules with a high-precision, language-aware AST-based scanner.
1210

1311
---
1412

1513
## 2. Key files and modules
1614

1715
- `pyproject.toml`
18-
- **Purpose:** Defines project metadata, dependencies (`typer`, `pyyaml`, `semgrep`), and the `privlog` entry point.
19-
- **Responsibilities:** Manages the package and its dependencies.
16+
- **Purpose:** Defines project metadata, dependencies, and the `privlog` entry point. It is also the location for user-defined configuration under the `[tool.privlog]` section.
2017

2118
- `README.md`
22-
- **Purpose:** Provides a high-level overview for human users.
19+
- **Purpose:** Provides a high-level overview and usage instructions for users.
2320

2421
- `privlog/`
2522
- The main Python package directory.
2623

2724
- `privlog/cli.py`
2825
- **Purpose:** The main entry point for the CLI application.
29-
- **Responsibilities:** Defines commands and arguments using Typer. Implements the `--warnings`/`-w` flag and filters findings based on severity (`ERROR` vs. `WARNING`).
26+
- **Responsibilities:** Defines commands and arguments using Typer. Implements the `--warnings`/`-w` flag and filters findings based on severity.
3027

3128
- `privlog/runner.py`
3229
- **Purpose:** The main analysis engine.
33-
- **Responsibilities:** Runs both Semgrep and AST checks, converts all findings into a common `Finding` object, and determines the final exit code based *only* on the presence of `ERROR`-level findings.
30+
- **Responsibilities:**
31+
1. Loads user configuration from `pyproject.toml` via the `_load_config` function.
32+
2. Runs the Semgrep scanner.
33+
3. Runs the AST checker, passing the loaded configuration to it.
34+
4. Merges findings from both sources.
35+
5. Determines the final exit code based *only* on the presence of `ERROR`-level findings.
3436

3537
- `privlog/formatter.py`
3638
- **Purpose:** Handles the presentation of results.
@@ -41,17 +43,15 @@ It explains what the project does and where the important code lives.
4143
- **Responsibilities:**
4244
1. **Severity System**: Divides sensitive variable names into `HIGH_CONFIDENCE_SENSITIVE_NAMES` (`ERROR`) and `WARNING_SENSITIVE_NAMES` (`WARNING`).
4345
2. **Multi-Format Detection**: Understands and inspects arguments within f-strings, `.format()` calls, and `%`-style formatting.
44-
3. **`print()` Check**: Scans `print()` statements for sensitive variables, applying the same severity logic as logging calls.
45-
4. **Heuristic Analysis**: Flags risky but not definitively incorrect patterns as `WARNING`s.
46+
3. **`print()` Check**: Scans `print()` statements for sensitive variables.
47+
4. **Heuristic Analysis**: Flags risky patterns like logging with `extra=...` or `json.dumps()`.
48+
5. **Custom Wrapper Analysis**: Receives the `PrivlogConfig` object and inspects function calls to see if they match a name in the `custom_wrappers` configuration, checking their keyword arguments accordingly.
4649
- **Finding Codes**:
47-
- `LM2101`: A direct sensitive identifier was found in a logging call. Severity can be `ERROR` or `WARNING`.
48-
- `LM2201`: A logging call uses the `extra` parameter, which could hide sensitive data. Severity is `WARNING`.
49-
- `LM2202`: `json.dumps()` is used in a logging call. Severity is `WARNING`.
50-
- `LM2203`: `.to_dict()` is used in a logging call. Severity is `WARNING`.
51-
- `LM2301`: A direct sensitive identifier was found in a `print()` call. Severity can be `ERROR` or `WARNING`.
52-
- `LM2302`: `json.dumps()` is used in a `print()` call. Severity is `WARNING`.
53-
- `LM2303`: `.to_dict()` is used in a `print()` call. Severity is `WARNING`.
50+
- `LM2101`: A direct sensitive identifier was found in a logging call.
51+
- `LM2201-2203`: A heuristic pattern (like `extra=...` or `json.dumps`) was found in a logging call.
52+
- `LM2301-2303`: A sensitive identifier or heuristic pattern was found in a `print()` call.
53+
- `LM2401`: A sensitive argument was passed to a custom logging wrapper defined in the user's configuration.
5454

5555
- `privlog/rules/privlog.yml`
56-
- **Purpose:** The core Semgrep ruleset, which complements the AST checker by finding broader, less precise patterns.
57-
- **Responsibilities:** Defines rules for detecting PII, secrets, and unsafe logging patterns like payload dumping.
56+
- **Purpose:** The core Semgrep ruleset, which complements the AST checker.
57+
- **Responsibilities:** Defines rules for detecting PII, secrets, and unsafe logging patterns.

README.md

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ A privacy-aware linter for Python projects, designed to catch accidental leaks o
1111
- **Built-in Heuristics**: Flags risky patterns like logging entire dictionaries (`extra=...`) or `json.dumps()` output.
1212
- **`print()` Statement Detection**: Catches sensitive data in leftover `print()` statements, a common source of leaks.
1313
- **CI/CD Friendly**: Exits with a non-zero code only on `ERROR` findings, allowing warnings to be reviewed without blocking development.
14-
- **Extensible**: Powered by a combination of custom AST checks and a Semgrep rule engine.
14+
- **Configurable & Extensible**: Teach `privlog` about your project's custom logging functions via a simple `pyproject.toml` configuration.
1515

1616
## Installation
1717

@@ -42,7 +42,11 @@ Once installed, run the `privlog` command on your project directory.
4242
By default, `privlog` only reports high-confidence `ERROR`s. If any are found, it will exit with a non-zero code, failing your build.
4343

4444
```sh
45+
# Scan a specific directory
4546
privlog /path/to/your/project
47+
48+
# Or, from inside a project, scan the current directory
49+
privlog .
4650
```
4751

4852
If only warnings are found, the command will pass and provide a helpful message:
@@ -56,10 +60,32 @@ If only warnings are found, the command will pass and provide a helpful message:
5660
To see both `ERROR`s and `WARNING`s, use the `-w` or `--warnings` flag.
5761

5862
```sh
63+
# Scan a specific directory with warnings
5964
privlog -w /path/to/your/project
65+
66+
# Or, from inside a project, scan the current directory with warnings
67+
privlog -w .
6068
```
69+
6170
This will display all findings, color-coded by severity, but will still only fail the build if `ERROR`s are present.
6271

72+
### Configuring Custom Wrappers
73+
74+
You can teach `privlog` to recognize your own custom logging functions. In your project's `pyproject.toml` file, add a `[tool.privlog.custom_wrappers]` section.
75+
76+
For each custom function, specify its name and which of its keyword arguments should be treated as sensitive, along with the desired severity (`ERROR` or `WARNING`).
77+
78+
**Example `pyproject.toml`:**
79+
```toml
80+
[tool.privlog.custom_wrappers]
81+
# For a function call like: audit(actor_id=user.id, event="login")
82+
audit = { actor_id = "ERROR" }
83+
84+
# For a function call like: log_event("payment_failed", details=evt)
85+
log_event = { details = "WARNING" }
86+
```
87+
`privlog` will automatically find and use this configuration when you run it.
88+
6389
---
6490

6591
## For Developers

0 commit comments

Comments
 (0)