You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Python SDK collects per-test code coverage during Tusk Drift replay using `coverage.py`. Unlike Node.js (which uses V8's built-in coverage), Python requires the `coverage` package to be installed.
4
+
5
+
## Requirements
6
+
7
+
```bash
8
+
pip install tusk-drift-python-sdk[coverage]
9
+
```
10
+
11
+
If `coverage` is not installed when coverage is enabled, the SDK logs a warning and coverage is skipped. Tests still run normally.
12
+
13
+
## How It Works
14
+
15
+
### coverage.py Integration
16
+
17
+
When coverage is enabled (via `--show-coverage`, `--coverage-output`, or `coverage.enabled: true` in config), the CLI sets `TUSK_COVERAGE=true`. The SDK detects this during initialization and starts coverage.py:
-`source` is set to the real path of the working directory (symlinks resolved)
33
+
- Third-party code (site-packages, venv) is excluded by default
34
+
35
+
### Snapshot Flow
36
+
37
+
1.**Baseline**: CLI sends `CoverageSnapshotRequest(baseline=true)`. The SDK:
38
+
- Calls `cov.stop()`
39
+
- Uses `cov.analysis2(filename)` for each measured file to get ALL coverable lines (statements + missing)
40
+
- Returns lines with count=0 for uncovered, count=1 for covered
41
+
- Calls `cov.erase()` then `cov.start()` to reset counters
42
+
43
+
2.**Per-test**: CLI sends `CoverageSnapshotRequest(baseline=false)`. The SDK:
44
+
- Calls `cov.stop()`
45
+
- Uses `cov.get_data().lines(filename)` to get only executed lines since last reset
46
+
- Returns only covered lines (count=1)
47
+
- Calls `cov.erase()` then `cov.start()` to reset
48
+
49
+
3.**Communication**: Results are sent back to the CLI via the existing protobuf channel — same socket used for replay. No HTTP server or extra ports.
50
+
51
+
### Branch Coverage
52
+
53
+
Branch coverage uses coverage.py's arc tracking. The SDK extracts per-line branch data using:
54
+
55
+
```python
56
+
analysis = cov._analyze(filename) # Private API
57
+
missing_arcs = analysis.missing_branch_arcs()
58
+
executed_arcs =set(data.arcs(filename) or [])
59
+
```
60
+
61
+
For each branch point (line with multiple execution paths), the SDK reports:
62
+
-`total`: number of branch paths from that line
63
+
-`covered`: number of paths that were actually taken
64
+
65
+
**Note:**`_analyze()` is a private coverage.py API. It's the only way to get per-line branch arc data. The public API (`analysis2()`) only provides aggregate branch counts. This means branch coverage may break on major coverage.py version upgrades.
66
+
67
+
### Path Handling
68
+
69
+
The SDK uses `os.path.realpath()` for the source root to handle symlinked project directories. File paths reported by coverage.py are also resolved via `realpath` before comparison. This prevents the silent failure where all files get filtered out because symlink paths don't match.
70
+
71
+
## Environment Variables
72
+
73
+
Set automatically by the CLI. You should not set these manually.
74
+
75
+
| Variable | Description |
76
+
|----------|-------------|
77
+
|`TUSK_COVERAGE`| Set to `true` by the CLI when coverage is enabled. The SDK checks this to decide whether to start coverage.py. |
78
+
79
+
Note: `NODE_V8_COVERAGE` is also set by the CLI (for Node.js), but the Python SDK ignores it — it only checks `TUSK_COVERAGE`.
80
+
81
+
## Thread Safety
82
+
83
+
Coverage collection uses a module-level lock (`threading.Lock`) to ensure thread safety:
84
+
85
+
-`start_coverage_collection()`: Acquires lock while initializing. Guards against double initialization — if called twice, stops the existing instance first.
86
+
-`take_coverage_snapshot()`: Acquires lock for the entire stop/read/erase/start cycle.
87
+
-`stop_coverage_collection()`: Acquires lock while stopping and cleaning up.
88
+
89
+
This is important because the protobuf communicator runs coverage handlers in a background thread.
90
+
91
+
## Limitations
92
+
93
+
-**`coverage` package required**: Unlike Node.js (V8 coverage is built-in), Python needs `pip install coverage`. If not installed, coverage silently doesn't work (warning logged).
94
+
-**Performance overhead**: coverage.py uses `sys.settrace()` which adds 10-30% execution overhead. This only applies during coverage replay runs.
95
+
-**Multi-process servers**: gunicorn with `--workers > 1` forks worker processes. The SDK starts coverage.py in the main process; forked workers don't inherit it. Use `--workers 1` during coverage runs.
96
+
-**Private API for branches**: `_analyze()` is not part of coverage.py's public API. Branch coverage detail may break on future coverage.py versions.
97
+
-**Python 3.12+ recommended for async**: coverage.py's `sys.settrace` can miss some async lines on Python < 3.12. Python 3.12+ uses `sys.monitoring` for better async tracking.
98
+
-**Startup ordering**: coverage.py starts during SDK initialization. Code that executes before `TuskDrift.initialize()` (e.g., module-level code in `tusk_drift_init.py`) isn't tracked. This is why `tusk_drift_init.py` typically shows 0% coverage.
99
+
-**C extensions invisible**: coverage.py can't track C extensions (numpy, Cython modules). Not relevant for typical web API servers.
0 commit comments