Skip to content

Commit 6765f19

Browse files
docs: rewrite AGENTS.md tree with init-agents
Why: * Replace generic generated AGENTS.md files with concise, repo-specific agent instructions grounded in observable architecture and conventions What: * Root AGENTS.md: 4-layer architecture, change rules, validation commands, gotchas * Package AGENTS.md: service file map, thin-wrapper invariant, new-service checklist * Core AGENTS.md: instrument/serializer/enum boundaries, high-impact change warnings * Tests AGENTS.md: BaseTest pattern, dual sync/async requirement, rate-limiting fixtures Changes: * AGENTS.md: rewrite from generic knowledge base to structured agent guide * src/python3_capsolver/AGENTS.md: focus on service-layer rules and boundaries * src/python3_capsolver/core/AGENTS.md: focus on instrument and serialization internals * tests/AGENTS.md: focus on test patterns, fixtures, and live API dependency Stats: * 4 files changed * ~210 lines replaced Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
1 parent b98b447 commit 6765f19

4 files changed

Lines changed: 146 additions & 192 deletions

File tree

AGENTS.md

Lines changed: 60 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -1,72 +1,69 @@
1-
# PROJECT KNOWLEDGE BASE
1+
# AGENTS.md
22

3-
**Generated:** 2026-03-15
4-
**Commit:** b797332
5-
**Branch:** main
3+
## Repository overview
64

7-
## OVERVIEW
8-
Python 3.8+ library for Capsolver service API. Dual sync (`requests`) / async (`aiohttp`) support. `msgspec` for serialization, `tenacity` for retries.
5+
Python 3.8+ client library for the Capsolver captcha-solving API. Single-package `src/` layout, dual sync (`requests`) / async (`aiohttp`) execution, `msgspec` for serialization, `tenacity` for async retries.
96

10-
## STRUCTURE
11-
```
7+
## Where to work
8+
9+
```text
1210
./
13-
├── src/python3_capsolver/ # Main library (service implementations)
14-
│ ├── core/ # Base classes, instruments, serializers
15-
│ └── *.py # Service-specific (ReCaptcha, Cloudflare, etc.)
16-
├── tests/ # Pytest suite (matches source structure)
17-
├── docs/ # Sphinx documentation
18-
├── ARCHITECTURE.md # System architecture (matklad-style)
19-
└── pyproject.toml # Build, uv, pytest, black/isort config
11+
├── src/python3_capsolver/ # Main library
12+
│ ├── core/ # Base classes, instruments, serializers, enums
13+
│ ├── control.py # Raw API: get_balance, create_task, get_task_result
14+
│ ├── recaptcha.py # ReCaptcha V2/V3/Enterprise
15+
│ ├── cloudflare.py # Cloudflare Turnstile/Challenge
16+
│ └── *.py # Other captcha services (see package AGENTS.md)
17+
├── tests/ # Pytest suite mirroring source structure
18+
│ ├── conftest.py # BaseTest class, fixtures, rate-limiting delays
19+
│ └── test_*.py # One file per service + test_core.py + test_instrument.py
20+
├── docs/ # Sphinx documentation (make html)
21+
├── ARCHITECTURE.md # Layered architecture, data flow, invariants
22+
├── pyproject.toml # Build, deps, black/isort/pytest config
23+
└── Makefile # make tests, make refactor, make build, make doc
2024
```
2125

22-
## WHERE TO LOOK
23-
| Task | Location | Notes |
24-
|------|----------|-------|
25-
| **Architecture** | `ARCHITECTURE.md` | Layered design, invariants, life of a request |
26-
| **Base Logic** | `src/python3_capsolver/core/` | `base.py`, `serializer.py`, `enum.py`, instruments |
27-
| **Service Implementations** | `src/python3_capsolver/*.py` | `recaptcha.py`, `cloudflare.py`, `control.py` |
28-
| **Tests** | `tests/` | `conftest.py` (BaseTest, fixtures), per-service tests |
29-
| **Configuration** | `pyproject.toml` | uv, black (120), isort, pytest (asyncio auto) |
30-
| **Commands** | `Makefile` | `make tests`, `make build`, `make upload` |
31-
32-
## CONVENTIONS
33-
- **Toolchain**: `uv` for package management (`uv sync`, `uv run`, `uv build`, `uv publish`)
34-
- **Formatter**: `black` (line-length 120), `isort` (profile "black")
35-
- **Cleanup**: `autoflake` (remove unused imports/variables)
36-
- **Serialization**: `msgspec` (not `json`) for performance
37-
- **Concurrency**: Dual sync/async required for all instruments
38-
- **Retries**: `tenacity` (async), `requests.Retry` (sync) — 5 attempts, exponential backoff
39-
- **Testing**: pytest 7.0+, `pytest-asyncio` (auto mode), rate-limiting fixtures (1s func, 2s class)
40-
41-
## ANTI-PATTERNS (THIS PROJECT)
42-
- **Empty `__init__.py` files**: `src/python3_capsolver/__init__.py` only exports `__version__`; `core/__init__.py` is completely empty. Users must import via full paths (`from python3_capsolver.recaptcha import ReCaptcha`)
43-
- **AGENTS.md in package dirs**: Will ship with distribution unless excluded in `pyproject.toml`
44-
- **No CLI entry points**: Library-only, no console_scripts defined
45-
46-
## UNIQUE STYLES
47-
- **Service Pattern**: Each captcha service inherits from `CaptchaParams` with `captcha_handler()` (sync) + `aio_captcha_handler()` (async)
48-
- **Task Payload**: Dict merged with internal params, passed to `create_task()` API
49-
- **Context Managers**: All services support `with` / `async with` for session cleanup
50-
- **Test Duplication**: Every sync test (`def test_*`) has async counterpart (`async def test_aio_*`)
51-
52-
## COMMANDS
26+
## Architecture and boundaries
27+
28+
Four layers with strict dependency direction (top → bottom only):
29+
30+
1. **Service layer** (`src/python3_capsolver/*.py`) — thin wrappers inheriting `CaptchaParams`, zero HTTP logic
31+
2. **Base layer** (`core/base.py`) — `CaptchaParams` merges payloads, delegates to instruments
32+
3. **Instrument layer** (`core/*_instrument.py`) — `SIOCaptchaInstrument` (sync) and `AIOCaptchaInstrument` (async) handle all HTTP, retries, and polling
33+
4. **Support layer** (`core/serializer.py`, `core/enum.py`, `core/const.py`) — `msgspec.Struct` classes, enums, constants
34+
35+
**Forbidden**: service files importing `requests`/`aiohttp` directly; support layer depending on upper layers.
36+
37+
## Change rules
38+
39+
- Every new captcha service must inherit `CaptchaParams`, provide `captcha_handler()` + `aio_captcha_handler()`, and register its type in `CaptchaTypeEnm`
40+
- All serialization must use `msgspec.Struct` — never raw `json` module
41+
- Dual sync/async is mandatory for any new instrument or service
42+
- Service classes are thin: only `__init__` with captcha-type-specific params; all HTTP goes through instruments
43+
- Context manager support (`with` / `async with`) is required on all service classes
44+
45+
## Validation
46+
5347
```bash
54-
# Development
55-
uv sync --all-groups # Install all dependencies
56-
uv run pytest tests/ # Run tests
57-
uv run black src/ tests/ # Format
58-
uv run isort src/ tests/ # Sort imports
59-
60-
# Build & Publish
61-
uv build # Build wheel/sdist
62-
uv publish # Upload to PyPI
63-
64-
# Documentation
65-
cd docs/ && uv run --group docs make html -e
48+
make tests # pytest + coverage (HTML + XML)
49+
make refactor # autoflake + black + isort on src/ and tests/
50+
make lint # autoflake --check, black --check, isort --check-only
51+
uv run pytest tests/ -k <name> # run specific tests
6652
```
6753

68-
## NOTES
69-
- **API Key**: Tests require `API_KEY` environment variable
70-
- **Coverage**: HTML reports in `coverage/html/`, XML in `coverage/coverage.xml`
71-
- **Python Support**: 3.8–3.12 (tested via `target-version = ['py310']`)
72-
- **Dependencies**: `requests>=2.21.0`, `aiohttp>=3.9.2`, `msgspec>=0.18,<=0.21`, `tenacity>=8,<10`
54+
Tests require the `API_KEY` environment variable. Rate-limiting fixtures (`delay_func` 1s, `delay_class` 2s) prevent API throttling.
55+
56+
## Key docs
57+
58+
- `ARCHITECTURE.md` — full system map, data flow, invariants
59+
- `src/python3_capsolver/AGENTS.md` — service-level conventions
60+
- `src/python3_capsolver/core/AGENTS.md` — core module internals
61+
- `tests/AGENTS.md` — test patterns and fixtures
62+
63+
## Repository-specific gotchas
64+
65+
- **Empty `__init__.py` files**: users import via full path (`from python3_capsolver.recaptcha import ReCaptcha`), never from top-level package
66+
- **`AGENTS.md` in package dirs**: these ship with the wheel unless excluded in `pyproject.toml` — do not add more inside `src/`
67+
- **`control.py` is the largest file** (~431 lines) and provides direct API access without the captcha-handling abstraction
68+
- **Toolchain is `uv`**: use `uv run`, `uv sync`, `uv build` — not bare `pip` or `pytest`
69+
- **`captcha_instrument.py` is ~9.3k lines**: contains both `CaptchaInstrumentBase` and `FileInstrument`; edits here affect all services

src/python3_capsolver/AGENTS.md

Lines changed: 28 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -1,55 +1,39 @@
1-
# PYTHON3_CAPSOLVER PACKAGE
1+
# AGENTS.md
22

3-
**Generated:** 2026-03-15
4-
**Commit:** b797332
3+
## Scope
54

6-
## OVERVIEW
7-
Main library package containing service-specific captcha solving implementations. Provides high-level interfaces for various captcha types (ReCaptcha, Cloudflare Turnstile, DataDome, etc.) through a unified API.
5+
Service implementations for individual captcha types. Each file is a self-contained solver class.
86

9-
Each service class inherits from `core.CaptchaParams` and implements synchronous (`requests`) and asynchronous (`aiohttp`) solving methods with automatic retry logic and response polling.
7+
## What lives here
108

11-
## STRUCTURE
12-
```
9+
```text
1310
src/python3_capsolver/
14-
├── core/ # Base classes, instruments, serializers
15-
├── control.py # Direct API access (get_balance, create_task, get_task_result)
11+
├── core/ # Base classes, instruments, serializers (own AGENTS.md)
12+
├── control.py # Raw API: get_balance, create_task, get_task_result
1613
├── recaptcha.py # ReCaptcha V2/V3/Enterprise
1714
├── cloudflare.py # Cloudflare Turnstile/Challenge
18-
├── vision_engine.py # AI-based image recognition
19-
├── image_to_text.py # OCR text extraction
20-
├── datadome_slider.py # DataDome slider captcha
21-
├── mt_captcha.py # MtCaptcha solver
22-
├── gee_test.py # GeeTest solver
15+
├── gee_test.py # GeeTest V3/V4
16+
├── datadome_slider.py # DataDome slider
17+
├── mt_captcha.py # MtCaptcha
2318
├── aws_waf.py # AWS WAF bypass
24-
├── friendly_captcha.py # FriendlyCaptcha solver
25-
├── yandex.py # Yandex captcha solver
26-
├── __init__.py # Exports only __version__ (minimal)
19+
├── friendly_captcha.py # FriendlyCaptcha
20+
├── yandex.py # Yandex SmartCaptcha
21+
├── image_to_text.py # OCR text extraction
22+
├── vision_engine.py # AI-based image recognition
23+
├── __init__.py # Exports only __version__
2724
└── __version__.py # Version string
2825
```
2926

30-
## WHERE TO LOOK
31-
| Task | Location | Notes |
32-
|------|----------|-------|
33-
| **Direct API Access** | `control.py` | `Control.get_balance()`, `create_task()`, `get_task_result()` |
34-
| **ReCaptcha** | `recaptcha.py` | V2, V3, Enterprise variants |
35-
| **Cloudflare** | `cloudflare.py` | Turnstile, Challenge modes |
36-
| **Image-based** | `vision_engine.py`, `image_to_text.py` | AI recognition, OCR |
37-
| **Other Services** | `*.py` | DataDome, GeeTest, AWS WAF, etc. |
38-
| **Base Logic** | `core/` | `CaptchaParams`, instruments, serializers |
39-
40-
## CONVENTIONS
41-
- **Service Pattern**: Each service class inherits from `CaptchaParams` with `api_key` and `captcha_type` params
42-
- **Dual Handlers**: All services provide `captcha_handler()` (sync) and `aio_captcha_handler()` (async)
43-
- **Retry Logic**: Built-in exponential backoff with configurable `sleep_time` (default: 5s)
44-
- **Task Payload**: Passed via `task_payload` dict, merged with internal task params
45-
- **Response Structure**: Returns dict with `errorId`, `taskId`, `status`, `solution` fields
46-
- **Context Managers**: Support `with` and `async with` for session cleanup
47-
48-
## ANTI-PATTERNS (THIS PACKAGE)
49-
- **Minimal `__init__.py`**: Does NOT export service classes. Users cannot do `from python3_capsolver import ReCaptcha` — must use full path `from python3_capsolver.recaptcha import ReCaptcha`
50-
- **AGENTS.md in package dir**: Will ship with distribution unless excluded in `pyproject.toml`
51-
52-
## UNIQUE STYLES
53-
- **control.py (431 lines)**: Largest file, provides raw API access without abstraction
54-
- **Service files**: 2-5k lines each, focused on single captcha type
55-
- **No type stubs**: Type hints inline, no `.pyi` files
27+
## Local boundaries and invariants
28+
29+
- Service classes inherit `CaptchaParams` and define only `__init__` with captcha-type-specific params
30+
- No HTTP imports (`requests`, `aiohttp`) allowed in service files — all HTTP goes through `core/` instruments
31+
- Every service must provide both `captcha_handler()` (sync) and `aio_captcha_handler()` (async) via inheritance
32+
- Every service must support context managers (`with` / `async with`) via `SIOContextManager` + `AIOContextManager` mixins
33+
34+
## Safe change rules
35+
36+
- To add a new captcha type: create `new_service.py`, inherit `CaptchaParams`, add type to `CaptchaTypeEnm` in `core/enum.py`, add serializer structs if needed in `core/serializer.py`
37+
- `control.py` is unique: it provides raw API methods (`get_balance`, `create_task`, `get_task_result`) without the create-then-poll abstraction — do not convert it to the standard pattern
38+
- Users import via full path (`from python3_capsolver.recaptcha import ReCaptcha`) — do not add re-exports to `__init__.py`
39+
- This file ships inside the wheel; keep it concise and avoid sensitive information
Lines changed: 31 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1,46 +1,37 @@
1-
# CORE MODULE
1+
# AGENTS.md
22

3-
**Generated:** 2026-03-15
4-
**Commit:** b797332
3+
## Scope
54

6-
## OVERVIEW
7-
Core module provides foundational classes for synchronous (`requests`) and asynchronous (`aiohttp`) captcha solving operations.
5+
Core infrastructure shared by all captcha services: base classes, HTTP instruments, serialization, enums, and constants.
86

9-
Base classes establish patterns for Sync/Async instruments, enabling dual concurrency support across the library. Serialization leverages `msgspec` for high-performance JSON handling.
7+
## What lives here
108

11-
## STRUCTURE
12-
```
13-
src/python3_capsolver/core/
14-
├── base.py # CaptchaParams entry class (Sync/Async handlers)
15-
├── captcha_instrument.py # CaptchaInstrumentBase, FileInstrument (9.3k lines)
16-
├── aio_captcha_instrument.py # AIOCaptchaInstrument (async implementation)
17-
├── sio_captcha_instrument.py # SIOCaptchaInstrument (sync implementation)
18-
├── serializer.py # Request/Response msgspec Struct classes
19-
├── enum.py # EndpointPostfixEnm, CaptchaTypeEnm, ResponseStatusEnm
20-
├── const.py # API URLs, retry configurations
21-
├── utils.py # Utility functions (attempts_generator)
22-
├── context_instr.py # Context manager instrumentation
23-
└── __init__.py # Empty (anti-pattern)
9+
```text
10+
core/
11+
├── base.py # CaptchaParams: merges payloads, delegates to instruments
12+
├── captcha_instrument.py # CaptchaInstrumentBase + FileInstrument (~9.3k lines)
13+
├── sio_captcha_instrument.py # SIOCaptchaInstrument: sync HTTP via requests
14+
├── aio_captcha_instrument.py # AIOCaptchaInstrument: async HTTP via aiohttp + tenacity
15+
├── serializer.py # msgspec.Struct classes for API payloads/responses
16+
├── enum.py # CaptchaTypeEnm, ResponseStatusEnm, EndpointPostfixEnm
17+
├── const.py # REQUEST_URL, RETRIES, sleep intervals, status codes
18+
├── context_instr.py # SIOContextManager, AIOContextManager mixins
19+
├── utils.py # attempts_generator and helpers
20+
└── __init__.py # Empty — import via full path
2421
```
2522

26-
## WHERE TO LOOK
27-
| Task | File | Notes |
28-
|------|------|-------|
29-
| **Entry Point** | `base.py` | `CaptchaParams` class with `captcha_handler()` and `aio_captcha_handler()` |
30-
| **Base Classes** | `captcha_instrument.py` | `CaptchaInstrumentBase` for instruments, `FileInstrument` for file processing |
31-
| **Sync Instrument** | `sio_captcha_instrument.py` | `SIOCaptchaInstrument` - requests-based implementation |
32-
| **Async Instrument** | `aio_captcha_instrument.py` | `AIOCaptchaInstrument` - aiohttp-based implementation |
33-
| **Serialization** | `serializer.py` | `PostRequestSer`, `CaptchaResponseSer`, `MyBaseModel.to_dict()` |
34-
| **Enums** | `enum.py` | `CaptchaTypeEnm`, `ResponseStatusEnm`, `EndpointPostfixEnm` |
35-
| **Configuration** | `const.py` | `REQUEST_URL`, `RETRIES`, `VALID_STATUS_CODES` |
36-
37-
## CONVENTIONS
38-
- **Base Classes**: All instruments inherit from `CaptchaInstrumentBase`
39-
- **Dual Support**: Every captcha operation provides both sync and async methods
40-
- **Serialization**: `msgspec.Struct` classes with `to_dict()` method for API payloads
41-
- **Retries**: `tenacity` for async, `requests.Retry` for sync (5 attempts, exponential backoff)
42-
- **File Processing**: `FileInstrument` handles local files, URLs, and base64 in both sync/async contexts
43-
- **Session Management**: Instruments maintain their own HTTP sessions with retry adapters
44-
45-
## ANTI-PATTERNS (THIS MODULE)
46-
- **Empty `__init__.py`**: Does NOT re-export base classes. Users must import via full path `from python3_capsolver.core.base import CaptchaParams`
23+
## Local boundaries and invariants
24+
25+
- `captcha_instrument.py` is the largest file (~9.3k lines) and contains both `CaptchaInstrumentBase` (abstract) and `FileInstrument` (file/URL/base64 processing) — edits here affect every service
26+
- Instruments are the only place `requests` and `aiohttp` are imported — service layer must never touch HTTP libraries
27+
- All serialization uses `msgspec.Struct` with `to_dict()` — never use the `json` module directly
28+
- Enums in `enum.py` are the single source of truth for captcha types, response statuses, and endpoint names
29+
- `base.py:CaptchaParams` is the single entry point; service classes inherit it and add only type-specific `__init__` params
30+
31+
## Safe change rules
32+
33+
- Adding a new captcha type requires updating `CaptchaTypeEnm` in `enum.py` and optionally adding structs to `serializer.py`
34+
- Changes to `captcha_instrument.py` are high-impact: it is shared by all services and both sync/async paths
35+
- Retry configuration lives in `const.py` (`RETRIES`, `ASYNC_RETRIES`) — do not hardcode retry counts in instruments
36+
- `context_instr.py` provides `__enter__`/`__exit__` and `__aenter__`/`__aexit__` — all services depend on these mixins
37+
- `__init__.py` is intentionally empty; do not add re-exports

0 commit comments

Comments
 (0)