Skip to content

Commit f0d4164

Browse files
authored
Merge pull request #59 from warestack/feat/issue-24-diff-and-codeowners
feat: issue 24 diff and codeowners
2 parents 60efa70 + b74bd5a commit f0d4164

36 files changed

Lines changed: 2048 additions & 212 deletions

CHANGELOG.md

Lines changed: 276 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,276 @@
1+
# Changelog
2+
3+
All notable changes to this project will be documented in this file.
4+
5+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
6+
7+
## [Unreleased] -- PR #59
8+
9+
### Added
10+
11+
- **Diff pattern scanning** -- `DiffPatternCondition` checks added lines in PR
12+
diffs against user-defined restricted regex patterns (e.g. `console\.log`,
13+
`TODO:`). Violations include the filename and matched patterns.
14+
- **Security pattern detection** -- `SecurityPatternCondition` flags hardcoded
15+
secrets, API keys, and sensitive data in PR diffs with CRITICAL severity.
16+
Both conditions share a new `_PatchPatternCondition` base class to eliminate
17+
duplication.
18+
- **Unresolved review comments gate** -- `UnresolvedCommentsCondition` blocks
19+
PR merges when unresolved (non-outdated) review comment threads exist,
20+
using GraphQL `reviewThreads` data from the enricher.
21+
- **Test coverage enforcement** -- `TestCoverageCondition` requires that PRs
22+
modifying source files also touch test files matching a configurable regex
23+
pattern (`test_file_pattern`).
24+
- **Comment response time SLA** -- `CommentResponseTimeCondition` flags
25+
unresolved review threads that have exceeded a configurable hour-based SLA
26+
(`max_comment_response_time_hours`).
27+
- **Signed commits verification** -- `SignedCommitsCondition` ensures all
28+
commits in a PR are cryptographically signed (GPG/SSH/S/MIME), for
29+
regulated environments that require commit provenance.
30+
- **Changelog requirement** -- `ChangelogRequiredCondition` blocks PRs that
31+
modify source code without a corresponding `CHANGELOG` or `.changeset`
32+
update.
33+
- **Self-approval prevention** -- `NoSelfApprovalCondition` enforces
34+
separation of duties by preventing PR authors from approving their own
35+
code (CRITICAL severity).
36+
- **Cross-team approval** -- `CrossTeamApprovalCondition` requires approvals
37+
from members of specified GitHub teams before merge. Uses a simplified
38+
`requested_teams` check (full team-membership resolution via GraphQL is
39+
tracked for a future iteration).
40+
- **Diff parsing utilities** -- New `src/rules/utils/diff.py` module with
41+
`extract_added_lines`, `extract_removed_lines`, and
42+
`match_patterns_in_patch` for reusable patch analysis.
43+
- **CODEOWNERS parser** -- New `src/rules/utils/codeowners.py` with
44+
`CodeOwnersParser` class supporting glob-to-regex conversion, owner
45+
lookup, and critical-file detection. CODEOWNERS content is now fetched
46+
dynamically from the GitHub API instead of reading from disk.
47+
- **Webhook handlers for review events** -- `PullRequestReviewEventHandler`
48+
and `PullRequestReviewThreadEventHandler` re-evaluate PR rules when
49+
reviews are submitted/dismissed or threads are resolved/unresolved.
50+
- **Review thread enrichment** -- `PullRequestEnricher` now fetches
51+
`reviewThreads` via GraphQL and attaches them to the event context,
52+
enabling `UnresolvedCommentsCondition` and `CommentResponseTimeCondition`.
53+
- **Full rule evaluation wiring** -- All new conditions are registered in
54+
`ConditionRegistry` (`AVAILABLE_CONDITIONS`, `RULE_ID_TO_CONDITION`) with
55+
corresponding `RuleID` enum values, violation-text mappings, and
56+
human-readable descriptions so they are routed through the fast
57+
condition-class evaluation path and support acknowledgment workflows.
58+
59+
### Changed
60+
61+
- **GraphQL client consolidation** -- Removed the standalone
62+
`graphql_client.py` module; all GraphQL operations now go through the
63+
unified `GitHubAPI` class with Pydantic-typed response models.
64+
- **CODEOWNERS fetched from API** -- `PathHasCodeOwnerCondition` and
65+
`RequireCodeOwnerReviewersCondition` now receive CODEOWNERS content via
66+
the event context (fetched by the enricher) rather than reading from the
67+
local filesystem.
68+
- **`_PatchPatternCondition` base class** -- `DiffPatternCondition` and
69+
`SecurityPatternCondition` now share a common abstract base, reducing
70+
~60 lines of duplicated iteration/matching logic.
71+
- **Removed redundant `validate()` overrides** -- Conditions in
72+
`compliance.py` and `access_control_advanced.py` that simply delegated to
73+
`evaluate()` now rely on `BaseCondition.validate()` which does the same
74+
thing.
75+
76+
### Fixed
77+
78+
- **Fail-closed on invalid regex** -- `TestCoverageCondition` now returns a
79+
violation (and `validate()` returns `False`) when `test_file_pattern` is
80+
an invalid regex, instead of silently passing.
81+
- **Consistent file-extension filtering** -- `TestCoverageCondition.validate()`
82+
now ignores `.txt` and `.json` files, matching the behavior of `evaluate()`.
83+
- **`max_hours=0` edge case** -- `CommentResponseTimeCondition` now uses
84+
`if max_hours is None` instead of `if not max_hours`, so a 0-hour SLA
85+
(immediate response required) is correctly enforced.
86+
- **Overly generic violation mapping key** -- Changed the
87+
`COMMENT_RESPONSE_TIME` acknowledgment mapping from `"exceeded the"` to
88+
`"response SLA"` to avoid false matches against unrelated violation text.
89+
90+
## [2026-02-27] -- PRs #54, #58
91+
92+
### Added
93+
94+
- **Disabled rule filtering** -- Rules with `enabled: false` in
95+
`rules.yaml` are now skipped during loading.
96+
- **CodeRabbit-style PR comments** -- Collapsible `<details>` sections for
97+
violations, acknowledgment summaries, and check run output.
98+
- **Watchflow footer** -- Branded footer appended to PR comments.
99+
- **Severity grouping fix** -- `INFO` severity rules are now grouped
100+
correctly instead of falling back to `LOW`.
101+
102+
### Changed
103+
104+
- **Default rules aligned with watchflow.dev** -- Canonical rule set updated
105+
to match the published documentation examples.
106+
- **`max_pr_loc` parameter alias** -- `MaxPrLocCondition` now accepts
107+
`max_pr_loc` and `max_changed_lines` in addition to `max_lines`.
108+
- **CODEOWNERS reviewer exclusion** -- PR author is excluded from the
109+
required code-owner reviewers list.
110+
- **Legacy rule ID references removed** -- Generated PR comments and error
111+
messages no longer expose internal `RuleID` strings.
112+
113+
### Fixed
114+
115+
- **Acknowledgment text matching** -- Violation text keys updated to
116+
exactly match the messages emitted by conditions.
117+
- **GitHub App auth env vars** -- Standardized to `APP_CLIENT_ID_GITHUB`
118+
and `APP_CLIENT_SECRET_GITHUB`.
119+
120+
## [2026-02-26] -- PRs #43 (cont.), event filtering
121+
122+
### Added
123+
124+
- **Event filtering** -- Irrelevant GitHub events (e.g. bot-only,
125+
label-only) are now dropped before reaching the rule engine, reducing
126+
noise and unnecessary LLM calls.
127+
128+
### Fixed
129+
130+
- **Deployment status blocking** -- Resolved an issue where deployment
131+
status events could block without a clear reason.
132+
- **Deployment approval gating** -- Addressed CodeRabbit feedback on
133+
retry logic, falsy checks, and callback URL handling.
134+
135+
## [2026-01-31] -- PR #43
136+
137+
### Added
138+
139+
- **Core event processing infrastructure** -- `PullRequestProcessor`,
140+
`PushEventProcessor`, `DeploymentProcessor`, and `CheckRunProcessor`
141+
with enrichment, rule evaluation, and GitHub reporting pipeline.
142+
- **Task queue with deduplication** -- Async `TaskQueue` for enqueuing
143+
webhook processing with delivery-ID-based dedup.
144+
- **Rule engine agent (LangGraph)** -- `RuleEngineAgent` with a multi-node
145+
workflow: analyze rules, select strategy (condition class vs LLM
146+
reasoning vs hybrid), execute, and validate.
147+
- **Acknowledgment agent** -- `AcknowledgmentAgent` parses `@watchflow ack`
148+
comments and maps violations to `RuleID` enum values.
149+
- **Webhook dispatcher and handlers** -- Modular handler registry for
150+
`pull_request`, `push`, `check_run`, `deployment`, `deployment_status`,
151+
`deployment_protection_rule`, `deployment_review`, and `issue_comment`
152+
events.
153+
- **Condition-based rule evaluation** -- `BaseCondition` ABC with
154+
`evaluate()` (returns `list[Violation]`) and `validate()` (legacy bool
155+
interface). Initial conditions: `TitlePatternCondition`,
156+
`MinDescriptionLengthCondition`, `RequiredLabelsCondition`,
157+
`MinApprovalsCondition`, `RequireLinkedIssueCondition`,
158+
`MaxFileSizeCondition`, `MaxPrLocCondition`, `FilePatternCondition`,
159+
`PathHasCodeOwnerCondition`, `RequireCodeOwnerReviewersCondition`,
160+
`CodeOwnersCondition`, `ProtectedBranchesCondition`,
161+
`NoForcePushCondition`, `AuthorTeamCondition`, `AllowedHoursCondition`,
162+
`DaysCondition`, `WeekendCondition`, `WorkflowDurationCondition`.
163+
- **Condition registry** -- `ConditionRegistry` with parameter-pattern
164+
matching to automatically wire YAML rule parameters to condition classes.
165+
- **`RuleID` enum and acknowledgment system** -- Type-safe rule
166+
identifiers, violation-text-to-rule mapping, and acknowledgment comment
167+
parsing.
168+
- **Webhook auth** -- HMAC-SHA256 signature verification for GitHub
169+
webhooks.
170+
171+
### Changed
172+
173+
- **Architectural modernization** -- Migrated from monolithic processor to
174+
modular event-processor / agent / handler architecture with Pydantic
175+
models throughout.
176+
- **Documentation overhaul** -- All docs aligned with the rule engine
177+
architecture, description-based rule format, and supported validation
178+
logic.
179+
180+
### Fixed
181+
182+
- **Dead code removal** -- Cleaned up unused webhook and PR processing code.
183+
- **JSON parse errors** -- Webhook handler now returns proper error
184+
responses on malformed payloads.
185+
- **WebhookResponse status normalization** -- Consistent status field
186+
values across all handlers.
187+
188+
## [2025-12-01] -- PRs #27-35
189+
190+
### Added
191+
192+
- **Repository Analysis Agent** -- `RepositoryAnalysisAgent` with LangGraph
193+
workflow analyzing PR history, contributing guidelines, and repository
194+
hygiene. Includes Pydantic models, LLM prompt templates, and API
195+
endpoints for rule recommendations.
196+
- **Diff-aware validators** -- `diff_pattern`, `related_tests`, and
197+
`required_field_in_diff` validators with normalized diff metadata and
198+
LLM-friendly summaries for PR files.
199+
- **Feasibility agent validator selection** -- `FeasibilityAgent` now
200+
dynamically chooses validators from a catalog.
201+
- **AI Immune System metrics** -- Repository health scoring with hygiene
202+
metrics and structured API responses.
203+
- **PR automation** -- Automated PR creation from repository analysis
204+
recommendations.
205+
206+
### Changed
207+
208+
- **Diff-aware rule presets** -- Default rule bundles updated to use the
209+
new diff-aware parameters and threading guardrails.
210+
211+
### Fixed
212+
213+
- **PR creation 404 prevention** -- Proper error handling for `create_git_ref`
214+
422 responses and repository analysis caching.
215+
- **Repository analysis reliability** -- Improved logging, formatting, and
216+
content checks in analysis nodes.
217+
218+
## [2025-10-01] -- PRs #18-21
219+
220+
### Added
221+
222+
- **Multi-provider AI abstraction** -- Provider-agnostic `get_chat_model()`
223+
factory supporting OpenAI, AWS Bedrock, and Google Vertex AI (Model
224+
Garden). Registry pattern for provider selection.
225+
- **Python version compatibility checks** -- Pre-commit hook validates
226+
syntax against target Python version.
227+
228+
### Changed
229+
230+
- **Provider-agnostic LLM usage** -- Replaced direct `ChatOpenAI`
231+
instantiation with the `get_chat_model()` abstraction throughout.
232+
- **Module restructuring** -- Reorganized package layout and updated
233+
configuration.
234+
235+
## [2025-08-05] -- PRs #10-13
236+
237+
### Added
238+
239+
- **CODEOWNERS integration** -- Initial CODEOWNERS file parsing and
240+
contributor analysis.
241+
- **Agent architecture enhancements** -- Improved consistency and
242+
reliability for `FeasibilityAgent` and `RuleEngineAgent`.
243+
- **Structured output for FeasibilityAgent** -- LLM responses parsed into
244+
Pydantic models.
245+
- **Testing framework** -- Coverage reporting, CI test pipeline, and
246+
mocking infrastructure for agents and LLM clients.
247+
- **GitHub Pages documentation** -- MkDocs site deployed via GitHub
248+
Actions.
249+
250+
### Changed
251+
252+
- **FastAPI lifespan** -- Replaced deprecated `on_event` handlers with
253+
lifespan context manager.
254+
- **Description-based rule format** -- Rules in YAML now use natural
255+
language descriptions matched to conditions.
256+
257+
### Fixed
258+
259+
- **CI pipeline** -- Python setup, coverage reporting, Codecov auth,
260+
MkDocs dependencies.
261+
- **Test isolation** -- Proper mocking of agent creation, config
262+
validation, and LLM client initialization.
263+
264+
## [2025-07-18] -- Initial release
265+
266+
### Added
267+
268+
- **Watchflow AI governance engine** -- First open-source release.
269+
LangGraph-based rule evaluation for GitHub webhook events
270+
(pull requests, pushes, deployments).
271+
- **EKS deployment** -- Helm chart, Kubernetes manifests, and GitHub
272+
Actions workflow for AWS EKS.
273+
- **Pre-commit hooks** -- Ruff linting and formatting, YAML checks,
274+
trailing whitespace, large file detection.
275+
- **Development tooling** -- `uv` package management, development guides,
276+
contributor guidelines.

README.md

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
GitHub governance that runs where you already work. No new dashboards, no “AI-powered” fluff—just rules in YAML, evaluated on every PR and push, with check runs and comments that maintainers actually read.
66

7-
Watchflow is the governance layer for your repo: it enforces the policies you define (CODEOWNERS, approvals, linked issues, PR size, title patterns, branch protection) so you don’t have to chase reviewers or guess what’s allowed. Built for teams that still care about traceability and review quality.
7+
Watchflow is the governance layer for your repo: it enforces the policies you define (CODEOWNERS, approvals, linked issues, PR size, title patterns, branch protection, diff scanning, review thread SLAs, signed commits, and more) so you don’t have to chase reviewers or guess what’s allowed. Built for teams that still care about traceability and review quality.
88

99
---
1010

@@ -46,11 +46,19 @@ Rules are **description + event_types + parameters**. The engine matches paramet
4646
| **PR** | `critical_owners: []` / code owners | pull_request | Changes to critical paths require code-owner review. |
4747
| **PR** | `require_path_has_code_owner: true` | pull_request | Every changed path must have an owner in CODEOWNERS. |
4848
| **PR** | `protected_branches: ["main"]` | pull_request | Block direct targets to these branches. |
49+
| **PR** | `diff_restricted_patterns: ["console\\.log", "TODO:"]` | pull_request | Flag restricted regex patterns in PR diff added lines. |
50+
| **PR** | `security_patterns: ["api_key", "secret"]` | pull_request | Detect hardcoded secrets or sensitive data in diffs (critical). |
51+
| **PR** | `block_on_unresolved_comments: true` | pull_request | Block merge when unresolved review threads exist. |
52+
| **PR** | `require_tests: true` | pull_request | Source changes must include corresponding test file changes. |
53+
| **PR** | `max_comment_response_time_hours: 24` | pull_request | Review threads must be addressed within SLA. |
54+
| **PR** | `require_signed_commits: true` | pull_request | All commits must be cryptographically signed (GPG/SSH). |
55+
| **PR** | `require_changelog_update: true` | pull_request | Source changes must include a CHANGELOG or .changeset update. |
56+
| **PR** | `block_self_approval: true` | pull_request | PR authors cannot approve their own code. |
57+
| **PR** | `required_team_approvals: ["backend", "security"]` | pull_request | Require approvals from specified GitHub teams. |
4958
| **Push** | `no_force_push: true` | push | Reject force pushes. |
5059
| **Files** | `max_file_size_mb: 1` | pull_request | No single file > N MB. |
5160
| **Files** | `pattern` + `condition_type: "files_match_pattern"` | pull_request | Changed files must (or must not) match glob/regex. |
5261
| **Time** | `allowed_hours`, `days`, weekend | deployment / workflow | Restrict when actions can run. |
53-
| **Deploy** | `environment`, approvals | deployment | Deployment protection. |
5462

5563
Rules are read from the **default branch** (e.g. `main`). Each webhook delivery is deduplicated by `X-GitHub-Delivery` so handler and processor both run; comments and check runs stay in sync.
5664

docs/concepts/overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ graph TD
4141
### Condition registry
4242

4343
- Maps parameter names to condition classes (e.g. `require_linked_issue``RequireLinkedIssueCondition`, `max_lines``MaxPrLocCondition`, `require_code_owner_reviewers``RequireCodeOwnerReviewersCondition`).
44-
- Supported conditions: linked issue, title pattern, description length, labels, approvals, PR size (lines), CODEOWNERS (path has owner, require owners as reviewers), protected branches, no force push, file size, file pattern, time/deploy rules. See [Configuration](../getting-started/configuration.md).
44+
- Supported conditions: linked issue, title pattern, description length, labels, approvals, PR size (lines), CODEOWNERS (path has owner, require owners as reviewers), protected branches, no force push, file size, file pattern, diff pattern scanning, security pattern detection, unresolved comments, test coverage, comment response SLA, signed commits, changelog required, self-approval prevention, cross-team approval, time/deploy rules. See [Configuration](../getting-started/configuration.md).
4545

4646
### PR enricher
4747

0 commit comments

Comments
 (0)