You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Use `/release` to prepare the changelog, bump versions, commit, and tag. Use `/commit` for day-to-day commits.
34
34
35
+
## Planning & Feature Design
36
+
37
+
### Design-First, Discuss Before Building
38
+
For any non-trivial feature, enter plan mode and work through the design iteratively with the user before writing code. Don't jump to implementation — discuss trade-offs, edge cases, and security implications first. The goal is alignment on approach before any code is written.
39
+
40
+
**Planning workflow:**
41
+
1.**Explore** — read the relevant code paths end-to-end. Understand what exists before proposing what to build.
42
+
2.**Design** — propose the approach with concrete trade-offs. Present options with pros/cons, not just one solution.
43
+
3.**Discuss** — ask the user targeted questions about design decisions. Don't make assumptions on ambiguous points. Use AskUserQuestion for specific choices, not open-ended "what do you think?" questions.
44
+
4.**Harden** — after the core design is agreed, proactively ask: "What else can we improve?" Look for security gaps, edge cases, performance concerns, and missing test coverage. Iterate until the user says "enough."
45
+
5.**Finalize** — write the plan with all decisions documented, then exit plan mode.
46
+
47
+
### Test Vector Design During Planning (Non-Optional)
48
+
Every feature plan MUST include a comprehensive test case inventory before implementation begins. Tests are designed during planning, not added as an afterthought. The test cases serve as the specification — if you can't write the test case, you don't understand the feature well enough.
49
+
50
+
**Systematic test categories to cover for every feature:**
51
+
52
+
| Category | What to ask | Examples |
53
+
|----------|------------|---------|
54
+
|**Happy path**| Does the basic flow work? | CRUD operations, expected inputs, normal usage |
55
+
|**Attack vectors**| Can it be exploited? | SQL injection, parameter tampering, scope mismatches, privilege escalation |
56
+
|**Deny-wins / security invariants**| Do security guarantees hold? | Deny overrides allow, deactivation blocks access, audit can't be tampered |
57
+
|**State interactions**| How does it interact with existing features? | is_active flags, is_enabled flags, access_mode, template variables |
58
+
|**FK cascades / data integrity**| What happens when related entities are deleted? | Delete parent → child cleanup, unique constraint violations |
59
+
|**Cache consistency**| Do changes take effect immediately? | Mutation → cache invalidation → next query reflects change |
60
+
|**Timing / concurrency**| What about race conditions? | Mid-session changes, concurrent mutations, rapid successive operations |
61
+
|**Edge cases**| What about boundary conditions? | Empty sets, max lengths, zero members, duplicate entries |
|**Audit integrity**| Are all mutations tracked? | Every CRUD op logged, correct actor, accurate before/after snapshots |
64
+
|**Multi-entity interaction**| How do multiple instances interact? | Multiple roles, multiple datasources, overlapping policies, priority conflicts |
65
+
|**Backward compatibility**| Does existing functionality still work? | Old API formats, migration of existing data, default values |
66
+
67
+
**Test naming convention:** Group tests by category with descriptive names. Map security-relevant tests to vector numbers in `docs/permission-security-tests.md`.
68
+
69
+
### Security-First Thinking
70
+
This is a data security product. Every feature that touches access control, policy resolution, or data visibility must be evaluated through a security lens:
71
+
-**What can an attacker do?** — enumerate bypass vectors before building defenses
72
+
-**What breaks when state changes?** — deactivation, deletion, membership changes, policy mutations
73
+
-**What's the blast radius?** — how many users/connections are affected by a change?
74
+
-**Is the audit trail complete?** — can every mutation be traced back to who did it and when?
75
+
35
76
## Migrations (`migration/`)
36
77
37
78
### Rules (violations here cause hard-to-fix production incidents)
@@ -301,16 +305,16 @@ After authentication succeeds in `handler.rs`, a background task pre-builds the
301
305
Access control is enforced **before** any query reaches the engine:
302
306
303
307
1.`validate_data_source()` — datasource must exist and be active
304
-
2.`check_access(user_id, datasource_name)` — user must have an explicit `user_data_source` row
308
+
2.`check_access(user_id, datasource_name)` — user must have access via `data_source_access` (direct, role-based, or all-scoped)
305
309
3. If either check fails → `FATAL` PG error, connection rejected before `get_ctx()` is ever called
306
310
307
311
### Why the Shared Pool Is Safe
308
312
309
313
The upstream connection pool carries **no user identity** — it is pure TCP connectivity to the upstream Postgres server. All identity and access decisions are made at the pgwire auth layer (steps 1–2 above), not at the pool layer.
310
314
311
315
Per-user isolation is enforced by:
312
-
-**Data plane** — `user_data_source` allowlist (no row → connection rejected)
313
-
-**RLS hook** — per-query `WHERE tenant = '<value>'` filter injected via DataFusion's logical plan tree, based on the authenticated user's tenant metadata
316
+
-**Data plane** — `data_source_access` allowlist (no matching row → connection rejected). Access can be granted directly to a user, via role membership (including inherited roles), or to all users.
317
+
-**Policy hook** — per-query row filters, column masks, and access controls injected via DataFusion's logical plan tree, based on the authenticated user's policy assignments (direct, role-based, or wildcard)
314
318
-**Virtual catalog** — the stored catalog is an allowlist; tables/columns not explicitly saved are invisible to the engine
315
319
316
320
The shared pool is safe for all authorized users of a datasource: Pool = "how to talk to upstream". Auth + RLS = "what this user can see". These are orthogonal.
@@ -328,16 +332,16 @@ QueryProxy enforces a two-layer access control model:
328
332
**Management plane** — controlled by `is_admin` flag. Admins manage users, data sources, policies, and catalogs via the Admin API. Non-admins have no Admin API access.
329
333
330
334
**Data plane** — controlled by two independent mechanisms:
331
-
1.*Connection access* — explicit `user_data_source` assignment. A user can only connect to a datasource with an explicit row. Being an admin does **not** automatically grant data plane access.
332
-
2.*Query policy* — `PolicyHook` applies row filters, column masks, and column access controls per-query based on assigned policies. If the datasource `access_mode` is `"policy_required"`, tables with no matching permit policy return empty results.
335
+
1.*Connection access* — `data_source_access` entries. A user can connect to a datasource via direct assignment, role membership (including inherited roles), or all-user scope. Being an admin does **not** automatically grant data plane access.
336
+
2.*Query policy* — `PolicyHook` applies row filters, column masks, and column access controls per-query based on assigned policies (direct, role-based, or all-scoped). If the datasource `access_mode` is `"policy_required"`, tables with no matching permit policy return empty results.
333
337
334
338
See `docs/permission-system.md` for the full policy system user guide.
Catalog entity IDs (schemas, tables, columns) are deterministic UUID v5 fingerprints derived from their natural keys. Re-discovering the same upstream object always produces the same ID, so re-syncs are safe upserts.
0 commit comments