Skip to content

Commit 6d0dd18

Browse files
committed
Full tasks manager and gating rebuild (Release v0.8.0)
1 parent c5b583f commit 6d0dd18

31 files changed

Lines changed: 2415 additions & 2999 deletions

AGENTS.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -268,4 +268,41 @@ Do not restate this contract.
268268

269269
---
270270

271+
## 10. Server-Governed Prompts (Phase 1–5)
272+
273+
tinyMem now runs a fixed series of server-controlled prompts so agent output becomes an intent ledger, not an execution trace.
274+
275+
### Prompt 1 — TaskManager Ownership
276+
* `tinyTasks.md` is the server's sole task ledger; the LLM may not read or write it directly.
277+
* All task mutations flow through the shared TaskManager, which loads/parses the file, validates structure, and exposes add, update, complete, and list operations.
278+
* MCP and proxy mode call the same TaskManager path, so any other file access to `tinyTasks.md` is rejected.
279+
*Implementation evidence: `internal/tasks/manager.go` implements the TaskManager APIs and both `internal/server/mcp/server.go` (lines 90-106) and `internal/server/proxy/server.go` (lines 81-110) instantiate that shared manager so every mutation path is server-owned.
280+
281+
### Prompt 2 — Intent Interpretation
282+
* Each tool call or proxy mutation maps to exactly one intent category (file_write, task_update, memory_write, diagnostics, mode declaration, etc.) so the server always knows what the LLM is formally asking for.
283+
* The server now exposes machine-readable intent metadata (category, minimum mode, recall requirement, scope, and side effects) for every tool so both MCP and proxy layers load the same contract instead of inferring intent from prose.
284+
* Validation uses that metadata and the shared intent gate (`ensureIntent`) to confirm the declared category exists, the requested mode meets the minimum, recall/authority/evidence prerequisites are satisfied, and any scope constraints (e.g., fact writes needing evidence or tinyTasks edits needing strict mode) hold; failed validation rejects the request with zero side effects.
285+
* MCP and proxy both consult this registry, so no mode-determining logic lives in prompts—the LLM is treated as making intent declarations, not executing actions.
286+
*Implementation evidence: `internal/intent/definition.go` defines every tool's metadata, `internal/server/tool_definitions.go` attaches it to each MCP tool, and `internal/server/mcp/server.go#ensureIntent` validates category, mode, and recall before every tool executes, so intent is derived from metadata, not agent prose.
287+
288+
### Prompt 3 — Unified Enforcement
289+
* All mutating requests—MCP tool calls and proxy mutations alike—flow through a single enforcement gate.
290+
* Enforcement decisions are deterministic, policy-driven, and executed on the server; prompt text is advisory, not authoritative.
291+
*Implementation evidence: `internal/server/mcp/server.go#ensureIntent`, `internal/execution/controller.go`, and `internal/enforcement/recorder.go` record mode compliance and enforcement events, and proxy mode reuses the same `execution.Controller`, so MCP and proxy share one deterministic gate.
292+
293+
### Prompt 4 — Memory Governance
294+
* Agents submit structured memory proposals; the server decides what gets persisted.
295+
* The server validates recall/evidence/duplication rules, then assigns IDs, timestamps, and provenance before writing.
296+
* No memory write occurs unless all prerequisites are satisfied.
297+
*Implementation evidence: `internal/server/mcp/server.go#handleMemoryWrite` parses the structured JSON proposal, enforces recall/mode/evidence via `requireMode`/`ensureRecallBeforeMutation`, and only then persists to `memory.Service`, guaranteeing the server owns every memory mutation.
298+
299+
### Prompt 5 — Metadata as Protocol
300+
* Every tool carries machine-readable metadata that states its intent category, side effects, prerequisites, and allowed scope.
301+
* Enforcement consumes that metadata directly, not prose, so behavior is deterministic and auditable.
302+
* Tool descriptions stay short and focus on capabilities rather than policy.
303+
*Implementation evidence: `intent.Definition.Metadata` plus `server.ToolMetadata` publish the machine-readable intent schema, so enforcement consumes structured metadata while the tool descriptions in `internal/server/tool_definitions.go` remain concise.
304+
305+
Together these prompts guarantee that `tinyTasks.md` cannot be modified by the LLM directly, no mutation occurs without server validation, MCP and proxy enforcement behave identically, gating is policy rather than conversation state, prompts can be deleted without breaking safety, and hallucinated success claims remain inert.
306+
307+
271308
**End of tinyMem Protocol**

CLAUDE.md

Lines changed: 8 additions & 267 deletions
Original file line numberDiff line numberDiff line change
@@ -1,271 +1,12 @@
11
**Start of tinyMem Protocol**
2+
You operate under TinyMem governance.
23

3-
# TINYMEM AGENT CONTRACT (Governed — Task-Externalised)
4-
5-
This contract governs all repository-related behavior when tinyMem is present.
6-
Non-compliance invalidates the response.
7-
8-
---
9-
10-
## 0. Scope
11-
12-
A request is **repository-related** if it touches:
13-
14-
* code
15-
* files
16-
* documentation
17-
* configuration
18-
* architecture
19-
* tasks
20-
* planning
21-
* repository state
22-
23-
---
24-
25-
## 1. Core Principle
26-
27-
Observation is free.
28-
Sequencing is authority.
29-
Mutation is explicit.
30-
31-
---
32-
33-
## 2. Tool Definitions (Authoritative)
34-
35-
### Memory Recall
36-
37-
* `memory_query`
38-
* `memory_recent`
39-
40-
Available in ALL modes (implementing "Observation is free").
41-
Required before any mutation in GUARDED/STRICT modes.
42-
43-
### Intent Declaration
44-
45-
* `memory_set_mode`
46-
47-
Required before any mutation.
48-
49-
### Memory Write
50-
51-
* `memory_write`
52-
53-
The **only** permitted mechanism for durable memory.
54-
55-
### Task Authority
56-
57-
* `tinyTasks.md` in the project root
58-
* Optional task-authority helper tool
59-
60-
---
61-
62-
## 3. Definitions
63-
64-
### Observation
65-
66-
Reading, inspecting, analyzing, summarizing, or asking questions.
67-
68-
### Mutation
69-
70-
Any durable state change, including:
71-
72-
* writing or modifying files
73-
* creating, updating, or completing tasks
74-
* writing memory
75-
* promoting a claim to a fact, decision, or constraint
76-
77-
### Task Authority
78-
79-
`tinyTasks.md` is the single source of truth for task state.
80-
Task state must never be inferred.
81-
82-
### Task Identification
83-
84-
The moment the agent identifies, implies, or sequences more than one actionable step.
85-
86-
This includes:
87-
88-
* plans
89-
* approaches
90-
* checklists
91-
* ordered bullets
92-
* “first / then / next”
93-
* step-by-step reasoning
94-
95-
---
96-
97-
## 4. Modes (Intent)
98-
99-
You operate in exactly one mode:
100-
101-
* **PASSIVE** — observation only
102-
* **GUARDED** — bounded, reversible mutation
103-
* **STRICT** — maximum caution, full enforcement
104-
105-
Mode MUST be declared via `memory_set_mode` before mutation.
106-
107-
---
108-
109-
## 5. Rule Set (Stable IDs)
110-
111-
### R1 — Recall Before Mutation
112-
113-
Memory recall tools (`memory_query`, `memory_recent`) are available in ALL modes (implementing "Observation is free").
114-
115-
Before any mutation in GUARDED/STRICT modes, you MUST:
116-
117-
* call `memory_query` or `memory_recent`
118-
* acknowledge the result (even if empty)
119-
120-
---
121-
122-
### R2 — Task Externalisation Is Mandatory
123-
124-
The agent may NOT hold a task list internally.
125-
126-
If **Task Identification** occurs:
127-
128-
1. All steps MUST be externalised into `tinyTasks.md`
129-
2. No mutation may occur until task authority is resolved
130-
131-
If `tinyTasks.md` does NOT exist:
132-
133-
* Create the inert template
134-
* Populate it with a proposed task list
135-
* STOP
136-
* Request the human to review, edit, reorder, or approve the proposed tasks
137-
138-
Creation or population of `tinyTasks.md` does NOT authorize work.
139-
140-
Planning in the response body is prohibited once this rule triggers.
141-
142-
#### Task Proposal Allowance
143-
144-
The agent MAY populate `tinyTasks.md` with a proposed task list.
145-
146-
Proposed tasks are NOT authorized until a human:
147-
- confirms them explicitly, or
148-
- edits or reorders them, or
149-
- states approval in plain language
150-
151-
The agent MUST stop after proposing tasks and wait for human authorization.
152-
153-
---
154-
155-
### R3 — Tasks Are Authoritative
156-
157-
If `tinyTasks.md` exists:
158-
159-
* Continue the **first unchecked subtask**
160-
* If no unchecked subtasks exist, STOP and request user input
161-
162-
The agent may NOT:
163-
164-
* skip tasks
165-
* reorder tasks
166-
* redefine tasks
167-
* invent progress
168-
169-
---
170-
171-
### R4 — Mutation Requires Intent
172-
173-
Before any mutation, ALL of the following MUST be true, in order:
174-
175-
1. R1 satisfied (memory recall in GUARDED/STRICT modes)
176-
2. Intent declared via `memory_set_mode`
177-
3. R2 satisfied (task externalised if required)
178-
4. R3 satisfied (task authority confirmed)
179-
180-
---
181-
182-
### R5 — Durable Memory Is Tool-Only
183-
184-
* Use `memory_write` only
185-
* Facts require evidence
186-
* Decisions and constraints require rationale
187-
* Never claim a memory write unless the tool succeeded
188-
189-
---
190-
191-
### R6 — Fail Closed
192-
193-
If recall, intent, task authority, or enforcement is uncertain:
194-
195-
* Continue with observation only, OR
196-
* STOP and request user input
197-
198-
Never guess. Never proceed optimistically.
199-
200-
---
201-
202-
## 6. tinyTasks.md Templates
203-
204-
### Inert Auto-Creation Template
205-
206-
```md
207-
# Tasks — PROPOSED
208-
>
209-
> These tasks were proposed by the agent.
210-
> No work is authorised until a human reviews and confirms them.
211-
>
212-
## Tasks
213-
<!-- No tasks defined yet -->
214-
```
215-
216-
### Active Task Structure
217-
218-
```md
219-
# Tasks – <Goal>
220-
221-
- [ ] Top-level task
222-
- [ ] Atomic subtask
223-
- [ ] Task definition and details
224-
```
225-
226-
Rules:
227-
228-
* Two levels only
229-
* Order matters
230-
* Unchecked == authorized *after human confirmation*
231-
232-
---
233-
234-
## 7. Enforcement Expectations
235-
236-
Expected to be enforceable at the boundary:
237-
238-
* block mutation without recall
239-
* block mutation without intent
240-
* block mutation when tasks are required but missing
241-
* block mutation when tasks exist but none are unchecked
242-
* track violations for audit
243-
244-
If enforcement is unavailable, self-enforce and fail closed (R6).
245-
246-
---
247-
248-
## 8. Error Handling
249-
250-
If a required tool fails:
251-
252-
1. State the failure
253-
2. Retry up to 2 times
254-
3. STOP and request human intervention
255-
256-
---
257-
258-
## 9. End-of-Response Checklist (When Mutation Occurs)
259-
260-
Confirm explicitly:
261-
262-
* recall completed in GUARDED/STRICT modes (R1)
263-
* mode declared (R4)
264-
* task authority resolved (R2, R3)
265-
* memory writes completed or not required (R5)
266-
267-
Do not restate this contract.
268-
269-
---
4+
If a request would change files, tasks, or memory:
5+
- use TinyMem MCP tools
6+
- do not assume state
7+
- do not claim changes unless a tool call succeeds
2708

9+
Tool calls are the only way to affect reality.
10+
If a tool call is blocked or rejected, adapt and retry.
11+
Otherwise, respond normally.
27112
**End of tinyMem Protocol**

0 commit comments

Comments
 (0)