Skip to content

Latest commit

 

History

History
365 lines (268 loc) · 12.4 KB

File metadata and controls

365 lines (268 loc) · 12.4 KB

Graph Schema Specification — v1

Authoritative contract for git-mind's knowledge graph. All validators, importers, and views implement against this document. Ref: #180 (BDK-001)

This document is the graph contract, not the canonical product narrative. Some prefixes and examples reflect legacy manual-authoring and roadmap-oriented workflows that remain supported, but they are not the current center of the product story.


1. Schema Version

All import files must declare a schema version at the root level.

version: 1
  • Missing versionhard error.
  • Unknown version (e.g., version: 2) → hard error: "Unknown schema version 2. This version of git-mind supports version 1."
  • Version is checked before any other validation.
  • Minor forward-compatible keys marked optional in this spec may be added without bumping the version.

2. Node ID Grammar

Every node ID follows the format prefix:identifier.

Formal Grammar

node-id    = prefix ":" identifier
prefix     = LOWER ( LOWER | DIGIT | "-" )*
identifier = ID_CHAR+
LOWER      = [a-z]
DIGIT      = [0-9]
ID_CHAR    = [A-Za-z0-9 . _ / @ -]

Canonical Regex

JS regex literal (source of truth):

/^[a-z][a-z0-9-]*:[A-Za-z0-9._\/@-]+$/

JSON-string-safe pattern (for configs, tools, other languages):

^[a-z][a-z0-9-]*:[A-Za-z0-9._/@-]+$

The / does not require escaping outside JS regex literals.

Rules

Rule Constraint
Prefix casing Always lowercase: milestone, not Milestone
Identifier casing Case-preserving: BEDROCK stays BEDROCK
Max total length 256 characters
Empty string Invalid
Prefix-only Invalid — milestone: is rejected (identifier required)
Colon-only Invalid — : is rejected
Whitespace Invalid anywhere in the ID — no trimming, no normalization
Comparison Exact byte/character match — no Unicode normalization

Cross-Repo IDs

Cross-repo IDs reference nodes in other repositories.

cross-repo-id  = "repo:" owner "/" name ":" prefix ":" identifier

Example: repo:neuroglyph/echo:crate:echo-core

Rules:

  • The repo: prefix is a system prefix — it cannot be used for regular nodes
  • extractPrefix returns the inner prefix (e.g., crate for repo:owner/name:crate:id)
  • Cross-repo IDs are valid in any context where a node ID is accepted
  • Use git mind link --remote <owner/name> to qualify local IDs as cross-repo

3. Prefix Taxonomy

Node prefixes are grouped into five categories. Unknown prefixes produce a warning, not a hard error — this allows organic taxonomy growth without schema changes.

Project Management

Prefix Purpose Example
milestone: Major project phase or release milestone:BEDROCK
feature: Feature grouping within a milestone feature:BDK-SCHEMA
task: Atomic unit of work task:BDK-001
issue: GitHub/tracker issue issue:180
pr: Pull request referenced from repo artifacts pr:308
phase: Project phase (alias for milestone in views) phase:alpha

Knowledge

Prefix Purpose Example
spec: Specification document spec:graph-schema
adr: Architecture decision record adr:001-crdt-storage
doc: General documentation doc:ROADMAP
concept: Abstract idea or principle concept:zero-trust
decision: Recorded decision decision:use-warp

Architecture

Prefix Purpose Example
crate: Internal package/module crate:echo-core
module: Software module module:auth
pkg: External dependency pkg:chalk
file: Source file or path file:src/auth.js

People & Tools

Prefix Purpose Example
person: Team member or stakeholder person:james
tool: Tool or service tool:github-actions

Observability

Prefix Purpose Example
event: Dated event or milestone event:v2-launch
metric: Measured quantity metric:test-coverage

System-Generated (Reserved)

Prefix Purpose Example
commit: Git commit (auto-generated by hooks) commit:934b6e3
repo: Cross-repo qualifier namespace for remote node IDs repo:owner/name:crate:echo-core
epoch: System-generated temporal/version marker for time-travel epoch:934b6e3

System prefix rules:

  • commit: nodes are created automatically by the post-commit hook.
  • User-provided commit: nodes in import files are rejected (hard error).
  • repo: is reserved for system-managed cross-repo qualification and should not be introduced ad hoc in import files.
  • epoch: nodes are created automatically by the epoch/timeline machinery and are not authored directly in import files.
  • Future system prefixes will use the sys- namespace (e.g., sys-audit:).

4. Edge Types

Eleven directed edge types are defined. Unknown edge types are a hard error.

Type Definition Direction Example
implements Source implements the target task → feature, code → spec task:BDK-002 → feature:BDK-SCHEMA
augments Source extends or enhances target extension → base module:auth-oauth → module:auth
relates-to General semantic association either direction concept:zero-trust → adr:001
references Source explicitly references target artifact → referenced artifact doc:release-notes → issue:180
touches Source changes or directly touches target commit → file commit:934b6e3 → file:src/auth.js
groups Source groups or contains target parent → child module:auth → file:src/auth.js
blocks Source blocks progress on target blocker → blocked task:BDK-001 → task:BDK-002
belongs-to Source is a child/member of target child → parent feature:BDK-SCHEMA → milestone:BEDROCK
consumed-by Source is consumed/used by target resource → consumer pkg:chalk → module:format
depends-on Source depends on target dependent → dependency milestone:INTAKE → milestone:BEDROCK
documents Source documents/describes target doc → subject doc:ROADMAP → milestone:BEDROCK

Edge Constraints

Constraint Rule
Self-edges Forbidden for blocks and depends-on. Other types allow self-reference.
Cycles Warned in v1 for blocks and depends-on. Not a hard error, but views may behave unexpectedly.
Uniqueness (source, target, type) tuple. Re-adding an existing edge updates its properties.

5. Edge Properties

Every edge carries three properties.

confidence

  • Type: finite number
  • Range: [0.0, 1.0] inclusive
  • Default: 1.0
  • Rejected: NaN, Infinity, -Infinity, non-number types, string representations (e.g., "0.9"). No coercion — hard error.
  • Precision: any finite float accepted, no clamping.

Confidence semantics:

Range Meaning
1.0 Verified by human
0.8 High confidence, not reviewed (e.g., commit directive)
0.3–0.5 Suggestion (AI-generated)
< 0.3 Low confidence, needs review

createdAt

  • Type: string
  • Format: ISO 8601 (e.g., "2026-02-10T14:30:00.000Z")
  • Set: automatically on edge creation
  • Mutability: immutable — preserved on upsert, never overwritten
  • Import behavior: if provided in import file, ignored and overwritten by the system

rationale

  • Type: string (optional)
  • Semantics: human-readable explanation of why this edge exists
  • Default: undefined (absent)

6. Node Properties

Node properties are arbitrary key-value pairs.

  • Keys: non-empty strings
  • Values: JSON-serializable (string, number, boolean, null, array, object)
  • Update: shallow merge — existing keys not mentioned are preserved
  • Non-existent node: setting properties on a non-existent node is a no-op (returns error)

Reserved Property Names

The following property names are conventional (not enforced in v1):

Name Purpose Example values
name Human-readable display name "BEDROCK", "Schema Contract"
type Node category "milestone", "feature", "task"
status Work status "pending", "in-progress", "done"

Reserved properties are convention only in v1. No enforcement, no errors, no warnings. Future versions may promote them to enforced fields.


7. Update Semantics

Nodes

  • Upsert by default. Adding a node that already exists merges properties (shallow merge).
  • Last key wins on conflict.
  • Duplicate node declarations in a single import file: hard error. Each node ID must appear at most once per import payload.

Edges

  • Upsert by (source, target, type) tuple. Re-adding an existing edge updates confidence and rationale.
  • createdAt is preserved (immutable).
  • Duplicate edge declarations in a single import file: hard error. Each (source, target, type) tuple must appear at most once per import payload.

Idempotency

Applying the same valid import file twice produces identical graph state. The second application reports updates (not creates) with zero net changes.


8. Non-Examples

Input Rule Violated Why It's Wrong
"" Non-empty ID Node ID must contain at least one character
milestone: Identifier required Prefix must be followed by a non-empty identifier
:foo Valid prefix Prefix must start with a lowercase letter
Milestone:BEDROCK Lowercase prefix Use milestone:BEDROCK — prefix is always lowercase
my node No whitespace Whitespace is not allowed in node IDs
commit:abc123 (in import) System prefix commit: nodes cannot be created via import
task:A --[explodes]--> task:B Known edge type explodes is not one of the 11 valid edge types
task:X blocks task:X Self-edge blocks does not allow self-edges
confidence: 1.5 Confidence range Must be in [0.0, 1.0]
confidence: NaN Finite number NaN, Infinity, and non-numbers are rejected
confidence: "0.9" No coercion String "0.9" is not a number — hard error
YAML without version: Version required All import files must declare version: 1
version: 2 Known version Only version 1 is supported

9. Import File Format

Import files use YAML with three root keys.

version: 1

nodes:
  - id: "milestone:BEDROCK"
    props:
      name: "BEDROCK"
      type: "milestone"
      theme: "Schema & Node Foundations"

  - id: "task:BDK-001"
    props:
      name: "Write GRAPH_SCHEMA.md"
      type: "task"
      estHours: 3

edges:
  - source: "task:BDK-001"
    target: "feature:BDK-SCHEMA"
    type: "implements"
    confidence: 1.0
    rationale: "Task delivers the schema spec"

  - source: "feature:BDK-SCHEMA"
    target: "milestone:BEDROCK"
    type: "belongs-to"

Validation Order

  1. Check version field (missing or unknown → hard error)
  2. Validate all node IDs (grammar, prefix warning)
  3. Check for duplicate node IDs within the file (→ hard error)
  4. Validate all edges (known types, valid source/target IDs, confidence range)
  5. Check for duplicate (source, target, type) tuples within the file (→ hard error)
  6. Check for commit: prefix in nodes (→ hard error)
  7. Reference validation: all edge sources/targets must exist (in file or in graph)
  8. If all pass → write in a single atomic patch

Appendix: Quick Reference

Node ID Regex

// JS regex literal
const NODE_ID = /^[a-z][a-z0-9-]*:[A-Za-z0-9._\/@-]+$/;

// JSON-safe string
// "^[a-z][a-z0-9-]*:[A-Za-z0-9._/@-]+$"

Prefix Array

const CANONICAL_PREFIXES = [
  'milestone', 'feature', 'task', 'issue', 'pr', 'phase',
  'spec', 'adr', 'doc', 'concept', 'decision',
  'crate', 'module', 'pkg', 'file',
  'person', 'tool',
  'event', 'metric',
];

Edge Type Array

const EDGE_TYPES = [
  'implements', 'augments', 'relates-to', 'references',
  'touches', 'groups', 'blocks',
  'belongs-to', 'consumed-by', 'depends-on', 'documents',
];

System Prefixes

const SYSTEM_PREFIXES = ['commit', 'repo', 'epoch'];