Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

NotarAI

Intent captured. Drift reconciled.

NotarAI is a continuous intent reconciliation tool that keeps your specs, code, and documentation in sync as all three evolve. It uses LLMs as a bidirectional reconciliation engine: not just to generate code from specs, but to detect drift, surface conflicts, and propose updates across your entire artifact chain.

What is NotarAI?

Spec-anchored – Structured YAML specs capture intent as the canonical source of truth, validated by JSON Schema.

Bidirectional – Detects drift in any direction (code, spec, or docs) and proposes aligned updates.

Propose and approve – Never auto-syncs. All changes are proposed for human review.

Composable – Specs reference each other via $ref for hierarchical and cross-cutting composition.

Installation

Quick Install (Linux / macOS)

curl -fsSL https://raw.githubusercontent.com/davidroeca/NotarAI/main/scripts/install.sh | sh

This detects your OS and architecture, downloads the appropriate binary from GitHub Releases, and installs it to ~/.local/bin. If that directory is not in your PATH, the script will print a one-line export command to add it.

From crates.io

If you have Rust installed:

cargo install notarai

Manual Download

Download the binary for your platform from the latest release:

PlatformBinary
Linux x86_64 (glibc)notarai-x86_64-linux-gnu
Linux x86_64 (musl)notarai-x86_64-linux-musl
Linux aarch64 (glibc)notarai-aarch64-linux-gnu
Linux aarch64 (musl)notarai-aarch64-linux-musl
macOS x86_64notarai-x86_64-macos
macOS aarch64 (Apple Silicon)notarai-aarch64-macos
Windows x86_64notarai-x86_64-windows.exe

Make the binary executable and move it to a directory in your PATH:

chmod +x notarai-*
mkdir -p ~/.local/bin
mv notarai-* ~/.local/bin/notarai

If ~/.local/bin is not already in your PATH, add this to your shell profile (~/.bashrc, ~/.zshrc, etc.):

export PATH="$HOME/.local/bin:$PATH"

From Source

git clone https://github.com/davidroeca/NotarAI
cd NotarAI
cargo build --release
# Binary is at target/release/notarai

Updating

If NotarAI is already installed, check for and install updates with:

notarai update

This detects how NotarAI was installed and acts accordingly — downloading a new binary for GitHub Release installs, or printing the appropriate cargo install command for Cargo installs. Use notarai update --check to check without installing.

NotarAI also prints a passive update hint on notarai validate and notarai init when a newer version is available (checked at most once every 24 hours).

Requirements

  • No runtime dependencies – NotarAI is a single static binary
  • Claude Code for reconciliation features (optional for validation-only usage)

Quick Start

Initialize your project

Run notarai init in your project root:

notarai init

This does several things:

  1. Adds a PostToolUse hook to .claude/settings.json so spec files are automatically validated when Claude Code writes or edits them.
  2. Copies the /notarai-reconcile skill to .claude/skills/ for drift detection.
  3. Copies the /notarai-bootstrap skill to .claude/skills/ for bootstrapping specs from an existing codebase.
  4. Copies notarai.spec.json to .notarai/notarai.spec.json so the schema is available for validation.
  5. Writes .notarai/README.md with workflow instructions.
  6. Replaces the ## NotarAI section in CLAUDE.md with a concise description of the workflow.
  7. Appends .notarai/.cache/ to .gitignore so the hash cache DB is never committed.
  8. Writes .mcp.json registering notarai mcp as a local MCP server, so MCP-accelerated reconciliation works out of the box.

Running init again is safe: it always refreshes skills and the schema copy, and replaces the ## NotarAI section in CLAUDE.md with the current content.

Create your first spec

Specs live in a .notarai/ directory at the root of your repository:

project/
  .notarai/
    system.spec.yaml
    auth.spec.yaml
    billing.spec.yaml
    _shared/
      security.spec.yaml
  src/
  docs/

Here’s a minimal spec:

# .notarai/auth.spec.yaml
schema_version: '0.6'

intent: |
  Users can sign up, log in, and reset passwords.
  Sessions expire after 30 min of inactivity.

behaviors:
  - name: 'signup'
    given: 'valid email + password (>= 12 chars)'
    then: 'account created, welcome email sent'
  - name: 'login'
    given: 'valid credentials'
    then: 'JWT issued, session created'

artifacts:
  code:
    - path: 'src/auth/**'
      role: 'primary implementation'
  docs:
    - path: 'docs/auth.md'

Validate specs

# Validate all spec files in .notarai/
notarai validate

# Validate a specific file
notarai validate .notarai/auth.spec.yaml

# Validate a directory
notarai validate .notarai/subsystems/

Output is PASS <file> or FAIL <file> with an indented error list. Exit code is 0 if all files pass, 1 if any fail.

Update NotarAI

Check for and install updates:

notarai update

NotarAI will also print a hint when a newer version is available during validate or init.

Bump schema version

When you upgrade to a new version of NotarAI, update all spec files with:

notarai schema-bump

This overwrites .notarai/notarai.spec.json with the bundled schema and updates the schema_version field in every .notarai/*.spec.yaml file.

Bootstrap from an existing codebase

Use the /notarai-bootstrap skill in Claude Code to generate specs from your existing code via a structured developer interview.

Detect drift

Use the /notarai-reconcile skill in Claude Code to detect drift between specs and code, and propose aligned updates.

Spec Format Reference

Specs are YAML files validated against a JSON Schema (notarai.spec.json). The format uses progressive disclosure: a small set of required fields for minimum viability, with optional fields for precision as needed.

Required fields

schema_version

Pins the JSON Schema version. Current version: "0.7". Versions "0.6" and "0.5" are also accepted for backward compatibility.

schema_version: '0.7'

intent

Natural language description of what the system or feature should do.

intent: |
  Users can sign up, log in, and reset passwords.
  Sessions expire after 30 min of inactivity.

behaviors

Structured Given/Then entries describing expected behavior. Each behavior has a name, a given condition, and a then outcome. Required for full tier specs; optional for registered and derived tier specs.

behaviors:
  - name: 'signup'
    given: 'valid email + password (>= 12 chars)'
    then: 'account created, welcome email sent'
  - name: 'session_timeout'
    given: '30 min inactivity'
    then: 'session invalidated'

Behaviors may also include optional interaction and state_transition sub-fields:

behaviors:
  - name: 'submit_form'
    given: 'user submits a valid form'
    then: 'data saved, confirmation shown'
    interaction:
      trigger: user_action # user_action | timer | system_event | data_change | schedule | external_signal | threshold | manual | lifecycle
      sequence:
        - validate fields
        - post to API
        - show confirmation
    state_transition:
      from: editing
      to: confirmed

artifacts

Glob patterns mapping the spec to the files it governs. The schema accepts any string as a category key. Convention categories:

CategoryWhen to use
codeSource code
docsDocumentation
testsTest files
slidesPresentation files
dataData files, CSVs
configsConfiguration, IaC
notebooksJupyter/R notebooks
assetsMedia, images, fonts
templatesReusable templates
schemasData schemas, API specs
artifacts:
  code:
    - path: 'src/auth/**'
      role: 'primary implementation'
  docs:
    - path: 'docs/auth.md'
  tests:
    - path: 'tests/auth/**'

Each artifact ref may include an optional integer tier override (1-4) for files that belong to a different tier than the spec itself:

artifacts:
  code:
    - path: 'dist/bundle.js'
      tier: 4 # derived output — tracked for staleness, not authored directly

Optional fields

constraints

Rules the system must follow.

constraints:
  - 'rate limit: 5 login attempts per minute per IP'
  - 'passwords must be >= 12 characters'

invariants

Conditions that must never be violated.

invariants:
  - 'no plaintext passwords stored anywhere'
  - 'all API responses include request-id header'

decisions

Architectural decision log with date, choice, and rationale.

decisions:
  - date: '2025-01-15'
    choice: 'JWT over session cookies'
    rationale: 'Stateless auth simplifies horizontal scaling'

open_questions

Unresolved design questions.

open_questions:
  - 'Should we support OAuth providers beyond Google?'
  - "What's the session timeout for mobile clients?"

dependencies

References to other specs this one interacts with.

dependencies:
  - $ref: 'billing.spec.yaml'
    relationship: 'auth gates billing endpoints'

notes

Freeform hints for the LLM about implicit relationships.

notes: |
  The auth module shares a rate limiter with the API gateway.
  Session storage is Redis in production, in-memory in dev.

output

Describes what the spec ultimately produces. Useful for non-software artifacts like presentations or reports.

output:
  type: presentation # app | presentation | interactive-doc | game | dashboard | report | library | service | document | course | api | infrastructure | dataset | design-system | campaign | template
  format: pptx
  runtime: static-file # browser | native | static-file | embedded | server
  entry_point: dist/deck.pptx

content

Describes the output’s logical structure in content terms (slides, scenes, sections) rather than file terms.

content:
  structure: graph # ordered | hierarchical | graph | free-form
  sections:
    - id: level_1
      type: scene
      intent: 'Tutorial level introducing movement mechanics'
      duration: { value: 5, unit: minutes }
      connections:
        - to: level_2
          label: completion
        - to: game_over
          label: player_death
      depends_on:
        - id: intro_cutscene
          relationship: 'must complete before this section unlocks'
      evidence:
        - type: data
          source: playtests/run_3.csv
          claim: '85% of players complete within 5 minutes'

states

Top-level state machine definition for interactive artifacts.

states:
  initial: idle
  definitions:
    - id: idle
      transitions:
        - to: running
          on: start
          guard: 'all required fields are populated'
          action: 'initialize timer, log start event'
    - id: running
      transitions:
        - to: idle
          on: stop

design

Visual and design specifications for brand-governed artifacts.

design:
  theme:
    palette: ['#1a1a2e', '#16213e']
    typography:
      heading: Inter
      body: Roboto
    modes:
      light: { palette: ['#ffffff', '#f0f0f0'] }
      dark: { palette: ['#1a1a2e', '#16213e'] }
  layout:
    type: paginated # slide-deck | scrolling | spatial | grid | free-form | paginated | canvas | timeline | tabbed
    dimensions: letter
  print:
    margins: { top: '1in', right: '1in', bottom: '1in', left: '1in' }
    headers: true
    footers: true
    page_numbers: true
  responsive:
    breakpoints:
      - name: mobile
        max_width: 768
        layout_override: scrolling
      - name: desktop
        min_width: 769

audience

Context about who the output is for.

audience:
  role: 'Series B investors'
  assumed_knowledge: 'Familiar with SaaS metrics, not technical infrastructure'
  tone: formal-but-engaging
  locale: en-US
  accessibility:
    - high-contrast
    - screen-reader-friendly

variants

Multiple versions of the same artifact with selective field overrides.

variants:
  - id: investor-deck
    description: 'Condensed version for investor meetings'
    overrides:
      audience.role: 'Series B investors'
  - id: engineering-deep-dive
    description: 'Full technical version for the eng team'

Variants are declarative metadata by default. Set variants_resolved: true at the spec top level to opt in to programmatic override resolution (scalar replacement, array replacement with + prefix for append, deep merge for objects, null to clear).

pipeline

Describes the build or generation process for the output artifact.

pipeline:
  env:
    NODE_ENV: production
  steps:
    - name: compile
      tool: tsc
      input: 'src/**/*.ts'
      output: dist/
      condition: "output.format == 'web'"
    - name: export_pdf
      command: 'pandoc input.md -o output.pdf'
      condition: "output.format == 'pdf'"
      on_failure: skip
      depends_on: [compile]
      env:
        PANDOC_DATA_DIR: ./templates
  preview:
    command: npx serve dist/
    url: 'http://localhost:3000'

feedback

Connects output performance metrics back to the spec for reconciliation triggers.

feedback:
  metrics:
    - name: avg_completion_rate
      source: analytics/completion.csv
      threshold: '>= 0.7'
    - name: build_time
      threshold: '< 5s'
  triggers:
    - condition:
        metric: avg_completion_rate
        operator: below_threshold
        duration: { value: 3, unit: days }
      action: reconcile
      priority: high

Note: reconciliation_trigger (free-form string) is deprecated in favor of triggers but still accepted.

compliance

Maps invariants and constraints to regulatory or standards frameworks. The reconciliation engine verifies that framework-required invariants still exist in the spec.

compliance:
  frameworks:
    - name: SOC2
      controls:
        - id: CC6.1
          satisfied_by:
            invariants: ['no plaintext passwords stored anywhere']
            constraints: ['rate limit: 5 login attempts per minute per IP']
    - name: WCAG
      level: AA
      satisfied_by:
        invariants: ['all interactive elements have visible focus indicators']
  audit_trail: true

Coverage tiers

Every file in the repo falls into one of four tiers:

  • Tier 1 (Full) — Business logic, APIs, user-facing features. Full behavioral specification required.
  • Tier 2 (Registered) — Utilities, config, sidecars. Intent and artifact mapping only; behaviors not required.
  • Tier 3 (Excluded) — Explicitly out of scope. Declared via exclude globs on the system spec.
  • Tier 4 (Derived) — Generated outputs tracked for staleness but not authored directly (e.g., build artifacts, compiled bundles). Use tier: derived on the spec or tier: 4 on individual artifact refs.

Files not covered by any tier are flagged as “unspecced” — a lint warning, not a blocker.

Set the spec-level tier with the tier field:

tier: registered # full (default) | registered | derived

Composition

Specs compose via $ref (borrowed from JSON Schema/OpenAPI):

  • subsystems — hierarchical references (system → services)
  • applies — cross-cutting specs (e.g., security, logging) that apply to all subsystems

A top-level system.spec.yaml serves as the manifest, referencing subsystem specs and declaring exclusion patterns for Tier 3 files.

Reconciliation

How reconciliation works

The reconciliation engine detects three scenarios:

1. Someone edits code

The engine detects that code has drifted from the spec and proposes spec and doc updates.

2. Someone edits spec

The engine propagates the spec change to code and documentation.

3. Conflict

Code says one thing, the spec says another. The engine surfaces the disagreement and the user decides which is correct.

The system is always propose-and-approve, never auto-sync. Both users and LLMs can edit everything; the spec is the tiebreaker.

Using reconciliation

After running notarai init, use the /notarai-reconcile slash command in Claude Code to trigger a reconciliation pass.

The reconciliation engine uses the notarai MCP server to serve pre-filtered data, keeping context usage proportional to what actually changed:

  1. Calls list_affected_specs to identify which specs govern changed files.
  2. For each affected spec, calls get_spec_diff to get only the diff for files that spec governs. Files already reconciled (per the BLAKE3 hash cache) are skipped and listed in the skipped field. Pass exclude_patterns to suppress noisy files like lockfiles; pass bypass_cache: true to force a full diff without clearing the cache.
  3. Loads any applies cross-cutting specs and merges their invariants and constraints into the analysis.
  4. Notes any dependencies refs for ripple-effect analysis.
  5. Calls get_changed_artifacts to get only doc artifacts that changed since the last reconciliation.
  6. Reads only those files, analyzes drift against the spec’s behaviors, constraints, and invariants.
  7. Proposes targeted updates to bring spec, code, and docs back into alignment.
  8. Calls mark_reconciled to update the hash cache for the next run.

See the MCP Server reference for full tool parameters and return shapes.

If the MCP server is unavailable, the command falls back to a manual flow using git diff directly.

Automatic validation

After notarai init, spec files are validated automatically whenever Claude Code writes or edits a file in .notarai/. Invalid specs block the tool use with errors on stderr. Non-spec files are ignored silently.

Specs vs Claude Rules

NotarAI specs and Claude rules (CLAUDE.md / .claude/rules/) both express project conventions, but they serve different purposes and trigger at different times. This guide explains when to use each – and when to use both.

Decision framework

Use a spec when…Use a Claude rule when…
The concern describes what artifacts must look likeThe concern describes how Claude should work
You want reconciliation to detect drift retroactivelyYou want to prevent violations proactively
The rule maps to files you can diff againstThe rule is about process, workflow, or tool usage
Cross-cutting specs (applies) can propagate itThe convention only matters during generation

Use a spec

Specs are the right home for artifact-facing rules – invariants, constraints, and behaviors that describe what code, docs, or configs should look like. The reconciliation engine diffs artifacts against these rules and proposes fixes when they drift.

Examples from this project:

  • “American English throughout” – style.spec.yaml catches existing files that use British spellings
  • “The engine must never silently auto-modify code” – an invariant in system.spec.yaml that reconciliation checks against code changes
  • “CLI validates spec files against bundled JSON Schema” – a behavior in cli.spec.yaml tied to source files

Cross-cutting specs (referenced via applies in the system spec) propagate invariants and constraints across all subsystems without duplication.

Use a Claude rule

Claude rules are the right home for workflow-facing instructions – how Claude should run commands, what tools to prefer, what process to follow. These have no artifact to reconcile against; they shape how Claude works, not what the output looks like.

Examples:

  • “Tests use cargo test” – tells Claude which command to run
  • “When bumping schema version, update these five files” – a checklist for a multi-step process
  • “Unit tests are inline #[cfg(test)] modules” – convention for where to put new tests

These belong in .claude/rules/ files (or CLAUDE.md) because there is no meaningful way to diff project files against them.

Use both

Some conventions benefit from both proactive prevention and retroactive detection. Style rules are the classic example:

  • Claude rule prevents new violations: Claude follows the rule as it generates code, so new files are correct from the start.
  • Spec catches existing drift: reconciliation scans all governed files and flags violations that predate the rule or were introduced by humans.

This is intentional duplication, not redundancy. The two mechanisms cover different failure modes.

Examples from this project:

ConventionClaude ruleSpec
American English.claude/rules/style.mdstyle.spec.yaml
QWERTY-typable characters.claude/rules/style.mdstyle.spec.yaml

Anti-patterns

Don’t put process instructions in specs. A spec behavior like “given a schema version bump, then update these five files” has no artifact to diff against. It belongs in a Claude rule or checklist.

Don’t put formal behavioral specs in Claude rules. A rule like “the CLI must validate spec files against the bundled schema” is a testable behavior. If it lives only in CLAUDE.md, reconciliation can’t detect when code drifts away from it.

Don’t duplicate without purpose. If a convention only needs proactive prevention (e.g., “run prettier on generated code”), a Claude rule is sufficient. If it only needs retroactive detection (e.g., “no circular $ref chains”), a spec invariant is sufficient. Use both only when both failure modes are real.

Non-Software Examples

NotarAI works for any artifact with intent, not just code. These examples show how the schema applies to presentations, legal documents, and research reports.


Presentation spec

A conference talk governed for audience alignment and slide drift.

schema_version: '0.7'
domain: presentation

intent: >
  A 30-minute conference talk introducing NotarAI to developers unfamiliar with
  spec-driven workflows. Attendees should leave understanding the three-body drift
  problem and how to run notarai init on their own project.

behaviors:
  - name: opening_hook
    given: 'speaker takes the stage'
    then: 'the intro slide presents a relatable drift scenario in under 90 seconds'
  - name: demo_live
    given: 'the demo section'
    then: 'speaker runs notarai init and reconcile live on a sample repo; audience sees a real drift report'

audience:
  role: 'mid-to-senior developers at a software conference'
  assumed_knowledge: 'Familiar with git, CI/CD, and code review workflows; may not know NotarAI'
  tone: formal-but-engaging
  locale: en-US

output:
  type: presentation
  format: pptx
  runtime: static-file
  entry_point: dist/talk.pptx

content:
  structure: ordered
  sections:
    - id: intro
      type: slide
      intent: 'Hook: show a real drift incident and its cost'
      duration: { value: 3, unit: minutes }
    - id: problem
      type: slide
      intent: 'Explain the three-body drift problem (spec, code, docs)'
      duration: { value: 5, unit: minutes }
    - id: demo
      type: interactive
      intent: 'Live notarai init + reconcile demo on a sample repo'
      duration: { value: 10, unit: minutes }
      content_ref: demo/sample-repo/
    - id: takeaways
      type: slide
      intent: 'Three action items the audience can do today'
      duration: { value: 2, unit: minutes }

design:
  theme:
    palette: ['#0f172a', '#6366f1', '#ffffff']
    typography:
      heading: Inter
      body: Inter
  layout:
    type: slide-deck
    dimensions: '16:9'

artifacts:
  slides:
    - path: 'slides/**/*.md'
      role: 'slide source content'
  assets:
    - path: 'assets/**'
      role: 'images and diagrams'

What this demonstrates: output.type: presentation, content.sections with duration, audience, and design. The reconciliation engine uses duration to detect if the talk now runs over time, and intent per section to detect off-message slides.


A service agreement governed for compliance and clause integrity.

schema_version: '0.7'
domain: legal

intent: >
  A standard SaaS service agreement for enterprise customers. Governs payment
  terms, liability limits, data processing obligations, and termination rights.
  The spec tracks which clauses satisfy which regulatory requirements so that
  removing or weakening a clause triggers a compliance drift alert.

behaviors:
  - name: data_processing
    given: 'customer data is processed by the service'
    then: 'the DPA clause defines processing purposes, data categories, and sub-processor obligations per GDPR Article 28'
  - name: liability_cap
    given: 'a dispute arises'
    then: 'liability is capped at 12 months of fees paid, except for gross negligence or data breach'

constraints:
  - 'All clause changes must be reviewed by legal counsel before execution'
  - 'Governing law must match the entity jurisdiction for each signed copy'

invariants:
  - 'The DPA clause must never be removed from the agreement'
  - 'Liability cap language must reference the specific cap amount'

compliance:
  frameworks:
    - name: GDPR
      controls:
        - id: Art28
          satisfied_by:
            invariants:
              ['The DPA clause must never be removed from the agreement']
    - name: SOC2
      controls:
        - id: CC9.2
          satisfied_by:
            constraints:
              [
                'All clause changes must be reviewed by legal counsel before execution',
              ]
  audit_trail: true

output:
  type: document
  format: pdf

content:
  structure: ordered
  sections:
    - id: definitions
      type: clause
      intent: 'Define all capitalized terms used in the agreement'
    - id: services
      type: clause
      intent: 'Describe the scope and delivery of services'
    - id: payment
      type: clause
      intent: 'Payment terms, invoicing cycle, and late payment penalties'
    - id: dpa
      type: clause
      intent: 'Data Processing Agreement per GDPR Article 28'
      depends_on:
        - id: definitions
          relationship: 'References defined terms for data categories and processing'
    - id: liability
      type: clause
      intent: 'Limit liability to 12 months fees; carve out gross negligence and data breach'
    - id: termination
      type: clause
      intent: 'Termination for convenience (30 days notice) and for cause (material breach)'

design:
  layout:
    type: paginated
    dimensions: letter
  print:
    margins: { top: '1in', right: '1in', bottom: '1in', left: '1in' }
    headers: true
    footers: true
    page_numbers: true

artifacts:
  docs:
    - path: 'contracts/service-agreement.md'
      role: 'master agreement source'
  configs:
    - path: 'contracts/variables.yaml'
      role: 'per-customer variable substitutions (entity name, jurisdiction, fees)'

What this demonstrates: domain: legal, compliance.frameworks with control mappings, content.sections with type: clause and depends_on, design.print for paginated layout. The compliance block creates an explicit link between the GDPR requirement and the DPA clause – if someone removes the DPA clause, the reconciliation engine flags it as a high-priority drift event.


Research report spec

An evidence-backed technical report governed for citation integrity.

schema_version: '0.7'
domain: research

intent: >
  A technical report evaluating three approaches to LLM-assisted code review:
  prompt-only, RAG-augmented, and spec-anchored. Reports accuracy, latency, and
  reviewer acceptance metrics from a 90-day study across 12 repositories.

behaviors:
  - name: methodology_reproducible
    given: 'a reader follows the methodology section'
    then: 'they can reproduce the experimental setup using the linked code and dataset'
  - name: results_traceable
    given: 'a claim appears in the results section'
    then: 'it is linked to a specific row or aggregate in the dataset'

constraints:
  - 'All quantitative claims must cite a specific data source in evidence'
  - 'Comparison tables must include confidence intervals'
  - 'Methodology must describe exclusion criteria for repositories'

output:
  type: document
  format: pdf

content:
  structure: ordered
  sections:
    - id: abstract
      type: section
      intent: 'Summarize the study question, methods, and key finding in 150 words'
      duration: { value: 2, unit: minutes }
    - id: methodology
      type: section
      intent: 'Describe the 90-day study design, repository selection criteria, and evaluation metrics'
      content_ref: sections/methodology.md
      evidence:
        - type: reference
          ref: 'Chen et al. 2023 -- LLM code review benchmarks'
          claim: 'Our accuracy metric aligns with the Chen et al. framework'
          relationship: 'supports methodology choice'
    - id: results
      type: section
      intent: 'Present accuracy, latency, and acceptance metrics per approach with confidence intervals'
      content_ref: sections/results.md
      evidence:
        - type: data
          source: data/results_final.csv
          claim: 'Spec-anchored approach achieves 94% accuracy vs 81% for prompt-only'
          relationship: 'primary quantitative result'
        - type: data
          source: data/latency.csv
          claim: 'Median review latency under 4 seconds for all approaches'
    - id: discussion
      type: section
      intent: 'Interpret results, discuss limitations, and suggest future work'
      depends_on:
        - id: results
          relationship: 'Interpretation requires results to be finalized'
    - id: conclusion
      type: section
      intent: 'State the recommendation: spec-anchored review for accuracy-critical workflows'

feedback:
  metrics:
    - name: peer_review_score
      threshold: '>= 3.5 / 5'
    - name: reproduction_success_rate
      threshold: '>= 0.8'
  triggers:
    - condition:
        metric: peer_review_score
        operator: below_threshold
      action: reconcile
      priority: high

artifacts:
  docs:
    - path: 'sections/**/*.md'
      role: 'report section source content'
  data:
    - path: 'data/**/*.csv'
      role: 'experimental results datasets'
  configs:
    - path: 'analysis/**/*.py'
      role: 'analysis scripts that produce data/ outputs'

What this demonstrates: domain: research, content.sections with evidence entries linking claims to data sources, depends_on between sections, feedback.triggers for structured review thresholds, and duration for time-budgeted writing. When data/results_final.csv changes, the reconciliation engine flags the results section’s claim for review because it is linked via evidence.

CLI Commands

NotarAI is distributed as a single static binary with no runtime dependencies. All commands use the notarai prefix.

notarai validate

Validate spec files against the JSON Schema.

# Validate all specs in .notarai/ (default)
notarai validate

# Validate a specific file
notarai validate .notarai/auth.spec.yaml

# Validate a directory
notarai validate .notarai/subsystems/

Arguments:

ArgumentRequiredDescription
pathNoFile or directory to validate. Defaults to .notarai/

Behavior:

  • Single file: validates against the schema, prints PASS or FAIL with indented errors.
  • Directory: recursively finds all .spec.yaml files and validates each.
  • No specs found: exits 0 with a warning on stderr.
  • Stale schema warning: if .notarai/notarai.spec.json exists but its $id differs from the bundled schema, prints a warning suggesting notarai init to update.

Exit codes: 0 all files pass, 1 any file fails.


notarai init

Set up NotarAI in a project. Running init again is safe: it always refreshes skills and the schema copy.

notarai init

What it does:

  1. Adds a PostToolUse hook to .claude/settings.json so spec files are automatically validated when Claude Code writes or edits them (command: notarai hook validate).
  2. Copies notarai.spec.json to .notarai/notarai.spec.json so Claude has the schema available (always refreshed to keep current).
  3. Writes .notarai/README.md with workflow instructions (always overwritten).
  4. Copies notarai-reconcile and notarai-bootstrap skills to .claude/skills/ (always overwritten to stay in sync with the binary).
  5. Replaces the ## NotarAI section in CLAUDE.md with a concise workflow description. Appends if the section is absent.
  6. Appends .notarai/.cache/ to .gitignore so the hash cache DB is never committed.
  7. Writes .mcp.json registering notarai mcp as a local MCP server.

Exit codes: 0 success, 1 error.


notarai schema-bump

Update the schema version across all specs in the project.

notarai schema-bump

Detects the schema version in .notarai/notarai.spec.json (if it exists) and compares it to the bundled schema. If they differ:

  1. Overwrites .notarai/notarai.spec.json with the bundled schema.
  2. Updates the schema_version field in every .notarai/*.spec.yaml file.
  3. Validates all updated specs and reports any failures.

If versions already match, prints “Already at current schema version” and exits 0.

Exit codes: 0 success or already current, 1 validation error after update.


notarai hook validate

PostToolUse hook handler. Validates spec files when Claude Code writes or edits them.

# Called automatically by Claude Code, not typically invoked manually
notarai hook validate

Reads PostToolUse JSON from stdin. If the file path matches .notarai/**/*.spec.yaml, reads the file from disk and validates it. Invalid specs block the tool use with errors on stderr.

Behavior:

StdinResult
Spec file path (.notarai/**/*.spec.yaml)Validates; exits 1 with errors if invalid
Non-spec file pathExits 0 silently
Invalid JSON or missing fileExits 0 silently (graceful degradation)

Exit codes: 0 valid or non-spec file, 1 invalid spec.


notarai cache

BLAKE3 + SQLite hash cache for tracking file changes between reconciliation runs. The cache database lives at .notarai/.cache/notarai.db.

notarai cache status

Show cache status: database path, entry count, and newest entry timestamp.

notarai cache status

Creates an empty database if none exists.

Exit codes: 0 success, 1 error.

notarai cache clear

Delete the cache database.

notarai cache clear

Prints Cache cleared or Cache not initialized (if the DB didn’t exist). No-op if the file does not exist.

Exit codes: 0 success, 1 error.


notarai state

Manage the persistent reconciliation state file (.notarai/reconciliation_state.json). The state file records the last reconciliation timestamp, git hash, branch, and BLAKE3 fingerprints for all governed files and specs. It can be committed to the repo to give collaborators a baseline.

notarai state show

Display the current reconciliation state.

notarai state show

Prints the timestamp, git hash, branch, and counts of tracked files and specs. Prints No reconciliation state found. if no state file exists.

Exit codes: 0 success, 1 error.

notarai state reset

Delete the reconciliation state file, forcing the next reconciliation to treat everything as changed.

notarai state reset

Prints Reconciliation state reset. or No reconciliation state to reset. (if the file didn’t exist).

Exit codes: 0 success, 1 error.

notarai state snapshot

Build a new state snapshot from the current SQLite cache and save it to .notarai/reconciliation_state.json.

notarai state snapshot

Reads all entries from the cache, partitions them into file fingerprints and spec fingerprints, captures the current git HEAD and branch, and writes the result. This is the CLI equivalent of the snapshot_state MCP tool.

Exit codes: 0 success, 1 error.


notarai update

Check for and install updates.

# Check if an update is available
notarai update --check

# Update to the latest version
notarai update

Arguments:

FlagRequiredDescription
--checkNoOnly check, don’t install

Behavior:

The command queries the GitHub API for the latest release, compares its version against the current binary, and prints the result. Without --check, it also attempts to install the update:

Install methodDetectionAction
GitHub ReleaseBinary is not in .cargo/bin or target/Downloads and replaces the binary in place
cargo installBinary path contains .cargo/binPrints cargo install notarai
Dev buildDebug build or path contains target/Prints cargo install --path .

Passive update hints:

notarai validate and notarai init automatically check for updates in the background using a global cache with a 24-hour TTL and a 5-second network timeout. If a newer version is available, a one-line hint is printed to stderr. All errors are silently swallowed — the hint never interferes with normal output.

Exit codes: 0 success or up to date, 1 error or update failure.


notarai mcp

Start a synchronous JSON-RPC 2.0 MCP server over stdio. Typically configured automatically by notarai init rather than invoked manually.

notarai mcp

The server reads JSON-RPC messages line-by-line from stdin and writes responses to stdout. It exits cleanly on stdin EOF.

Protocol: JSON-RPC 2.0 over stdio (synchronous, no async runtime).

Setup: notarai init writes .mcp.json to the project root, which Claude Code reads to auto-start the server:

{
  "mcpServers": {
    "notarai": {
      "type": "stdio",
      "command": "notarai",
      "args": ["mcp"]
    }
  }
}

See the MCP Server reference for the full tool API, parameters, and return shapes.

Exit codes: 0 on stdin EOF.

MCP Server

NotarAI includes a built-in Model Context Protocol (MCP) server that serves pre-filtered diffs and change data to the reconciliation engine. This keeps context usage proportional to what actually changed rather than the full repository.

Setup

notarai init writes an .mcp.json file to the project root that registers the MCP server:

{
  "mcpServers": {
    "notarai": {
      "type": "stdio",
      "command": "notarai",
      "args": ["mcp"]
    }
  }
}

Claude Code reads this file and starts the server automatically. No manual configuration needed.

Protocol

  • Transport: stdio (stdin/stdout)
  • Format: JSON-RPC 2.0, one message per line
  • Execution: synchronous (no async runtime)
  • Protocol version: 2024-11-05

Initialize response

The initialize response includes standard MCP fields (protocolVersion, capabilities, serverInfo, tools). When the local schema (.notarai/notarai.spec.json) is out of date relative to the bundled schema, the response includes an additional schemaNote field:

{
  "schemaNote": "Schema is out of date (local: .../0.5/..., bundled: .../0.6/...). Run `notarai init` to update."
}

This surfaces schema staleness to Claude at session start without requiring a separate check.

When the project’s NotarAI configs are behind the running CLI version (detected via the version in .notarai/README.md), the response includes an additional projectNote field:

{
  "projectNote": "hint: project was initialized with notarai v0.3.1. Run `notarai init` to update project configs to v0.3.2."
}

This surfaces project config staleness to Claude at session start so reconciliation uses up-to-date slash commands and schema.

Tools

list_affected_specs

Identify which specs govern files that changed on the current branch relative to a base branch.

Parameters:

ParameterTypeRequiredDescription
base_branchstringYesBranch to diff against (e.g., "main")

Returns:

{
  "changed_files": ["src/auth.rs", "src/main.rs"],
  "affected_specs": [
    {
      "spec_path": ".notarai/cli.spec.yaml",
      "behaviors": [],
      "constraints": [],
      "invariants": []
    }
  ]
}

Each affected spec includes its behaviors, constraints, and invariants so the reconciliation engine has the context to evaluate drift without additional file reads.


get_spec_diff

Get the git diff filtered to files governed by a specific spec. Uses the hash cache to skip files that haven’t changed since the last reconciliation.

Parameters:

ParameterTypeRequiredDescription
spec_pathstringYesRelative path to the spec file
base_branchstringYesBranch to diff against
exclude_patternsstring[]NoGlob patterns to exclude via git :(exclude) pathspecs (e.g., ["Cargo.lock", "*.lock"])
bypass_cachebooleanNoIf true, diff all governed files regardless of cache state. Defaults to false

Returns:

{
  "diff": "unified diff of non-spec governed files...",
  "files": ["src/auth.rs"],
  "skipped": ["src/utils.rs"],
  "excluded": ["Cargo.lock"],
  "spec_changes": [
    {
      "path": ".notarai/cli.spec.yaml",
      "content": "full file content..."
    }
  ],
  "system_spec": {
    "path": ".notarai/system.spec.yaml",
    "content": "full file content..."
  },
  "binary_changes": ["assets/logo.png", "slides/deck.pptx"],
  "file_categories": {
    "src/auth.rs": "code",
    "docs/auth.md": "docs",
    "assets/logo.png": "assets"
  }
}
FieldDescription
diffUnified diff output for non-spec, non-binary artifact files only
filesNon-spec files included in the diff (includes binary files by path, but their content is in binary_changes)
skippedNon-spec files whose BLAKE3 hash matched the cache (already reconciled)
excludedPatterns passed via exclude_patterns
spec_changesArray of {path, content} for each governed .notarai/**/*.spec.yaml file that changed
system_specThe system spec (the spec with a subsystems key) – included whenever spec_changes is non-empty; null otherwise
binary_changesFile paths of binary files (images, PPTX, PDF, etc.) whose content cannot be usefully diffed
file_categoriesObject mapping each changed file path to its artifact category from the spec (e.g., "code", "docs", "assets")

Why full content for spec files?

Spec files express intent, not implementation. The reconciliation engine needs the complete spec to evaluate drift – diff hunks showing only changed lines lack the context to determine whether behavior is still satisfied. Returning full content also avoids the ambiguity of partial context when the spec is the source of truth.

Spec deduplication: If the system spec itself changed, it appears in spec_changes with full content and system_spec contains only {path} (a reference) to avoid duplicating the content.

Cache behavior:

  • Files whose on-disk BLAKE3 hash matches the cached hash are listed in skipped (for artifact files) or omitted from spec_changes (for spec files).
  • A cold or absent cache causes all governed files to be included. This is a safe fallback that ensures nothing is missed.
  • bypass_cache: true forces a full diff without destroying the cache (useful for re-checking everything).

get_changed_artifacts

Get artifact files governed by a spec that have changed since the last cache update. Useful for identifying which docs or other artifacts need review during reconciliation.

Parameters:

ParameterTypeRequiredDescription
spec_pathstringYesRelative path to the spec file
artifact_typestringNoFilter by artifact type (e.g., "docs", "code", "configs")

Returns:

{
  "changed_artifacts": ["docs/auth.md", "docs/api-reference.md"]
}

Only files whose content differs from the cached hash are included. If no artifact_type is specified, all artifact types are checked.


mark_reconciled

Update the hash cache after reconciliation is complete. Call this at the end of a reconciliation pass so that subsequent runs skip files that haven’t changed.

Parameters:

ParameterTypeRequiredDescription
filesstring[]YesRelative file paths to cache

Returns:

{
  "updated": 5
}

Files are hashed with BLAKE3 and stored with their relative paths as cache keys. Non-existent files are silently skipped.


clear_cache

Delete the reconciliation cache database, forcing the next get_spec_diff call to diff all governed files.

Parameters: None.

Returns:

{
  "cleared": true
}

Returns true if the database was deleted, false if it didn’t exist.


snapshot_state

Persist the current reconciliation cache as a state snapshot at .notarai/reconciliation_state.json. Call this at the end of a successful reconciliation pass.

Parameters: None.

Returns:

{
  "state_path": ".notarai/reconciliation_state.json",
  "files": 42,
  "specs": 5,
  "git_hash": "a1b2c3d..."
}
FieldDescription
state_pathAbsolute path where the state file was written
filesNumber of non-spec file fingerprints stored
specsNumber of spec fingerprints stored
git_hashgit HEAD at snapshot time (empty string if not in a repo)

The state file is pretty-printed JSON and safe to commit. It gives collaborators a baseline so subsequent get_spec_diff calls can skip files that haven’t changed since the last reconciliation. Use notarai state show / notarai state reset to inspect or clear state from the CLI.

Cache semantics

The cache is a SQLite database at .notarai/.cache/notarai.db with a single table:

file_cache(path TEXT PRIMARY KEY, blake3_hash TEXT, updated_at INTEGER)

Key details:

  • Hash algorithm: BLAKE3 – fast cryptographic hash.
  • Path format: MCP tools use relative paths as cache keys. Seed the MCP cache via mark_reconciled, not notarai cache update.
  • Cold cache: When the cache is empty or absent, get_spec_diff diffs all governed files. This is the safe default.
  • Cache location: .notarai/.cache/ is gitignored by notarai init so the cache is never committed.

Error codes

CodeMeaning
-32700Parse error (malformed JSON)
-32601Method not found
-32602Invalid params (missing required parameter)
-32603Internal error (git failure, file I/O, cache unavailable)

Motivation

The problem

With LLMs generating both code and documentation from natural language prompts, there’s no authoritative representation of intent that persists across changes. Code and docs drift out of sync – and unlike the pre-LLM era where code was the single source of truth, now either artifact can be the one that’s “right.” This is the three-body problem: intent, code, and docs can all diverge.

The idea

Introduce a NotarAI spec – a structured YAML document governed by a JSON Schema – that captures user intent as the canonical source of truth. An LLM acts as the reconciliation engine, keeping code and documentation in sync with the spec (and vice versa).

Coverage model

Four tiers ensure every file in the repo is accounted for without over-specifying:

  • Tier 1 (Full Spec): Business logic, APIs, user-facing features – full behaviors and constraints
  • Tier 2 (Registered): Utility libs, sidecars, config – just intent + artifact mapping, no behaviors
  • Tier 3 (Excluded): Generated code, vendor deps, editor configs – explicitly out of scope
  • Tier 4 (Derived): Generated outputs tracked for staleness but not authored directly (build artifacts, compiled bundles)

Anything not covered by any tier is flagged as “unspecced” – a lint warning, not a blocker.

Bootstrap

For existing codebases: ingest code + docs + commit history, then the LLM interviews the developer about goals and undocumented rules, drafts a spec with required fields only, and the user reviews and enriches. The spec accrues precision over time.

Inspirations

See the Inspirations page.

Design Diagrams

All diagrams from the design process, illustrating the NotarAI name and .notarai/ directory convention.


1. The Problem: Pre-LLM vs Current LLM Era

1a. Pre-LLM: Code Is the Spec

flowchart LR
    Dev["Developer<br/>(intent in head)"]
    Code["Source Code<br/>authoritative spec"]
    Docs["Docs<br/>second-class, often stale"]

    Dev -->|writes| Code
    Code -.->|describes| Docs

1b. Current LLM Era: The Three-Body Problem

flowchart TD
    Intent["User Intent<br/>natural language prompt"]
    LLM["LLM"]
    Code["Source Code"]
    Docs["Documentation"]

    Intent --> LLM
    Intent -.->|"edits directly"| Code
    Intent -.->|"edits directly"| Docs
    LLM -->|generates| Code
    LLM -->|generates| Docs
    Code <-..->|"drift / desync"| Docs

2. NotarAI: Spec State File as Single Source of Truth

flowchart TD
    Intent["User Intent<br/>natural language"]
    Spec["NotarAI Spec<br/>structured intent representation<br/>canonical source of truth"]
    LLM["LLM (sync engine)"]
    Code["Source Code"]
    Docs["Documentation"]

    Intent -->|updates| Spec
    Spec -->|reads| LLM
    LLM -->|derives| Code
    LLM -->|derives| Docs
    Code -.->|reconcile back| Spec
    Docs -.->|reconcile back| Spec
    Code <-.->|"always in sync via spec"| Docs

3. Spec File Anatomy

3a. Required Core

# .notarai/auth.spec.yaml
schema_version: '0.6'

intent: |
  Users can sign up, log in, and
  reset passwords. Sessions expire
  after 30 min of inactivity.

behaviors:
  - name: 'signup'
    given: 'valid email + password'
    then: 'account created, welcome email sent'
  - name: 'session_timeout'
    given: '30 min inactivity'
    then: 'session invalidated'

artifacts:
  code:
    - path: 'src/auth/**'
  docs:
    - path: 'docs/auth.md'

3b. Optional Extensions

# Power users add precision as needed

constraints:
  - 'passwords >= 12 chars'
  - 'rate limit: 5 login attempts / min'

invariants:
  - 'no plaintext passwords in DB'
  - 'all endpoints require HTTPS'

decisions:
  - date: '2025-03-12'
    choice: 'JWT over session cookies'
    rationale: 'stateless scaling'

open_questions:
  - 'Should we support OAuth2 providers?'
  - 'MFA timeline?'

Design note: The behaviors field uses Given/Then language (BDD-adjacent) but stays in natural language – not formal Gherkin. Structured enough to diff and validate, informal enough that non-engineers can author it.


4. Reconciliation Lifecycle

4a. Scenario A: Human Edits Code

flowchart LR
    A1["Human edits code<br/>adds OAuth endpoint"]
    A2["LLM detects drift<br/>code != spec behaviors"]
    A3["LLM proposes spec update<br/>+ add behavior: oauth_login<br/>+ update docs/auth.md"]
    A4["Human approves<br/>or adjusts and approves"]

    A1 -->|trigger| A2
    A2 -->|reconcile| A3
    A3 -->|resolve| A4

4b. Scenario B: Human Edits Spec

flowchart LR
    B1["Human edits spec<br/>changes session to 60 min"]
    B2["LLM updates code to match"]
    B3["LLM updates docs to match"]
    B4["Human reviews<br/>code + docs diff<br/>as a single PR"]

    B1 -->|direct| B2
    B1 -->|direct| B3
    B2 --> B4
    B3 --> B4

4c. Scenario C: Conflict Detected

flowchart LR
    C1["Conflict detected<br/>code says X, spec says Y<br/>docs say Z"]
    C2["LLM presents options<br/>spec says X, but code<br/>does Y -- which is right?"]
    C3["Human decides intent<br/>LLM propagates decision<br/>across spec + code + docs"]
    C4["All three aligned<br/>conflict resolved"]

    C1 -->|detect| C2
    C2 -->|reconcile| C3
    C3 -->|resolve| C4

5. Post-Push Reconciliation in Practice

flowchart LR
    S1["Dev + LLM<br/>write code freely<br/>no spec friction"]
    S2["git push<br/>or open PR"]
    S3["CI hook: LLM reviews<br/>diff vs affected specs<br/>proposes spec updates<br/>proposes doc updates"]
    S4["Adds to PR<br/>spec diff + docs diff<br/>alongside code diff"]
    S5["Single review<br/>code + spec + docs<br/>all land together or not"]

    S1 --> S2 --> S3 --> S4 --> S5

The artifacts field in the spec tells the CI hook which specs are affected by which file paths – so it only reconciles what changed.


6. Spec Composition – The Import Model

6a. Directory Structure

project/
+-- .notarai/
|   +-- system.spec.yaml          # top-level system spec
|   +-- auth.spec.yaml            # auth service (Tier 1)
|   +-- billing.spec.yaml         # billing service (Tier 1)
|   +-- api.spec.yaml             # API layer (Tier 1)
|   +-- utils.spec.yaml           # shared utilities (Tier 2)
|   +-- redis-cache.spec.yaml     # sidecar process (Tier 2)
|   +-- _shared/
|       +-- security.spec.yaml    # cross-cutting
|       +-- logging.spec.yaml     # cross-cutting
+-- src/
+-- docs/

6b. Composition Relationships

flowchart TD
    System["system.spec.yaml<br/>top-level intent + invariants"]

    Auth[".notarai/auth.spec.yaml"]
    Billing[".notarai/billing.spec.yaml"]
    API[".notarai/api.spec.yaml"]

    Security["_shared/security.spec.yaml<br/>applies to: all subsystems"]
    Logging["_shared/logging.spec.yaml<br/>applies to: all subsystems"]

    System -->|"$ref"| Auth
    System -->|"$ref"| Billing
    System -->|"$ref"| API

    Security -.->|applies| Auth
    Security -.->|applies| Billing
    Security -.->|applies| API
    Logging -.->|applies| Auth
    Logging -.->|applies| Billing
    Logging -.->|applies| API

When the LLM checks auth.spec.yaml, it also loads security.spec.yaml and validates that auth code satisfies both specs’ invariants. Cross-cutting concerns are defined once and enforced everywhere.


7. Coverage Model – Three Tiers

flowchart LR
    subgraph T1["Tier 1: Full Spec"]
        T1a["Business logic services"]
        T1b["API endpoints"]
        T1c["Data models / schemas"]
        T1d["Anything user-facing"]
    end

    subgraph T2["Tier 2: Registered"]
        T2a["Utility libraries"]
        T2b["Shared helpers / constants"]
        T2c["Config files"]
        T2d["Sidecar processes"]
    end

    subgraph T3["Tier 3: Excluded"]
        T3a["Generated code / build output"]
        T3b["Vendored dependencies"]
        T3c["IDE / editor configs"]
        T3d["node_modules, .git, etc."]
    end

Coverage equation: Tier 1 + Tier 2 + Tier 3 = entire repo

Anything not covered = unspecced (a lint warning, not a block).


8. Bootstrap Flow for Existing Codebases

flowchart LR
    S1["1. Ingest<br/>code + docs +<br/>commit history +<br/>README / ADRs"]
    S2["2. LLM interviews<br/>What's the goal?<br/>Any undocumented rules?"]
    S3["3. Draft spec<br/>required fields only<br/>intent + behaviors +<br/>artifact mappings"]
    S4["4. Human review<br/>correct, enrich,<br/>add constraints /<br/>open questions"]
    S5["5. Activate<br/>sync engine<br/>watches for drift<br/>from this point on"]

    S1 --> S2 --> S3 --> S4 --> S5

Bootstrap starts minimal and accrues precision over time – the spec is a living document.

Comparison to SDD

Spec-driven development (SDD) has emerged as a major pattern for AI-assisted coding, but the term covers several distinct approaches. Birgitta Böckeler’s taxonomy identifies three levels:

  • Spec-first: Write a spec, generate code, discard or ignore the spec afterward.
  • Spec-anchored: Keep the spec around for ongoing maintenance, but how it stays current is left vague.
  • Spec-as-source: The spec replaces code as the primary artifact. People never touch code directly.

Most SDD tools (Kiro, Spec Kit, OpenSpec) are spec-first in practice: they help you go from intent to plan to tasks to code, but once the code exists, the spec quietly goes stale. Superpowers takes the spec-first workflow further with a structured seven-stage methodology and subagent-driven execution, but its plans are task-scoped artifacts. Tessl is exploring spec-as-source, where code is generated from specs and marked “DO NOT EDIT”, but this sacrifices the flexibility of direct code editing.

NotarAI occupies the gap that Böckeler’s taxonomy identifies but no current tool fills: spec-anchored with automated maintenance. The spec persists for the lifetime of the feature, and an LLM reconciliation engine actively keeps it aligned with code and docs as all three evolve.

SDD tools solve the cold-start problem. NotarAI solves the entropy problem.

SDD tools help you write specs. NotarAI helps you keep them true.

  • A developer adds a feature – NotarAI detects the spec doesn’t account for it and proposes an update
  • A team lead updates the spec – NotarAI propagates the change to code and docs
  • Code contradicts a spec constraint – NotarAI flags the conflict and asks the user to decide

The spec isn’t just a blueprint. It’s a witness – a living contract the LLM continuously verifies against reality.

Landscape comparison

ToolSDD LevelDirectionSpec LifespanBrownfield Support
KiroSpec-firstSpec -> codeChange requestLimited
Spec KitSpec-first (aspires to anchored)Spec -> codeBranch / change requestLimited
TesslSpec-as-sourceSpec -> code (human edits spec only)Feature lifetimeReverse-engineering CLI
OpenSpecSpec-firstSpec -> codeChange requestLimited
SuperpowersSpec-first (workflow methodology)Spec -> plan -> subagent executionTask / branchGit worktree isolation
SemcheckCompliance checkingSpec -> code (one-way check)OngoingYes
NotarAISpec-anchored + active reconciliationSpec <-> code <-> docsFeature lifetimeBootstrap flow with LLM interview

Inspirations

NotarAI draws from several established traditions:

  • Cucumber / Gherkin: The Given/Then behavior format in NotarAI specs comes from BDD’s structured scenario language, but kept in natural language rather than formal Gherkin syntax to lower the authoring barrier.
  • Terraform and Infrastructure-as-Code: The reconciliation model (declare desired state, detect drift from actual state, propose a plan to converge) is borrowed from IaC tools like Terraform, Pulumi, and CloudFormation. NotarAI’s spec is a state file for intent, not infrastructure.
  • JSON Schema / OpenAPI: The $ref composition model and the use of a JSON Schema to govern spec validity come directly from these standards.
  • Design by Contract (Eiffel): The distinction between constraints (what the system enforces) and invariants (what must never be violated) echoes Eiffel’s preconditions, postconditions, and class invariants.
  • Architecture Decision Records: The decisions field in the spec is a lightweight ADR log, capturing the why alongside the what.

Contributing

Your interest in contributing to this project is appreciated. Below is a series of instructions that will hopefully remain up to date because this tool should help manage that. However, if you notice that the steps seem out of date or misaligned with current practices in the repo, an update to this document could be a high-value first or second contribution to the project.

Note that the project’s own spec drift is self-managed, so please get acquainted with the tool and make sure your contributions stay in sync.

Development Setup

Install Rust (stable toolchain). Install pre-commit for pre-commit hooks.

Temporarily (until biome supports markdown), install prettier.

Setup clippy and rustfmt via:

rustup component add rustfmt clippy

Then setup the repo:

git clone https://github.com/davidroeca/NotarAI.git
cd NotarAI
cargo build
cargo install biome
cargo install --path .
pre-commit install

The last step installs the notarai binary to ~/.cargo/bin so the Claude Code hook (notarai hook validate) resolves correctly. Re-run it whenever you want the installed binary to reflect your latest local changes.

Making Changes

  1. Create a branch from main
  2. Make your changes
  3. Run cargo build to verify compilation
  4. Run cargo test to run the test suite
  5. Run cargo fmt --check to verify formatting
  6. Run cargo clippy -- -D warnings to check for lint issues
  7. Use the /notarai-reconcile Claude Code command to check for spec drift
  8. Open a pull request

Code Style

  • Rust 2024 edition
  • cargo fmt for Rust formatting
  • cargo clippy for Rust lints
  • biome format --check for non-Rust file formatting (JSON, JS/TS, CSS, etc.)
  • prettier --check for Markdown formatting (temporary until biome#3718 is resolved)
  • Functional style preferred over excessive use of structs with methods
  • Core library lives in src/core/ (not src/lib/ due to Rust’s reserved module name)

Project Structure

See CLAUDE.md in the repository root for a detailed layout and architectural constraints.

Good First Contributions

These changes will drive broader adoption but are not yet a priority:

  • Support other coding agents (e.g. Codex, Aider, Cline, OpenHands, Goose, opencode)
  • Find/create new issues and reference them here

License

By contributing, you agree that your contributions will be licensed under the Apache License 2.0.