Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

NotarAI

Intent captured. Drift reconciled.

NotarAI is a continuous intent reconciliation tool that keeps your specs, code, and documentation in sync as all three evolve. It uses LLMs as a bidirectional reconciliation engine: not just to generate code from specs, but to detect drift, surface conflicts, and propose updates across your entire artifact chain.

What is NotarAI?

Spec-anchored – Structured YAML specs capture intent as the canonical source of truth, validated by JSON Schema.

Bidirectional – Detects drift in any direction (code, spec, or docs) and proposes aligned updates.

Propose and approve – Never auto-syncs. All changes are proposed for human review.

Composable – Specs reference each other via $ref for hierarchical and cross-cutting composition.

Installation

Quick Install (Linux / macOS)

curl -fsSL https://raw.githubusercontent.com/davidroeca/NotarAI/main/scripts/install.sh | sh

This detects your OS and architecture, downloads the appropriate binary from GitHub Releases, and installs it to ~/.local/bin. If that directory is not in your PATH, the script will print a one-line export command to add it.

From crates.io

If you have Rust installed:

cargo install notarai

Manual Download

Download the binary for your platform from the latest release:

PlatformBinary
Linux x86_64 (glibc)notarai-x86_64-linux-gnu
Linux x86_64 (musl)notarai-x86_64-linux-musl
Linux aarch64 (glibc)notarai-aarch64-linux-gnu
Linux aarch64 (musl)notarai-aarch64-linux-musl
macOS x86_64notarai-x86_64-macos
macOS aarch64 (Apple Silicon)notarai-aarch64-macos
Windows x86_64notarai-x86_64-windows.exe

Make the binary executable and move it to a directory in your PATH:

chmod +x notarai-*
mkdir -p ~/.local/bin
mv notarai-* ~/.local/bin/notarai

If ~/.local/bin is not already in your PATH, add this to your shell profile (~/.bashrc, ~/.zshrc, etc.):

export PATH="$HOME/.local/bin:$PATH"

From Source

git clone https://github.com/davidroeca/NotarAI
cd NotarAI
cargo build --release -p notarai
# Binary is at target/release/notarai

Updating

If NotarAI is already installed, check for and install updates with:

notarai update

This detects how NotarAI was installed and acts accordingly — downloading a new binary for GitHub Release installs, or printing the appropriate cargo install command for Cargo installs. Use notarai update --check to check without installing.

NotarAI also prints a passive update hint on notarai validate and notarai init when a newer version is available (checked at most once every 24 hours).

Requirements

  • No runtime dependencies – NotarAI is a single static binary
  • Claude Code for reconciliation features (optional for validation-only usage)

Quick Start

Initialize your project

Run notarai init in your project root:

notarai init

This does several things:

  1. Copies notarai.spec.json to .notarai/notarai.spec.json so the schema is available for validation.
  2. Writes .notarai/README.md with workflow instructions.
  3. Writes .notarai/reconcile-prompt.md (reconciliation prompt template).
  4. Writes .notarai/bootstrap-prompt.md (bootstrap prompt template).
  5. Appends .notarai/.cache/ to .gitignore so the hash cache DB is never committed.
  6. Writes .mcp.json registering notarai mcp as a local MCP server, so MCP-accelerated reconciliation works out of the box.
  7. Writes or section-merges AGENTS.md with a ## NotarAI section describing the workflow.
  8. For the Claude adapter: adds a PostToolUse hook to .claude/settings.json so spec files are automatically validated when Claude Code writes or edits them; copies reconcile and bootstrap skills to .claude/skills/; creates or section-merges CLAUDE.md as an @AGENTS.md pointer.

Running init again is safe: it always refreshes skills, templates, and the schema copy, and replaces the ## NotarAI section in AGENTS.md (and adapter pointer files) with the current content.

Create your first spec

Specs live in a .notarai/ directory at the root of your repository:

project/
  .notarai/
    system.spec.yaml
    auth.spec.yaml
    billing.spec.yaml
    _shared/
      security.spec.yaml
  src/
  docs/

Here’s a minimal spec:

# .notarai/auth.spec.yaml
schema_version: '0.8'

intent: |
  Users can sign up, log in, and reset passwords.
  Sessions expire after 30 min of inactivity.

behaviors:
  - name: 'signup'
    given: 'valid email + password (>= 12 chars)'
    then: 'account created, welcome email sent'
  - name: 'login'
    given: 'valid credentials'
    then: 'JWT issued, session created'

artifacts:
  code:
    - path: 'src/auth/**'
      role: 'primary implementation'
  docs:
    - path: 'docs/auth.md'

Validate specs

# Validate all spec files in .notarai/
notarai validate

# Validate a specific file
notarai validate .notarai/auth.spec.yaml

# Validate a directory
notarai validate .notarai/subsystems/

Output is PASS <file> or FAIL <file> with an indented error list. Exit code is 0 if all files pass, 1 if any fail.

Update NotarAI

Check for and install updates:

notarai update

NotarAI will also print a hint when a newer version is available during validate or init.

Bump schema version

When you upgrade to a new version of NotarAI, update all spec files with:

notarai schema-bump

This overwrites .notarai/notarai.spec.json with the bundled schema and updates the schema_version field in every .notarai/*.spec.yaml file.

Bootstrap from an existing codebase

Use the /notarai-bootstrap skill in Claude Code to generate specs from your existing code via a structured developer interview.

Check for drift

Run notarai check to detect structural drift without an LLM:

# See what's drifted
notarai check

# Strict mode for CI (any finding = exit code 1)
notarai check --strict

This reports coverage gaps, orphaned globs, changed files since last reconciliation, overlapping coverage, circular $ref chains, and incomplete behaviors. See the CLI reference for details.

For automated PR checks, add the GitHub Action to your CI workflow.

Reconcile with an LLM

Use the /notarai-reconcile skill in Claude Code to perform a full semantic reconciliation: detect drift, propose spec/code/doc updates, and walk through each finding interactively.

Next steps

  • Existing codebase? See the Brownfield Adoption Guide for a step-by-step walkthrough of adding NotarAI to a project that already has code.
  • Not sure how much spec detail you need? Progressive Adoption describes three maturity levels so you can start light and add depth where it matters.

Spec Format Reference

Specs are YAML files validated against a JSON Schema (notarai.spec.json). The format uses progressive disclosure: a small set of required fields for minimum viability, with optional fields for precision as needed.

Required fields

schema_version

Pins the JSON Schema version. Current version: "0.8". Versions "0.7", "0.6", and "0.5" are also accepted for backward compatibility.

schema_version: '0.8'

intent

Natural language description of what the system or feature should do.

intent: |
  Users can sign up, log in, and reset passwords.
  Sessions expire after 30 min of inactivity.

behaviors

Structured Given/Then entries describing expected behavior. Each behavior has a name, a given condition, and a then outcome. Required for full tier specs; optional for registered and derived tier specs.

behaviors:
  - name: 'signup'
    given: 'valid email + password (>= 12 chars)'
    then: 'account created, welcome email sent'
  - name: 'session_timeout'
    given: '30 min inactivity'
    then: 'session invalidated'

Behaviors may also include optional interaction and state_transition sub-fields:

behaviors:
  - name: 'submit_form'
    given: 'user submits a valid form'
    then: 'data saved, confirmation shown'
    interaction:
      trigger: user_action # user_action | timer | system_event | data_change | schedule | external_signal | threshold | manual | lifecycle
      sequence:
        - validate fields
        - post to API
        - show confirmation
    state_transition:
      from: editing
      to: confirmed

Behaviors may also declare the tests that verify them via tested_by (introduced in schema 0.8). notarai check uses this to surface test-alignment drift:

behaviors:
  - name: 'signup'
    given: 'valid email and password'
    then: 'account created, welcome email sent'
    tested_by:
      - path: 'tests/auth/signup_test.rs'
        assertion: 'signup_creates_account'
CheckSeverityTrigger
T001WarningA tier-1 behavior has no tested_by entry.
T002ErrorA tested_by.path does not exist on disk.

artifacts

Glob patterns mapping the spec to the files it governs. The schema accepts any string as a category key. Convention categories:

CategoryWhen to use
codeSource code
docsDocumentation
testsTest files
slidesPresentation files
dataData files, CSVs
configsConfiguration, IaC
notebooksJupyter/R notebooks
assetsMedia, images, fonts
templatesReusable templates
schemasData schemas, API specs
artifacts:
  code:
    - path: 'src/auth/**'
      role: 'primary implementation'
  docs:
    - path: 'docs/auth.md'
  tests:
    - path: 'tests/auth/**'

Each artifact ref may include an optional integer tier override (1-4) for files that belong to a different tier than the spec itself:

artifacts:
  code:
    - path: 'dist/bundle.js'
      tier: 4 # derived output — tracked for staleness, not authored directly

Optional fields

constraints

Rules the system must follow.

constraints:
  - 'rate limit: 5 login attempts per minute per IP'
  - 'passwords must be >= 12 characters'

invariants

Conditions that must never be violated.

invariants:
  - 'no plaintext passwords stored anywhere'
  - 'all API responses include request-id header'

decisions

Architectural decision log with date, choice, and rationale.

decisions:
  - date: '2025-01-15'
    choice: 'JWT over session cookies'
    rationale: 'Stateless auth simplifies horizontal scaling'

open_questions

Unresolved design questions.

open_questions:
  - 'Should we support OAuth providers beyond Google?'
  - "What's the session timeout for mobile clients?"

dependencies

References to other specs this one interacts with.

dependencies:
  - $ref: 'billing.spec.yaml'
    relationship: 'auth gates billing endpoints'

notes

Freeform hints for the LLM about implicit relationships.

notes: |
  The auth module shares a rate limiter with the API gateway.
  Session storage is Redis in production, in-memory in dev.

output

Describes what the spec ultimately produces. Useful for non-software artifacts like presentations or reports.

output:
  type: presentation # app | presentation | interactive-doc | game | dashboard | report | library | service | document | course | api | infrastructure | dataset | design-system | campaign | template
  format: pptx
  runtime: static-file # browser | native | static-file | embedded | server
  entry_point: dist/deck.pptx

content

Describes the output’s logical structure in content terms (slides, scenes, sections) rather than file terms.

content:
  structure: graph # ordered | hierarchical | graph | free-form
  sections:
    - id: level_1
      type: scene
      intent: 'Tutorial level introducing movement mechanics'
      duration: { value: 5, unit: minutes }
      connections:
        - to: level_2
          label: completion
        - to: game_over
          label: player_death
      depends_on:
        - id: intro_cutscene
          relationship: 'must complete before this section unlocks'
      evidence:
        - type: data
          source: playtests/run_3.csv
          claim: '85% of players complete within 5 minutes'

states

Top-level state machine definition for interactive artifacts.

states:
  initial: idle
  definitions:
    - id: idle
      transitions:
        - to: running
          on: start
          guard: 'all required fields are populated'
          action: 'initialize timer, log start event'
    - id: running
      transitions:
        - to: idle
          on: stop

design

Visual and design specifications for brand-governed artifacts.

design:
  theme:
    palette: ['#1a1a2e', '#16213e']
    typography:
      heading: Inter
      body: Roboto
    modes:
      light: { palette: ['#ffffff', '#f0f0f0'] }
      dark: { palette: ['#1a1a2e', '#16213e'] }
  layout:
    type: paginated # slide-deck | scrolling | spatial | grid | free-form | paginated | canvas | timeline | tabbed
    dimensions: letter
  print:
    margins: { top: '1in', right: '1in', bottom: '1in', left: '1in' }
    headers: true
    footers: true
    page_numbers: true
  responsive:
    breakpoints:
      - name: mobile
        max_width: 768
        layout_override: scrolling
      - name: desktop
        min_width: 769

audience

Context about who the output is for.

audience:
  role: 'Series B investors'
  assumed_knowledge: 'Familiar with SaaS metrics, not technical infrastructure'
  tone: formal-but-engaging
  locale: en-US
  accessibility:
    - high-contrast
    - screen-reader-friendly

variants

Multiple versions of the same artifact with selective field overrides.

variants:
  - id: investor-deck
    description: 'Condensed version for investor meetings'
    overrides:
      audience.role: 'Series B investors'
  - id: engineering-deep-dive
    description: 'Full technical version for the eng team'

Variants are declarative metadata by default. Set variants_resolved: true at the spec top level to opt in to programmatic override resolution (scalar replacement, array replacement with + prefix for append, deep merge for objects, null to clear).

pipeline

Describes the build or generation process for the output artifact.

pipeline:
  env:
    NODE_ENV: production
  steps:
    - name: compile
      tool: tsc
      input: 'src/**/*.ts'
      output: dist/
      condition: "output.format == 'web'"
    - name: export_pdf
      command: 'pandoc input.md -o output.pdf'
      condition: "output.format == 'pdf'"
      on_failure: skip
      depends_on: [compile]
      env:
        PANDOC_DATA_DIR: ./templates
  preview:
    command: npx serve dist/
    url: 'http://localhost:3000'

feedback

Connects output performance metrics back to the spec for reconciliation triggers.

feedback:
  metrics:
    - name: avg_completion_rate
      source: analytics/completion.csv
      threshold: '>= 0.7'
    - name: build_time
      threshold: '< 5s'
  triggers:
    - condition:
        metric: avg_completion_rate
        operator: below_threshold
        duration: { value: 3, unit: days }
      action: reconcile
      priority: high

Note: reconciliation_trigger (free-form string) is deprecated in favor of triggers but still accepted.

compliance

Maps invariants and constraints to regulatory or standards frameworks. The reconciliation engine verifies that framework-required invariants still exist in the spec.

compliance:
  frameworks:
    - name: SOC2
      controls:
        - id: CC6.1
          satisfied_by:
            invariants: ['no plaintext passwords stored anywhere']
            constraints: ['rate limit: 5 login attempts per minute per IP']
    - name: WCAG
      level: AA
      satisfied_by:
        invariants: ['all interactive elements have visible focus indicators']
  audit_trail: true

Coverage tiers

Every file in the repo falls into one of four tiers:

  • Tier 1 (Full) — Business logic, APIs, user-facing features. Full behavioral specification required.
  • Tier 2 (Registered) — Utilities, config, sidecars. Intent and artifact mapping only; behaviors not required.
  • Tier 3 (Excluded) — Explicitly out of scope. Declared via exclude globs on the system spec.
  • Tier 4 (Derived) — Generated outputs tracked for staleness but not authored directly (e.g., build artifacts, compiled bundles). Use tier: derived on the spec or tier: 4 on individual artifact refs.

Files not covered by any tier are flagged as “unspecced” — a lint warning, not a blocker.

Set the spec-level tier with the tier field:

tier: registered # full (default) | registered | derived

Composition

Specs compose via $ref (borrowed from JSON Schema/OpenAPI):

  • subsystems — hierarchical references (system → services)
  • applies — cross-cutting specs (e.g., security, logging) that apply to all subsystems

A top-level system.spec.yaml serves as the manifest, referencing subsystem specs and declaring exclusion patterns for Tier 3 files.

Cross-cutting specs

A spec that expresses concerns spanning multiple subsystems (style, security, logging, compliance) should set cross_cutting: true:

schema_version: '0.8'
cross_cutting: true
intent: >
  American English spelling across all code and documentation.
behaviors:
  - name: american_english
    given: 'british spelling appears in a governed file'
    then: 'reconciliation flags it as drift'
invariants:
  - 'All documentation uses American English spellings throughout'

Cross-cutting specs:

  • Omit artifacts — they govern no files directly. Their invariants and behaviors layer onto the specs that include them via applies.
  • Cannot be top-level — they must not declare subsystems or exclude.
  • Must be referenced via applies, not subsystems — L011 flags misplacement.

This avoids glob overlap with subsystem specs (since two specs governing the same file raises an OverlappingCoverage finding) while still letting the spec layer its invariants across the whole system.

Reconciliation

How reconciliation works

The reconciliation engine detects three scenarios:

1. Someone edits code

The engine detects that code has drifted from the spec and proposes spec and doc updates.

2. Someone edits spec

The engine propagates the spec change to code and documentation.

3. Conflict

Code says one thing, the spec says another. The engine surfaces the disagreement and the user decides which is correct.

The system is always propose-and-approve, never auto-sync. Both users and LLMs can edit everything; the spec is the tiebreaker.

Using reconciliation

After running notarai init, use the /notarai-reconcile slash command in Claude Code to trigger a reconciliation pass.

The skill is a thin orchestrator that delegates context assembly to the notarai export-context CLI command:

  1. Determines a baseline (from .notarai/reconciliation_state.json if available, or asks for a base branch).
  2. Runs notarai export-context --all --base-branch <baseline> --format markdown to gather per-spec reconciliation blocks containing spec content and changed-file lists.
  3. For small changesets (10 or fewer changed files), analyzes all specs inline. For larger changesets, spawns one parallel sub-agent per spec.
  4. Reads changed files, runs git diff per file, and evaluates each behavior, constraint, and invariant against the changes.
  5. Notes applies cross-cutting specs and dependencies refs for ripple-effect analysis.
  6. Produces a structured report (DRIFT / VIOLATED / UNSPECCED / STALE REF findings).
  7. Walks through findings interactively, proposing exact changes for approval.
  8. Calls mark_reconciled (via MCP or CLI) to update the hash cache, then snapshots reconciliation state.

The MCP server is used for mark_reconciled and snapshot_state when available, with CLI fallbacks (notarai state snapshot) when it is not.

For non-Claude agents, run notarai export-context directly and paste the output into your agent’s prompt. See the CLI reference for details.

Automatic validation

After notarai init, spec files are validated automatically whenever Claude Code writes or edits a file in .notarai/. Invalid specs block the tool use with errors on stderr. Non-spec files are ignored silently.

Specs vs Claude Rules

NotarAI specs and Claude rules (CLAUDE.md / .claude/rules/) both express project conventions, but they serve different purposes and trigger at different times. This guide explains when to use each – and when to use both.

Decision framework

Use a spec when…Use a Claude rule when…
The concern describes what artifacts must look likeThe concern describes how Claude should work
You want reconciliation to detect drift retroactivelyYou want to prevent violations proactively
The rule maps to files you can diff againstThe rule is about process, workflow, or tool usage
Cross-cutting specs (applies) can propagate itThe convention only matters during generation

Use a spec

Specs are the right home for artifact-facing rules – invariants, constraints, and behaviors that describe what code, docs, or configs should look like. The reconciliation engine diffs artifacts against these rules and proposes fixes when they drift.

Examples from this project:

  • “American English throughout” – style.spec.yaml catches existing files that use British spellings
  • “The engine must never silently auto-modify code” – an invariant in system.spec.yaml that reconciliation checks against code changes
  • “CLI validates spec files against bundled JSON Schema” – a behavior in cli.spec.yaml tied to source files

Cross-cutting specs (referenced via applies in the system spec) propagate invariants and constraints across all subsystems without duplication.

Use a Claude rule

Claude rules are the right home for workflow-facing instructions – how Claude should run commands, what tools to prefer, what process to follow. These have no artifact to reconcile against; they shape how Claude works, not what the output looks like.

Examples:

  • “Tests use cargo test” – tells Claude which command to run
  • “When bumping schema version, update these five files” – a checklist for a multi-step process
  • “Unit tests are inline #[cfg(test)] modules” – convention for where to put new tests

These belong in .claude/rules/ files (or CLAUDE.md) because there is no meaningful way to diff project files against them.

Use both

Some conventions benefit from both proactive prevention and retroactive detection. Style rules are the classic example:

  • Claude rule prevents new violations: Claude follows the rule as it generates code, so new files are correct from the start.
  • Spec catches existing drift: reconciliation scans all governed files and flags violations that predate the rule or were introduced by humans.

This is intentional duplication, not redundancy. The two mechanisms cover different failure modes.

Examples from this project:

ConventionClaude ruleSpec
American English.claude/rules/style.mdstyle.spec.yaml
QWERTY-typable characters.claude/rules/style.mdstyle.spec.yaml

Anti-patterns

Don’t put process instructions in specs. A spec behavior like “given a schema version bump, then update these five files” has no artifact to diff against. It belongs in a Claude rule or checklist.

Don’t put formal behavioral specs in Claude rules. A rule like “the CLI must validate spec files against the bundled schema” is a testable behavior. If it lives only in CLAUDE.md, reconciliation can’t detect when code drifts away from it.

Don’t duplicate without purpose. If a convention only needs proactive prevention (e.g., “run prettier on generated code”), a Claude rule is sufficient. If it only needs retroactive detection (e.g., “no circular $ref chains”), a spec invariant is sufficient. Use both only when both failure modes are real.

Non-Software Examples

NotarAI works for any artifact with intent, not just code. These examples show how the schema applies to presentations, legal documents, and research reports.


Presentation spec

A conference talk governed for audience alignment and slide drift.

schema_version: '0.7'
domain: presentation

intent: >
  A 30-minute conference talk introducing NotarAI to developers unfamiliar with
  spec-driven workflows. Attendees should leave understanding the three-body drift
  problem and how to run notarai init on their own project.

behaviors:
  - name: opening_hook
    given: 'speaker takes the stage'
    then: 'the intro slide presents a relatable drift scenario in under 90 seconds'
  - name: demo_live
    given: 'the demo section'
    then: 'speaker runs notarai init and reconcile live on a sample repo; audience sees a real drift report'

audience:
  role: 'mid-to-senior developers at a software conference'
  assumed_knowledge: 'Familiar with git, CI/CD, and code review workflows; may not know NotarAI'
  tone: formal-but-engaging
  locale: en-US

output:
  type: presentation
  format: pptx
  runtime: static-file
  entry_point: dist/talk.pptx

content:
  structure: ordered
  sections:
    - id: intro
      type: slide
      intent: 'Hook: show a real drift incident and its cost'
      duration: { value: 3, unit: minutes }
    - id: problem
      type: slide
      intent: 'Explain the three-body drift problem (spec, code, docs)'
      duration: { value: 5, unit: minutes }
    - id: demo
      type: interactive
      intent: 'Live notarai init + reconcile demo on a sample repo'
      duration: { value: 10, unit: minutes }
      content_ref: demo/sample-repo/
    - id: takeaways
      type: slide
      intent: 'Three action items the audience can do today'
      duration: { value: 2, unit: minutes }

design:
  theme:
    palette: ['#0f172a', '#6366f1', '#ffffff']
    typography:
      heading: Inter
      body: Inter
  layout:
    type: slide-deck
    dimensions: '16:9'

artifacts:
  slides:
    - path: 'slides/**/*.md'
      role: 'slide source content'
  assets:
    - path: 'assets/**'
      role: 'images and diagrams'

What this demonstrates: output.type: presentation, content.sections with duration, audience, and design. The reconciliation engine uses duration to detect if the talk now runs over time, and intent per section to detect off-message slides.


A service agreement governed for compliance and clause integrity.

schema_version: '0.7'
domain: legal

intent: >
  A standard SaaS service agreement for enterprise customers. Governs payment
  terms, liability limits, data processing obligations, and termination rights.
  The spec tracks which clauses satisfy which regulatory requirements so that
  removing or weakening a clause triggers a compliance drift alert.

behaviors:
  - name: data_processing
    given: 'customer data is processed by the service'
    then: 'the DPA clause defines processing purposes, data categories, and sub-processor obligations per GDPR Article 28'
  - name: liability_cap
    given: 'a dispute arises'
    then: 'liability is capped at 12 months of fees paid, except for gross negligence or data breach'

constraints:
  - 'All clause changes must be reviewed by legal counsel before execution'
  - 'Governing law must match the entity jurisdiction for each signed copy'

invariants:
  - 'The DPA clause must never be removed from the agreement'
  - 'Liability cap language must reference the specific cap amount'

compliance:
  frameworks:
    - name: GDPR
      controls:
        - id: Art28
          satisfied_by:
            invariants:
              ['The DPA clause must never be removed from the agreement']
    - name: SOC2
      controls:
        - id: CC9.2
          satisfied_by:
            constraints:
              [
                'All clause changes must be reviewed by legal counsel before execution',
              ]
  audit_trail: true

output:
  type: document
  format: pdf

content:
  structure: ordered
  sections:
    - id: definitions
      type: clause
      intent: 'Define all capitalized terms used in the agreement'
    - id: services
      type: clause
      intent: 'Describe the scope and delivery of services'
    - id: payment
      type: clause
      intent: 'Payment terms, invoicing cycle, and late payment penalties'
    - id: dpa
      type: clause
      intent: 'Data Processing Agreement per GDPR Article 28'
      depends_on:
        - id: definitions
          relationship: 'References defined terms for data categories and processing'
    - id: liability
      type: clause
      intent: 'Limit liability to 12 months fees; carve out gross negligence and data breach'
    - id: termination
      type: clause
      intent: 'Termination for convenience (30 days notice) and for cause (material breach)'

design:
  layout:
    type: paginated
    dimensions: letter
  print:
    margins: { top: '1in', right: '1in', bottom: '1in', left: '1in' }
    headers: true
    footers: true
    page_numbers: true

artifacts:
  docs:
    - path: 'contracts/service-agreement.md'
      role: 'master agreement source'
  configs:
    - path: 'contracts/variables.yaml'
      role: 'per-customer variable substitutions (entity name, jurisdiction, fees)'

What this demonstrates: domain: legal, compliance.frameworks with control mappings, content.sections with type: clause and depends_on, design.print for paginated layout. The compliance block creates an explicit link between the GDPR requirement and the DPA clause – if someone removes the DPA clause, the reconciliation engine flags it as a high-priority drift event.


Research report spec

An evidence-backed technical report governed for citation integrity.

schema_version: '0.7'
domain: research

intent: >
  A technical report evaluating three approaches to LLM-assisted code review:
  prompt-only, RAG-augmented, and spec-anchored. Reports accuracy, latency, and
  reviewer acceptance metrics from a 90-day study across 12 repositories.

behaviors:
  - name: methodology_reproducible
    given: 'a reader follows the methodology section'
    then: 'they can reproduce the experimental setup using the linked code and dataset'
  - name: results_traceable
    given: 'a claim appears in the results section'
    then: 'it is linked to a specific row or aggregate in the dataset'

constraints:
  - 'All quantitative claims must cite a specific data source in evidence'
  - 'Comparison tables must include confidence intervals'
  - 'Methodology must describe exclusion criteria for repositories'

output:
  type: document
  format: pdf

content:
  structure: ordered
  sections:
    - id: abstract
      type: section
      intent: 'Summarize the study question, methods, and key finding in 150 words'
      duration: { value: 2, unit: minutes }
    - id: methodology
      type: section
      intent: 'Describe the 90-day study design, repository selection criteria, and evaluation metrics'
      content_ref: sections/methodology.md
      evidence:
        - type: reference
          ref: 'Chen et al. 2023 -- LLM code review benchmarks'
          claim: 'Our accuracy metric aligns with the Chen et al. framework'
          relationship: 'supports methodology choice'
    - id: results
      type: section
      intent: 'Present accuracy, latency, and acceptance metrics per approach with confidence intervals'
      content_ref: sections/results.md
      evidence:
        - type: data
          source: data/results_final.csv
          claim: 'Spec-anchored approach achieves 94% accuracy vs 81% for prompt-only'
          relationship: 'primary quantitative result'
        - type: data
          source: data/latency.csv
          claim: 'Median review latency under 4 seconds for all approaches'
    - id: discussion
      type: section
      intent: 'Interpret results, discuss limitations, and suggest future work'
      depends_on:
        - id: results
          relationship: 'Interpretation requires results to be finalized'
    - id: conclusion
      type: section
      intent: 'State the recommendation: spec-anchored review for accuracy-critical workflows'

feedback:
  metrics:
    - name: peer_review_score
      threshold: '>= 3.5 / 5'
    - name: reproduction_success_rate
      threshold: '>= 0.8'
  triggers:
    - condition:
        metric: peer_review_score
        operator: below_threshold
      action: reconcile
      priority: high

artifacts:
  docs:
    - path: 'sections/**/*.md'
      role: 'report section source content'
  data:
    - path: 'data/**/*.csv'
      role: 'experimental results datasets'
  configs:
    - path: 'analysis/**/*.py'
      role: 'analysis scripts that produce data/ outputs'

What this demonstrates: domain: research, content.sections with evidence entries linking claims to data sources, depends_on between sections, feedback.triggers for structured review thresholds, and duration for time-budgeted writing. When data/results_final.csv changes, the reconciliation engine flags the results section’s claim for review because it is linked via evidence.

Brownfield Adoption Guide

This guide walks through adopting NotarAI on an existing codebase. You do not need to spec everything at once. Start with the most critical modules and expand coverage incrementally.

Prerequisites

  • A Git repository with existing code
  • NotarAI installed (Installation)

Step 1: Initialize NotarAI

notarai init

This sets up the .notarai/ directory, validation hooks, slash commands, and MCP server configuration. See the Quick Start for details on what init creates.

Step 2: Create a system spec with broad exclusions

Start with a system spec that explicitly excludes files you do not want to track:

# .notarai/system.spec.yaml
schema_version: '0.8'
intent: >
  Top-level system spec for the project. Defines subsystem
  composition and excludes vendor, generated, and config files.
artifacts:
  configs:
    - path: 'package.json'
      role: 'package manifest'
exclude:
  - 'vendor/**'
  - 'node_modules/**'
  - 'dist/**'
  - 'build/**'
  - '.github/**'
  - '*.lock'
  - '*.config.*'

The exclude patterns use glob syntax. Files matching these patterns will not be flagged as “unspecced” by notarai check.

Step 3: Bootstrap specs for critical modules

Use the bootstrap interview to create specs for your 2-3 most important modules:

# In Claude Code:
/notarai-bootstrap

# Or for any agent:
notarai export-context --bootstrap | pbcopy

The bootstrap flow interviews you about the module’s purpose, behaviors, constraints, and invariants, then drafts a spec. You review and approve before anything is written.

Focus on modules where drift would cause the most damage: authentication, billing, core business logic.

Step 4: Run your first check

notarai check

This reports:

  • Coverage gaps: Files not governed by any spec (expected to be large initially)
  • Orphaned globs: Spec artifact patterns matching no files (should be zero for freshly created specs)

Do not try to eliminate all coverage gaps immediately. A large brownfield codebase will have many unspecced files, and that is fine. The goal is progressive coverage.

Step 5: Add specs as modules are touched

When you modify a module, that is the natural time to add a spec for it. The incremental approach:

  1. Before making changes, create a spec for the module (or use /notarai-bootstrap to interview about it)
  2. Make your code changes
  3. Run /notarai-reconcile to verify the spec still aligns
  4. Commit the spec alongside the code changes

Over time, your most-changed modules will naturally accumulate spec coverage.

Step 6: Set up CI drift detection

Add the NotarAI GitHub Action to your PR workflow:

# .github/workflows/notarai.yml
name: NotarAI Check
on:
  pull_request:
    branches: [main]

permissions:
  contents: read
  pull-requests: write

jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: davidroeca/NotarAI/crates/notarai-action@v0.7.0

This runs notarai check on every PR and posts a comment summarizing findings. See the GitHub Action reference for configuration options.

Common pitfalls

Over-speccing too early. Do not try to write full behavioral specs for every module on day one. Start with tier: registered specs that just map intent to artifacts. Add behaviors later for critical paths. See Progressive Adoption.

Ignoring coverage gaps. Coverage gaps are warnings, not errors. They are informational. Do not suppress them globally; instead, add exclude patterns for directories you genuinely do not want to track (vendor, generated code).

Speccing generated code. Files generated by build tools, compilers, or code generators should be excluded (Tier 3) or tracked as derived (Tier 4). Do not write behavioral specs for generated output.

Realistic timeline

For a medium-sized project (50-100 source files, 5-10 logical modules):

  • Day 1: notarai init, system spec with exclusions, 2-3 module specs via bootstrap
  • Week 1-2: Add specs for modules you are actively working on
  • Month 1: 40-60% coverage of source files; CI drift checks running on PRs
  • Ongoing: Coverage grows naturally as modules are touched

There is no pressure to reach 100% coverage. Many teams find that 60-80% coverage of source files (with the remainder excluded or registered) provides the right balance of safety and maintenance cost.

Progressive Adoption

You do not need full behavioral specs to get value from NotarAI. This guide describes three maturity levels. Most teams should aim for Level 2 on critical modules and Level 1 everywhere else.

Level 1: Intent and artifacts only

The simplest useful spec. Maps files to a purpose without describing behaviors.

schema_version: '0.8'
intent: >
  HTTP client library with retry logic, connection pooling,
  and timeout configuration.
tier: registered
artifacts:
  code:
    - path: 'src/http/**/*.rs'
      role: 'HTTP client implementation'
  tests:
    - path: 'tests/http_*.rs'
      role: 'HTTP client integration tests'
  docs:
    - path: 'docs/http.md'
      role: 'HTTP client usage guide'

What you get at Level 1:

  • notarai check detects coverage gaps and orphaned globs
  • Reconciliation surfaces which files changed and which spec governs them
  • You have a searchable index of what each module owns

When Level 1 is enough: Utilities, configuration, internal tools, and any module where the intent is self-evident from the code.

Level 2: Add behaviors for critical paths

Add given/then behavior descriptions for the paths that matter most: error handling, security boundaries, data validation, and user-facing features.

schema_version: '0.8'
intent: >
  HTTP client library with retry logic, connection pooling,
  and timeout configuration.
behaviors:
  - name: retry_on_transient_failure
    given: 'a request fails with a 502, 503, or 429 status'
    then: 'retries up to 3 times with exponential backoff (1s, 2s, 4s)'

  - name: timeout_enforcement
    given: 'a request exceeds the configured timeout'
    then: 'aborts the request and returns a timeout error'

  - name: connection_pool_reuse
    given: 'multiple requests to the same host within the keep-alive window'
    then: 'reuses the existing TCP connection'
constraints:
  - 'All HTTP errors must be wrapped in a typed error enum'
  - 'Connection pool size must be configurable at initialization'
artifacts:
  code:
    - path: 'src/http/**/*.rs'
      role: 'HTTP client implementation'
  tests:
    - path: 'tests/http_*.rs'
      role: 'HTTP client integration tests'

What you get at Level 2:

  • Everything from Level 1
  • Reconciliation can detect when code behavior drifts from spec (e.g., retry logic changed but spec still says “3 retries”)
  • notarai lint flags incomplete behaviors (missing given or then)
  • New team members can read the spec to understand intended behavior without reading the full implementation

When to use Level 2: Business logic, APIs, authentication, data pipelines, and anything where behavioral correctness matters.

Level 3: Full SDD coverage

Add constraints, invariants, decisions, and cross-cutting concerns. This is the full power of the spec format.

schema_version: '0.8'
intent: >
  HTTP client library with retry logic, connection pooling,
  and timeout configuration.
behaviors:
  - name: retry_on_transient_failure
    given: 'a request fails with a 502, 503, or 429 status'
    then: 'retries up to 3 times with exponential backoff (1s, 2s, 4s)'
  - name: timeout_enforcement
    given: 'a request exceeds the configured timeout'
    then: 'aborts the request and returns a timeout error'
constraints:
  - 'All HTTP errors must be wrapped in a typed error enum'
  - 'Connection pool size must be configurable at initialization'
  - 'No unbounded retries: retry count must have a hard cap'
invariants:
  - 'A timed-out request must never silently succeed'
  - 'Connection pool must never leak file descriptors'
decisions:
  - date: '2026-03-15'
    choice: 'Use ureq (blocking) instead of reqwest (async)'
    rationale: >
      Project is synchronous throughout. Adding tokio for HTTP
      alone adds 200KB to the binary and complicates error handling.
artifacts:
  code:
    - path: 'src/http/**/*.rs'
      role: 'HTTP client implementation'
  tests:
    - path: 'tests/http_*.rs'
      role: 'HTTP client integration tests'

What you get at Level 3:

  • Everything from Level 2
  • Invariants serve as hard constraints that reconciliation checks against
  • Decisions capture the “why” behind architectural choices, preventing them from being unknowingly reversed
  • notarai lint checks decision freshness and rationale completeness

When to use Level 3: Core system components, security-critical modules, and modules where architectural decisions are frequently revisited or questioned.

Guidance

Start at Level 1 for everything, then promote. When you find yourself explaining a module’s behavior to a teammate or an LLM, that is a signal to promote it to Level 2.

Level 3 is not always better. Over-specified specs create maintenance burden. A utility module with three functions does not need invariants and decisions. Match the spec depth to the module’s complexity and risk.

You do not need Level 3 to get value from NotarAI. Most teams find that Level 2 on 5-10 critical modules plus Level 1 everywhere else provides the right balance.

See Brownfield Adoption for a step-by-step guide to getting started.

Severity Tiers

NotarAI classifies every check and lint finding into one of three severity tiers. Tiers help you prioritize: fix critical issues first, review drift during development, and handle housekeeping when convenient.

Tier definitions

TierNameMeaningExample
1CriticalBroken references or structural violationsOrphaned glob (spec references deleted code), circular $ref cycle, schema version mismatch (L009), missing $ref target (L004)
2DriftCode changed in ways that may not align with specFiles changed since last reconciliation, Tier 1 spec with no behaviors (L001), duplicate behavior names (L010)
3HousekeepingDocumentation, style, or organizational misalignmentCoverage gaps, overlapping coverage, incomplete behaviors, stale decisions (L006), open questions (L007), broad globs (L008)

Tier assignment

Each check type and lint rule is assigned a fixed tier:

Critical: CircularRef, OrphanedGlob, L004, L009

Drift: ChangedSinceReconciliation, L001, L010

Housekeeping: CoverageGap, OverlappingCoverage, BehaviorIncomplete, L002, L003, L005, L006, L007, L008

Output

Human output groups findings by tier (critical first):

--- Critical (2 findings) ---
  Orphaned Globs (1 findings)
    error  : src/deleted/**/*.rs
             in .notarai/cli.spec.yaml
  Circular $ref Cycles (1 findings)
    error  : .notarai/a.spec.yaml
             Circular $ref chain: a -> b -> a

--- Drift (1 findings) ---
  Changed Since Last Reconciliation (1 findings)
    warning: src/lib.rs

--- Housekeeping (3 findings) ---
  Coverage Gaps (3 findings)
    warning: README.md
    warning: LICENSE
    warning: CONTRIBUTING.md

6 issues found (2 errors, 4 warnings).

JSON output includes a tier field on each finding:

{
  "findings": [
    {
      "type": "orphaned_glob",
      "severity": "error",
      "tier": "critical",
      "spec_path": ".notarai/cli.spec.yaml",
      "file_path": null,
      "glob_pattern": "src/deleted/**/*.rs",
      "message": "Artifact glob matches no files: src/deleted/**/*.rs"
    }
  ],
  "summary": { "errors": 1, "warnings": 0 }
}

Configuring CI thresholds

Create .notarai/check.yaml to control which tiers cause CI failures:

# Fail the check if any critical or drift finding is present.
fail_on: drift

# Only show critical and drift findings (suppress housekeeping).
warn_on: drift

fail_on: The minimum tier that causes a non-zero exit code. Accepts critical, drift, or housekeeping. When set, any finding at or above (lower number) the configured tier causes exit code 1. Default behavior (when omitted): fail only on error-severity findings.

warn_on: The minimum tier to include in output. Tiers below this threshold are suppressed entirely. Default: housekeeping (show everything).

The --strict flag overrides fail_on and causes any finding at all to produce exit code 1.

LLM reconciliation tiers

The /notarai-reconcile skill and notarai export-context prompt also use the three-tier classification. When the LLM evaluates semantic drift, it classifies each finding as:

  • Critical: Code contradicts a spec constraint, invariant, or behavior.
  • Drift: Code has changed in ways not reflected in the spec, or new code is not covered by any behavior.
  • Housekeeping: Documentation references outdated information, or style/naming diverges from spec conventions.

Drift Scoring

notarai score computes a numeric drift score for each spec so you can prioritize reconciliation work. Scores are deterministic, produced without LLM calls, and always informational (exit code 0).

Score range and status

Each spec gets a score in [0.0, 1.0]:

RangeStatusMeaning
0.0 - 0.3healthySpec is aligned with code and docs.
0.3 - 0.6reviewSome drift signals; review soon.
0.6 - 1.0overdueHigh drift; reconcile now.

The overall score is the mean across all specs.

Signals

Six weighted signals contribute to each spec’s score. All weights are configurable via .notarai/scoring.yaml; defaults are shown below.

SignalDefault weightWhat it measures
files_changed0.30Governed files changed since last reconciliation.
days_since_reconciliation0.20Days since the reconciliation state was updated.
unresolved_decisions0.15Proposed (not yet accepted) decisions for spec.
orphaned_globs0.15Artifact glob patterns that match zero files.
open_questions0.10open_questions entries in the spec YAML.
unspecced_files0.10Tracked files in governed directories not covered.

Each signal is normalized to [0.0, 1.0] before being multiplied by its weight. The final score is clamped to 1.0.

Usage

notarai score
notarai score --format json
notarai score --spec .notarai/cli.spec.yaml

Human output

Spec                                          Score   Status
-----------------------------------------------------------------
.notarai/cli.spec.yaml                        0.42    review
.notarai/system.spec.yaml                     0.18    healthy

Overall: 0.30 (review)

JSON output

{
  "specs": [
    {
      "spec_path": ".notarai/cli.spec.yaml",
      "score": 0.42,
      "status": "review"
    },
    {
      "spec_path": ".notarai/system.spec.yaml",
      "score": 0.18,
      "status": "healthy"
    }
  ],
  "overall": { "score": 0.3, "status": "review" }
}

Configuration

Create .notarai/scoring.yaml to override default weights:

files_changed: 0.4
days_since_reconciliation: 0.1
unresolved_decisions: 0.2
orphaned_globs: 0.1
open_questions: 0.1
unspecced_files: 0.1

Weights do not need to sum to 1.0, but keeping them normalized makes the resulting score easier to interpret.

MCP integration

The initialize response from notarai mcp includes:

  • driftScore – overall score (0.0-1.0)
  • driftStatushealthy / review / overdue
  • mostDrifted – spec path with the highest score

This lets agents quickly orient without running a separate command.

Troubleshooting Reconciliation

Common issues encountered during reconciliation and how to resolve them.

“Reconciliation flags too many false positives”

Symptom: Reconciliation reports drift on files that have not meaningfully changed, or flags changes that are intentional.

Causes and fixes:

  • Overly broad artifact globs. A glob like src/**/*.rs may include utility files, generated code, or test helpers that change frequently without affecting the spec’s intent. Narrow the globs to match only the files the spec actually governs.

  • Missing exclude patterns. Generated files, lock files, and build artifacts should be excluded at the system spec level. Add patterns to the exclude array in your system spec.

  • Spec covers too much. If a single spec governs 50+ files, it will flag drift on every PR that touches any of them. Split large specs into focused subsystems.

“Reconciliation misses obvious drift”

Symptom: Code has clearly diverged from the spec, but reconciliation reports no findings.

Causes and fixes:

  • Stale cache. The BLAKE3 cache may show files as unchanged if they were marked reconciled after a previous session. Try running with bypass_cache: true on the MCP get_spec_diff tool, or clear the cache:

    notarai cache clear
    
  • Behaviors do not cover the changed area. Reconciliation evaluates drift against the behaviors listed in the spec. If the changed code is not described by any behavior, the LLM has nothing to compare against. Add behaviors for the critical paths you want monitored.

  • Wrong base branch. The reconciliation prompt compares against a base branch (default: main). If you are working on a long-lived feature branch, the diff may not include the changes you expect. Specify the correct base:

    notarai export-context --all --base-branch develop
    

“Context window is too large”

Symptom: The reconciliation prompt exceeds the LLM’s context window, or the LLM produces shallow analysis because the context is too dense.

Causes and fixes:

  • Use exclude_patterns in get_spec_diff. The MCP tool accepts an exclude_patterns array of glob strings that suppress noisy files from the diff output. Common candidates: lock files, snapshot files, auto-generated code.

  • Split large specs into subsystems. A spec governing 30+ files produces a large diff. Break it into focused subsystems with $ref links. Reconciliation processes each spec independently, keeping context proportional to each subsystem’s changes.

  • Let the cache work. Files that have not changed since the last reconciliation are automatically excluded from the diff. Run notarai cache status to verify the cache is populated. If it is empty, the first reconciliation will include everything; subsequent runs will be incremental.

“Spec and code intentionally diverged”

Symptom: You know the code has changed in ways that do not match the spec, and you want to acknowledge this without updating the spec immediately.

Fix: Add a decisions entry to the spec explaining the divergence:

decisions:
  - date: '2026-04-10'
    choice: 'Temporarily diverge from retry spec during migration'
    rationale: >
      The old retry logic is being replaced incrementally. The spec
      describes the target state. Code will converge over the next
      two sprints.

Then mark the affected files as reconciled so they do not trigger repeated warnings:

# Via MCP (in a Claude Code session):
# The mark_reconciled tool updates the cache for specified files.

# Via CLI:
notarai state snapshot

notarai check reports errors but reconciliation says everything is fine”

Symptom: notarai check finds issues (orphaned globs, coverage gaps) but the LLM-based reconciliation does not mention them.

Explanation: notarai check runs deterministic, structural checks. LLM-based reconciliation evaluates semantic alignment. They are complementary:

  • check catches: orphaned globs, coverage gaps, overlapping specs, circular refs, stale cache entries
  • reconciliation catches: behavioral drift, outdated constraints, misaligned documentation

Run both. Use notarai check in CI for fast, deterministic feedback. Use /notarai-reconcile during development for deeper semantic analysis.

“Bootstrap interview produces a spec that does not validate”

Symptom: The spec generated by /notarai-bootstrap fails notarai validate.

Fix: This usually means the LLM produced a field or value not in the JSON Schema. Common issues:

  • Unknown artifact category. The schema allows code, docs, tests, configs, and several others. If the LLM invented a category like scripts, rename it to a supported one or use a custom key.

  • Missing required fields. Every spec needs schema_version, intent, and artifacts. If the LLM omitted one, add it.

  • Wrong schema_version. The LLM may have used an older version string. Run notarai schema-bump to update all specs to the current version.

CLI Commands

NotarAI is distributed as a single static binary with no runtime dependencies. All commands use the notarai prefix.

notarai validate

Validate spec files against the JSON Schema.

# Validate all specs in .notarai/ (default)
notarai validate

# Validate a specific file
notarai validate .notarai/auth.spec.yaml

# Validate a directory
notarai validate .notarai/subsystems/

Arguments:

ArgumentRequiredDescription
pathNoFile or directory to validate. Defaults to .notarai/

Behavior:

  • Single file: validates against the schema, prints PASS or FAIL with indented errors.
  • Directory: recursively finds all .spec.yaml files and validates each.
  • No specs found: exits 0 with a warning on stderr.
  • Stale schema warning: if .notarai/notarai.spec.json exists but its $id differs from the bundled schema, prints a warning suggesting notarai init to update.

Exit codes: 0 all files pass, 1 any file fails.


notarai check

Deterministic, LLM-free drift detection. Reports coverage gaps, orphaned globs, changed files, overlapping coverage, circular $ref chains, and incomplete behaviors.

# Human-readable output (default)
notarai check

# JSON output
notarai check --format json

# Custom base branch
notarai check --base-branch develop

# Strict mode: promote all warnings to errors (useful for CI)
notarai check --strict

Arguments:

FlagRequiredDefaultDescription
--formatNohumanOutput format: human or json
--base-branchNomainBase branch for changed-file detection
--strictNofalsePromote all warnings to errors (zero-tolerance CI)

Checks performed:

CheckSeverityTierDescription
Orphaned globsErrorCriticalArtifact glob patterns matching zero files
Circular $ref chainsErrorCriticalCycles in subsystems, applies, or dependencies references
Changed since reconciliationWarningDriftGoverned files changed since last cache update
Coverage gapsWarningHousekeepingTracked files not governed by any spec (minus excludes)
Overlapping coverageWarningHousekeepingFiles governed by two or more specs
Behavior completenessWarningHousekeepingBehaviors missing a given or then field
T001 Test coverage missingWarningHousekeepingTier-1 behavior without a tested_by entry
T002 Test path missingErrorCriticaltested_by.path does not exist on disk

Lint rules (L001-L011) are also run and merged into check output. See Lint Rules.

Severity tiers: Each finding is classified as Critical, Drift, or Housekeeping. Human output groups findings by tier. JSON output includes a tier field. See Severity Tiers for details.

Configuration: Create .notarai/check.yaml to control CI thresholds:

fail_on: drift # Fail on critical or drift findings
warn_on: drift # Suppress housekeeping from output

With --strict, all warning-severity findings are promoted to errors and any finding causes exit code 1.

The check command never modifies files or the cache database.

Exit codes: 0 no error-severity findings (or no findings at or above fail_on tier), 1 errors found (including warnings promoted under --strict), 2 not initialized (.notarai/ missing).


notarai lint

Lint spec files for quality issues beyond JSON Schema conformance. A superset of notarai validate that checks semantic quality.

# Human-readable output (default)
notarai lint

# JSON output
notarai lint --format json
FlagDefaultDescription
--formathumanOutput format: human or json

Runs 11 deterministic rules (L001-L011) covering missing behaviors, broken $ref targets, stale decisions, schema mismatches, and more. Rules can be configured via .notarai/lint.yaml. Lint results are also integrated into notarai check.

See Lint Rules for the full rule reference.

Exit codes: 0 no error-severity findings, 1 errors found, 2 not initialized.


notarai decisions

Manage decision proposals from reconciliation. Proposals are stored in .notarai/decision-log.json and can be accepted (appended to the spec’s decisions array) or rejected (marked in the log with an optional reason).

notarai decisions list

# List all decisions
notarai decisions list

# Filter by status
notarai decisions list --status proposed
FlagDefaultDescription
--status(all)Filter: proposed, accepted, or rejected

notarai decisions accept

notarai decisions accept .notarai/auth.spec.yaml 0

Accepts the proposal at the given index: removes it from the log, appends { date, choice, rationale } to the spec’s YAML decisions array, and validates the spec afterward.

notarai decisions reject

notarai decisions reject .notarai/auth.spec.yaml 0 --reason "Not relevant"

Marks the proposal as rejected in the log. Does not modify the spec. The optional --reason flag records why the decision was rejected.

Exit codes: 0 success, 1 error, 2 not initialized.


notarai score

Compute drift scores for each spec. Deterministic, no LLM calls. Exit code is always 0 (informational). See the Drift Scoring guide for signal details and configuration.

notarai score
notarai score --format json
notarai score --spec .notarai/cli.spec.yaml
FlagDefaultDescription
--formathumanOutput format: human or json.
--spec(all)Score a single spec by path.

Scores are in [0.0, 1.0] with thresholds: < 0.3 healthy, < 0.6 review, otherwise overdue.


notarai init

Set up NotarAI in a project. Running init again is safe: it always refreshes skills and the schema copy.

# Interactive prompt (defaults to claude)
notarai init

# Explicit agent selection
notarai init --agents claude
notarai init --agents opencode
notarai init --agents claude,gemini

# All known adapters
notarai init --agents all

# Agent-agnostic artifacts only (no adapter-specific setup)
notarai init --agents none

# Deprecated alias (claude -> claude, generic -> opencode)
notarai init --agent claude
notarai init --agent generic

Arguments:

FlagRequiredDescription
--agentsNoComma-separated list of agents: claude, gemini, codex, opencode, plus meta-tokens all and none. Prompts interactively if omitted and stdin is a TTY; auto-detects if stdin is not a TTY
--agentNoDeprecated alias for --agents. claude maps to --agents claude; generic maps to --agents opencode

Shared setup (all modes):

  1. Copies notarai.spec.json to .notarai/notarai.spec.json (always refreshed).
  2. Writes .notarai/README.md with workflow instructions (always overwritten).
  3. Writes .notarai/reconcile-prompt.md (reconciliation prompt template).
  4. Writes .notarai/bootstrap-prompt.md (bootstrap prompt template).
  5. Appends .notarai/.cache/ to .gitignore.
  6. Writes .mcp.json registering notarai mcp as a local MCP server.
  7. Writes or section-merges AGENTS.md so user content outside the ## NotarAI section is preserved.

Per-adapter setup (for each selected adapter):

  1. If the adapter declares a pointer file (CLAUDE.md, GEMINI.md), creates it as a single-line @AGENTS.md stub when absent, leaves it unchanged when it already contains @AGENTS.md, or section-merges a ## NotarAI block when it has other content.
  2. If the adapter declares a skills directory, always overwrites SKILL.md for notarai-reconcile and notarai-bootstrap (Claude-flavor for the claude adapter, generic-flavor for all others).
  3. If the adapter declares a hook installer, installs it (only claude installs a PostToolUse hook in .claude/settings.json).

Exit codes: 0 success, 1 error (unparseable JSON, unknown agent, symlink pointer file, non-directory skills path).


notarai export-context

Export reconciliation context for any LLM agent. Outputs spec content, changed files, and diffs in a format suitable for feeding into a reconciliation prompt.

# Single spec, markdown output (default)
notarai export-context --spec .notarai/auth.spec.yaml

# All affected specs, JSON output
notarai export-context --all --format json

# Custom base branch
notarai export-context --spec .notarai/api.spec.yaml --base-branch develop

Arguments:

FlagRequiredDefaultDescription
--specOne of the twoPath to a single spec file
--allOne of the twoExport context for all affected specs
--base-branchNomainBase branch for diff
--formatNomarkdownOutput format: markdown or json

Exactly one of --spec or --all is required.

Markdown output fills the bundled reconcile-prompt.md template with spec content, changed file list, and diff. Multiple specs are separated by ---.

JSON output includes spec_path, spec_name, spec_content, changed_files, diff, binary_changes, and file_categories. A single spec produces an object; --all with multiple specs produces an array.

Exit codes: 0 success, 1 error (bad arguments, missing spec, git failure), 2 not initialized (.notarai/ missing).


notarai schema-bump

Update the schema version across all specs in the project.

notarai schema-bump

Detects the schema version in .notarai/notarai.spec.json (if it exists) and compares it to the bundled schema. If they differ:

  1. Overwrites .notarai/notarai.spec.json with the bundled schema.
  2. Updates the schema_version field in every .notarai/*.spec.yaml file.
  3. Validates all updated specs and reports any failures.

If versions already match, prints “Already at current schema version” and exits 0.

Exit codes: 0 success or already current, 1 validation error after update.


notarai hook validate

PostToolUse hook handler. Validates spec files when Claude Code writes or edits them.

# Called automatically by Claude Code, not typically invoked manually
notarai hook validate

Reads PostToolUse JSON from stdin. If the file path matches .notarai/**/*.spec.yaml, reads the file from disk and validates it. Invalid specs block the tool use with errors on stderr.

Behavior:

StdinResult
Spec file path (.notarai/**/*.spec.yaml)Validates; exits 1 with errors if invalid
Non-spec file pathExits 0 silently
Invalid JSON or missing fileExits 0 silently (graceful degradation)

Exit codes: 0 valid or non-spec file, 1 invalid spec.


notarai cache

BLAKE3 + SQLite hash cache for tracking file changes between reconciliation runs. The cache database lives at .notarai/.cache/notarai.db.

notarai cache status

Show cache status: database path, entry count, and newest entry timestamp.

notarai cache status

Creates an empty database if none exists.

Exit codes: 0 success, 1 error.

notarai cache clear

Delete the cache database.

notarai cache clear

Prints Cache cleared or Cache not initialized (if the DB didn’t exist). No-op if the file does not exist.

Exit codes: 0 success, 1 error.


notarai state

Manage the persistent reconciliation state file (.notarai/reconciliation_state.json). The state file records the last reconciliation timestamp, git hash, branch, and BLAKE3 fingerprints for all governed files and specs. It can be committed to the repo to give collaborators a baseline.

notarai state show

Display the current reconciliation state.

notarai state show

Prints the timestamp, git hash, branch, and counts of tracked files and specs. Prints No reconciliation state found. if no state file exists.

Exit codes: 0 success, 1 error.

notarai state reset

Delete the reconciliation state file, forcing the next reconciliation to treat everything as changed.

notarai state reset

Prints Reconciliation state reset. or No reconciliation state to reset. (if the file didn’t exist).

Exit codes: 0 success, 1 error.

notarai state snapshot

Build a new state snapshot from the current SQLite cache and save it to .notarai/reconciliation_state.json.

notarai state snapshot

Reads all entries from the cache, partitions them into file fingerprints and spec fingerprints, captures the current git HEAD and branch, and writes the result. This is the CLI equivalent of the snapshot_state MCP tool.

Exit codes: 0 success, 1 error.


notarai update

Check for and install updates.

# Check if an update is available
notarai update --check

# Update to the latest version
notarai update

Arguments:

FlagRequiredDescription
--checkNoOnly check, don’t install

Behavior:

The command queries the GitHub API for the latest release, compares its version against the current binary, and prints the result. Without --check, it also attempts to install the update:

Install methodDetectionAction
GitHub ReleaseBinary is not in .cargo/bin or target/Downloads and replaces the binary in place
cargo installBinary path contains .cargo/binPrints cargo install notarai
Dev buildDebug build or path contains target/Prints cargo install --path crates/notarai

Passive update hints:

notarai validate and notarai init automatically check for updates in the background using a global cache with a 24-hour TTL and a 5-second network timeout. If a newer version is available, a one-line hint is printed to stderr. All errors are silently swallowed — the hint never interferes with normal output.

Exit codes: 0 success or up to date, 1 error or update failure.


notarai mcp

Start a synchronous JSON-RPC 2.0 MCP server over stdio. Typically configured automatically by notarai init rather than invoked manually.

notarai mcp

The server reads JSON-RPC messages line-by-line from stdin and writes responses to stdout. It exits cleanly on stdin EOF.

Protocol: JSON-RPC 2.0 over stdio (synchronous, no async runtime).

Setup: notarai init writes .mcp.json to the project root, which Claude Code reads to auto-start the server:

{
  "mcpServers": {
    "notarai": {
      "type": "stdio",
      "command": "notarai",
      "args": ["mcp"]
    }
  }
}

See the MCP Server reference for the full tool API, parameters, and return shapes.

Exit codes: 0 on stdin EOF.

Lint Rules

notarai lint checks spec quality beyond JSON Schema conformance. Each rule has a stable ID, a default severity, and a description. Rules can be configured per-project via .notarai/lint.yaml.

Rules

RuleDefault SeverityDescription
L001errorTier 1 (full) spec has zero behaviors. A full spec should describe at least one behavior.
L002warningBehavior missing given field. The trigger condition is unspecified.
L003warningBehavior missing then field. The expected outcome is unspecified.
L004error$ref target file does not exist on disk. A subsystem, applies, or dependency reference points to a missing file.
L005warningCircular $ref dependency detected. Spec reference chains form a cycle.
L006warningDecision older than 90 days with no rationale. Stale decisions without context lose value over time.
L007infoSpec has open_questions. Consider resolving before reconciliation.
L008warningArtifact glob is **/* (overly broad). Likely matches unintended files.
L009errorschema_version does not match the bundled schema. Run notarai schema-bump to update.
L010warningDuplicate behavior names within a spec. Each behavior should have a unique name.
L011errorCross-cutting spec (cross_cutting: true) referenced from another spec’s subsystems. Move it to applies.

Usage

# Human-readable output (default)
notarai lint

# JSON output for CI
notarai lint --format json

Exit codes:

  • 0: No error-severity findings.
  • 1: At least one error-severity finding.
  • 2: .notarai/ directory not found.

Configuration

Create .notarai/lint.yaml to customize rule behavior:

rules:
  L006:
    severity: info # downgrade from warning
    decision_age_days: 180 # custom threshold (default: 90)
  L007:
    enabled: false # disable rule entirely
  L008:
    severity: error # promote to error

Each rule supports:

  • enabled (bool): Set to false to disable the rule. Default: true.
  • severity (string): Override the default severity. Values: error, warning, info.
  • Rule-specific parameters (e.g., decision_age_days for L006).

Integration with notarai check

Lint rules run automatically as part of notarai check. Findings from L002, L003, and L005 are deduplicated against their check equivalents (BehaviorIncomplete and CircularRef). All other lint errors count as check errors and affect the exit code.

JSON Output Format

{
  "findings": [
    {
      "rule_id": "L001",
      "severity": "error",
      "spec_path": ".notarai/auth.spec.yaml",
      "message": "Tier 1 (full) spec has zero behaviors: .notarai/auth.spec.yaml"
    }
  ],
  "summary": {
    "errors": 1,
    "warnings": 0,
    "infos": 0
  }
}

Rule IDs are stable across versions and will never be renumbered.

MCP Server

NotarAI includes a built-in Model Context Protocol (MCP) server that serves pre-filtered diffs and change data to the reconciliation engine. This keeps context usage proportional to what actually changed rather than the full repository.

Setup

notarai init writes an .mcp.json file to the project root that registers the MCP server:

{
  "mcpServers": {
    "notarai": {
      "type": "stdio",
      "command": "notarai",
      "args": ["mcp"]
    }
  }
}

Claude Code reads this file and starts the server automatically. No manual configuration needed.

Protocol

  • Transport: stdio (stdin/stdout)
  • Format: JSON-RPC 2.0, one message per line
  • Execution: synchronous (no async runtime)
  • Protocol version: 2024-11-05

Initialize response

The initialize response includes standard MCP fields (protocolVersion, capabilities, serverInfo, tools). When the local schema (.notarai/notarai.spec.json) is out of date relative to the bundled schema, the response includes an additional schemaNote field:

{
  "schemaNote": "Schema is out of date (local: .../0.5/..., bundled: .../0.6/...). Run `notarai init` to update."
}

This surfaces schema staleness to Claude at session start without requiring a separate check.

When the project’s NotarAI configs are behind the running CLI version (detected via the version in .notarai/README.md), the response includes an additional projectNote field:

{
  "projectNote": "hint: project was initialized with notarai v0.3.1. Run `notarai init` to update project configs to v0.3.2."
}

This surfaces project config staleness to Claude at session start so reconciliation uses up-to-date slash commands and schema.

The response also includes a drift score snapshot so agents can prioritize reconciliation work without calling a separate tool:

{
  "driftScore": 0.42,
  "driftStatus": "review",
  "mostDrifted": ".notarai/cli.spec.yaml"
}

See the Drift Scoring guide for signal details.

Tools

list_affected_specs

Identify which specs govern files that changed on the current branch relative to a base branch.

Parameters:

ParameterTypeRequiredDescription
base_branchstringYesBranch to diff against (e.g., "main")

Returns:

{
  "changed_files": ["src/auth.rs", "src/main.rs"],
  "affected_specs": [
    {
      "spec_path": ".notarai/cli.spec.yaml",
      "behaviors": [],
      "constraints": [],
      "invariants": []
    }
  ]
}

Each affected spec includes its behaviors, constraints, and invariants so the reconciliation engine has the context to evaluate drift without additional file reads.


get_spec_diff

Get the git diff filtered to files governed by a specific spec. Uses the hash cache to skip files that haven’t changed since the last reconciliation.

Parameters:

ParameterTypeRequiredDescription
spec_pathstringYesRelative path to the spec file
base_branchstringYesBranch to diff against
exclude_patternsstring[]NoGlob patterns to exclude via git :(exclude) pathspecs (e.g., ["Cargo.lock", "*.lock"])
bypass_cachebooleanNoIf true, diff all governed files regardless of cache state. Defaults to false

Returns:

{
  "diff": "unified diff of non-spec governed files...",
  "files": ["src/auth.rs"],
  "skipped": ["src/utils.rs"],
  "excluded": ["Cargo.lock"],
  "spec_changes": [
    {
      "path": ".notarai/cli.spec.yaml",
      "content": "full file content..."
    }
  ],
  "system_spec": {
    "path": ".notarai/system.spec.yaml",
    "content": "full file content..."
  },
  "binary_changes": ["assets/logo.png", "slides/deck.pptx"],
  "file_categories": {
    "src/auth.rs": "code",
    "docs/auth.md": "docs",
    "assets/logo.png": "assets"
  },
  "spec_invalidated": ["src/utils.rs"]
}
FieldDescription
diffUnified diff output for non-spec, non-binary artifact files only
filesNon-spec files included in the diff (includes binary files by path, but their content is in binary_changes)
skippedNon-spec files whose BLAKE3 hash matched the cache (already reconciled)
excludedPatterns passed via exclude_patterns
spec_changesArray of {path, content} for each governed .notarai/**/*.spec.yaml file that changed
system_specThe system spec (the spec with a subsystems key) – included whenever spec_changes is non-empty; null otherwise
binary_changesFile paths of binary files (images, PPTX, PDF, etc.) whose content cannot be usefully diffed
file_categoriesObject mapping each changed file path to its artifact category from the spec (e.g., "code", "docs", "assets")
spec_invalidatedCached artifact paths whose governing spec has changed since last reconciliation, indicating they need review even though the artifacts themselves have not changed on disk. Empty when bypass_cache is true

Why full content for spec files?

Spec files express intent, not implementation. The reconciliation engine needs the complete spec to evaluate drift – diff hunks showing only changed lines lack the context to determine whether behavior is still satisfied. Returning full content also avoids the ambiguity of partial context when the spec is the source of truth.

Spec deduplication: If the system spec itself changed, it appears in spec_changes with full content and system_spec contains only {path} (a reference) to avoid duplicating the content.

Cache behavior:

  • Files whose on-disk BLAKE3 hash matches the cached hash are listed in skipped (for artifact files) or omitted from spec_changes (for spec files).
  • A cold or absent cache causes all governed files to be included. This is a safe fallback that ensures nothing is missed.
  • bypass_cache: true forces a full diff without destroying the cache (useful for re-checking everything).

get_changed_artifacts

Get artifact files governed by a spec that have changed since the last cache update. Useful for identifying which docs or other artifacts need review during reconciliation.

Parameters:

ParameterTypeRequiredDescription
spec_pathstringYesRelative path to the spec file
artifact_typestringNoFilter by artifact type (e.g., "docs", "code", "configs")

Returns:

{
  "changed_artifacts": ["docs/auth.md", "docs/api-reference.md"],
  "spec_invalidated": ["docs/overview.md"]
}
FieldDescription
changed_artifactsFiles whose on-disk content differs from the cached hash
spec_invalidatedCached artifact paths whose governing spec has changed since last reconciliation, indicating they need review even though the artifacts themselves have not changed on disk

If no artifact_type is specified, all artifact types are checked. When artifact_type is set, both changed_artifacts and spec_invalidated are filtered to that type.


mark_reconciled

Update the hash cache after reconciliation is complete. Call this at the end of a reconciliation pass so that subsequent runs skip files that haven’t changed.

Parameters:

ParameterTypeRequiredDescription
filesstring[]YesRelative file paths to cache

Returns:

{
  "updated": 5
}

Files are hashed with BLAKE3 and stored with their relative paths as cache keys. Non-existent files are silently skipped.


clear_cache

Delete the reconciliation cache database, forcing the next get_spec_diff call to diff all governed files.

Parameters: None.

Returns:

{
  "cleared": true
}

Returns true if the database was deleted, false if it didn’t exist.


snapshot_state

Persist the current reconciliation cache as a state snapshot at .notarai/reconciliation_state.json. Call this at the end of a successful reconciliation pass.

Parameters: None.

Returns:

{
  "state_path": ".notarai/reconciliation_state.json",
  "files": 42,
  "specs": 5,
  "git_hash": "a1b2c3d..."
}
FieldDescription
state_pathAbsolute path where the state file was written
filesNumber of non-spec file fingerprints stored
specsNumber of spec fingerprints stored
git_hashgit HEAD at snapshot time (empty string if not in a repo)

The state file is pretty-printed JSON and safe to commit. It gives collaborators a baseline so subsequent get_spec_diff calls can skip files that haven’t changed since the last reconciliation. Use notarai state show / notarai state reset to inspect or clear state from the CLI.

Cache semantics

The cache is a SQLite database at .notarai/.cache/notarai.db with a single table:

file_cache(path TEXT PRIMARY KEY, blake3_hash TEXT, updated_at INTEGER)

Key details:

  • Hash algorithm: BLAKE3 – fast cryptographic hash.
  • Path format: MCP tools use relative paths as cache keys. Seed the MCP cache via mark_reconciled, not notarai cache update.
  • Cold cache: When the cache is empty or absent, get_spec_diff diffs all governed files. This is the safe default.
  • Cache location: .notarai/.cache/ is gitignored by notarai init so the cache is never committed.

Error codes

CodeMeaning
-32700Parse error (malformed JSON)
-32601Method not found
-32602Invalid params (missing required parameter)
-32603Internal error (git failure, file I/O, cache unavailable)

GitHub Action

NotarAI provides a composite GitHub Action that runs notarai check on pull requests and posts a summary comment. No Rust toolchain is required on the runner.

Setup

Add a workflow file to your repository:

# .github/workflows/notarai.yml
name: NotarAI Check
on:
  pull_request:
    branches: [main]

permissions:
  contents: read
  pull-requests: write

jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: davidroeca/NotarAI/crates/notarai-action@v0.7.0

The action downloads the notarai binary from GitHub Releases, runs the check, and posts (or updates) a PR comment with the results.

Inputs

InputDefaultDescription
versionlatestNotarAI version to install
base-branchmainBranch to diff against for changed-file detection
strictfalsePromote all warnings to errors (fail on any drift)
commenttruePost a PR comment with findings

What it does

  1. Detects platform: determines runner OS and architecture.
  2. Downloads binary: fetches the matching notarai release binary from GitHub Releases.
  3. Runs check: executes notarai check --format json --base-branch <base-branch> (with --strict if enabled).
  4. Posts comment: renders findings into a Markdown comment grouped by type, with collapsible details.
  5. Sets exit code: fails the step if any error-severity findings are present.

PR comment

The comment includes a summary line and collapsible sections for each finding type:

## NotarAI Drift Check

113 finding(s) | 0 error(s) | 113 warning(s) -- 113 warning(s)

> Orphaned globs (1)
>   - Artifact glob matches no files: src/legacy/** (in legacy.spec.yaml)

> Coverage gaps (3)
>   - File not governed by any spec: src/new_module.rs
>   - ...

Re-runs update the existing comment in place rather than posting duplicates. The comment is identified by a <!-- notarai-action --> HTML marker.

Strict mode

Use strict: 'true' to fail the check on any finding, not just errors:

- uses: davidroeca/NotarAI/crates/notarai-action@v0.7.0
  with:
    strict: 'true'

This is useful for repositories that want zero-tolerance drift detection in CI.

Uninitialized repositories

If the repository does not have a .notarai/ directory, the action posts a comment noting that NotarAI is not initialized and exits successfully (does not fail the workflow).

Requirements

  • Runs on ubuntu-latest (Linux x86_64 or aarch64).
  • No Rust toolchain required on the runner.
  • Requires pull-requests: write permission for posting comments.
  • Uses GITHUB_TOKEN (automatic) for downloading releases and posting comments.

Motivation

The problem

With LLMs generating both code and documentation from natural language prompts, there’s no authoritative representation of intent that persists across changes. Code and docs drift out of sync – and unlike the pre-LLM era where code was the single source of truth, now either artifact can be the one that’s “right.” This is the three-body problem: intent, code, and docs can all diverge.

The idea

Introduce a NotarAI spec – a structured YAML document governed by a JSON Schema – that captures user intent as the canonical source of truth. An LLM acts as the reconciliation engine, keeping code and documentation in sync with the spec (and vice versa).

Coverage model

Four tiers ensure every file in the repo is accounted for without over-specifying:

  • Tier 1 (Full Spec): Business logic, APIs, user-facing features – full behaviors and constraints
  • Tier 2 (Registered): Utility libs, sidecars, config – just intent + artifact mapping, no behaviors
  • Tier 3 (Excluded): Generated code, vendor deps, editor configs – explicitly out of scope
  • Tier 4 (Derived): Generated outputs tracked for staleness but not authored directly (build artifacts, compiled bundles)

Anything not covered by any tier is flagged as “unspecced” – a lint warning, not a blocker.

Bootstrap

For existing codebases: ingest code + docs + commit history, then the LLM interviews the developer about goals and undocumented rules, drafts a spec with required fields only, and the user reviews and enriches. The spec accrues precision over time.

Inspirations

See the Inspirations page.

Design Diagrams

All diagrams from the design process, illustrating the NotarAI name and .notarai/ directory convention.


1. The Problem: Pre-LLM vs Current LLM Era

1a. Pre-LLM: Code Is the Spec

flowchart LR
    Dev["Developer<br/>(intent in head)"]
    Code["Source Code<br/>authoritative spec"]
    Docs["Docs<br/>second-class, often stale"]

    Dev -->|writes| Code
    Code -.->|describes| Docs

1b. Current LLM Era: The Three-Body Problem

flowchart TD
    Intent["User Intent<br/>natural language prompt"]
    LLM["LLM"]
    Code["Source Code"]
    Docs["Documentation"]

    Intent --> LLM
    Intent -.->|"edits directly"| Code
    Intent -.->|"edits directly"| Docs
    LLM -->|generates| Code
    LLM -->|generates| Docs
    Code <-..->|"drift / desync"| Docs

2. NotarAI: Spec State File as Single Source of Truth

flowchart TD
    Intent["User Intent<br/>natural language"]
    Spec["NotarAI Spec<br/>structured intent representation<br/>canonical source of truth"]
    LLM["LLM (sync engine)"]
    Code["Source Code"]
    Docs["Documentation"]

    Intent -->|updates| Spec
    Spec -->|reads| LLM
    LLM -->|derives| Code
    LLM -->|derives| Docs
    Code -.->|reconcile back| Spec
    Docs -.->|reconcile back| Spec
    Code <-.->|"always in sync via spec"| Docs

3. Spec File Anatomy

3a. Required Core

# .notarai/auth.spec.yaml
schema_version: '0.6'

intent: |
  Users can sign up, log in, and
  reset passwords. Sessions expire
  after 30 min of inactivity.

behaviors:
  - name: 'signup'
    given: 'valid email + password'
    then: 'account created, welcome email sent'
  - name: 'session_timeout'
    given: '30 min inactivity'
    then: 'session invalidated'

artifacts:
  code:
    - path: 'src/auth/**'
  docs:
    - path: 'docs/auth.md'

3b. Optional Extensions

# Power users add precision as needed

constraints:
  - 'passwords >= 12 chars'
  - 'rate limit: 5 login attempts / min'

invariants:
  - 'no plaintext passwords in DB'
  - 'all endpoints require HTTPS'

decisions:
  - date: '2025-03-12'
    choice: 'JWT over session cookies'
    rationale: 'stateless scaling'

open_questions:
  - 'Should we support OAuth2 providers?'
  - 'MFA timeline?'

Design note: The behaviors field uses Given/Then language (BDD-adjacent) but stays in natural language – not formal Gherkin. Structured enough to diff and validate, informal enough that non-engineers can author it.


4. Reconciliation Lifecycle

4a. Scenario A: Human Edits Code

flowchart LR
    A1["Human edits code<br/>adds OAuth endpoint"]
    A2["LLM detects drift<br/>code != spec behaviors"]
    A3["LLM proposes spec update<br/>+ add behavior: oauth_login<br/>+ update docs/auth.md"]
    A4["Human approves<br/>or adjusts and approves"]

    A1 -->|trigger| A2
    A2 -->|reconcile| A3
    A3 -->|resolve| A4

4b. Scenario B: Human Edits Spec

flowchart LR
    B1["Human edits spec<br/>changes session to 60 min"]
    B2["LLM updates code to match"]
    B3["LLM updates docs to match"]
    B4["Human reviews<br/>code + docs diff<br/>as a single PR"]

    B1 -->|direct| B2
    B1 -->|direct| B3
    B2 --> B4
    B3 --> B4

4c. Scenario C: Conflict Detected

flowchart LR
    C1["Conflict detected<br/>code says X, spec says Y<br/>docs say Z"]
    C2["LLM presents options<br/>spec says X, but code<br/>does Y -- which is right?"]
    C3["Human decides intent<br/>LLM propagates decision<br/>across spec + code + docs"]
    C4["All three aligned<br/>conflict resolved"]

    C1 -->|detect| C2
    C2 -->|reconcile| C3
    C3 -->|resolve| C4

5. Post-Push Reconciliation in Practice

flowchart LR
    S1["Dev + LLM<br/>write code freely<br/>no spec friction"]
    S2["git push<br/>or open PR"]
    S3["CI hook: LLM reviews<br/>diff vs affected specs<br/>proposes spec updates<br/>proposes doc updates"]
    S4["Adds to PR<br/>spec diff + docs diff<br/>alongside code diff"]
    S5["Single review<br/>code + spec + docs<br/>all land together or not"]

    S1 --> S2 --> S3 --> S4 --> S5

The artifacts field in the spec tells the CI hook which specs are affected by which file paths – so it only reconciles what changed.


6. Spec Composition – The Import Model

6a. Directory Structure

project/
+-- .notarai/
|   +-- system.spec.yaml          # top-level system spec
|   +-- auth.spec.yaml            # auth service (Tier 1)
|   +-- billing.spec.yaml         # billing service (Tier 1)
|   +-- api.spec.yaml             # API layer (Tier 1)
|   +-- utils.spec.yaml           # shared utilities (Tier 2)
|   +-- redis-cache.spec.yaml     # sidecar process (Tier 2)
|   +-- _shared/
|       +-- security.spec.yaml    # cross-cutting
|       +-- logging.spec.yaml     # cross-cutting
+-- src/
+-- docs/

6b. Composition Relationships

flowchart TD
    System["system.spec.yaml<br/>top-level intent + invariants"]

    Auth[".notarai/auth.spec.yaml"]
    Billing[".notarai/billing.spec.yaml"]
    API[".notarai/api.spec.yaml"]

    Security["_shared/security.spec.yaml<br/>applies to: all subsystems"]
    Logging["_shared/logging.spec.yaml<br/>applies to: all subsystems"]

    System -->|"$ref"| Auth
    System -->|"$ref"| Billing
    System -->|"$ref"| API

    Security -.->|applies| Auth
    Security -.->|applies| Billing
    Security -.->|applies| API
    Logging -.->|applies| Auth
    Logging -.->|applies| Billing
    Logging -.->|applies| API

When the LLM checks auth.spec.yaml, it also loads security.spec.yaml and validates that auth code satisfies both specs’ invariants. Cross-cutting concerns are defined once and enforced everywhere.


7. Coverage Model – Three Tiers

flowchart LR
    subgraph T1["Tier 1: Full Spec"]
        T1a["Business logic services"]
        T1b["API endpoints"]
        T1c["Data models / schemas"]
        T1d["Anything user-facing"]
    end

    subgraph T2["Tier 2: Registered"]
        T2a["Utility libraries"]
        T2b["Shared helpers / constants"]
        T2c["Config files"]
        T2d["Sidecar processes"]
    end

    subgraph T3["Tier 3: Excluded"]
        T3a["Generated code / build output"]
        T3b["Vendored dependencies"]
        T3c["IDE / editor configs"]
        T3d["node_modules, .git, etc."]
    end

Coverage equation: Tier 1 + Tier 2 + Tier 3 = entire repo

Anything not covered = unspecced (a lint warning, not a block).


8. Bootstrap Flow for Existing Codebases

flowchart LR
    S1["1. Ingest<br/>code + docs +<br/>commit history +<br/>README / ADRs"]
    S2["2. LLM interviews<br/>What's the goal?<br/>Any undocumented rules?"]
    S3["3. Draft spec<br/>required fields only<br/>intent + behaviors +<br/>artifact mappings"]
    S4["4. Human review<br/>correct, enrich,<br/>add constraints /<br/>open questions"]
    S5["5. Activate<br/>sync engine<br/>watches for drift<br/>from this point on"]

    S1 --> S2 --> S3 --> S4 --> S5

Bootstrap starts minimal and accrues precision over time – the spec is a living document.

Comparison to SDD

Spec-driven development (SDD) has emerged as a major pattern for AI-assisted coding, but the term covers several distinct approaches. Birgitta Böckeler’s taxonomy identifies three levels:

  • Spec-first: Write a spec, generate code, discard or ignore the spec afterward.
  • Spec-anchored: Keep the spec around for ongoing maintenance, but how it stays current is left vague.
  • Spec-as-source: The spec replaces code as the primary artifact. People never touch code directly.

Most SDD tools (Kiro, Spec Kit, OpenSpec) are spec-first in practice: they help you go from intent to plan to tasks to code, but once the code exists, the spec quietly goes stale. Superpowers takes the spec-first workflow further with a structured seven-stage methodology and subagent-driven execution, but its plans are task-scoped artifacts. Tessl is exploring spec-as-source, where code is generated from specs and marked “DO NOT EDIT”, but this sacrifices the flexibility of direct code editing.

NotarAI occupies the gap that Böckeler’s taxonomy identifies but no current tool fills: spec-anchored with automated maintenance. The spec persists for the lifetime of the feature, and an LLM reconciliation engine actively keeps it aligned with code and docs as all three evolve.

SDD tools solve the cold-start problem. NotarAI solves the entropy problem.

SDD tools help you write specs. NotarAI helps you keep them true.

  • A developer adds a feature – NotarAI detects the spec doesn’t account for it and proposes an update
  • A team lead updates the spec – NotarAI propagates the change to code and docs
  • Code contradicts a spec constraint – NotarAI flags the conflict and asks the user to decide

The spec isn’t just a blueprint. It’s a witness – a living contract the LLM continuously verifies against reality.

Landscape comparison

ToolSDD LevelDirectionSpec LifespanBrownfield Support
KiroSpec-firstSpec -> codeChange requestLimited
Spec KitSpec-first (aspires to anchored)Spec -> codeBranch / change requestLimited
TesslSpec-as-sourceSpec -> code (human edits spec only)Feature lifetimeReverse-engineering CLI
OpenSpecSpec-firstSpec -> codeChange requestLimited
SuperpowersSpec-first (workflow methodology)Spec -> plan -> subagent executionTask / branchGit worktree isolation
SemcheckCompliance checkingSpec -> code (one-way check)OngoingYes
NotarAISpec-anchored + active reconciliationSpec <-> code <-> docsFeature lifetimeBootstrap flow with LLM interview

For a practical, feature-by-feature comparison of NotarAI against specific tools (Spec Kit, OpenSpec, Intent, Kiro), see How NotarAI Compares.

How NotarAI Compares

This page provides a fair, practical comparison of NotarAI against the major spec-driven development tools available today. For the conceptual positioning of NotarAI within SDD taxonomy, see Comparison to SDD.

At a Glance

DimensionNotarAISpec KitOpenSpecIntentKiro
FocusContinuous reconciliationGenerative workflowProposal-basedLiving specsIDE-integrated
Agent supportClaude Code + any via export14+ agents20+ agentsProprietaryClaude only
Spec formatStructured YAML + JSON SchemaMarkdownMarkdownProprietaryEARS notation
CI integrationnotarai check + GitHub ActionManualManualBuilt-inBuilt-in
Deterministic checksYes (LLM-free)NoNoPartialPartial
CostFree / OSSFree / OSSFree / OSS$60-200/moFree tier + paid
Brownfield/notarai-bootstrap interviewManualDelta markersContext EngineLimited

Where NotarAI fits

Most SDD tools solve the cold-start problem: turning intent into code. NotarAI solves the entropy problem: keeping specs, code, and docs aligned as all three evolve independently after the initial generation.

This makes NotarAI complementary to, not competitive with, tools like Spec Kit and OpenSpec. You can use Spec Kit to generate your initial codebase, then install NotarAI to watch for drift as the project evolves.

Key differentiators

Post-generation drift detection. NotarAI keeps watching after the initial generation is done. When code changes but the spec stays the same (or the reverse), NotarAI surfaces the conflict and proposes updates.

Deterministic CI checks. notarai check runs without network access, API keys, or LLM calls. It completes in under 2 seconds and produces structured JSON for CI consumption. No other SDD tool offers a fully deterministic, headless drift analysis.

Structured specs with schema validation. Specs are machine-readable YAML validated against a JSON Schema. This enables deterministic tooling (lint, check, scoring) that Markdown-based specs cannot support.

Agent-agnostic reconciliation. notarai export-context produces self-contained prompts that any LLM can process. The MCP server speaks a standard protocol. You are not locked into a single agent ecosystem.

Propose-and-approve only. NotarAI never auto-modifies code or specs. Every change is surfaced for human review. The spec is the tiebreaker when code and spec disagree, but the human decides what to do about it.

When to choose NotarAI

NotarAI is a good fit when:

  • You already have a codebase and want to add spec coverage incrementally
  • You want CI to catch spec drift automatically, without LLM calls
  • You care about structured, machine-readable specs rather than freeform Markdown
  • You want to use multiple LLM agents without lock-in

NotarAI may not be the right choice when:

  • You need a tool that generates entire codebases from specs (use Spec Kit, OpenSpec, or Kiro instead, then add NotarAI for ongoing maintenance)
  • Your team prefers freeform Markdown specs without schema constraints
  • You need a fully proprietary, managed solution with built-in billing

Using NotarAI alongside other tools

NotarAI’s .notarai/ spec format captures the same intent, behaviors, and constraints that other SDD tools produce. A typical combined workflow:

  1. Use Spec Kit or OpenSpec to bootstrap your initial codebase from specs
  2. Run notarai init and notarai-bootstrap to create .notarai/ specs from the existing code
  3. Use notarai check in CI and /notarai-reconcile during development
  4. Continue using your preferred generation tool for new features; NotarAI watches everything

See Brownfield Adoption for a step-by-step guide.

Inspirations

NotarAI draws from several established traditions:

  • Cucumber / Gherkin: The Given/Then behavior format in NotarAI specs comes from BDD’s structured scenario language, but kept in natural language rather than formal Gherkin syntax to lower the authoring barrier.
  • Terraform and Infrastructure-as-Code: The reconciliation model (declare desired state, detect drift from actual state, propose a plan to converge) is borrowed from IaC tools like Terraform, Pulumi, and CloudFormation. NotarAI’s spec is a state file for intent, not infrastructure.
  • JSON Schema / OpenAPI: The $ref composition model and the use of a JSON Schema to govern spec validity come directly from these standards.
  • Design by Contract (Eiffel): The distinction between constraints (what the system enforces) and invariants (what must never be violated) echoes Eiffel’s preconditions, postconditions, and class invariants.
  • Architecture Decision Records: The decisions field in the spec is a lightweight ADR log, capturing the why alongside the what.

Contributing

Your interest in contributing to this project is appreciated. Below is a series of instructions that will hopefully remain up to date because this tool should help manage that. However, if you notice that the steps seem out of date or misaligned with current practices in the repo, an update to this document could be a high-value first or second contribution to the project.

Note that the project’s own spec drift is self-managed, so please get acquainted with the tool and make sure your contributions stay in sync.

Development Setup

Install Rust (stable toolchain). Install pre-commit for pre-commit hooks.

Temporarily (until biome supports markdown), install prettier.

Setup clippy and rustfmt via:

rustup component add rustfmt clippy

Then setup the repo:

git clone https://github.com/davidroeca/NotarAI.git
cd NotarAI
cargo build
cargo install biome
cargo install --path crates/notarai
pre-commit install

The cargo install step installs the notarai binary to ~/.cargo/bin so the Claude Code hook (notarai hook validate) resolves correctly. Re-run it whenever you want the installed binary to reflect your latest local changes.

Making Changes

  1. Create a branch from main
  2. Make your changes
  3. Run cargo build to verify compilation
  4. Run cargo test to run the test suite
  5. Run cargo fmt --check to verify formatting
  6. Run cargo clippy -- -D warnings to check for lint issues
  7. Use the /notarai-reconcile Claude Code command to check for spec drift
  8. Add a changeset if your PR should trigger a release (see below)
  9. Open a pull request

Changesets

This project uses sampo for versioning and changelogs. If your PR introduces user-visible changes (new features, bug fixes, breaking changes), add a changeset:

sampo add

This creates a Markdown file in .sampo/changesets/ describing the change and the bump level (patch, minor, or major). Commit this file with your PR.

When changesets are merged to main, a release PR is automatically created. Merging the release PR publishes the new version.

Code Style

  • Rust 2024 edition
  • cargo fmt for Rust formatting
  • cargo clippy for Rust lints
  • biome format --check for non-Rust file formatting (JSON, JS/TS, CSS, etc.)
  • prettier --check for Markdown formatting (temporary until biome#3718 is resolved)
  • Functional style preferred over excessive use of structs with methods
  • Core library lives in crates/notarai/src/core/ (not src/lib/ due to Rust’s reserved module name)

Project Structure

See CLAUDE.md in the repository root for a detailed layout and architectural constraints.

Good First Contributions

These changes will drive broader adoption but are not yet a priority:

  • Support other coding agents (e.g. Codex, Aider, Cline, OpenHands, Goose, opencode)
  • Find/create new issues and reference them here

License

By contributing, you agree that your contributions will be licensed under the Apache License 2.0.