Specora Core is a five-tier, LLM-native Contract-Driven Development engine. It takes declarative YAML contracts as input and produces working software as output. The primary interface is your LLM coding agent (Claude Code, Cursor, Windsurf), which reads CLAUDE.md and operates the entire system via Python API calls.
This document describes the full architecture: the five tiers, the LLM-native workflow, the compiler pipeline, the Intermediate Representation, the generator system, the repository pattern, the self-healing loop, and the project structure.
Tier 1: FORGE — Deterministic generation
Contract -> Compile -> IR -> Generate -> Code
Zero tokens. Sub-second. Repeatable.
Tier 2: FACTORY — LLM-powered authoring
Human describes feature -> LLM writes contract via Python API -> Forge generates
One-time LLM cost. Conversational. Guided.
Tier 3: HEALER — Self-healing
Error detected -> Healer classifies -> Proposes diff -> Auto-apply or human approval -> Forge regenerates
Autonomous bug fixing at the specification level.
Tier 4: EXTRACTOR — Reverse-engineering
Existing codebase -> Scan -> Extract -> Cross-reference -> Emit contracts
Converts legacy code into the contract system.
Tier 5: ADVISOR — Proactive evolution (planned)
Telemetry observed -> Advisor detects patterns -> Proposes new contracts -> Human approves
The platform evolves based on how it's actually used.
The core engine. Contracts go in, code comes out. The pipeline is entirely deterministic – no LLM calls, no network requests, no randomness. Given the same contracts, it produces the same output every time.
Forge handles: validation, dependency resolution, IR compilation, and code generation for all targets (TypeScript, FastAPI, PostgreSQL, Docker, production FastAPI with repositories, and tests).
The Factory is a contract authoring system. Your LLM coding agent describes what you want and calls Python emitter functions (emit_entity, emit_route, emit_page, emit_workflow) to generate well-formed contracts. Those contracts then flow through Forge for code generation. The Factory can also be used via the CLI (spc factory new) for interactive interviews.
Requires at least one LLM provider configured (see LLM Providers).
The Healer watches for errors – validation failures, compilation errors, runtime exceptions – and proposes fixes at the contract level. Tier 1 fixes (naming normalization, format corrections) are auto-applied. Tier 2-3 fixes (structural changes, missing fields) require human approval.
See Healer Documentation for the full pipeline.
The Extractor analyzes existing Python and TypeScript codebases to produce Specora contracts. It scans source files, extracts entities and routes, cross-references relationships, detects workflows, and emits .contract.yaml files.
See Extractor Documentation for details.
The Advisor will monitor telemetry from running applications, detect usage patterns, and propose contract improvements. Not yet implemented.
The primary interface for Specora Core is your LLM coding agent. The workflow:
1. pip install specora-core
2. specora-init my_app # Scaffolds project with CLAUDE.md
3. cd my_app
4. Open LLM (Claude Code, Cursor, Windsurf)
5. Talk to the LLM # It reads CLAUDE.md, calls Python API
6. docker compose up -d # Boots the generated app
The LLM reads CLAUDE.md – the LLM operating manual – which contains:
The LLM never needs the CLI. It calls Python functions directly:
# Validate
from forge.parser.loader import load_all_contracts
from forge.parser.validator import validate_all
errors = validate_all(load_all_contracts(Path("domains/my_domain")))
# Compile
from forge.ir.compiler import Compiler
ir = Compiler(contract_root=Path("domains/my_domain")).compile()
# Generate
from forge.targets.fastapi_prod.generator import FastAPIProductionGenerator
for f in FastAPIProductionGenerator().generate(ir):
(Path("runtime") / f.path).parent.mkdir(parents=True, exist_ok=True)
(Path("runtime") / f.path).write_text(f.content)
The docker generator produces a Docker Compose stack with three services:
docker-compose.yml
|
+-- db (PostgreSQL 16)
| Schema auto-applied via init script
| Health check: pg_isready
|
+-- app (FastAPI)
| Waits for db health check
| Error middleware reports to Healer
| Port 8000
|
+-- healer (Healer sidecar)
Receives error reports from app
Classifies, proposes fixes
Port 8083
The app includes error reporting middleware that POSTs unhandled exceptions to the Healer sidecar at http://healer:8083/healer/ingest. The Healer classifies the error, proposes a contract-level fix, and queues it for approval. This closes the feedback loop between running software and contracts.
The Forge compiler transforms contracts into code through six stages:
Contracts (.contract.yaml)
|
v
[1. Parser] Load + discover .contract.yaml files, validate envelope
|
v
[2. Validator] Check each contract against its kind-specific meta-schema
|
v
[3. Dep Graph] Build dependency graph, detect cycles, topological sort
|
v
[4. IR Compiler] Transform contracts into Intermediate Representation
|
v
[5. IR Passes] Expand mixins, bind workflows, resolve references, infer tables
|
v
[6. Generators] IR -> code (TypeScript, FastAPI, PostgreSQL, Docker, etc.)
forge/parser/loader.py)Discovers all .contract.yaml files recursively under a domain directory. Loads YAML, validates the envelope structure (apiVersion, kind, metadata, spec), and computes the Fully Qualified Name (FQN) for each contract. Also loads stdlib contracts from spec/stdlib/.
forge/parser/validator.py)Validates each contract against its kind-specific meta-schema using the jsonschema library. Meta-schemas are JSON Schema draft 2020-12 documents stored as YAML in spec/meta/. A local registry allows $ref between meta-schemas without network calls.
forge/parser/graph.py)Builds a directed graph where nodes are contracts and edges come from requires arrays. Performs:
forge/ir/compiler.py)Transforms each contract into its IR representation, following topological order so dependencies are available when needed. Dispatches to kind-specific compilation methods (one per contract kind).
forge/ir/passes/)Post-compilation transformations that run in order:
mixin_expansion.py) – Copies mixin fields into entities. Entity fields take precedence on name conflicts.table_name_inference.py) – Infers PostgreSQL table names by pluralizing entity names.state_machine_binding.py) – Attaches workflow StateMachineIR to entities that reference them. Adds a state field with valid enum values.reference_resolution.py) – Validates that all cross-entity references point to existing entities. Infers route base_paths.Generators consume ONLY the DomainIR. They never import the parser, validator, or raw contracts. This is the IR firewall – it ensures generators are target-agnostic and pluggable.
The IR is the center of the architecture. It sits between contracts and generators as a normalized, target-agnostic application model.
+-------------------+
| TypeScript |
+------------>| Generator |--> types.ts
| +-------------------+
+-----------+ +---+---+ +-------------------+
| Contracts |-->| IR |------->| FastAPI |--> routes.py, models.py
| (YAML) | | | | Generator |
+-----------+ +---+---+ +-------------------+
| +-------------------+
+------------>| PostgreSQL |--> schema.sql
| | Generator |
| +-------------------+
| +-------------------+
+------------>| FastAPI-Prod |--> repos, auth, docker
| Generator |
+-------------------+
forge/ir/model.py)All models are Pydantic BaseModel subclasses:
| Model | Purpose |
|---|---|
DomainIR |
The complete domain – everything a generator needs |
EntityIR |
Data model with expanded mixin fields, bound state machine |
FieldIR |
Normalized field: type, constraints, reference, computed |
ReferenceIR |
Cross-entity reference (FK + graph edge) |
StateMachineIR |
States, transitions, guards, side effects |
StateIR |
Single state with label, category, terminal flag |
PageIR |
UI specification: route, views, actions, filters |
RouteIR |
API route set: base path, endpoints, global behaviors |
EndpointIR |
Single API endpoint: method, path, validation, auto-fields |
AgentIR |
AI behavior: trigger, threshold, constraints, fallback |
MixinIR |
Reusable field group (pre-expansion) |
InfraIR |
Infrastructure config: auth, deployment, database |
Critical rule: Generators ONLY import forge.ir.model. They never see raw contracts, YAML, or the parser.
This means:
Every generator implements BaseGenerator from forge/targets/base.py:
class BaseGenerator(ABC):
@abstractmethod
def name(self) -> str:
"""Short lowercase name (e.g., 'typescript', 'fastapi-prod')."""
...
@abstractmethod
def generate(self, ir: DomainIR) -> list[GeneratedFile]:
"""Generate code files from the IR."""
...
Each GeneratedFile has a path (relative output path), content (full file text), and provenance (source contract FQN).
Every generated file includes a provenance header:
# ======================================================================
# @generated -- DO NOT EDIT
#
# This file was generated by Specora Forge from contract specifications.
# Any manual changes will be overwritten on the next generation.
#
# Source: entity/library/book
# Generated: 2026-04-07 12:00 UTC
# ======================================================================
| Target | Name | What it generates |
|---|---|---|
| TypeScript | typescript |
Interfaces with typed fields and JSDoc |
| FastAPI (basic) | fastapi |
Route handlers with inline Pydantic models |
| PostgreSQL | postgres |
CREATE TABLE DDL with indexes |
| FastAPI (production) | fastapi-prod |
Config, models, repositories, routes, app, auth |
| Docker | docker |
Dockerfile, docker-compose.yml, .env.example, requirements.txt |
| Tests | tests |
Black-box pytest tests from route contracts |
| Next.js | nextjs |
Frontend pages, components, API client |
| Migrations | migrations |
Incremental SQL migration files |
Aliases:
| Alias | Expands to |
|---|---|
prod (default) |
fastapi-prod + postgres + docker + tests + nextjs |
forge/targets/{name}/BaseGenerator – only import forge.ir.model_get_generators() in forge/cli/main.pyThe fastapi-prod generator produces a clean layered architecture using the repository pattern:
backend/
config.py # 12-factor env config
models.py # Pydantic Create/Update/Response models
app.py # FastAPI application with CORS, routers, health check
routes_{entity}.py # Route handlers -- call repository, not SQL
auth/
interface.py # Abstract AuthProvider
jwt_provider.py # JWT implementation (bcrypt + python-jose)
middleware.py # require_auth, require_role dependencies
repositories/
base.py # Abstract interfaces + factory functions
memory.py # In-memory dict adapters (dev/test)
postgres.py # PostgreSQL adapters (asyncpg)
For each entity, the generator creates an abstract base class:
class BookRepository(ABC):
async def list(self, limit, offset, filters) -> tuple[list[dict], int]: ...
async def get(self, id: str) -> dict | None: ...
async def create(self, data: dict) -> dict: ...
async def update(self, id: str, data: dict) -> dict | None: ...
async def delete(self, id: str) -> bool: ...
async def transition(self, id: str, new_state: str) -> dict | None: ... # if state machine
The get_{entity}_repo() factory reads DATABASE_BACKEND from config:
def get_book_repo() -> BookRepository:
from backend.config import DATABASE_BACKEND
if DATABASE_BACKEND == "postgres":
from backend.repositories.postgres import PostgresBookRepository
return PostgresBookRepository()
from backend.repositories.memory import MemoryBookRepository
return MemoryBookRepository()
Set DATABASE_BACKEND=memory for development/testing (no persistence).
Set DATABASE_BACKEND=postgres for production (asyncpg connection pool).
The self-healing loop is the end-to-end story that makes Specora Core more than a code generator. It closes the feedback loop between running software and contracts:
[1. Error Occurs]
Runtime exception, validation failure, test failure, compilation error
|
v
[2. Healer Intake]
Error is ingested via CLI (spc healer fix), HTTP API, or manual submission
|
v
[3. Classification]
Classifier assigns: error_type, tier (1/2/3), priority (critical/high/medium/low)
|
v
[4. Proposal]
Tier 1: Deterministic fix (normalize_contract)
Tier 2-3: LLM-powered structural fix (reads contract + diff history)
|
v
[5. Approval]
Tier 1: Auto-applied (confidence = 1.0)
Tier 2-3: Queued for human approval
|
v
[6. Application]
Fix is applied to the contract YAML file
Diff is recorded in .forge/diffs/
|
v
[7. Regeneration]
Forge recompiles and regenerates code from the updated contract
|
v
[8. Notification]
Console log, JSONL file, optional webhook POST
This means bugs get fixed at the specification level, not in generated code. The contract gets smarter with every fix, and the diff history teaches the Healer what patterns of fixes work.
Every contract mutation is tracked by the diff system (forge/diff/):
Contract Change
|
v
+---------------+
| Tracker | compute_diff(before, after)
| (deepdiff) | -> list[FieldChange]
+-------+-------+
|
v
+---------------+
| DiffStore | save(ContractDiff)
| (.forge/) | -> JSON file + index
+-------+-------+
|
v
+----------------------+
| LLM Context | format_for_llm(fqn, n=10)
| (for Healer/Advisor)| -> structured text
+----------------------+
Each diff records:
| Field | Type | Description |
|---|---|---|
id |
string | Unique diff identifier |
fqn |
string | Contract FQN |
timestamp |
datetime | When the change occurred |
origin |
enum | human, healer, advisor, factory |
origin_detail |
string | Additional context (ticket ID, user, etc.) |
reason |
string | Why the change was made |
changes |
list[FieldChange] | JSONPath-level changes |
before_hash / after_hash |
string | Content hashes |
before_snapshot / after_snapshot |
dict | Full contract before/after |
Change types: added, removed, modified, type_changed.
Storage: .forge/diffs/ directory, file-based JSON indexed by FQN.
specora-core/
|
+-- spec/ # CONTRACT LANGUAGE DEFINITION
| +-- meta/ # Meta-schemas (one per contract kind)
| | +-- envelope.meta.yaml # Common structure: apiVersion, kind, metadata
| | +-- entity.meta.yaml # Entity contract validation
| | +-- workflow.meta.yaml # Workflow (state machine) validation
| | +-- page.meta.yaml # Page (UI spec) validation
| | +-- route.meta.yaml # Route (API behavior) validation
| | +-- agent.meta.yaml # Agent (AI behavior) validation
| | +-- mixin.meta.yaml # Mixin (reusable fields) validation
| | +-- infra.meta.yaml # Infrastructure validation
| +-- stdlib/ # Standard library contracts
| +-- mixins/ # timestamped, identifiable, auditable, taggable
| +-- workflows/ # crud_lifecycle, approval, ticket
|
+-- forge/ # TIER 1: THE COMPILER ENGINE
| +-- parser/ # Load, validate, resolve contracts
| | +-- loader.py # Discover + load .contract.yaml files
| | +-- validator.py # Validate against meta-schemas
| | +-- graph.py # Dependency graph, cycle detection, topo sort
| +-- ir/ # Intermediate Representation
| | +-- model.py # IR data models (DomainIR, EntityIR, etc.)
| | +-- compiler.py # Contract -> IR transformation
| | +-- passes/ # Post-compilation passes
| | +-- mixin_expansion.py
| | +-- table_name_inference.py
| | +-- state_machine_binding.py
| | +-- reference_resolution.py
| +-- targets/ # Code generators (IR -> code)
| | +-- base.py # BaseGenerator interface + provenance headers
| | +-- typescript/ # TypeScript interfaces
| | +-- fastapi/ # Basic FastAPI routes
| | +-- postgres/ # PostgreSQL DDL
| | +-- fastapi_prod/ # Production FastAPI (repos, auth, docker, tests)
| | +-- generator.py # Orchestrator (3 generators: fastapi-prod, docker, tests)
| | +-- gen_config.py # 12-factor config module
| | +-- gen_models.py # Pydantic Create/Update/Response models
| | +-- gen_repositories.py # Abstract + Memory + Postgres adapters
| | +-- gen_routes.py # Route handlers calling repositories
| | +-- gen_app.py # FastAPI app with middleware stack
| | +-- gen_auth.py # Auth interface + JWT provider + middleware
| | +-- gen_docker.py # Dockerfile, compose, .env.example, requirements.txt
| | +-- gen_tests.py # Black-box pytest tests (stub)
| +-- diff/ # Contract diff tracking
| | +-- models.py # ContractDiff, FieldChange, DiffOrigin
| | +-- tracker.py # Compute structural diffs
| | +-- store.py # Persist + query diffs
| +-- cli/ # CLI commands
| +-- main.py # Click-based CLI entry point
|
+-- factory/ # TIER 2: LLM-POWERED AUTHORING
|
+-- healer/ # TIER 3: SELF-HEALING PIPELINE
| +-- models.py # HealerTicket, HealerProposal, enums
| +-- queue.py # SQLite-backed priority queue
| +-- pipeline.py # Pipeline orchestrator
| +-- notifier.py # Console, webhook, file notifications
| +-- monitor.py # Metrics and success rate tracking
| +-- watcher.py # File system watcher
| +-- applier.py # Apply proposals to contract files
| +-- analyzer/ # Error classification
| | +-- classifier.py # Tier assignment, error typing
| +-- proposer/ # Fix proposal strategies
| | +-- deterministic.py # Tier 1: normalize_contract()
| | +-- llm_proposer.py # Tier 2-3: LLM-powered structural fixes
| +-- api/ # HTTP service
| | +-- server.py # FastAPI endpoints for remote healing
| +-- cli/ # CLI commands
| +-- commands.py # fix, status, tickets, show, approve, reject, serve, history
|
+-- extractor/ # TIER 4: REVERSE-ENGINEERING
| +-- models.py # ExtractedEntity, ExtractedRoute, AnalysisReport
| +-- scanner.py # Pass 1: File discovery and classification
| +-- analyzers/ # Pass 2: Language-specific extraction
| | +-- python_models.py # Pydantic, SQLAlchemy, dataclass extraction
| | +-- typescript_types.py # TypeScript interface/type extraction
| | +-- routes.py # FastAPI/Express route extraction
| +-- cross_ref.py # Pass 3: Relationship resolution, workflow detection
| +-- synthesizer.py # Pass 4: Build AnalysisReport
| +-- reporter.py # Interactive accept/skip per entity
| +-- emitter.py # Write .contract.yaml files
| +-- cli/ # CLI commands
| +-- commands.py # spc extract
|
+-- engine/ # SHARED LLM INFRASTRUCTURE
| +-- config.py # Provider auto-detection from environment
| +-- registry.py # Model capabilities catalog (15 models)
|
+-- advisor/ # TIER 5: PROACTIVE EVOLUTION (planned)
|
+-- domains/ # USER'S DOMAIN CONTRACTS (the input)
| +-- library/ # Example: Library domain
| +-- entities/ # book, author, patron
| +-- workflows/ # book_lifecycle
| +-- pages/ # books
| +-- routes/ # books API
|
+-- runtime/ # GENERATED CODE (the output)
| +-- backend/ # Generated FastAPI routes + models
| +-- frontend/ # Generated frontend configs
| +-- database/ # Generated SQL DDL
|
+-- tests/ # Test suite
+-- docs/ # Documentation
+-- pyproject.toml # Package config
+-- .env.example # All environment variables
Contracts are the source of truth. Code is a derived, disposable artifact. Delete all generated code, keep the contracts, and the engine regenerates everything.
IR is the firewall. Generators see only forge.ir.model. They never touch raw YAML or the parser. This makes targets pluggable and testable in isolation.
Diffs, not replacements. Every contract mutation is tracked with who/what/why context. This feeds the Healer and Advisor with historical context.
Progressive complexity. The stdlib provides simple building blocks. Domains compose them into complex models. The engine handles the mechanical parts; LLMs handle the creative parts.
Meta-schemas enforce correctness. Invalid contracts are caught at compile time, not at runtime. The meta-schema is the law.
Tiered autonomy. Simple fixes are auto-applied. Complex changes require human approval. The system earns trust through transparency.
12-factor generated apps. All configuration comes from environment variables. Database backends are swappable. Auth is optional and contract-driven.