aura-labs.ai

ADR-002: Three-Layer NLP Architecture with Shared Module and Managed Inference

Status: Accepted Date: 2026-03-09 Decision Makers: Marc Massar Supersedes: ADR-001 (NLP Distribution) Decision Log: DEC-024

Context

ADR-001 established a tiered NLP distribution: Scout (20% regex), Core (70% Granite via Replicate), Beacon (10% optional). With the commerce lifecycle now complete — Ed25519 identity (DEC-009), merchant integration hooks (DEC-010), three-layer observability (DEC-011/019), OWASP security remediation (DEC-020/021), and E2E tests (DEC-022) — the platform is ready for production language processing.

Three problems with ADR-001 have become clear:

Regex-only Scout is insufficient at enterprise scale. The Scout SDK will be embedded in diverse purchasing channels (Discord, Slack, Chrome, iOS, MCP, Shopify). Commerce intent expressions are near-infinite: “I need industrial fasteners, grade 8, metric, for automotive assembly” has no reliable regex pattern for category or product characteristics. A deterministic-only approach cannot serve the breadth of channels this platform targets.
Replicate hosting doesn’t meet control requirements. The platform needs managed infrastructure where we control the model instance, not a third-party API wrapper.
Scout and Beacon share NLP concerns but have no shared code. Both need to understand intent structure — Scout for validation, Beacon for interpretation. Duplicating extraction logic across SDKs is wasteful and will diverge over time.

Additionally, the original architecture placed clarification logic at Core (Tier 3 of intent-svc), requiring a round-trip to Core for every incomplete intent. Moving clarification to the Scout layer provides faster user feedback and reduces wasted Core LLM calls.

Decision

Architecture Overview

User → Scout SDK (@aura-labs/nlp) → AURA Core (Granite 8B) → Beacon SDK (@aura-labs/nlp)
         │                                │                        │
         ├─ Completeness gate             ├─ Deep semantic parse   ├─ Domain interpretation
         ├─ Conversational clarification  ├─ Category taxonomy     ├─ Inventory matching
         └─ Hybrid: regex + small model   ├─ Constraint normalize  └─ Structured offer
                                          └─ Confidence scoring

Layer 1 — Shared NLP Package (`@aura-labs/nlp`)

A common module used by both Scout and Beacon SDKs. Single source of truth for intent structure understanding.

Components:

Module	Responsibility
`categories.js`	Defines the eight intent categories (4 Tier 1 + 4 Tier 2) and their detection rules
`completeness.js`	Evaluates intent against tiered categories, returns missing required categories
`conversation.js`	Generates targeted clarification questions for missing categories
`provider.js`	LLM provider abstraction (mock for tests, Together/Fireworks for production)
`activity.js`	Structured event emission following DEC-011 ActivityLogger pattern

Eight Intent Categories (Tiered):

Tier 1 — Always Required:

#	Category (`key`)	Examples	Detection Method
1	Category of goods/services (`what`)	“keyboards”, “cloud hosting”, “legal services”	Model (semantic classification)
2	Quantity / Recurrence (`how_many`)	“50 units”, “monthly”, “a dozen”	Regex (reliable numeric patterns)
3	Product characteristics (`what_kind`)	“ergonomic”, “grade A”, “enterprise-grade”	Model (semantic understanding)
4	Price / Commercial terms (`how_much_cost`)	“under $5000”, “£200-£500 range”, “budget flexible”	Regex (reliable currency patterns)

Tier 2 — Contextually Required (triggered by intent language):

#	Category (`key`)	Examples	Detection Method	Trigger Examples
5	Location / Jurisdiction (`where`)	“deliver to London”, “EU only”	Hybrid (regex + model)	“ship”, “deliver”, “physical product”
6	Timing / Urgency (`when`)	“by Friday”, “ASAP”, “next week”	Regex (temporal patterns)	“by”, “deadline”, “urgent”
7	Use Case / Purpose (`why`)	“for office renovation”, “conference event”	Model	“for a/an/the”, “purpose”, “project”
8	Values / Ethical / Environmental Impact (`values_impact`)	“eco-friendly”, “fair trade”, “B Corp”	Model	“sustainable”, “organic”, “ethical”

Tiered Logic: Tier 1 categories are always required for completeness. Tier 2 categories are only required when the intent text contains trigger patterns for that category. A simple intent like “500 paper clips under $20” requires only Tier 1. An intent mentioning “eco-friendly biodegradable ink” triggers values_impact as an additional requirement.

Design Principle — Presence Detection, Not Semantic Understanding:

The shared module checks whether each category is addressed in the intent, not what the values are. “Did the user mention price?” is a completeness question. “The maximum price is $5000” is a semantic extraction that belongs at Core. This preserves NEUTRAL_BROKER.md Property 1 (Prompt Injection Resistance) — Core retains exclusive authority over semantic interpretation.

Layer 2 — Scout SDK (Completeness Gate + Conversational Interface)

The Scout’s role changes from “basic regex extraction” to “completeness validation with targeted clarification.” The Scout is the quality gate that ensures Core only processes intents with sufficient substance.

Conversational API (channel-agnostic):

// Works identically in Discord, Slack, Chrome, iOS, MCP, CLI
const intentSession = scout.createIntentSession();

// Round 1: User submits initial intent
let result = await intentSession.submit("I need 50 ergonomic keyboards");
// result.complete === false
// result.missing === ['how_much_cost', 'when']
// result.question === "What's your budget or price range?
//                      Also, when do you need this by?"

// Round 2: User provides missing information
result = await intentSession.submit("Under $5000, by end of month");
// result.complete === true
// result.intent === { raw: "...", categories: {...}, confidence: 0.91 }

// Only now does the complete intent go to Core
const session = await scout.intent(result.intent);

Hybrid Detection Strategy:

Regex for quantity, price, delivery — these have reliable syntactic patterns
Small local model (Granite 3B or equivalent) for category of goods/services and product characteristics — these require semantic understanding that keyword lists cannot cover

Key Properties:

Channel-agnostic: any client calls submit() and gets back either a question or a complete intent
Conversational: maintains context across multiple submit() calls within an intent session
Non-extractive: the Scout validates presence, it does not parse values. Value extraction is Core’s job
Auditable: every intent session records the clarification loop for traceability

Layer 3 — Core (Deep Semantic Parsing via Managed Granite 8B)

Core receives validated, complete intents from Scouts and performs deep semantic analysis.

Provider Abstraction:

// Interface — all providers implement this contract
{
  async parse(intent, context) → { structured, confidence, ambiguities }
  async healthCheck() → { status, latency }
}

// Implementations
// - TogetherProvider (Together.ai managed inference)
// - FireworksProvider (Fireworks managed inference)
// - ReplicateProvider (migration path from ADR-001)
// - LocalProvider (for testing and development)
// - MockProvider (deterministic responses for unit tests)

Core Semantic Analysis:

Category taxonomy mapping (e.g., “keyboards” → office_equipment.keyboards)
Synonym resolution (e.g., “cheap” → price-sensitive, “ASAP” → urgent delivery)
Constraint normalization (e.g., “next week” → ISO 8601 date, “50 bucks” → { max: 50, currency: "USD" })
Market context enrichment (category-aware matching against registered Beacon capabilities)
Confidence scoring with granular breakdown per category

Managed Inference Provider:

Primary: Together.ai or Fireworks (decision deferred to implementation; both support Granite)
Model: IBM Granite 8B Instruct (Apache 2.0 licensed)
Fallback: Local regex parser (existing intent-parser.js) when provider unavailable
Test: MockProvider returns deterministic responses for all test scenarios

Layer 4 — Beacon SDK (Domain-Specific Interpretation)

Beacons use the same @aura-labs/nlp module to interpret incoming session intents against their domain expertise.

beacon.onSession(async (session, beacon) => {
  // Shared NLP module interprets intent in Beacon's domain context
  const interpretation = await beacon.interpretIntent(session.intent, {
    domain: beacon.capabilities,
    inventory: beacon.products,
  });

  const matches = beacon.matchProducts(interpretation);
  if (matches.length > 0) {
    await beacon.submitOffer(session.sessionId, buildOffer(matches));
  }
});

Beacon receives both:

session.intent.raw — original natural language (for domain-specific nuance)
session.intent.structured — Core’s semantic parse (for reliable field access)

Beacon must respond with: structured offers (for fair comparison across Beacons). This is unchanged from ADR-001.

Data Flow Specification

Scout → Core Request (Updated from ADR-001)

{
  "raw_intent": "I need 50 ergonomic keyboards under $5000, delivery by end of month",
  "completeness": {
    "categories_present": ["what", "how_many", "what_kind", "how_much_cost", "when"],
    "categories_missing": [],
    "tier2_triggered": ["when"],
    "clarification_rounds": 0,
    "confidence": 0.91
  },
  "scout_version": "2.0.0",
  "nlp_module_version": "1.0.0"
}

Change from ADR-001: Scout no longer sends pre-extracted fields (quantity, max_budget, etc.). It sends the raw intent with a completeness attestation. Core does all value extraction. This preserves the semantic authority boundary.

Core → Beacon Query (Unchanged)

{
  "session_id": "abc123",
  "raw_intent": "I need 50 ergonomic keyboards under $5000, delivery by end of month",
  "structured": {
    "category": "office_equipment.keyboards",
    "keywords": ["ergonomic", "keyboard"],
    "quantity": 50,
    "max_unit_price": 100,
    "currency": "USD",
    "delivery_by": "2026-02-28"
  },
  "confidence": 0.92
}

Beacon → Core Response (Unchanged)

{
  "offer_id": "offer_xyz",
  "beacon_id": "beacon_456",
  "product": {
    "name": "ErgoKey Pro Wireless",
    "sku": "EKP-2026",
    "unit_price": 89.99,
    "quantity_available": 200
  },
  "total_price": 4499.50,
  "currency": "USD",
  "delivery_estimate": "2026-02-25",
  "confidence": 0.88,
  "interpretation_notes": "Matched 'ergonomic keyboards' to ErgoKey Pro line"
}

Formal Properties Preserved

Property	Source	How Preserved
Prompt Injection Resistance	NEUTRAL_BROKER.md Property 1	Scout does presence detection only; Core retains semantic authority
Identity Abstraction	NEUTRAL_BROKER.md Property 4	Unchanged — Scout identity decoupled from intent
Constraint Redaction	NEUTRAL_BROKER.md Property 5	Unchanged — Beacons see categories, not budgets/deadlines
Ranking Independence	NEUTRAL_BROKER.md Property 7	Unchanged — Core ranks by compatibility, not payment
Ownership Predicates	DEC-021 (BOLA/BFLA)	Unchanged — all queries include `AND agent_id = $N`
Ed25519 Authentication	DEC-009	Unchanged — all requests signed
Idempotency	DEC-018	Unchanged — UUID keys on all mutations

Failure Modes

Scenario	Behavior
Scout small model unavailable	Fall back to regex-only completeness check (degraded but functional)
Scout can’t determine completeness	Send intent to Core with `completeness.confidence < threshold`; Core handles full parse
Core managed inference unavailable	Fall back to local regex parser (existing `intent-parser.js`)
Core managed inference low confidence	Return `needs_clarification` to Scout; Scout asks user to rephrase
Beacon can’t interpret	Respond with null offer or low-confidence match (unchanged)
Shared NLP module version mismatch	Scout/Beacon include `nlp_module_version` in requests for diagnostics

Implementation Phases

Phase A: Shared NLP Module + Provider Abstraction

@aura-labs/nlp package: 8-category tiered definitions, completeness checker, conversation generator, provider abstraction, activity logger
Provider implementations: MockProvider (tests), TogetherProvider or FireworksProvider (production)
Test suite: completeness detection across diverse commerce scenarios, provider contract tests, conversation flow tests

Phase B: Scout + Core Integration

Scout SDK: createIntentSession() conversational API, integration with @aura-labs/nlp
Core intent-svc: replace Replicate with provider abstraction, wire to managed Granite 8B
Core sessions.js: call intent-svc instead of local regex parser
E2E test: Scout → completeness check → Core → parsed intent → stored in session

Phase C: Beacon Integration + Validation

Beacon SDK: interpretIntent() method using @aura-labs/nlp
Updated beacon matching with LLM-parsed intents
Full lifecycle E2E: Scout intent → Core parse → Beacon interpret → offer → commit

Consequences

Positive

Channel-agnostic intelligence: Any purchasing channel gets NLP for free by using the Scout SDK
Reduced Core load: Only complete intents reach Core; incomplete intents are resolved at the Scout
Faster user feedback: Clarification happens locally, no round-trip to Core
Shared code: Single NLP module eliminates duplication across Scout and Beacon
Provider flexibility: Inference backend swappable without changing business logic
Preserves formal properties: Semantic authority stays at Core; Scout validates presence only

Negative

Scout dependency on model inference: Scouts now need access to a small model (increases SDK requirements)
Shared module coupling: Scout and Beacon SDKs depend on @aura-labs/nlp (version coordination required)
Increased test surface: Three-layer architecture with shared module requires testing at each layer and across layer boundaries

Mitigations

Scout falls back to regex-only if model unavailable (graceful degradation)
Shared module versioned with semver; SDKs pin to compatible ranges
Each phase includes comprehensive test suites at unit, integration, and E2E levels per Feature Readiness Standard

DEC-024 (this decision’s log entry)
ADR-001 (superseded)
DEC-009: Ed25519 Agent Identity (authentication layer unchanged)
DEC-015: Policy Agents as Operator Layer (NLP output must be auditable for policy agents)
DEC-020: Security Acceptance Criteria (NLP changes must satisfy OWASP baseline)

References

NEUTRAL_BROKER.md — Formal security/privacy/neutrality properties
API_SECURITY_BASELINE.md — OWASP API Top 10 requirements
FEATURE_READINESS.md — Acceptance criteria standard
Intent Service — Existing Granite integration (to be refactored)
Intent Parser — Regex fallback (retained)