Cognitive Extension Epic

User Journey

The Cognitive Extension bounded context manages LLM/AI capabilities including prompt templates, completion orchestration, tool execution, and agent session management. It provides context-aware recommendations, generates interactive cognitive artifacts (mind maps, checklists, kanban boards), and enables configurable AI agents with custom prompt templates for diverse use cases from code generation to PET training.

Jobs to be Done & EARS Requirements

Job: Generate AI-Based Code Snippets

User Story: As a Developer, I want the system to generate code snippets based on a business rule defined in the DSL, so that I can rapidly implement features that adhere to enterprise semantics.

EARS Requirement:

While the system is operational, when a user expresses an intent requiring code generation, the cognitive-extension context shall:
1. Extract business rules from SEA-DSL specifications via semantic-core
2. Build prompt template with relevant context (project, role, recent activities)
3. Execute LLM completion via ExecuteCompletion flow
4. Generate code snippets adhering to enterprise semantics
5. Apply prompt injection defense policies
6. Redact secrets from generated content
7. Return contextually relevant code artifacts

Job: Provide Cognitive Amplification

User Story: As a Project Manager, when discussing project risks with my team, I want the AI to suggest a pre-filled risk matrix, so that I can quickly document and analyze potential issues.

EARS Requirement:

While a user is engaged in a task or conversation, when the context indicates an opportunity for cognitive support, the cognitive-extension context shall:
1. Analyze current task context from Knowledge Graph
2. Identify relevant cognitive artifact type (risk matrix, decision tree, checklist)
3. Proactively offer cognitive artifacts enhancing understanding or decision-making
4. Pre-fill artifact with relevant information from semantic context
5. Allow user to accept, modify, or reject the suggestion

Job: Generate Context-Aware Recommendations

User Story: As an AI Agent, I want to understand the user’s current project, role, and recent activities, so that my recommendations are tailored to their specific needs.

EARS Requirement:

When the Artifact Engine analyzes the current user and task context, the cognitive-extension context shall:
1. Query Knowledge Graph for user profile (project, role, recent activities)
2. Retrieve semantic relationships between business entities
3. Match context to cognitive artifact templates
4. Score recommendations by relevance using semantic similarity
5. Return highly relevant and personalized cognitive artifact recommendations

Job: Render Interactive Cognitive Artifacts

User Story: As the CADSL Runtime, I want to receive a CADSL definition, so that I can render a fully interactive cognitive artifact in the user interface.

EARS Requirement:

When the Artifact Engine receives a request for a cognitive artifact, the cognitive-extension context shall:
1. Accept CADSL definition as input
2. Parse artifact structure and components
3. Generate interactive UI programmatically using CADSL schema
4. Support diverse artifact types: mind maps, checklists, kanban boards, decision trees
5. Enable user interaction (edit, save, share) within workflow

Job: Configure AI Agents

User Story: As an AI Operations Engineer, I want to update the prompt template for the customer support AI agent, so that it incorporates new product information.

EARS Requirement:

When an administrator configures AI agents, the cognitive-extension context shall:
1. Accept prompt template updates with new context sources
2. Update available skills and tool execution permissions
3. Associate context sources from Knowledge Graph and Domain Services
4. Tailor agent behavior to specific organizational needs
5. Apply tool execution allowlists and secret redaction policies
6. Enforce RBAC/permission-scope authority boundaries per SDS-031 for privileged configuration changes

Job: Enable Context-Aware AI Behavior

User Story: As the AI Agent (Conversation), I want to receive context from the Knowledge Graph and Domain Services, so that I can answer user questions with accurate and up-to-date enterprise information.

EARS Requirement:

While interacting with a user, when real-time context is provided by various sources, the cognitive-extension context shall:
1. Receive context from Knowledge Graph (semantic relationships)
2. Integrate Domain Services context (business rules, policies)
3. Maintain agent session state across conversation turns
4. Dynamically adapt responses based on real-time context changes
5. Provide accurate, contextually relevant answers and actions

Job: Generate Contextual Cognitive Artifacts

User Story: As a knowledge worker, I want the system to generate cognitive artifacts (mind maps, checklists, kanban boards) from context, so that I can accelerate structured thinking.

EARS Requirement:

When context-driven recommendations are requested, the cognitive-extension context shall:
1. Analyze conversational and project context for intent and scope
2. Select appropriate artifact types based on task category
3. Generate artifacts with source references and rationale
4. Provide editable output with version history
5. Respect tool allowlists and secrets redaction policies
6. Gate privileged artifact generation flows per SDS-031 RBAC/permission-scope authority checks

Job: Enable Interactive Artifact Editing

User Story: As a Knowledge Worker, I want to directly edit the content of a recommended mind map, so that I can customize it to my specific thought process.

EARS Requirement:

When a user interacts with generated cognitive artifacts, the cognitive-extension context shall:
1. Allow seamless editing of artifact content
2. Support real-time updates and collaboration
3. Enable saving of modified artifacts
4. Provide sharing capabilities within workflow
5. Maintain version history for artifact evolution

Job: Evaluate Prompt Quality

User Story: As a learner, I want my prompt to be evaluated for intent, structure, and agentic viability, so that I can receive actionable feedback to improve my prompt engineering skills.

EARS Requirement:

While PET is operational, when a user submits a prompt via POST /v1/judge/evaluate, the cognitive-extension context shall:
1. Detect input language and confidence, set context for all subsequent judges to output feedback in that language, and record any fallback/override decisions
2. Execute Sub-Judges pipeline:
  - Intent Detector: Extract primary and secondary goals using LLM (weight 0.2)
  - Structure Evaluator: Check coherence, specificity, vagueness metrics using rule-based + LLM (weight 0.3)
  - Agentic Viability Evaluator: Validate tool invocation completeness, parameters, step sequencing using LLM (weight 0.5)
  - Constraint Checker: Verify output format (JSON, Markdown), PII constraints, org-specific rules
3. Calculate hybrid score (0-100) from component scores
4. Generate concrete “Fix” suggestions localized to user’s prompt language
5. Compose improvedPrompt by applying suggestions to original
6. Set flags for lesson triggers (e.g., #missing_constraints, #agentic_ambiguity, #vague_intent, #no_examples)
7. Return evaluation with evaluationId, score, summary, sections (intent/structure/agentic), suggestions, improvedPrompt, flags, language, languageConfidence, and languageFallback/override status
8. Complete evaluation within 5 seconds; if evaluation time exceeds 3 seconds, start streaming feedback while ensuring completion within 5 seconds

Job: Display Prompt Feedback in Dual-Pane UI

User Story: As a learner, I want to see my prompt response and judge feedback side-by-side, so that I can immediately understand what to improve.

EARS Requirement:

When prompt evaluation completes, the cognitive-extension context shall:
1. Display Left Pane with:
  - Original user prompt input
  - Raw AI response to the prompt
2. Display Right Pane (Prompt Judge) with:
  - Inferred Intent with score (1-5) and explanation
  - Clarity/Specificity score (1-5)
  - Agentic Effectiveness score (1-5) for multi-step prompts
  - Mental model gaps identified
  - Concrete improvement suggestions by type (add_constraint, add_parameter, clarify_intent)
  - Overall summary score (0-100)
3. Enable “Apply Improvements” button to replace prompt with improvedPrompt suggestion
4. Enable manual editing for iterative refinement
5. Support streaming feedback when evaluation time >3 seconds and ensure completion within 5 seconds

Job: Trigger Micro-Lessons Based on Prompt Flags

User Story: As a learner, I want the system to suggest targeted lessons when I make specific mistakes, so that I learn concepts in context.

EARS Requirement:

While evaluating prompts, when specific flags are set, the cognitive-extension context shall:
1. Match flags to lesson triggers:
  - #missing_constraints → “Defining Output Constraints” lesson (format, length, safety)
  - #agentic_ambiguity → “Agentic Prompt Clarity” lesson (tool parameters, error handling)
  - #vague_intent → “Sharpening Your Intent” lesson (single vs. multi-goal prompts)
  - #no_examples → “Power of Few-Shot” lesson (adding examples to prompts)
2. Retrieve human-authored lesson template from Lesson Library
3. Apply SecretsRedaction and PrivacyFirstConsent checks to user examples, removing/obfuscating PII, sensitive, or proprietary data, and log consent status (mandatory)
4. Populate template with redacted user-specific examples using LLM only after consent and redaction checks pass
5. Present lesson suggestion in context of current prompt
6. Track lesson completion and prompt improvement correlation
7. Never use purely generative lessons—all lessons must be human-templated

Job: Award XP and Gamification Progress

User Story: As a learner, I want to earn XP and badges for improving my prompts, so that I stay motivated and track my progress.

EARS Requirement:

While the user is active, when prompt-related actions occur, the cognitive-extension context shall:
1. Calculate XP for actions:
  - Submit prompt: 5 XP base
  - Improve prompt (score +10): 15 XP
  - Complete lesson: 25 XP
  - First prompt of day: 10 XP with streak bonus
  - Perfect score (100): 50 XP
2. Apply Streak Bonus Formula: base_xp × (1 + streak_days × 0.1) (max 2x multiplier)
3. Award Skill Badges when criteria met:
  - Intent Master (Bronze): 10 prompts with intent score ≥ 4
  - Constraint Expert (Silver): 25 prompts with all constraints met
  - Agentic Architect (Gold): 50 agentic prompts with score ≥ 80
  - Streak Champion (Platinum): 30-day continuous streak
4. Update Leaderboards:
  - Scope: Org-level (enterprise), Global (opt-in)
  - Metrics: Weekly XP, Improvement Rate, Badge Count
  - Privacy: Users can opt out at any time
5. Display progress, streaks, and badges in user profile

Job: Support Enterprise Custom Rubrics

User Story: As an enterprise administrator, I want to define organization-specific prompt evaluation rules, so that PET enforces our best practices.

EARS Requirement:

When an enterprise configures custom rubrics, the cognitive-extension context shall:
1. Accept rubric.yaml configuration with:
  - org_id: Organization identifier
  - weights: Custom scoring weights (intent, structure, agentic), summing to 1.0 or accompanied by a defined normalization strategy
  - priorities: Org-specific priority rules (e.g., no_jailbreaks, clear_tool_use)
2. Validate rubric configuration against schema, including weights sum/normalization rules
3. Version rubric configurations for audit trail and pin in-flight evaluations to the version they started with
4. Authorize rubric load into Judge Service context by Organization ID per SDS-031/POL-CE-008 to prevent cross-tenant access
5. Inject org-specific “Best Practice” documents into Judge Service system prompt
6. Enable rubric hot-reload without service restart using atomic swaps; rollback if new rubric causes evaluation failures
7. Ensure new evaluations use the latest approved rubric version while in-flight evaluations continue on their pinned version
8. Return rubric registration confirmation with version ID and validation/authorization results

Job: Export Learning Data via SCORM/xAPI

User Story: As an enterprise L&D administrator, I want to export PET learning data to our LMS, so that prompt engineering training integrates with our existing learning ecosystem.

EARS Requirement:

When export is requested, the cognitive-extension context shall:
1. Generate SCORM package with:
  - User progress data (XP, lessons completed, badges earned)
  - Prompt improvement metrics (before/after scores)
  - Time spent and session history
2. Generate xAPI statements for:
  - Prompt submitted (verb: submitted)
  - Lesson completed (verb: completed)
  - Badge earned (verb: earned)
  - Prompt improved (verb: improved, with score delta)
3. Package data in standard SCORM 1.2 or 2004 format
4. Include metadata: course ID, learner ID, timestamp, score
5. Enable scheduled or manual export via API
6. Support field-level encryption for stored prompts in on-prem deployments

Job: Provide Multilingual Prompt Feedback

User Story: As a global learner, I want to receive prompt feedback in my native language, so that I can effectively learn prompt engineering regardless of language.

EARS Requirement:

While evaluating prompts, when input language is detected, the cognitive-extension context shall:
1. Detect input language using Language Detector (e.g., “es”, “fr”, “de”, “zh”) and capture confidence
2. If language detection confidence < 70%, default to English, surface a user-visible fallback notice, and provide a manual language override
3. Set language context for all subsequent Sub-Judges; Sub-Judges respond in the detected or user-overridden language
4. Generate all feedback, suggestions, and improvedPrompt in the detected or user-overridden language
5. Localize UI elements for feedback display (score labels, section headers, action buttons)
6. Support lesson templates where v1 templates are English-only while feedback, UI localization, and evaluation outputs support multiple languages
7. Return detected language, confidence score, and whether fallback/override was applied in the evaluation response

Job: Support Desktop App Workflow (Tauri)

User Story: As a prompt engineering professional, I want a native desktop application for specialized workflow, so that I can focus without browser distractions.

EARS Requirement:

When PET Desktop App launches, the cognitive-extension context shall:
1. Initialize Tauri shell wrapping React frontend
2. Connect to backend API via localhost or configured endpoint
3. Enable specialized desktop features:
  - Keyboard shortcuts for prompt submission
  - Native notifications for lesson completions
  - Offline caching of lesson templates
  - Local storage for prompt drafts
4. Support on-prem deployment with field-level encryption for prompts
5. Validate resource usage via benchmarks (memory, CPU, startup time, bundle size) against a defined Electron baseline and target delta thresholds; publish benchmark results and CI test harness before claiming lower resource usage
6. Sync progress with cloud when online (for enterprise users) with conflict detection and resolution:
  - Detect divergent states on sync using primary keys for progress records
  - Default resolution: last-write-wins, with explicit merge rules for XP, badges, and lesson completion
  - Provide user-facing conflict prompts and an admin override via resolveConflict API
  - Acceptance criteria: audit logs for conflict events, retries, and deterministic outcomes

Job: Maintain Privacy and Encrypt Stored Prompts

User Story: As an enterprise user, I want my prompts to be encrypted at field level, so that my intellectual property is protected even in on-prem deployments.

EARS Requirement:

While storing prompts, when a prompt is persisted with explicit consent, the cognitive-extension context shall:
1. Apply field-level encryption to prompt content before storage using AES-256-GCM (256-bit keys) or a KMS-backed AEAD
2. Use encryption keys managed by organization (enterprise) or system (individual)
3. Protect metadata by encrypting sensitive fields (user ID) or using pseudonymous identifiers; only non-sensitive metadata (timestamp, evaluation ID) may remain plaintext
4. Enable decryption only for authorized users with OAuth2 bearer tokens + RBAC per SDS-031/POL-CE-008 (and mTLS for service-to-service), with KMS IAM policies for key access; log access for SOC2/ISO27001 auditability
5. Support key rotation for encrypted fields with re-encryption of existing ciphertexts or a dual-key grace period and explicit rotation workflow (read old/write new, background re-encrypt, retire old key)
6. Log all access to encrypted prompts in audit trail
7. Never persist prompts without explicit user consent (test mode or evaluation-only)

Domain Entities Summary

Root Aggregates

CompletionRequest/Response: LLM completion orchestration with prompt templates and responses
AgentSession: Maintains conversational state across multiple turns
CognitiveArtifact: Generated interactive tools (mind maps, checklists, kanban boards, decision trees)
PromptEvaluation: PET evaluation with evaluationId, score, sections, suggestions, improvedPrompt, and flags
Lesson: Human-authored template with trigger flags, content structure, and completion tracking
UserProgress: Gamification state with XP, streaks, badges, and leaderboard rankings
RubricConfiguration: Enterprise-specific evaluation rules with org_id, weights, priorities, and versioning

Value Objects

PromptTemplate: Configurable templates for AI agent behavior
ToolCall: External tool execution with allowlist enforcement
RecommendationScore: Semantic relevance scoring for artifact suggestions
SubJudgeResult: Individual evaluator output (intent, structure, agentic, constraint) with score and feedback
Suggestion: Concrete fix action with type (add_constraint, add_parameter, clarify_intent) and localized text
LessonTrigger: Flag mapping (e.g., #missing_constraints) to lesson template ID
Badge: Skill achievement with criteria (tier, count required, score threshold)
XPCalculation: Experience point calculation with base value, streak multiplier, and bonus logic

Policy Rules

PromptInjectionDefense (POL-CE-001): Validates and sanitizes prompt inputs
ToolExecutionAllowlist (POL-CE-002): Enforces approved external tool access
SecretsRedaction (POL-CE-003): Removes sensitive information from outputs
HighImpactApproval (POL-CE-004): Requires approval for critical operations
AuthorityBoundaries (POL-CE-008 / SDS-031): Enforces RBAC, permission-scope checks, and auditable approvals for privileged operations (rubric authorization, key access, agent configuration, artifact generation)
PrivacyFirstConsent (POL-CE-005): Prompts persist only with explicit user consent (PET invariant)
JudgingDeterminism (POL-CE-006): Identical inputs with mode=test produce identical evaluation results (PET invariant)
FieldLevelEncryption (POL-CE-007): Stored prompts encrypted at field level for enterprise privacy (PET invariant)

Integration Points

Semantic Core Context: Provides SEA-DSL parsing and semantic grounding
Knowledge Graph Service: Supplies semantic relationships and entity context
Query Context: Retrieves relevant policies for AI responses
Memory Context: Provides semantic search for context retrieval
Domain Services: Business rules and domain-specific context
PET Judge Service: Evaluates prompts via Sub-Judges pipeline (Intent, Structure, Agentic, Constraint)
PET Lesson Library: Human-authored templates triggered by prompt flags
PET Gamification Engine: Calculates XP, awards badges, manages leaderboards
SCORM/xAPI Export: LMS integration for enterprise learning data portability
Tauri Desktop Shell: Native desktop application wrapper for specialized workflow