PHANTOM - Development Guide for Claude
Living Machine Learning Framework - Production-grade document intelligence, RAG pipeline, and AI classification system.
Version: 0.0.1 (Pre-Alpha) Python: 3.11+ License: Apache-2.0 Last Updated: 2026-03-26
π Table of Contentsβ
- Project Overview
- Current State Assessment
- Architecture Quick Reference
- Development Priorities
- Quality Assessment
- Frontend-Backend Integration
- Key Files Reference
- Testing Strategy
- Common Tasks
- Known Issues & TODOs
π― Project Overviewβ
What is Phantom?β
Phantom is a local-first AI document intelligence framework that processes unstructured documents into actionable intelligence using:
- CORTEX Engine: Semantic chunking, parallel LLM classification, insight extraction
- RAG Pipeline: FAISS vector indexing with sentence-transformers embeddings
- Multi-Interface: CLI (Typer), REST API (FastAPI), Desktop UI (Tauri + SvelteKit)
- Production-Ready: VRAM monitoring, auto-throttling, thread pool concurrency, Prometheus metrics
- Fully Reproducible: Nix flake with locked dependencies
Core Capabilitiesβ
- Document Processing: Extract insights from markdown, text, PDFs
- Vector Search: Semantic search over document embeddings (FAISS)
- Classification: Multi-threaded LLM-based document classification
- Sentiment Analysis: NLTK VADER + optional spaCy NER
- RAG Pipeline: Question-answering over knowledge base
- Resource Management: Real-time VRAM/RAM monitoring with auto-throttling
Tech Stackβ
- Backend: Python 3.11+, FastAPI, Pydantic 2.0
- Frontend: Tauri 2.0, SvelteKit, Svelte 5, Tailwind CSS
- ML/NLP: sentence-transformers, FAISS, NLTK, scikit-learn, tiktoken
- Inference: llama.cpp (local, OpenAI-compatible API)
- Agent: Rust (Crane build system, multi-crate workspace)
- DevOps: Nix flake, GitHub Actions (8 workflows), pre-commit hooks
- Observability: structlog, Prometheus metrics
π Current State Assessmentβ
β Production-Ready Componentsβ
| Component | Status | Files | Test Coverage |
|---|---|---|---|
| CORTEX Processor | β Complete | core/cortex.py | High |
| Semantic Chunker | β Complete | core/cortex.py | High |
| FAISS Vector Store | β Complete | rag/vectors.py | High |
| Sentiment Engine | β Complete | analysis/sentiment.py | High |
| Embedding Generator | β Complete | core/embeddings.py | Medium |
| LlamaCpp Provider | β Complete | providers/llamacpp.py | Medium |
| FastAPI Server | β Complete | api/app.py | High |
| Prometheus Metrics | β Complete | api/app.py | High |
| Pydantic Schemas | β Complete | All modules | High |
| CI/CD Pipelines | β Complete | .github/workflows/ | N/A |
| Nix Environment | β Complete | flake.nix | N/A |
| CLI Commands | β Complete | cli/main.py | Medium |
| RAG Query API | β Complete | api/app.py | High |
| Document Upload | β Complete | api/app.py | High |
| Vector Indexing API | β Complete | api/app.py | High |
| SSE Streaming Chat | β Complete | api/app.py | Medium |
π‘ Partially Implemented Componentsβ
| Component | Status | Missing | Priority |
|---|---|---|---|
| Desktop UI | π‘ Framework | Component polish, e2e tests | Medium |
| tools vram | π‘ Partial | Model-specific VRAM estimates | Low |
| Cloud LLM Providers | π‘ Stub | OpenAI, Anthropic, DeepSeek impl | Medium |
β Not Implementedβ
- Cloud LLM Providers (OpenAI, Anthropic, DeepSeek)
- Kubernetes/Helm packaging
- Full desktop app UI (marked for GTK4 migration)
- Redis semantic cache integration
- Advanced prompt workbench features
Code Quality Metricsβ
- Total Python LOC: 11,290 (33 source files)
- Test Files: 18 (unit, integration, e2e)
- Test Coverage: 70% minimum (enforced via pytest)
- Linting: Ruff (enforced in CI)
- Type Checking: mypy (enabled, non-strict mode)
- Security Scanning: bandit, pip-audit, cargo-audit
- Documentation: 20+ markdown files
ποΈ Architecture Quick Referenceβ
Directory Structureβ
phantom/
βββ src/phantom/ # Main Python source (11,290 LOC)
β βββ core/ # CORTEX engine, embeddings, chunking
β βββ rag/ # Vector stores (FAISS), search
β βββ analysis/ # Sentiment, SPECTRE, viability
β βββ pipeline/ # DAG orchestration, classification
β βββ providers/ # LLM providers (llama.cpp base)
β βββ cerebro/ # RAG engine + knowledge integration
β βββ neutron/ # Compliance guardrails (SENTINEL)
β βββ api/ # FastAPI REST server + Judge API
β βββ cli/ # Typer CLI interface
βββ tests/ # 18 test files
β βββ unit/ # Unit tests (isolated components)
β βββ integration/ # API + CLI tests
β βββ e2e/ # End-to-end pipeline tests
βββ cortex-desktop/ # Tauri + SvelteKit + Svelte 5
βββ intelagent/ # Rust agent (multi-crate workspace)
β βββ crates/security/ # Privacy & audit modules
β βββ crates/governance/ # DAO & rewards systems
β βββ crates/memory/ # Context & knowledge graphs
β βββ crates/quality/ # Automated peer review gates
β βββ crates/mcp/ # MCP protocol handlers
βββ docs/ # 20+ markdown documentation files
βββ nix/ # NixOS module definitions
βββ .github/workflows/ # CI/CD pipelines (8 workflows)
Data Flowβ
User Input β CLI/API/Desktop
β
CORTEX Processor β Semantic Chunker β Embeddings
β
LLM Classifier (llama.cpp) β Insights Extraction
β
FAISS Vector Store β RAG Pipeline
β
Results (JSON + Pydantic validation)
API Endpoints (Current)β
| Endpoint | Method | Status | Returns |
|---|---|---|---|
/health | GET | β Complete | Health status |
/ready | GET | β Complete | Readiness checks |
/metrics | GET | β Complete | Prometheus metrics |
/extract | POST | π‘ TODO | Document insights |
/upload | POST | π‘ Partial | File upload |
/rag/query | GET | π‘ TODO | RAG query |
/judge | POST | π‘ Integration | AI-OS-Agent judgment |
Missing API Endpoints (Needed for Frontend)β
| Endpoint | Method | Purpose | Frontend Usage |
|---|---|---|---|
/process | POST | Document processing | cortex-desktop/src/lib/api.ts:24 |
/vectors/search | POST | Vector semantic search | cortex-desktop/src/lib/api.ts:64 |
/vectors/index | POST | Index document to FAISS | cortex-desktop/src/lib/api.ts:81 |
/api/chat | POST | RAG chat interface | cortex-desktop/src/routes/+page.svelte:145 |
/api/models | GET | List available models | cortex-desktop/src/routes/+page.svelte:127 |
/api/prompt/test | POST | Test prompt rendering | cortex-desktop/src/routes/+page.svelte:209 |
/api/upload | POST | Multi-file upload | cortex-desktop/src/routes/+page.svelte:287 |
π― Development Prioritiesβ
Phase 1: Complete Backend API β COMPLETED (2026-02-05)β
Goal: Implement missing API endpoints so frontend can use real data instead of stubs.
Status: All 7 endpoints implemented and tested. See PHASE1_IMPLEMENTATION.md for details.
1.1 Document Processing Endpointβ
File: src/phantom/api/app.py
Current:
@app.post("/extract", response_model=ExtractResponse)
async def extract(request: ExtractRequest):
# TODO: Implement using CortexProcessor
insights = { "themes": [], "patterns": [], ... }
Needs:
from phantom.core.cortex import CortexProcessor
@app.post("/process")
async def process(file: UploadFile, chunk_strategy: str = "recursive", chunk_size: int = 1024):
"""Process document using CORTEX engine."""
content = await file.read()
processor = CortexProcessor()
result = processor.extract_insights(content.decode(), filename=file.filename)
return {
"filename": file.filename,
"insights": result.model_dump(),
"processing_time": ...,
}
Frontend expects: cortex-desktop/src/lib/api.ts:20-31
1.2 Vector Search Endpointβ
File: src/phantom/api/app.py (new endpoint)
Needs:
from phantom.rag.vectors import FAISSVectorStore
@app.post("/vectors/search")
async def vector_search(query: str, top_k: int = 5):
"""Semantic search using FAISS."""
store = FAISSVectorStore() # Or singleton instance
results = await store.search(query, top_k=top_k)
return {
"query": query,
"results": [{"text": r.text, "score": r.score} for r in results],
"total_results": len(results),
}
Frontend expects: cortex-desktop/src/lib/api.ts:63-75
1.3 Vector Indexing Endpointβ
File: src/phantom/api/app.py (new endpoint)
Needs:
@app.post("/vectors/index")
async def vector_index(file: UploadFile):
"""Index document into FAISS vector store."""
content = await file.read()
store = FAISSVectorStore()
chunks = semantic_chunker.chunk(content.decode())
count = await store.add_documents(chunks)
return {"status": "indexed", "chunks_indexed": count}
Frontend expects: cortex-desktop/src/lib/api.ts:77-88
1.4 RAG Chat Endpointβ
File: src/phantom/api/app.py (replace /rag/query)
Needs:
@app.post("/api/chat")
async def rag_chat(
message: str,
conversation_id: str,
history: list[dict],
context_size: int = 5,
llm_provider: str = "local"
):
"""RAG-powered chat with context."""
# 1. Vector search for relevant context
store = FAISSVectorStore()
context_chunks = await store.search(message, top_k=context_size)
# 2. Build prompt with context
prompt = build_rag_prompt(message, context_chunks, history)
# 3. Call LLM
provider = get_llm_provider(llm_provider)
response = await provider.complete(prompt)
return {
"message": {
"content": response,
"sources": [{"text": c.text, "score": c.score} for c in context_chunks]
}
}
Frontend expects: cortex-desktop/src/routes/+page.svelte:145-186
1.5 Models List Endpointβ
File: src/phantom/api/app.py (new endpoint)
Needs:
@app.get("/api/models")
async def list_models():
"""List available LLM models by provider."""
return {
"local": [
{"id": "qwen-30b", "name": "Qwen 30B (Local)"},
{"id": "llama-3-8b", "name": "Llama 3 8B (Local)"}
],
"openai": [], # Future: OpenAI models
"anthropic": [] # Future: Claude models
}
Frontend expects: cortex-desktop/src/routes/+page.svelte:127-129
Phase 2: System Monitoring Integration (MEDIUM PRIORITY) π‘β
Goal: Expose real host machine metrics to frontend (CPU, memory, VRAM, temperature).
2.1 Add System Metrics Endpointβ
File: src/phantom/api/app.py (new endpoint)
Implementation:
import psutil
@app.get("/api/system/metrics")
async def system_metrics():
"""Get real-time system resource metrics."""
cpu_percent = psutil.cpu_percent(interval=1)
mem = psutil.virtual_memory()
disk = psutil.disk_usage('/')
# VRAM monitoring (existing code in core/cortex.py:226)
from phantom.core.cortex import get_vram_usage
vram = get_vram_usage()
return {
"cpu": {
"percent": cpu_percent,
"count": psutil.cpu_count()
},
"memory": {
"total": mem.total,
"used": mem.used,
"available": mem.available,
"percent": mem.percent
},
"disk": {
"total": disk.total,
"used": disk.used,
"free": disk.free,
"percent": disk.percent
},
"vram": vram # If GPU available
}
Frontend Integration:
- Add system monitor component in
cortex-desktop/ - Display real-time metrics in settings or dashboard tab
- Use for resource warnings before heavy operations
Phase 3: Desktop UI Enhancement (LOW PRIORITY) π’β
Goal: Complete the Cortex Desktop UI with functional components.
3.1 Current Stateβ
Framework: Tauri 2 + SvelteKit + Svelte 5
Status: Infrastructure ready, minimal UI implemented
File: cortex-desktop/src/routes/+page.svelte (1,193 lines)
Implemented Tabs:
- β Chat (RAG conversation)
- β Process (document processing)
- β Search (vector search)
- β Workbench (prompt engineering)
- β Library (saved prompts)
- β Settings (API config)
Issues:
- Backend endpoints missing (Phase 1 priority)
- No real-time metrics display
- No error handling UI
- No processing progress indicators
3.2 Enhancement Tasksβ
- Add System Monitor Tab
- Real-time CPU/memory/VRAM charts
- Historical metrics (last 24h)
- Resource alerts
- Improve Error Handling
- Toast notifications for API errors
- Retry logic with exponential backoff
- Offline mode detection
- Add Progress Indicators
- Document processing progress
- Indexing progress (chunk by chunk)
- Model loading status
- Enhance Chat UI
- Markdown rendering for responses
- Code syntax highlighting
- Copy-to-clipboard for responses
- Export conversation history
π Quality Assessmentβ
Code Quality: A- (Excellent)β
Strengths:
- β Comprehensive Pydantic validation (type safety)
- β Clean separation of concerns (core, rag, analysis, api, cli)
- β Extensive testing infrastructure (70% coverage minimum)
- β Modern Python 3.11+ features (type hints, dataclasses)
- β Production-ready observability (structlog, Prometheus)
- β Strict linting (Ruff) and security scanning (bandit, pip-audit)
Areas for Improvement:
- π‘ Missing docstrings in some modules (inconsistent)
- π‘ TODO comments in production code (api/app.py)
- π‘ No auto-generated API documentation (Sphinx/MkDocs)
- π‘ Some type annotations use
any(reduce this)
Architecture: A (Very Good)β
Strengths:
- β Clear layered architecture (client β application β processing β storage)
- β Provider abstraction for LLM flexibility
- β DAG pipeline for complex workflows
- β Reproducible builds (Nix flake)
- β Comprehensive ADR documentation
Considerations:
- π‘ Singleton pattern for vector store (consider dependency injection)
- π‘ No caching layer implemented (Redis planned)
- π‘ No distributed processing support (single-node only)
Testing: B+ (Good)β
Strengths:
- β Multi-level testing (unit, integration, e2e)
- β Fixture-based test data (conftest.py)
- β Async test support (pytest-asyncio)
- β Coverage enforcement (70% minimum)
- β CI/CD integration
Gaps:
- π‘ Frontend tests missing (no Vitest/Playwright setup)
- π‘ CLI commands untested (stubs not implemented)
- π‘ Load testing not implemented
- π‘ No mutation testing
Documentation: B+ (Good)β
Strengths:
- β Extensive README (634 lines)
- β Architecture diagrams (Mermaid)
- β Quick start guides
- β ADR records
- β Security policy
Gaps:
- π‘ No auto-generated API docs
- π‘ Inconsistent module docstrings
- π‘ No video tutorials
- π‘ Missing deployment guides
π Frontend-Backend Integrationβ
Current Statusβ
Frontend Architecture:
- Framework: Tauri 2.0 (Rust backend) + SvelteKit + Svelte 5
- API Client:
cortex-desktop/src/lib/api.ts - State Management: Svelte 5 runes (
$state,$effect) - Styling: Tailwind CSS (Catppuccin Mocha theme)
Key Finding: Frontend does NOT use mock data β
The frontend is properly designed to call real API endpoints at:
http://localhost:8000(CORTEX API)http://localhost:8081(RAG API)
Problem: Backend endpoints are marked as "TODO" (see Phase 1 priorities).
Integration Checklistβ
- Frontend API client properly structured (
api.ts) - TypeScript interfaces match expected backend responses
- Health check endpoint working (
/health) - Prometheus metrics working (
/metrics) - Document processing endpoint (
/process) - TODO - Vector search endpoint (
/vectors/search) - TODO - Vector indexing endpoint (
/vectors/index) - TODO - RAG chat endpoint (
/api/chat) - TODO - Models list endpoint (
/api/models) - TODO - System metrics endpoint (
/api/system/metrics) - TODO
π Key Files Referenceβ
Backend Coreβ
| File | Purpose | LOC | Test Coverage |
|---|---|---|---|
src/phantom/core/cortex.py | CORTEX processor, semantic chunking | 500+ | High |
src/phantom/core/embeddings.py | sentence-transformers integration | 200+ | Medium |
src/phantom/rag/vectors.py | FAISS vector store | 300+ | High |
src/phantom/analysis/sentiment.py | NLTK VADER sentiment | 250+ | High |
src/phantom/api/app.py | FastAPI server | 190 | High |
src/phantom/cli/main.py | Typer CLI | 150+ | Low (stubs) |
src/phantom/providers/llamacpp.py | llama.cpp provider | 200+ | Medium |
Frontend Coreβ
| File | Purpose | LOC |
|---|---|---|
cortex-desktop/src/routes/+page.svelte | Main UI component | 1,193 |
cortex-desktop/src/lib/api.ts | API client | 89 |
cortex-desktop/src-tauri/src/main.rs | Tauri Rust backend | ~100 |
Configurationβ
| File | Purpose |
|---|---|
flake.nix | Nix environment definition |
pyproject.toml | Python package config |
pytest.ini | Test configuration |
.pre-commit-config.yaml | Pre-commit hooks |
justfile | Task runner (50+ commands) |
Documentationβ
| File | Purpose |
|---|---|
README.md | Main project documentation (634 lines) |
CORTEX_V2_ARCHITECTURE.md | Architecture deep-dive |
VRAM_CALCULATOR.md | Hardware planning guide |
ROADMAP.md | 4-phase development plan |
CONTRIBUTING.md | Development workflow |
π§ͺ Testing Strategyβ
Running Testsβ
# Enter Nix development environment
nix develop
# Run all tests
just test
# Run with coverage report
just test-cov
# Run specific test levels
just test-unit # Unit tests only
just test-integration # Integration tests
just test-e2e # End-to-end tests
# Run tests by marker
pytest -m "not slow" # Skip slow tests
pytest -m "integration" # Integration only
Test Structureβ
tests/
βββ conftest.py # Shared fixtures
βββ test_imports.py # Critical import validation
βββ unit/
β βββ test_cortex.py # CORTEX processor
β βββ test_cortex_chunker.py # Semantic chunking
β βββ test_sentiment.py # Sentiment analysis
β βββ test_vector_store.py # FAISS operations
β βββ test_pipeline.py # DAG pipeline
βββ integration/
β βββ test_api.py # FastAPI endpoints
β βββ test_cli.py # CLI commands
βββ e2e/
βββ test_full_pipeline.py # End-to-end flows
Coverage Requirementsβ
- Minimum: 70% (enforced via
pytest.ini) - Target: 80%+
- Exclusions:
*/tests/*,*/__pycache__/*
Adding New Testsβ
- Unit Tests: Test isolated functions/classes
- Integration Tests: Test API endpoints
- E2E Tests: Test full workflows
βοΈ Common Tasksβ
Starting the Development Environmentβ
# Enter Nix shell with all dependencies
nix develop
# Or use direnv (auto-load on cd)
echo "use flake" > .envrc
direnv allow
Running the API Serverβ
# Development server (auto-reload)
just run-api
# Or manually with uvicorn
python -m uvicorn phantom.api.app:app --reload --host 127.0.0.1 --port 8000
Running the Desktop Appβ
cd cortex-desktop
npm install # First time only
npm run tauri dev # Start Tauri dev server
Code Quality Checksβ
# Lint Python code
just lint
# Format Python code
just format
# Type check
just typecheck
# Security scan
just security-scan
# All checks at once
just quality
Database/Vector Store Managementβ
# Initialize FAISS index (if needed)
python -c "from phantom.rag.vectors import FAISSVectorStore; store = FAISSVectorStore(); store.save('data/index.faiss')"
# Check index stats
python -c "from phantom.rag.vectors import FAISSVectorStore; store = FAISSVectorStore.load('data/index.faiss'); print(f'Vectors: {store.ntotal}')"
Git Workflowβ
# Pre-commit hooks installed automatically in Nix shell
git add .
git commit -m "feat: implement /process endpoint" # Hooks run automatically
# Manual hook run
pre-commit run --all-files
π¨ Known Issues & TODOsβ
Critical (Fix Immediately)β
- Missing API Endpoints (
src/phantom/api/app.py)/process- Document processing/vectors/search- Vector search/vectors/index- Document indexing/api/chat- RAG chat/api/models- Model listing/api/prompt/test- Prompt testing
- CLI Not Functional (
src/phantom/cli/main.py)phantom extract- Stub onlyphantom analyze- Stub onlyphantom classify- Stub onlyphantom scan- Stub only
High Priorityβ
- No Frontend Tests
- Set up Vitest for unit tests
- Set up Playwright for e2e tests
- Test API client (
api.ts)
- Documentation Gaps
- Auto-generate API docs (Sphinx/MkDocs)
- Add module docstrings to all files
- Create deployment guide
Medium Priorityβ
- Error Handling
- Add proper exception hierarchy
- Improve API error responses (consistent format)
- Add retry logic for transient failures
- Observability
- Add distributed tracing (OpenTelemetry)
- Add request ID propagation
- Add structured error logging
Low Priorityβ
- Cloud Providers
- OpenAI integration
- Anthropic integration
- DeepSeek integration
- Advanced Features
- Redis semantic cache
- Kubernetes/Helm charts
- Multi-node distributed processing
π Development Philosophyβ
Key Principlesβ
- Local-First: No cloud dependencies for core functionality
- Type Safety: Pydantic everywhere, strong typing
- Reproducibility: Nix flake, locked dependencies
- Production-Ready: Metrics, logging, health checks from day one
- Test-Driven: 70% minimum coverage, multi-level testing
- Clean Architecture: Layered, modular, dependency injection
Code Styleβ
- Python: Follow Ruff rules (pycodestyle, pyflakes, isort, flake8-bugbear)
- TypeScript: Follow Prettier + ESLint
- Rust: Follow rustfmt + clippy
- Commits: Conventional Commits format (
feat:,fix:,docs:) - Line Length: 100 characters (Python), 120 (TypeScript)
Adding New Featuresβ
- Write Tests First (TDD when possible)
- Implement Minimal Viable Feature
- Add Type Annotations & Pydantic Models
- Add Docstrings (Google style)
- Run Quality Checks (
just quality) - Update Documentation (README, CLAUDE.md)
- Commit with Conventional Commits
π Additional Resourcesβ
Documentation Filesβ
README.md- Main project documentationCORTEX_V2_ARCHITECTURE.md- Architecture deep-diveCORTEX_V2_QUICKSTART.md- Getting started guideVRAM_CALCULATOR.md- Hardware planningNIX_PYTHON_GUIDELINES.md- Nix + Python best practicesROADMAP.md- Development phasesCONTRIBUTING.md- Contribution guidelinesSECURITY.md- Security policy
External Linksβ
- Repository: https://github.com/kernelcore/phantom
- CI/CD: https://github.com/kernelcore/phantom/actions
- NixOS Packages: https://search.nixos.org/packages
- FastAPI Docs: https://fastapi.tiangolo.com/
- SvelteKit Docs: https://kit.svelte.dev/
π Quick Start Checklistβ
For New Developersβ
- Clone repository
- Install Nix (if not installed):
curl -L https://nixos.org/nix/install | sh - Enter dev environment:
nix develop - Run tests:
just test - Start API server:
just run-api - Read
CONTRIBUTING.md - Check current issues: Phase 1 priorities above
For Claudeβ
- Read this file fully
- Check Phase 1 priorities (Backend API endpoints)
- Verify test coverage before implementing features
- Follow code style (Ruff, type hints, docstrings)
- Update this file when adding new features
- Run
just qualitybefore committing
Last Updated: 2026-02-05 Maintainer: kernelcore <kernelcore@voidnix.dev> Version: 2.0.0 (Beta)
This document is auto-updated. For questions, check CONTRIBUTING.md or open an issue.