SecureLLM MCP Server
Enterprise-Grade Model Context Protocol Server for Intelligent Development Workflows
Overviewβ
SecureLLM MCP is a production-ready Model Context Protocol (MCP) server that transforms AI assistants into intelligent development partners. Built with enterprise-grade architecture, it combines advanced caching, reasoning systems, and comprehensive tooling to deliver unprecedented productivity for NixOS and systems programming workflows.
Key Capabilitiesβ
- Semantic Intelligence: 50-70% cost reduction through embedding-based query caching
- Hybrid Reasoning: Context inference, multi-step planning, and causal impact analysis
- Production-Ready: Circuit breakers, retry logic, structured logging, and Prometheus metrics
- NixOS First-Class: Deep integration with Nix ecosystem - package debugging, flake management, build optimization
- Emergency Framework: Laptop thermal protection during intensive builds
- Knowledge Management: Persistent learning with SQLite + FTS5 full-text search
- Security-Focused: SOPS secrets management, OAuth integration, sandboxed execution
Architectureβ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MCP CLIENT (Claude, Cline) β
ββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β stdio/HTTP
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SecureLLM MCP Server Core β
β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ β
β β Semantic β β Smart Rate β β Knowledge β β
β β Cache β β Limiter β β Database β β
β β (Embeddings) β β (Circuit β β (SQLite + β β
β β β β Breaker) β β FTS5) β β
β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββΌβββββββββββββββββββββ
βΌ βΌ βΌ
ββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββββ
β Reasoning β β Development β β Infrastructure β
β Systems β β Tools β β Management β
β β β β β β
β β’ Context β β β’ Nix Package β β β’ SSH Remote β
β Inference β β Debugger β β Execution β
β β’ Multi-Step β β β’ Build Analyzer β β β’ System Health β
β Planner β β β’ Flake Ops β β Monitoring β
β β’ Causal β β β’ Web Search β β β’ Emergency β
β Analysis β β β’ Browser Auto β β Framework β
β β’ Adaptive β β β’ Research Agent β β β’ Backup Manager β
β Learning β β β’ Code Analysis β β β’ Log Analysis β
ββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Observability & Security β
β ββββββββββββββ ββββββββββββββ ββββββββββββββ ββββββββββββββ β
β β Prometheus β β Structured β β OAuth/ β β Sandboxed β β
β β Metrics β β Logging β β GitHub β β Execution β β
β ββββββββββββββ ββββββββββββββ ββββββββββββββ ββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Featuresβ
π§ Intelligent Caching Layerβ
Semantic Cache - Industry-first embedding-based caching for MCP servers:
- Semantic Similarity Detection: Understands that "check system temperature" and "verify thermal status" are equivalent queries
- Cost Optimization: 50-70% reduction in tool execution costs
- Automatic Expiration: TTL-based cache invalidation with periodic cleanup
- Performance Metrics: Real-time hit/miss rates, token savings, similarity scores
// Queries like these hit the same cache:
"What's the current CPU temperature?"
"Check thermal status of the system"
"Show me processor heat levels"
π― Smart Rate Limitingβ
Production-grade request management with circuit breaker pattern:
- Per-Provider Queuing: FIFO request queues with configurable limits
- Circuit Breaker: Automatic failure detection and recovery
- Exponential Backoff: Intelligent retry with jitter
- Metrics Collection: Request latency percentiles (p50, p95, p99), error categorization, queue depths
- Prometheus Export: HTTP metrics endpoint for observability
ποΈ Knowledge Management Systemβ
Persistent learning infrastructure with advanced search:
- SQLite + FTS5: Full-text search with Porter stemming and Unicode support
- Session Management: Contextual conversation tracking across interactions
- Structured Storage: Typed entries (insights, decisions, code, references)
- Priority System: High/medium/low classification for relevance ranking
- Project Watcher: Automatic file system monitoring and knowledge extraction
π§ NixOS Development Toolsβ
Comprehensive tooling for NixOS ecosystem:
- Package Debugger: Diagnose and fix Nix package build failures
- Flake Operations: Build, update, and manage Nix flakes
- Build Analyzer: Performance profiling and optimization recommendations
- Hash Calculator: Automatic SHA256 calculation for fetchurl/fetchFromGitHub
- Configuration Generator: Smart Nix expression generation
π‘οΈ Emergency Frameworkβ
Laptop protection during intensive operations:
- Thermal Monitoring: Real-time CPU/GPU temperature tracking
- Rebuild Safety Checks: Pre-build thermal validation
- Automatic Throttling: Force cooldown when temperature exceeds thresholds
- Forensic Analysis: Post-build thermal profiling with detailed reports
- War Room Mode: Live monitoring during critical operations
π Hybrid Reasoning (Beta)β
Next-generation AI capabilities currently in development:
- Context Inference Engine: Automatic entity extraction from user input and project state
- Proactive Action Engine: Execute preparatory checks before asking questions
- Multi-Step Planner: Decompose complex tasks into dependency-ordered steps
- Causal Reasoning: Predict change impacts through dependency graph analysis
- Adaptive Learning: Continuous improvement from interaction feedback
Installationβ
Prerequisitesβ
- Node.js: 22.0+ (native ESM support)
- NixOS: Recommended for full feature set
- SQLite: 3.35+ (for FTS5 support)
- Optional: llama.cpp server for semantic caching embeddings
Quick Startβ
# Clone repository
git clone https://github.com/marcosfpina/securellm-mcp.git
cd securellm-mcp
# Install dependencies
npm install
# Build
npm run build
# Run server
node build/src/index.js
Environment Configurationβ
Create .env file:
# Core Configuration
PROJECT_ROOT=/path/to/your/project
ENABLE_KNOWLEDGE=true
KNOWLEDGE_DB_PATH=~/.local/share/securellm/knowledge.db
# Semantic Cache (Optional)
ENABLE_SEMANTIC_CACHE=true
SEMANTIC_CACHE_THRESHOLD=0.85
SEMANTIC_CACHE_TTL=3600
LLAMA_CPP_URL=http://localhost:8080
# API Keys (loaded via SOPS in production)
ANTHROPIC_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
DEEPSEEK_API_KEY=your_key_here
# Observability
METRICS_PORT=9090
LOG_LEVEL=info
MCP Client Integrationβ
Claude Desktopβ
// ~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"securellm": {
"command": "node",
"args": ["/path/to/securellm-mcp/build/src/index.js"],
"env": {
"PROJECT_ROOT": "/your/project/path"
}
}
}
}
Cline (VSCodium/VSCode)β
// ~/.config/VSCodium/User/globalStorage/saoudrizwan.claude-dev/settings/cline_mcp_settings.json
{
"mcpServers": {
"securellm": {
"command": "node",
"args": ["/path/to/securellm-mcp/build/src/index.js"],
"env": {
"PROJECT_ROOT": "${workspaceFolder}"
}
}
}
}
Usage Examplesβ
Package Debuggingβ
// Diagnose why a Nix package won't build
await mcp.call("package_diagnose", {
package_path: "./pkgs/custom-app/default.nix",
package_type: "js",
build_test: true
});
// Download package from GitHub with automatic hash calculation
await mcp.call("package_download", {
package_name: "awesome-tool",
package_type: "tar",
source: {
type: "github_release",
github: {
repo: "owner/awesome-tool",
tag: "v1.2.3",
asset_pattern: "*.tar.gz"
}
}
});
Emergency Frameworkβ
// Check if it's safe to rebuild
await mcp.call("rebuild_safety_check");
// Monitor thermals during build
await mcp.call("thermal_warroom", {
duration: 120 // Monitor for 2 minutes
});
// Get forensic analysis after thermal event
await mcp.call("thermal_forensics", {
duration: 180,
skip_rebuild: false
});
Knowledge Managementβ
// Create development session
const session = await mcp.call("create_session", {
summary: "Implementing new authentication module"
});
// Save insights during development
await mcp.call("save_knowledge", {
session_id: session.id,
entry_type: "decision",
content: "Using JWT tokens instead of sessions for API auth",
tags: ["auth", "api", "jwt"],
priority: "high"
});
// Search past decisions
const results = await mcp.call("search_knowledge", {
query: "authentication jwt",
entry_type: "decision",
limit: 5
});
System Health Monitoringβ
// Comprehensive health check
await mcp.call("system_health_check", {
detailed: true
});
// Analyze system logs
await mcp.call("system_log_analyzer", {
service: "sshd",
since: "1 hour ago",
level: "error"
});
// Service management
await mcp.call("system_service_manager", {
action: "restart",
service: "nginx"
});
Research & Analysisβ
// Deep research on technical topics
await mcp.call("research_agent", {
topic: "Rust async runtime comparison",
depth: "comprehensive",
sources: ["github", "reddit", "documentation"]
});
// Analyze codebase complexity
await mcp.call("analyze_complexity", {
directory: "./src",
include_patterns: ["**/*.ts"],
metrics: ["cyclomatic", "cognitive", "maintainability"]
});
// Find potentially dead code
await mcp.call("find_dead_code", {
directory: "./src",
extensions: [".ts", ".js"]
});
Resourcesβ
The server exposes several MCP resources for querying system state:
config://current- Current SecureLLM configurationlogs://audit- Recent audit log entriesmetrics://usage- Provider usage statisticsmetrics://prometheus- Prometheus-format metricsmetrics://semantic-cache- Cache performance statsdocs://api- API documentation
// Query cache performance
const stats = await mcp.read("metrics://semantic-cache");
console.log(`Hit rate: ${stats.hitRate}%`);
console.log(`Tokens saved: ${stats.tokensSaved}`);
Performanceβ
Benchmarksβ
- Semantic Cache Lookup: < 10ms (in-memory embedding comparison)
- Knowledge DB Search: < 50ms (FTS5 indexed queries)
- Rate Limiter Overhead: < 5ms per request
- Circuit Breaker Decision: < 1ms
Scalabilityβ
- Memory Footprint: ~512MB base + 256MB per active reasoning session
- Database Size: ~100MB per 10,000 knowledge entries
- Concurrent Requests: 100+ simultaneous tool calls (per-provider queuing)
- Cache Storage: ~1KB per cached response
Securityβ
Secrets Managementβ
- SOPS Integration: Encrypted secrets stored in
secrets.yaml - Environment Variables: Runtime API key injection
- No Hardcoded Credentials: All sensitive data externalized
Sandboxed Executionβ
- Tool Whitelisting: Configurable allowed commands
- Path Restrictions: Sandboxed file system access
- Network Isolation: Optional network policy enforcement
Audit Trailβ
- Structured Logging: All actions logged with context
- Knowledge DB Audit: Complete interaction history
- Metrics Retention: 30-day historical performance data
Developmentβ
Project Structureβ
securellm-mcp/
βββ src/
β βββ index.ts # MCP server entry point
β βββ knowledge/
β β βββ database.ts # SQLite + FTS5 implementation
β βββ middleware/
β β βββ semantic-cache.ts # Embedding-based caching
β β βββ rate-limiter.ts # Smart rate limiting
β β βββ circuit-breaker.ts # Failure detection
β β βββ retry-strategy.ts # Exponential backoff
β β βββ metrics-collector.ts # Performance tracking
β βββ reasoning/
β β βββ context-manager.ts # Context inference
β β βββ multi-step-planner.ts # Task decomposition
β β βββ proactive-executor.ts # Pre-action execution
β βββ tools/
β β βββ package-diagnose.ts # Nix package debugging
β β βββ emergency/ # Thermal protection
β β βββ laptop-defense/ # System safety
β β βββ system/ # Health monitoring
β β βββ ssh/ # Remote execution
β β βββ browser/ # Web automation
β β βββ nix/ # Nix ecosystem tools
β βββ types/
β β βββ knowledge.ts # Knowledge DB schemas
β β βββ semantic-cache.ts # Cache type definitions
β β βββ middleware/ # Middleware types
β βββ utils/
β βββ logger.ts # Pino structured logging
β βββ project-detection.ts # Auto project root detection
β βββ host-detection.ts # NixOS hostname resolution
βββ docs/ # Architecture documentation
βββ tests/ # Integration tests
βββ build/ # Compiled output
Building from Sourceβ
# Development mode with watch
npm run watch
# Production build
npm run build
# Run tests
npm test
# Type checking
npx tsc --noEmit
Contributingβ
- Architecture Changes: Review
docs/HYBRID-REASONING-ARCHITECTURE.md - Code Style: Follow existing TypeScript patterns, use Zod for validation
- Testing: Add integration tests for new tools
- Documentation: Update README and inline JSDoc comments
Roadmapβ
Phase 1: Core Infrastructure β β
- MCP server implementation
- Knowledge database (SQLite + FTS5)
- Smart rate limiter with circuit breaker
- Semantic cache with embeddings
- Nix package debugging tools
- Emergency framework
- Prometheus metrics
Phase 2: Reasoning Systems π§β
- Context inference engine
- Proactive action executor
- Multi-step task planner
- Causal dependency analyzer
- Adaptive learning system
Phase 3: Advanced Tools π§β
- SSH remote execution suite
- Browser automation tools
- Sensitive data handling
- File organization system
- Advanced code analysis
Phase 4: Enterprise Featuresβ
- Multi-user support
- Role-based access control
- Distributed caching
- Horizontal scaling
- SaaS deployment
Monitoring & Observabilityβ
Prometheus Metricsβ
Expose metrics on HTTP endpoint:
# Start metrics server
export METRICS_PORT=9090
node build/src/index.js
# Query metrics
curl http://localhost:9090/metrics
Available metrics:
mcp_rate_limiter_requests_total{provider="deepseek"}mcp_rate_limiter_request_duration_seconds{provider="openai"}mcp_circuit_breaker_state{provider="anthropic"}mcp_semantic_cache_hits_totalmcp_semantic_cache_tokens_saved_total
Structured Loggingβ
Pino-based JSON logging:
{
"level": "info",
"time": 1704196800000,
"msg": "Semantic cache hit",
"similarity": 0.92,
"toolName": "thermal_check",
"tokensSaved": 150
}
Troubleshootingβ
Common Issuesβ
1. Semantic cache not working
# Verify llama.cpp server is running
curl http://localhost:8080/health
# Check cache database exists
ls -lh ~/.local/share/securellm/semantic_cache.db
# Enable debug logging
export LOG_LEVEL=debug
2. Rate limiter throttling requests
# Check current queue status
# (use rate_limiter_status tool via MCP)
# Adjust rate limits in config
# See src/config/rate-limits.ts
3. Knowledge DB corruption
# Backup and rebuild
cp ~/.local/share/securellm/knowledge.db{,.backup}
rm ~/.local/share/securellm/knowledge.db
# Restart server (will recreate schema)
Licenseβ
MIT License - See LICENSE file
Acknowledgmentsβ
Built with:
- Model Context Protocol SDK - MCP protocol implementation
- better-sqlite3 - High-performance SQLite bindings
- Pino - Fast structured logging
- Zod - TypeScript schema validation
Inspired by:
- NixOS community's declarative infrastructure philosophy
- The MCP ecosystem's vision for AI-native tooling
- Production systems engineering best practices
Contactβ
Author: marcosfpina Project: github.com/marcosfpina/securellm-mcp Issues: GitHub Issues
Built for developers who demand production-grade tooling.