SecureLLM Bridge - AI Assistant Guide
Project: SecureLLM Bridge
Version: 0.1.0
Last Updated: 2025-11-06
Maintainer: kernelcore
Executive Summaryโ
Project Overviewโ
SecureLLM Bridge is a secure, production-ready proxy for Large Language Model APIs with enterprise-grade security features. Built in Rust, it provides:
- Unified API Interface: Single consistent interface for multiple LLM providers
- Enterprise Security: TLS mutual authentication, rate limiting, audit logging, sandboxing
- Provider Support: DeepSeek, OpenAI, Anthropic, Ollama with extensible architecture
- Zero-Trust Design: Every request validated, logged, and rate-limited
- Local ML Integration: Ready for ml-offload-api integration for local inference
Current Stateโ
Status: โ
Core functionality complete, tested with DeepSeek API
Build System: Nix flakes + Cargo workspace
Architecture: 5 crates (core, security, providers, cli, desktop)
Security Level: Production-ready with comprehensive hardening
Goalsโ
- Primary: Provide secure proxy for LLM API access
- Secondary: Integrate with ml-offload-api for local model fallback
- Tertiary: Desktop GUI for non-technical users
- Future: Multi-tenant support, advanced observability
Architecture Overviewโ
Workspace Structureโ
SecureLLM Bridge/
โโโ crates/
โ โโโ core/ # Core types, traits, unified interface
โ โโโ security/ # TLS, rate limiting, audit logs, sandboxing
โ โโโ providers/ # LLM provider implementations
โ โโโ cli/ # Command-line interface
โ โโโ desktop/ # GUI application (WIP)
โโโ mcp-server/ # MCP server for IDE integration
โโโ .claude/ # AI assistant infrastructure
โโโ nix/ # Nix build configurations
โโโ config.toml # Runtime configuration
Crate Responsibilitiesโ
1. crates/core/ - Foundationโ
- Purpose: Core abstractions and unified interface
- Key Components:
LLMProvidertrait: Unified interface for all providersMessage,ChatRequest,ChatResponsetypesProviderConfigfor provider-specific settings- Error handling with
anyhow
- Dependencies: Minimal (serde, anyhow, tokio)
2. crates/security/ - Security Layerโ
- Purpose: Enterprise-grade security features
- Key Components:
- TLS: Mutual authentication with
rustls, client certificates - Rate Limiting: Token-bucket algorithm, per-provider limits
- Audit Logging: Structured JSON logs, rotation, retention
- Sandboxing: Process isolation, resource limits
- Secrets Management:
secrecycrate for sensitive data
- TLS: Mutual authentication with
- Security Standards: OWASP compliant, defense-in-depth
- Dependencies: rustls, tokio, secrecy, serde_json
3. crates/providers/ - LLM Integrationsโ
- Purpose: Provider-specific implementations
- Supported Providers:
- DeepSeek: โ Tested and working (api.deepseek.com)
- OpenAI: โ Implementation complete (GPT-4, GPT-3.5)
- Anthropic: โ Implementation complete (Claude models)
- Ollama: โ Local inference support (localhost:11434)
- ML-Offload-API: ๐ง Planned integration (port 9000)
- Features:
- Automatic retry with exponential backoff
- Request/response transformation
- Provider-specific error handling
- Cost tracking (tokens, API calls)
- Dependencies: reqwest, serde_json, tokio
4. crates/cli/ - Command-Line Interfaceโ
- Purpose: CLI for testing and automation
- Commands:
securellm test <provider>- Test provider connectivitysecurellm chat <provider>- Interactive chat sessionsecurellm config validate- Validate configurationsecurellm security audit- Run security audit
- Features: Interactive REPL, streaming responses, configuration management
- Dependencies: clap, tokio, serde
5. crates/desktop/ - GUI Applicationโ
- Purpose: User-friendly desktop interface
- Status: ๐ง Work in progress
- Planned Features:
- Multi-provider chat interface
- Configuration wizard
- Security dashboard
- Usage analytics
- Technology: TBD (Tauri, egui, or Dioxus)
Security Architectureโ
Defense in Depthโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 1. TLS Mutual Authentication โ
โ - Client certificates required โ
โ - Server certificate validation โ
โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 2. Rate Limiting โ
โ - Per-provider token buckets โ
โ - Request rate limits โ
โ - Burst protection โ
โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 3. Input Validation & Sanitization โ
โ - Schema validation โ
โ - Prompt injection protection โ
โ - Content filtering โ
โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 4. Audit Logging โ
โ - All requests logged โ
โ - Structured JSON format โ
โ - Tamper-proof logs โ
โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 5. Sandboxing โ
โ - Process isolation โ
โ - Resource limits (CPU, memory) โ
โ - Network restrictions โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
TLS Configurationโ
Certificates:
- Server certificate:
/etc/securellm/certs/server.crt - Server key:
/etc/securellm/certs/server.key - Client CA:
/etc/securellm/certs/client-ca.crt
Configuration (config.toml):
[security.tls]
enabled = true
cert_path = "/etc/securellm/certs/server.crt"
key_path = "/etc/securellm/certs/server.key"
client_ca_path = "/etc/securellm/certs/client-ca.crt"
require_client_cert = true
Rate Limitingโ
Algorithm: Token bucket with refill
Configuration:
[security.rate_limit]
enabled = true
requests_per_minute = 60
burst_size = 10
per_provider = true
Limits by Provider:
- DeepSeek: 60 req/min, 10 burst
- OpenAI: 3500 req/min (API tier dependent)
- Anthropic: 50 req/min, 5 burst
- Ollama: Unlimited (local)
Audit Loggingโ
Format: Structured JSON
Fields: timestamp, user_id, provider, model, prompt_tokens, completion_tokens, cost, duration_ms, status
Rotation: Daily with 90-day retention
Storage: /var/log/securellm/audit.log
Example Log Entry:
{
"timestamp": "2025-11-06T01:54:32Z",
"request_id": "req_abc123",
"user_id": "user_001",
"provider": "deepseek",
"model": "deepseek-chat",
"prompt_tokens": 126,
"completion_tokens": 748,
"total_cost": 0.000437,
"duration_ms": 738,
"status": "success"
}
Provider Integration Guideโ
Adding a New Providerโ
- Create Provider Module (
crates/providers/src/newprovider.rs):
use crate::core::{LLMProvider, ChatRequest, ChatResponse, ProviderError};
pub struct NewProvider {
api_key: String,
base_url: String,
}
#[async_trait::async_trait]
impl LLMProvider for NewProvider {
async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, ProviderError> {
// Implementation
}
}
- Add to Provider Registry (
crates/providers/src/lib.rs):
pub mod newprovider;
pub use newprovider::NewProvider;
- Add Configuration (
config.toml):
[providers.newprovider]
enabled = true
api_key = "${NEW_PROVIDER_API_KEY}"
base_url = "https://api.newprovider.com"
- Implement Tests:
#[tokio::test]
async fn test_newprovider() {
let provider = NewProvider::new(config);
let response = provider.chat(test_request).await.unwrap();
assert!(!response.content.is_empty());
}
Provider Testingโ
Test Script: basic_usage.sh
#!/bin/bash
export DEEPSEEK_API_KEY="your-key"
cargo run --bin securellm -- test deepseek
Expected Output:
Testing DeepSeek provider...
Request sent in 738ms
Response: 874 tokens
Status: โ
Success
ML-Offload-API Integration Planโ
Integration Architectureโ
โโโโโโโโโโโโโโโโโโโโโโโโโโ
โ SecureLLM Bridge โ
โ (This Project) โ
โ โ
โ โโโโโโโโโโโโโโโโโโ โ
โ โ Cloud Providersโ โ
โ โ - DeepSeek โ โ
โ โ - OpenAI โ โ
โ โ - Anthropic โ โ
โ โโโโโโโโโโโโโโโโโโ โ
โ โ
โ โโโโโโโโโโโโโโโโโโ โ
โ โ Local Provider โ โ
โ โ (NEW) โ โ
โ โ โ โ
โ โ โโโโโโโโโโโโ โ โ
โ โ โ ML-Offloadโโโโผโโโโผโโโ Port 9000
โ โ โ API โ โ โ
โ โ โโโโโโโโโโโโ โ โ
โ โ โ โ
โ โ - VRAM check โ โ
โ โ - Model mgmt โ โ
โ โ - llama.cpp โ โ
โ โโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโ
Integration Steps (Phase 1)โ
Week 1: Research & Design
- Analyze ml-offload-api endpoints
- Design LocalProvider implementation
- Define fallback strategy (cloud โ local)
- Plan VRAM-aware routing
Week 2: Implementation
- Create
crates/providers/src/local.rs - Implement OpenAI-compatible client
- Add VRAM monitoring integration
- Implement model availability checks
Week 3: Testing & Integration
- Unit tests for LocalProvider
- Integration tests with ml-offload-api
- Load testing and performance tuning
- Documentation and examples
LocalProvider Designโ
pub struct LocalProvider {
client: reqwest::Client,
base_url: String, // http://localhost:9000
vram_threshold_mb: u64, // Minimum VRAM for inference
}
impl LocalProvider {
async fn check_vram(&self) -> Result<VramState, ProviderError> {
// GET /health/vram
}
async fn get_available_models(&self) -> Result<Vec<ModelInfo>, ProviderError> {
// GET /v1/models
}
async fn select_model(&self, request: &ChatRequest) -> Result<String, ProviderError> {
// Intelligent model selection based on:
// - Request complexity
// - Available VRAM
// - Model capabilities
}
}
#[async_trait::async_trait]
impl LLMProvider for LocalProvider {
async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, ProviderError> {
// Check VRAM availability
let vram = self.check_vram().await?;
if vram.free_mb < self.vram_threshold_mb {
return Err(ProviderError::InsufficientResources);
}
// Select appropriate model
let model = self.select_model(&request).await?;
// POST /v1/chat/completions
let response = self.client
.post(&format!("{}/v1/chat/completions", self.base_url))
.json(&request)
.send()
.await?;
// Transform response
Ok(response.json().await?)
}
}
Development Guideโ
Prerequisitesโ
- Nix: 2.18+ with flakes enabled
- Rust: 1.70+ (via Nix devShell)
- System: Linux (tested on NixOS)
Initial Setupโ
# Clone repository
git clone /path/to/securellm-bridge
cd securellm-bridge
# Enter Nix development shell
nix develop
# Build all crates
cargo build
# Run tests
cargo test
Development Workflowโ
- Make Changes: Edit Rust code
- Format:
cargo fmt - Lint:
cargo clippy - Test:
cargo test - Build:
cargo build --release
Testing Providersโ
DeepSeek:
export DEEPSEEK_API_KEY="your-key-here"
./basic_usage.sh
Ollama (requires local Ollama server):
ollama serve # Start Ollama server
cargo run --bin securellm -- test ollama
Configuration Managementโ
Development (config.toml):
[providers.deepseek]
enabled = true
api_key = "${DEEPSEEK_API_KEY}"
base_url = "https://api.deepseek.com"
model = "deepseek-chat"
[security.tls]
enabled = false # Disable for local dev
[security.rate_limit]
enabled = true
requests_per_minute = 60
Production (config.production.toml):
[security.tls]
enabled = true
cert_path = "/etc/securellm/certs/server.crt"
key_path = "/etc/securellm/certs/server.key"
client_ca_path = "/etc/securellm/certs/client-ca.crt"
require_client_cert = true
[security.audit]
enabled = true
log_path = "/var/log/securellm/audit.log"
rotation = "daily"
retention_days = 90
Docker Deploymentโ
# Build Docker image
docker build -t securellm-bridge:latest -f Dockerfile .
# Run container
docker run -d \
--name securellm-bridge \
-p 8443:8443 \
-v /etc/securellm:/etc/securellm:ro \
-v /var/log/securellm:/var/log/securellm \
-e DEEPSEEK_API_KEY="${DEEPSEEK_API_KEY}" \
securellm-bridge:latest
NixOS Deploymentโ
# /etc/nixos/configuration.nix
{
services.securellm-bridge = {
enable = true;
port = 8443;
configFile = "/etc/securellm/config.toml";
tlsCertFile = "/etc/securellm/certs/server.crt";
tlsKeyFile = "/etc/securellm/certs/server.key";
};
}
Best Practicesโ
Code Styleโ
- Formatting: Use
rustfmtwith default settings - Linting: Address all
clippywarnings - Naming:
- Types:
PascalCase - Functions/methods:
snake_case - Constants:
SCREAMING_SNAKE_CASE
- Types:
- Error Handling: Use
Result<T, E>everywhere, neverpanic!in library code - Async: Use
tokioruntime, avoid blocking operations
Security Guidelinesโ
- Secrets: Never hardcode secrets, use environment variables or secrets management
- Validation: Validate all external inputs (API responses, user input, config files)
- Logging: Log security events, sanitize logs (no secrets in logs)
- Dependencies: Regular security audits with
cargo audit - Updates: Keep dependencies updated, monitor CVEs
Testing Strategyโ
Unit Tests: Test individual components in isolation
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_rate_limiter() {
let limiter = RateLimiter::new(60, 10);
assert!(limiter.check_limit().is_ok());
}
}
Integration Tests: Test provider integrations
#[tokio::test]
async fn test_deepseek_integration() {
let provider = DeepSeekProvider::new(test_config());
let response = provider.chat(test_request()).await;
assert!(response.is_ok());
}
Security Tests: Validate security features
#[tokio::test]
async fn test_rate_limit_enforcement() {
// Exceed rate limit and verify rejection
}
Git Workflowโ
- Branches:
main: Stable, production-readydevelop: Integration branchfeature/*: New featuresfix/*: Bug fixes
- Commits: Use conventional commits (feat:, fix:, docs:, test:)
- PRs: Require tests, documentation, and review
- Versioning: Semantic versioning (major.minor.patch)
MCP Server Integrationโ
Overviewโ
The MCP (Model Context Protocol) server provides IDE integration for SecureLLM Bridge development. It exposes tools and resources that Cline (Claude Code) can use for:
- Testing providers
- Security auditing
- Build automation
- Configuration validation
Available Toolsโ
provider_test: Test LLM provider connectivitysecurity_audit: Run security checksrate_limit_check: Check rate limit statusbuild_and_test: Build and test projectprovider_config_validate: Validate provider configurationcrypto_key_generate: Generate TLS certificates
Available Resourcesโ
config://current: Current configuration statelogs://audit: Audit log accessmetrics://usage: Provider usage metricsdocs://api: API documentation
Configurationโ
Add to Claude Desktop config (~/.config/Claude/claude_desktop_config.json):
{
"mcpServers": {
"securellm-bridge": {
"command": "node",
"args": [
"/home/kernelcore/Downloads/ClaudeSkills/Security-Architect/mcp-server/build/index.js"
],
"env": {
"PROJECT_ROOT": "/home/kernelcore/Downloads/ClaudeSkills/Security-Architect"
}
}
}
}
Usage in Clineโ
// Test DeepSeek provider
await use_mcp_tool({
server_name: "securellm-bridge",
tool_name: "provider_test",
arguments: {
provider: "deepseek",
prompt: "Hello, world!",
model: "deepseek-chat"
}
});
// Run security audit
await use_mcp_tool({
server_name: "securellm-bridge",
tool_name: "security_audit",
arguments: {
config_file: "./config.toml"
}
});
Troubleshootingโ
Build Issuesโ
Error: error: linking with cc failed
Solution: Ensure all system dependencies are installed
nix develop # Nix will provide all dependencies
Error: cannot find crate secrecy
Solution: Clean and rebuild
cargo clean
cargo build
Provider Issuesโ
Error: DeepSeek API authentication failed
Solution: Check API key is set correctly
echo $DEEPSEEK_API_KEY # Verify key is set
export DEEPSEEK_API_KEY="sk-your-key-here"
Error: Rate limit exceeded
Solution: Wait for rate limit reset or adjust configuration
[security.rate_limit]
requests_per_minute = 30 # Reduce rate
TLS Issuesโ
Error: TLS handshake failed
Solution: Verify certificate paths and validity
openssl x509 -in /etc/securellm/certs/server.crt -text -noout
Runtime Issuesโ
Error: VRAM insufficient for inference
Solution:
- Check VRAM availability:
nvidia-smi - Reduce model size or batch size
- Use cloud provider fallback
Roadmapโ
Phase 1: Foundation (Complete โ )โ
- Core architecture and traits
- Security module (TLS, rate limiting, audit)
- DeepSeek provider integration
- OpenAI provider integration
- Anthropic provider integration
- Ollama provider integration
- CLI interface
- Docker support
Phase 2: ML-Offload Integration (In Progress ๐ง)โ
- LocalProvider implementation
- VRAM-aware routing
- Model availability checks
- Cloud โ Local fallback
- Performance optimization
- Integration tests
Phase 3: Advanced Features (Planned ๐)โ
- Desktop GUI application
- Multi-tenant support
- Advanced observability (Prometheus, Grafana)
- Cost optimization engine
- Prompt caching
- Response streaming
Phase 4: Enterprise Features (Future ๐ฎ)โ
- Kubernetes operator
- Multi-region deployment
- Advanced RBAC
- Compliance reporting (SOC2, HIPAA)
- Plugin system
- GraphQL API
Contributingโ
Code Contributionsโ
- Fork repository
- Create feature branch (
git checkout -b feature/amazing-feature) - Make changes and test thoroughly
- Commit with conventional commits (
git commit -m 'feat: add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open Pull Request
Documentation Contributionsโ
- Improve this CLAUDE.md
- Add inline code documentation
- Create tutorials and guides
- Report issues and suggest improvements
Testing Contributionsโ
- Add unit tests
- Create integration tests
- Perform security testing
- Load/performance testing
Support & Resourcesโ
Documentationโ
- This file:
CLAUDE.md - API docs:
cargo doc --open - Examples:
examples/directory
Communityโ
- Issues: File via
/reportbugin Cline - Discussions: Project discussions
- Maintainer: kernelcore
Related Projectsโ
- ml-offload-api:
/etc/nixos/modules/ml/offload/ - NixOS Configuration:
/etc/nixos/
Appendixโ
Environment Variablesโ
| Variable | Purpose | Example |
|---|---|---|
DEEPSEEK_API_KEY | DeepSeek API authentication | sk-... |
OPENAI_API_KEY | OpenAI API authentication | sk-... |
ANTHROPIC_API_KEY | Anthropic API authentication | sk-ant-... |
OLLAMA_BASE_URL | Ollama server URL | http://localhost:11434 |
CONFIG_PATH | Configuration file path | /etc/securellm/config.toml |
LOG_LEVEL | Logging verbosity | debug, info, warn, error |
Configuration Referenceโ
See config.toml for complete configuration options.
Performance Metricsโ
Typical Response Times:
- DeepSeek: 500-1000ms
- OpenAI GPT-4: 1000-3000ms
- Anthropic Claude: 800-2000ms
- Ollama (local): 200-500ms
Resource Usage:
- Memory: ~50MB base + ~200MB per active connection
- CPU: Minimal (<1% idle, 5-10% under load)
- Network: Depends on model and usage
Last Updated: 2025-11-06
Version: 1.0.0
Maintained By: kernelcore