Neoland Architecture
Document Version: 2.0
Last Updated: 2026-01-18
This document provides a technical deep-dive into Neoland's architecture, design patterns, and integration strategies.
Table of Contentsβ
- System Overview
- Module Architecture
- LLM Inference Pipeline
- Data Flow
- Integration Mechanisms
- Design Patterns
- Future: Neutron Integration
System Overviewβ
Technology Stackβ
| Layer | Technology | Purpose |
|---|---|---|
| CLI | clap 4.5 | Argument parsing |
| TUI | ratatui 0.28 | Terminal rendering |
| Input | crossterm 0.28 | Cross-platform terminal I/O |
| RPC | tonic 0.12 + prost 0.13 | gRPC server/client |
| REST | axum 0.7 | HTTP API |
| HTTP Client | reqwest 0.12 | External API calls |
| Async Runtime | tokio 1.40 | async/await executor |
| Inference | candle 0.8 | Pure Rust ML framework |
| Build | Nix Flakes | Reproducible builds |
Deployment Modelsβ
- Standalone: Single binary with embedded inference
- Client-Server: TUI client + gRPC server
- Distributed: Client β ml-offload-api β multi-backend
Module Architectureβ
Server Module (src/server/mod.rs)β
Responsibilities:
- gRPC service implementation (
LlamaService) - REST API endpoints (OpenAI-compatible)
- Concurrent server management (tokio::select!)
- State management (
Arc<Mutex<>>)
Key Components:
pub struct AppState {
engine: Arc<Mutex<Option<LocalEngine>>>, // Lazy-loaded inference
vector_store: Arc<Mutex<VectorStore>>, // RAG embeddings
}
#[tonic::async_trait]
impl LlamaService for MyLlamaService {
type ChatStreamStream = ReceiverStream<Result<ChatResponse, Status>>;
async fn chat_stream(&self, req: Request<ChatRequest>)
-> Result<Response<Self::ChatStreamStream>, Status>;
}
Design Pattern: Repository pattern for state isolation.
TUI Module (src/tui/)β
Architecture: Model-View-Controller (MVC)
app.rs (Model)
βββ AppState - Core state
βββ ChatMessage - Message data
βββ QueryConfig - Inference params
ui.rs (View)
βββ render() - Main render loop
βββ render_header() - Status bar
βββ render_chat() - Message list
βββ render_sidebar() - Config panel
βββ render_input() - Input field
events.rs (Controller)
βββ handle_events() - Input routing
βββ AppEvent enum - Action commands
mod.rs (Coordinator)
βββ run_client() - Event loop
βββ send_message_to_server() - API logic
State Management:
pub struct AppState {
messages: Vec<ChatMessage>,
input_buffer: String,
config: QueryConfig,http://127.0.0.1:39599/webview/agentic-duo-chat?mode=flow-mode&_csrf=wxdUnrBS-PSKvagv6fMbPfj6fgPsoEeUiyJgZVokGRhbTc757A-g#
sidebar_visible: bool,
server_url: String,
ml_api_url: String, // Configurable endpoint
scroll_offset: usize,
is_thinking: bool, // UI feedback flag
}
ML Offload Module (src/ml_offload/)β
Purpose: Client for external ML orchestration service.
OpenAI Compatibility Layer:
pub struct ChatCompletionRequest {
model: String, // "auto" for backend selection
messages: Vec<ChatMessage>,
temperature: Option<f32>,
max_tokens: Option<u32>,
stream: Option<bool>, // SSE support
}
HTTP Client:
impl MLOffloadClient {
pub async fn chat_completion(&self, req: ChatCompletionRequest)
-> Result<ChatCompletionResponse>;
pub async fn health(&self) -> Result<HealthStatus>;
pub async fn list_models(&self) -> Result<ModelsListResponse>;
}
LLM Module (src/llm/)β
Purpose: SecureLLM integration for audited external providers.
Security Proxy:
pub struct SecureLLMProxy {
// Future: Provider pooling, circuit breaker
}
impl SecureLLMProxy {
pub async fn send_secure(&self, message: String) -> Result<String> {
// 1. Build securellm_core::Request
// 2. Apply rate limiting
// 3. Log request (audit trail)
// 4. Send to provider
// 5. Validate response
}
}
LLM Inference Pipelineβ
Fallback Chain Implementationβ
async fn send_message_to_server(app: &mut AppState) -> Result<()> {
let message = app.input_buffer.clone();
app.add_user_message(&message);
// === LAYER 1: ml-offload-api ===
let ml_client = MLOffloadClient::new(app.ml_api_url.clone());
if ml_client.health().await.is_ok() {
match ml_client.chat_completion(request).await {
Ok(response) => return Ok(()),
Err(e) => log_warning(e),
}
}
// === LAYER 2: gRPC Internal ===
let mut grpc_client = LlamaServiceClient::connect(app.server_url.clone()).await?;
match grpc_client.chat_stream(grpc_request).await {
Ok(stream) => return Ok(()),
Err(e) => log_error(e),
}
// === LAYER 3: SecureLLM Proxy ===
let proxy = SecureLLMProxy::new();
let response = proxy.send_secure(message).await?;
app.add_assistant_message(&response);
Ok(())
}
Rationale: Prioritize speed (GPU) β reliability (local) β coverage (external).
Data Flowβ
Client-Initiated Chatβ
sequenceDiagram
participant User
participant TUI
participant MLOffload
participant gRPC
participant SecureLLM
User->>TUI: Type message + Ctrl+Enter
TUI->>TUI: Set is_thinking=true
TUI->>MLOffload: POST /v1/chat/completions
alt MLOffload Available
MLOffload-->>TUI: Stream tokens (SSE)
TUI->>User: Display response
else MLOffload Down
TUI->>gRPC: chat_stream(ChatRequest)
alt gRPC Available
gRPC-->>TUI: Stream<ChatResponse>
TUI->>User: Display response
else gRPC Down
TUI->>SecureLLM: send_secure()
SecureLLM-->>TUI: String response
TUI->>User: Display response
end
end
TUI->>TUI: Set is_thinking=false
gRPC Protocol Bufferβ
message ChatRequest {
string prompt = 1;
string model_id = 2;
bool use_local = 3;
optional float temperature = 4;
optional int32 max_tokens = 5;
// ... 11 additional sampling params
}
message ChatResponse {
string content = 1;
bool is_command = 2;
optional ResponseMetadata metadata = 3;
}
Integration Mechanismsβ
Hyprland IPCβ
Implementation: External crate (hyprland-ipc).
NixOS Declaration:
wayland.windowManager.hyprland.extraConfig = ''
# Neoland scratchpad
windowrulev2 = float, class:^(neoland-client)$
windowrulev2 = size 1400 900, class:^(neoland-client)$
windowrulev2 = center, class:^(neoland-client)$
windowrulev2 = opacity 0.95, class:^(neoland-client)$
windowrulev2 = workspace special:neoland, class:^(neoland-client)$
bind = SUPER, N, togglespecialworkspace, neoland
'';
Runtime IPC:
use hyprland_ipc::HyprlandClient;
let client = HyprlandClient::new()?;
client.dispatch("togglespecialworkspace neoland").await?;
Agent Hub Integrationβ
Launcher Script (agent-hub.nix):
"σ° Neoland - AI Agent")
alacritty --class="neoland-client" \
-e neoland client \
--ml-api-url http://${llamaCppApi.host}:${toString llamaCppApi.port} &
;;
Status Check:
pgrep -f "neoland (server|client)" > /dev/null && echo "π’ Running" || echo "π΄ Stopped"
Design Patternsβ
1. Builder Pattern (CLI)β
#[derive(Parser)]
#[command(name = "neoland")]
pub struct Cli {
#[command(subcommand)]
pub command: Commands,
#[arg(long, global = true, default_value = "info")]
pub log_level: String,
}
2. State Machine (TUI Events)β
pub enum AppEvent {
Quit,
SendMessage,
ClearChat,
ApplyPreset(String),
ToggleSidebar,
ScrollUp,
ScrollDown,
}
match event {
AppEvent::SendMessage => {
app.is_thinking = true;
terminal.draw(|f| render(f, &app))?;
send_message_to_server(&mut app).await?;
app.is_thinking = false;
}
// ...
}
3. Repository Pattern (Server State)β
pub struct AppState {
engine: Arc<Mutex<Option<LocalEngine>>>, // Lazy init
vector_store: Arc<Mutex<VectorStore>>, // Shared state
}
// Thread-safe access
let mut engine_guard = state.engine.lock().unwrap();
if engine_guard.is_none() {
*engine_guard = Some(LocalEngine::new()?);
}
4. Adapter Pattern (ML Offload)β
OpenAI-compatible interface adapts to internal data structures:
// External format
pub struct ChatMessage {
pub role: String, // "user" | "assistant" | "system"
pub content: String,
}
// Internal format
pub struct ChatMessage {
pub role: MessageRole, // enum
pub content: String,
pub timestamp: DateTime<Utc>, // Added context
}
Future: Neutron Integrationβ
Visionβ
Neutron is a distributed trust layer for AI systems, ensuring:
- Data Provenance: Cryptographic proof of training data origins
- Model Integrity: Tamper-proof audit logs and version control
- Inference Accountability: Zero-knowledge proofs for sensitive queries
Proposed Architectureβ
// src/neutron/mod.rs
pub mod provenance;
pub mod consensus;
pub mod zkp;
pub struct NeutronClient {
attestation_service: AttestationClient,
consensus_pool: ConsensusNode,
zkp_engine: ZKPVerifier,
}
impl NeutronClient {
/// Verify training data lineage
pub async fn verify_provenance(&self, model_id: &str)
-> Result<ProvenanceChain>;
/// Submit inference to distributed audit log
pub async fn audit_inference(&self, request: &Request, response: &Response)
-> Result<AuditReceipt>;
/// Generate ZKP for sensitive inference
pub async fn zkp_inference(&self, private_input: &str)
-> Result<(PublicOutput, Proof)>;
}
Integration Pointsβ
1. ml-offload-api Attestationβ
// Before forwarding request to backend
let attestation = neutron_client
.verify_backend(&backend_url)
.await?;
if !attestation.is_valid() {
return Err("Untrusted backend");
}
2. SecureLLM Audit Trailβ
// After external provider response
let receipt = neutron_client
.audit_inference(&request, &response)
.await?;
// Store immutable audit log
save_to_blockchain(receipt);
3. Vector Store Source Verificationβ
impl VectorStore {
pub async fn add_document_verified(
&mut self,
content: &str,
neutron_client: &NeutronClient
) -> Result<()> {
// Verify document source
let provenance = neutron_client
.verify_document_source(content)
.await?;
self.add_document(content, &provenance.serialize())?;
Ok(())
}
}
Threat Modelβ
Mitigated Risks:
- π‘οΈ Poisoned training data injection
- π‘οΈ Model weight tampering
- π‘οΈ Inference result manipulation
- π‘οΈ Unauthorized data access
Implementation Timeline:
- Q2 2026: Provenance module (Phase 1)
- Q3 2026: Consensus layer (Phase 2)
- Q4 2026: ZKP engine (Phase 3)
Dependenciesβ
# Future Cargo.toml additions
neutron-core = { path = "../neutron/crates/core" }
zkp-stark = "0.3" # Zero-knowledge proofs
blockchain-client = "1.0" # Audit log persistence
attestation-service = "0.5" # Remote attestation
Performance Optimizationβ
Current Bottlenecksβ
-
Inference: CPU-bound (Qwen 1.8B @ 5 tok/s)
- Solution: ml-offload-api with GPU backend
-
Startup: Model loading (~2s cold start)
- Solution: Lazy initialization in server
-
Memory: VectorStore grows unbounded
- Future: LRU eviction policy
Benchmarksβ
# TUI rendering latency
hyperfine --warmup 3 './target/release/neoland client'
# Mean: 42ms Β± 5ms
# gRPC throughput
ghz --insecure --proto proto/llamachat.proto \
--call llamachat.LlamaService/ChatStream \
-d '{"prompt":"test"}' \
--total 1000 \
localhost:50051
# RPS: ~150 (single-threaded server)
Security Considerationsβ
Current Measuresβ
- Input Validation: All CLI args sanitized via
clap - Command Filtering: Whitelist for
[[CMD:]]execution - TLS: gRPC supports TLS (disabled in dev)
Future Hardeningβ
- Neutron Integration: Cryptographic verification
- Rate Limiting: Per-user quotas
- Sandboxing: Isolate inference in separate process
Referencesβ
Maintained by: VoidNxSEC Architecture Team
Review Cycle: Quarterly