🧠 CEREBRO — Enterprise Knowledge Extraction & Distributed RAG Platform

CEREBRO — Enterprise Knowledge Extraction & Distributed RAG Platform

Repository: VoidNxSEC/cerebro
Language: Python 3.13+
Runtime: Nix (hermetic)
LLMs: Azure OpenAI, GCP Vertex AI, Llama.cpp

Visão Geral

Cerebro é uma plataforma standalone e hermética de inteligência sobre código e conhecimento. Combina nativamente:

Análise estática profunda via AST (TreeSitter)
Workflows de RAG local
Interfaces de terminal (TUI) ricas
Adapters cloud de nível de produção (Azure, GCP)

Construída rigidamente em torno de reprodutibilidade Nix-first, garantindo que o runtime que roda na máquina do dev escale sem fricção para clusters Kubernetes e ambientes enterprise.

Quickstart (1 minuto)

💡 A forma mais rápida de começar é pelo shell hermético Nix. Sem virtualenvs. Sem pacotes globais escondidos.

# 1. Entrar no ambiente de desenvolvimento hermético
nix develop

# 2. Setup interativo (LLM/Azure)
cerebro setup

# 3. Analisar código e consultar inteligência
cerebro knowledge analyze . "General codebase review"
cerebro rag ingest ./data/analyzed/all_artifacts.jsonl
cerebro rag query "Explain the architecture of the Core Services"

Arquitetura

Cerebro separa preocupações de forma agressiva: a lógica local roda 100% livre de cloud, mas o plug-in para escalabilidade enterprise (Azure, Elasticsearch) é nativo.

graph TD
	classDef default fill:#1e1e1e,stroke:#4a4a4a,stroke-width:2px,color:#fff;
	classDef adapter fill:#0078d4,stroke:#fff,stroke-width:2px,color:#fff;

	subgraph "Cerebro Unified Surface"
		CLI["Interactive CLI (Rich)"]
		CDASH["React Dashboard (Port 18321)"]
		CTUI["Textual TUI"]
	end

	subgraph "Core Intelligence Engine"
		AST["AST Static Analysis (TreeSitter)"]
		Metrics["Repo Metrics (Zero-token)"]
		RAG["Rigorous RAG Engine"]
	end

	subgraph "Pluggable Storage (Vector Stores)"
		Chroma["Local ChromaDB"]
		ES["Elasticsearch (Hybrid RRF)"]
		PG["PGVector"]
	end

	subgraph "Pluggable Compute (LLMs)"
		Llama["Llama.cpp (Local)"]
		Azure["Azure OpenAI (Enterprise)"]:::adapter
		GCP["GCP Vertex AI"]
	end

	CLI --> AST
	CDASH --> RAG
	CTUI --> Metrics
	AST --> RAG
	Metrics --> RAG
	RAG --> Chroma
	RAG --> ES
	RAG --> PG
	RAG --> Llama
	RAG --> Azure
	RAG --> GCP

Superfície de Comandos

DX nativo. Dentro do shell Nix, use chelp a qualquer momento.

Comando	Descrição
`cerebro setup`	Wizard interativo de `.env` mapeando variáveis de conexão.
`cerebro knowledge`	Parse seguro de codebases, extraindo syntax trees e dependências.
`cerebro rag`	Query, ingest e fusion search usando Reciprocal Rank Fusion (RRF).
`cerebro ops`	Health checks compatíveis com K8s (liveness/readiness).
`cdash` / `ctui`	Dashboards gráfico (browser) e de terminal.

Multi-cloud e Enterprise

Azure (Adapter nativo)

Azure OpenAI é priorizado para workloads em escala. Basta configurar CEREBRO_LLM_PROVIDER=azure via cerebro setup, mapeando credenciais Entra ID ou API tokens sob AzureOpenAIProvider.

Kubernetes

Definições .yaml production-grade em /kubernetes. Split limpo entre:

Backend RAG API: porta 8000
Dashboard analítico: porta 18321

Deploy: kubectl apply -k kubernetes/

Diretrizes Internas

Sempre use Nix. Nunca pip install. Para novos pacotes, wrapping em flake.nix sob as diretivas poetry2nix.
A arquitetura exige interfaces abstratas. Ao introduzir AWS ou Anthropic, plugar via src/cerebro/interfaces/llm.py.

Stack Técnica

Linguagem: Python 3.13+
Runtime: Nix Flakes (hermético, reprodutível)
Parsing: TreeSitter (AST multi-linguagem)
Vector stores: ChromaDB (local), Elasticsearch (hybrid RRF), PGVector
LLM providers: Llama.cpp, Azure OpenAI, GCP Vertex AI
Frontends: CLI (Rich), TUI (Textual), Dashboard React
Orquestração: Kubernetes
Licença: Proprietária

Visão Geral​

Quickstart (1 minuto)​

Arquitetura​

Superfície de Comandos​

Multi-cloud e Enterprise​

Azure (Adapter nativo)​

Kubernetes​

Diretrizes Internas​

Stack Técnica​