You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

16 KiB

AI Second Brain — System Architecture

Version: 1.0.0
Date: 2026-03-05
Status: Design Document


Table of Contents

  1. Overview
  2. Core Components
  3. Data Flow
  4. Database Schema
  5. API Design
  6. Agent Architecture
  7. Ingestion Pipeline
  8. Infrastructure
  9. Design Principles

Overview

The AI Second Brain is a fully self-hosted, offline-capable knowledge management system that treats a Markdown vault (Obsidian/Logseq compatible) as the single source of truth. All AI capabilities—embeddings, retrieval, generation, and autonomous agents—run locally.

┌─────────────────────────────────────────────────────────────────────┐
│                         AI SECOND BRAIN                             │
│                                                                     │
│  ┌──────────┐    ┌────────────┐    ┌──────────┐    ┌────────────┐  │
│  │  EDITOR  │───▶│ INGESTION  │───▶│ STORAGE  │───▶│    API     │  │
│  │  LAYER   │    │  PIPELINE  │    │  LAYER   │    │   LAYER    │  │
│  └──────────┘    └────────────┘    └──────────┘    └────────────┘  │
│       │                                                  │          │
│  Markdown Vault                                     ┌────▼───────┐  │
│  (Obsidian/Logseq)                                  │  AI LAYER  │  │
│                                                     │  (Ollama)  │  │
│                                                     └────────────┘  │
│                                                          │          │
│                                                     ┌────▼───────┐  │
│                                                     │ INTERFACE  │  │
│                                                     │   LAYER    │  │
│                                                     └────────────┘  │
└─────────────────────────────────────────────────────────────────────┘

Core Components

1. Editor Layer

  • Vault directory: ./vault/ — plain Markdown files, fully compatible with Obsidian and Logseq
  • Format: CommonMark + YAML frontmatter + [[WikiLinks]]
  • Source of truth: All knowledge lives here; the database is a derived index
  • Sync: File-system watching via watchdog triggers the ingestion pipeline

2. Storage Layer

  • PostgreSQL 16 with pgvector extension
  • Stores: document metadata, text chunks, embeddings (1536-dim or 768-dim), extracted entities, wikilink relations
  • Vector index: IVFFlat or HNSW for ANN search

3. Processing Layer (Ingestion Pipeline)

  • File watcher monitors ./vault/**/*.md
  • Parser: frontmatter extraction (YAML), Markdown-to-text, WikiLink graph extraction
  • Chunker: 500800 token sliding window with 10% overlap
  • Embeddings: Ollama (nomic-embed-text) or sentence-transformers (offline fallback)
  • Idempotent: SHA-256 content hashing prevents redundant re-indexing

4. API Layer

  • FastAPI service exposing REST endpoints
  • Retrieval: hybrid search (vector similarity + full-text BM25-style)
  • Reranking: optional cross-encoder via sentence-transformers
  • Async throughout; connection pooling with asyncpg

5. AI Layer

  • Ollama sidecar providing local LLM inference (Mistral, Llama 3, Phi-3, etc.)
  • Embedding model: nomic-embed-text (768-dim)
  • Chat/generation model: configurable (default: mistral)
  • Agents use LangChain/LlamaIndex or direct Ollama API calls

6. Agent Layer

  • Long-running Python workers
  • Agents: Ingestion, Knowledge Linking, Tagging, Summarization, Maintenance
  • Message queue: Redis-backed job queue (ARQ) or simple PostgreSQL-backed queue
  • Scheduled via cron-style configuration

7. Interface Layer

  • Next.js (React) web application
  • Pages: Search, Chat, Document Viewer, Graph View (knowledge graph), Tag Browser
  • API client calls the FastAPI backend
  • Served as a Docker container (Node.js)

Data Flow

Ingestion Flow

Markdown File (vault/)
       │
       ▼
   File Watcher (watchdog)
       │
       ▼
   Parse & Validate
   ├── Extract YAML frontmatter (title, tags, date, aliases)
   ├── Extract WikiLinks [[target]]
   └── Convert Markdown → plain text
       │
       ▼
   Content Hash (SHA-256)
   └── Skip if unchanged (idempotent)
       │
       ▼
   Chunker (500-800 tokens, 10% overlap)
       │
       ▼
   Embedding Generation (Ollama nomic-embed-text)
       │
       ▼
   Store in PostgreSQL
   ├── documents table (metadata + full text)
   ├── chunks table (chunk text + embedding vector)
   ├── entities table (extracted NER if enabled)
   └── relations table (WikiLink graph edges)

Retrieval (RAG) Flow

User Query
    │
    ▼
Query Embedding (Ollama)
    │
    ▼
Hybrid Search
├── Vector similarity (pgvector cosine distance)
└── Full-text search (PostgreSQL tsvector)
    │
    ▼
Reranker (optional cross-encoder)
    │
    ▼
Context Assembly (top-k chunks + metadata)
    │
    ▼
LLM Generation (Ollama)
    │
    ▼
Response + Citations

Database Schema

Tables

documents

CREATE TABLE documents (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    path        TEXT NOT NULL UNIQUE,        -- relative path in vault
    title       TEXT,
    content     TEXT NOT NULL,               -- full markdown source
    content_hash TEXT NOT NULL,              -- SHA-256 for change detection
    frontmatter JSONB DEFAULT '{}',          -- parsed YAML frontmatter
    tags        TEXT[] DEFAULT '{}',
    aliases     TEXT[] DEFAULT '{}',
    word_count  INTEGER,
    created_at  TIMESTAMPTZ DEFAULT now(),
    updated_at  TIMESTAMPTZ DEFAULT now(),
    indexed_at  TIMESTAMPTZ,
    fts_vector  TSVECTOR                     -- full-text search index
);
CREATE INDEX idx_documents_path ON documents(path);
CREATE INDEX idx_documents_tags ON documents USING GIN(tags);
CREATE INDEX idx_documents_fts ON documents USING GIN(fts_vector);

chunks

CREATE TABLE chunks (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    document_id UUID NOT NULL REFERENCES documents(id) ON DELETE CASCADE,
    chunk_index INTEGER NOT NULL,
    content     TEXT NOT NULL,
    token_count INTEGER,
    embedding   VECTOR(768),                 -- nomic-embed-text dimension
    metadata    JSONB DEFAULT '{}',
    created_at  TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX idx_chunks_document_id ON chunks(document_id);
CREATE INDEX idx_chunks_embedding ON chunks USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);

entities

CREATE TABLE entities (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    document_id UUID NOT NULL REFERENCES documents(id) ON DELETE CASCADE,
    name        TEXT NOT NULL,
    entity_type TEXT NOT NULL,               -- PERSON, ORG, CONCEPT, etc.
    context     TEXT,
    created_at  TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX idx_entities_document_id ON entities(document_id);
CREATE INDEX idx_entities_name ON entities(name);
CREATE INDEX idx_entities_type ON entities(entity_type);

relations

CREATE TABLE relations (
    id            UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    source_doc_id UUID NOT NULL REFERENCES documents(id) ON DELETE CASCADE,
    target_path   TEXT NOT NULL,             -- may not exist yet (forward links)
    target_doc_id UUID REFERENCES documents(id) ON DELETE SET NULL,
    relation_type TEXT DEFAULT 'wikilink',   -- wikilink, tag, explicit
    context       TEXT,                      -- surrounding text
    created_at    TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX idx_relations_source ON relations(source_doc_id);
CREATE INDEX idx_relations_target ON relations(target_doc_id);
CREATE INDEX idx_relations_target_path ON relations(target_path);

agent_jobs

CREATE TABLE agent_jobs (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    agent_type  TEXT NOT NULL,               -- ingestion, linking, tagging, etc.
    status      TEXT DEFAULT 'pending',      -- pending, running, done, failed
    payload     JSONB DEFAULT '{}',
    result      JSONB,
    error       TEXT,
    created_at  TIMESTAMPTZ DEFAULT now(),
    started_at  TIMESTAMPTZ,
    completed_at TIMESTAMPTZ,
    retry_count INTEGER DEFAULT 0
);
CREATE INDEX idx_agent_jobs_status ON agent_jobs(status);
CREATE INDEX idx_agent_jobs_type ON agent_jobs(agent_type);

agent_logs

CREATE TABLE agent_logs (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    job_id      UUID REFERENCES agent_jobs(id) ON DELETE SET NULL,
    agent_type  TEXT NOT NULL,
    level       TEXT DEFAULT 'info',
    message     TEXT NOT NULL,
    metadata    JSONB DEFAULT '{}',
    created_at  TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX idx_agent_logs_job_id ON agent_logs(job_id);
CREATE INDEX idx_agent_logs_created ON agent_logs(created_at DESC);

API Design

Base URL: http://localhost:8000/api/v1

Method Endpoint Description
POST /search Hybrid vector + full-text search
POST /chat RAG chat with streaming response
GET /document/{id} Get document by ID
GET /document/path Get document by vault path
POST /index Manually trigger index of a file
POST /reindex Full vault reindex
GET /related/{id} Get related documents by embedding sim
GET /tags List all tags with counts
GET /graph WikiLink graph (nodes + edges)
GET /health Health check
GET /stats System statistics

Request/Response Shapes

POST /search

// Request
{
  "query": "machine learning concepts",
  "limit": 10,
  "threshold": 0.7,
  "tags": ["ml", "ai"],
  "hybrid": true
}

// Response
{
  "results": [
    {
      "document_id": "uuid",
      "chunk_id": "uuid",
      "title": "Introduction to ML",
      "path": "notes/ml-intro.md",
      "content": "chunk text...",
      "score": 0.92,
      "tags": ["ml", "ai"],
      "highlight": "...matched text..."
    }
  ],
  "total": 42,
  "query_time_ms": 23
}

POST /chat

// Request (SSE stream)
{
  "message": "What do I know about transformers?",
  "conversation_id": "optional-uuid",
  "context_limit": 5
}

// Response (Server-Sent Events)
data: {"token": "Transformers", "type": "token"}
data: {"token": " are", "type": "token"}
data: {"sources": [...], "type": "sources"}
data: {"type": "done"}

Agent Architecture

All agents inherit from a common BaseAgent class:

BaseAgent
├── IngestionAgent      — watches vault, triggers indexing
├── LinkingAgent        — discovers and creates knowledge links
├── TaggingAgent        — auto-tags documents using LLM
├── SummarizationAgent  — generates/updates document summaries
└── MaintenanceAgent    — detects orphans, broken links, stale content

Agent Lifecycle

  1. Agent starts, reads config from environment
  2. Polls agent_jobs table (or subscribes to PostgreSQL NOTIFY)
  3. Claims job atomically (UPDATE ... WHERE status='pending' RETURNING *)
  4. Executes job with retry logic (exponential backoff, max 3 retries)
  5. Writes result / error back to agent_jobs
  6. Logs to agent_logs

Scheduling

  • IngestionAgent: event-driven (file watcher) + fallback poll every 30s
  • LinkingAgent: runs after every ingestion batch
  • TaggingAgent: runs on new/modified documents without tags
  • SummarizationAgent: runs on documents >1000 words without summary
  • MaintenanceAgent: scheduled daily at midnight

Ingestion Pipeline

services/ingestion-worker/
├── watcher.py          — watchdog file system monitor
├── parser.py           — frontmatter + markdown + wikilink parser
├── chunker.py          — token-aware sliding window chunker
├── embedder.py         — Ollama / sentence-transformers embeddings
├── indexer.py          — PostgreSQL upsert logic
└── pipeline.py         — orchestrates the full ingestion flow

Chunking Strategy

  • Method: Sliding window, 500800 tokens, 10% overlap
  • Splitter: Prefer semantic boundaries (paragraphs, headings) over hard token cuts
  • Metadata preserved: document_id, chunk_index, source heading path

Embedding Strategy

  • Primary: Ollama nomic-embed-text (768-dim, fully offline)
  • Fallback: sentence-transformers/all-MiniLM-L6-v2 (384-dim, local model)
  • Batching: 32 chunks per embedding request for efficiency

Infrastructure

Docker Services

Service Image Port Description
postgres pgvector/pgvector:pg16 5432 PostgreSQL + pgvector
ollama ollama/ollama:latest 11434 Local LLM inference
rag-api local/rag-api 8000 FastAPI retrieval service
ingestion-worker local/ingestion-worker Vault watcher + indexer
agents local/agents Background AI agents
web-ui local/web-ui 3000 Next.js frontend
redis redis:7-alpine 6379 Job queue + caching

Volume Mounts

  • ./vault:/vault:rw — shared across all services needing vault access
  • postgres_data:/var/lib/postgresql/data — persistent database
  • ollama_data:/root/.ollama — pulled LLM models

Network

  • Internal Docker network second-brain-net
  • External ports: 3000 (UI), 8000 (API), 11434 (Ollama)

Design Principles

  1. Vault is source of truth — database is always a derived index, fully rebuildable
  2. Offline-first — zero external API calls required; all AI runs locally via Ollama
  3. Idempotent ingestion — SHA-256 hashing ensures files are not re-indexed unless changed
  4. No vendor lock-in — all components are open source and self-hosted
  5. Modular — each service can be replaced independently (swap Ollama for another runtime)
  6. Graceful degradation — system works without agents running; agents enhance, not gate
  7. Markdown compatibility — vault works as a standalone Obsidian/Logseq vault at all times

Powered by TurnKey Linux.