Initial commit: AI Second Brain

Self-hosted knowledge management system with: - RAG API (FastAPI + pgvector) - Markdown vault (Obsidian/Logseq compatible) - Autonomous AI agents (ingestion, tagging, linking, summarization, maintenance) - Web UI (Next.js) - Docker Compose deployment - Ollama integration for local LLM inference Built by Copilot CLI, reviewed by Clawd.
3 weeks ago · 626b04aa4e
commit 626b04aa4e
73 changed files with 4937 additions and 0 deletions
--- a/.env.example
+++ b/.env.example
@ -0,0 +1,61 @@
 # =============================================================================
 # AI Second Brain — Environment Configuration
 # Copy this file to .env and adjust values for your setup.
 # =============================================================================
 # ---------------------------------------------------------------------------
 # Database
 # ---------------------------------------------------------------------------
 POSTGRES_PASSWORD=brain
 POSTGRES_PORT=5432
 # ---------------------------------------------------------------------------
 # Ollama (local LLM)
 # ---------------------------------------------------------------------------
 OLLAMA_PORT=11434
 EMBEDDING_MODEL=nomic-embed-text
 CHAT_MODEL=mistral
 # ---------------------------------------------------------------------------
 # RAG API
 # ---------------------------------------------------------------------------
 API_PORT=8000
 LOG_LEVEL=INFO
 # Retrieval defaults
 SEARCH_TOP_K=10
 SEARCH_THRESHOLD=0.65
 RERANK_ENABLED=false
 # Embedding provider: ollama | sentence_transformers
 EMBEDDING_PROVIDER=ollama
 EMBEDDING_DIMENSIONS=768
 # CORS — comma-separated origins allowed to access the API
 CORS_ORIGINS=http://localhost:3000
 # ---------------------------------------------------------------------------
 # Web UI
 # ---------------------------------------------------------------------------
 UI_PORT=3000
 # ---------------------------------------------------------------------------
 # Ingestion Worker
 # ---------------------------------------------------------------------------
 VAULT_PATH=/vault
 CHUNK_SIZE=700
 CHUNK_OVERLAP=70
 POLL_INTERVAL=30
 # ---------------------------------------------------------------------------
 # AI Agents
 # ---------------------------------------------------------------------------
 INGESTION_POLL=15
 LINKING_POLL=60
 TAGGING_POLL=120
 SUMMARIZATION_POLL=300
 MAINTENANCE_POLL=3600
 # Enable auto-tagging and summarization by agents
 AUTO_TAG=true
 AUTO_SUMMARIZE=true
--- a/.gitignore
+++ b/.gitignore
@ -0,0 +1,37 @@
 # Environment
 .env
 .env.local
 *.env
 # Python
 __pycache__/
 *.py[cod]
 *$py.class
 .venv/
 venv/
 *.egg-info/
 # Node
 node_modules/
 .next/
 out/
 build/
 dist/
 # Data
 *.db
 *.sqlite
 *.log
 # IDE
 .vscode/
 .idea/
 # Docker volumes (local)
 postgres_data/
 redis_data/
 ollama_data/
 # OS
 .DS_Store
 Thumbs.db
--- a/README.md
+++ b/README.md
@ -0,0 +1,41 @@
 # AI Second Brain
 A fully self-hosted, offline-capable knowledge management system with AI-powered retrieval, autonomous agents, and a Markdown-first philosophy.
 ## Quick Start
 ```bash
 cp .env.example .env
 # edit .env as needed
 docker compose up -d
 # wait for ollama to pull models
 docker compose exec ollama ollama pull nomic-embed-text
 docker compose exec ollama ollama pull mistral
 # open the UI
 open http://localhost:3000
 ```
 ## Architecture
 See [docs/architecture.md](docs/architecture.md) for the full design.
 ## Components
 | Service            | Port  | Description                    |
 |--------------------|-------|--------------------------------|
 | Web UI             | 3000  | Next.js knowledge interface    |
 | RAG API            | 8000  | FastAPI retrieval service      |
 | Ollama             | 11434 | Local LLM inference            |
 | PostgreSQL         | 5432  | Vector + relational store      |
 | Redis              | 6379  | Job queue                      |
 ## Documentation
 - [Architecture](docs/architecture.md)
 - [Setup Guide](docs/setup.md)
 - [API Reference](docs/api.md)
 - [Agents Guide](docs/agents.md)
 ## License
 MIT
--- a/docker-compose.yml
+++ b/docker-compose.yml
@ -0,0 +1,197 @@
 services:
  # ---------------------------------------------------------------------------
  # PostgreSQL with pgvector
  # ---------------------------------------------------------------------------
  postgres:
    image: pgvector/pgvector:pg16
    container_name: second-brain-postgres
    restart: unless-stopped
    environment:
      POSTGRES_DB: second_brain
      POSTGRES_USER: brain
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-brain}
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./infra/database/schema.sql:/docker-entrypoint-initdb.d/01_schema.sql:ro
    ports:
      - "${POSTGRES_PORT:-5432}:5432"
    networks:
      - brain-net
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U brain -d second_brain"]
      interval: 10s
      timeout: 5s
      retries: 5
  # ---------------------------------------------------------------------------
  # Redis (job queue)
  # ---------------------------------------------------------------------------
  redis:
    image: redis:7-alpine
    container_name: second-brain-redis
    restart: unless-stopped
    volumes:
      - redis_data:/data
    networks:
      - brain-net
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5
  # ---------------------------------------------------------------------------
  # Ollama (local LLM inference)
  # ---------------------------------------------------------------------------
  ollama:
    image: ollama/ollama:latest
    container_name: second-brain-ollama
    restart: unless-stopped
    volumes:
      - ollama_data:/root/.ollama
    ports:
      - "${OLLAMA_PORT:-11434}:11434"
    networks:
      - brain-net
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:11434/api/tags"]
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 60s
  # ---------------------------------------------------------------------------
  # Ollama model bootstrap (pulls required models on first start)
  # ---------------------------------------------------------------------------
  ollama-bootstrap:
    image: ollama/ollama:latest
    container_name: second-brain-ollama-bootstrap
    depends_on:
      ollama:
        condition: service_healthy
    volumes:
      - ollama_data:/root/.ollama
    networks:
      - brain-net
    entrypoint: ["/bin/sh", "-c"]
    command:
      - |
        OLLAMA_HOST=ollama:11434 ollama pull ${EMBEDDING_MODEL:-nomic-embed-text}
        OLLAMA_HOST=ollama:11434 ollama pull ${CHAT_MODEL:-mistral}
    restart: "no"
  # ---------------------------------------------------------------------------
  # RAG API (FastAPI)
  # ---------------------------------------------------------------------------
  rag-api:
    build:
      context: ./services/rag-api
      dockerfile: Dockerfile
    container_name: second-brain-rag-api
    restart: unless-stopped
    env_file:
      - .env
    environment:
      DATABASE_URL: postgresql://brain:${POSTGRES_PASSWORD:-brain}@postgres:5432/second_brain
      OLLAMA_URL: http://ollama:11434
    depends_on:
      postgres:
        condition: service_healthy
      ollama:
        condition: service_healthy
    ports:
      - "${API_PORT:-8000}:8000"
    networks:
      - brain-net
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/api/v1/health"]
      interval: 15s
      timeout: 5s
      retries: 5
      start_period: 30s
  # ---------------------------------------------------------------------------
  # Ingestion Worker
  # ---------------------------------------------------------------------------
  ingestion-worker:
    build:
      context: ./services/ingestion-worker
      dockerfile: Dockerfile
    container_name: second-brain-ingestion
    restart: unless-stopped
    env_file:
      - .env
    environment:
      DATABASE_URL: postgresql://brain:${POSTGRES_PASSWORD:-brain}@postgres:5432/second_brain
      OLLAMA_URL: http://ollama:11434
      VAULT_PATH: /vault
    volumes:
      - ./vault:/vault:ro
    depends_on:
      postgres:
        condition: service_healthy
      ollama:
        condition: service_healthy
    networks:
      - brain-net
  # ---------------------------------------------------------------------------
  # AI Agents
  # ---------------------------------------------------------------------------
  agents:
    build:
      context: ./services/agents
      dockerfile: Dockerfile
    container_name: second-brain-agents
    restart: unless-stopped
    env_file:
      - .env
    environment:
      DATABASE_URL: postgresql://brain:${POSTGRES_PASSWORD:-brain}@postgres:5432/second_brain
      OLLAMA_URL: http://ollama:11434
      VAULT_PATH: /vault
    volumes:
      - ./vault:/vault:ro
    depends_on:
      postgres:
        condition: service_healthy
      rag-api:
        condition: service_healthy
    networks:
      - brain-net
  # ---------------------------------------------------------------------------
  # Web UI (Next.js)
  # ---------------------------------------------------------------------------
  web-ui:
    build:
      context: ./services/web-ui
      dockerfile: Dockerfile
    container_name: second-brain-ui
    restart: unless-stopped
    environment:
      NEXT_PUBLIC_API_URL: http://localhost:${API_PORT:-8000}
    depends_on:
      rag-api:
        condition: service_healthy
    ports:
      - "${UI_PORT:-3000}:3000"
    networks:
      - brain-net
 volumes:
  postgres_data:
  redis_data:
  ollama_data:
 networks:
  brain-net:
    driver: bridge
--- a/docs/agents.md
+++ b/docs/agents.md
@ -0,0 +1,174 @@
 # AI Agents Guide
 The Second Brain system includes five autonomous AI agents that run as background workers, continuously improving the knowledge base.
 ---
 ## Architecture
 All agents inherit from `BaseAgent` and share:
 - **Atomic job claiming** from `agent_jobs` table (no double-processing)
 - **Exponential backoff retry** (max 3 retries, 2/4/8s delays)
 - **Structured logging** to `agent_logs` table
 - **Configurable poll intervals** via environment variables
 ---
 ## Agents
 ### 1. Ingestion Agent (`ingestion`)
 **Purpose:** Indexes new and modified Markdown files from the vault.
 **Triggers:**
 - Queued job via the API (`POST /api/v1/index`)
 - Full vault reindex job (`POST /api/v1/index/reindex`)
 - File watcher events (from ingestion-worker)
 **What it does:**
 1. Reads the target file(s) from the vault
 2. Parses frontmatter, extracts WikiLinks and tags
 3. Chunks content into 500–800 token segments
 4. Generates embeddings via Ollama
 5. Upserts document, chunks, and relations in PostgreSQL
 **Idempotency:** SHA-256 content hashing ensures unchanged files are skipped.
 ---
 ### 2. Knowledge Linking Agent (`linking`)
 **Purpose:** Discovers semantic connections between documents and creates `ai-inferred` relation edges.
 **Triggers:** Runs periodically (default: every 60s).
 **What it does:**
 1. Finds documents without AI-inferred links
 2. For each: computes average chunk embedding
 3. Finds top-5 semantically similar documents (cosine similarity > 0.75)
 4. Inserts `ai-inferred` relations
 **Use case:** Surfaces non-obvious connections — e.g., a note about "attention mechanisms" linked to a note about "reading strategies" if the embeddings are similar.
 ---
 ### 3. Tagging Agent (`tagging`)
 **Purpose:** Automatically suggests and applies tags to untagged documents using the LLM.
 **Triggers:** Runs periodically (default: every 120s).
 **What it does:**
 1. Finds documents with no tags
 2. Sends title + content excerpt to Ollama with a tagging prompt
 3. Parses the LLM JSON response (3–7 suggested tags)
 4. Writes tags back to the `documents` table
 **Prompt template:** Instructs the LLM to produce lowercase, hyphen-separated tags.
 **Note:** Tags written to the database only — to persist back to the Markdown file, run the optional vault sync script.
 ---
 ### 4. Summarization Agent (`summarization`)
 **Purpose:** Generates concise summaries for long documents that lack one.
 **Triggers:** Runs periodically (default: every 300s).
 **Criteria:**
 - Document word count > 500
 - `frontmatter.summary` is missing or empty
 **What it does:**
 1. Sends title + content (up to 4000 chars) to Ollama
 2. Receives a 2–4 sentence summary
 3. Stores the summary in `documents.frontmatter.summary`
 The summary becomes available via the API and is displayed in the document viewer.
 ---
 ### 5. Maintenance Agent (`maintenance`)
 **Purpose:** Health checks and housekeeping for the knowledge graph.
 **Triggers:** Runs daily (default: every 3600s).
 **What it does:**
 1. Counts broken WikiLinks (links with no matching document)
 2. Finds orphaned documents (no incoming or outgoing links)
 3. Counts stale documents (not re-indexed in 7+ days)
 4. Counts chunks with missing embeddings
 5. Resolves previously broken WikiLinks that now have matching documents
 **Output:** A structured report written to `agent_jobs.result` and logged to `agent_logs`.
 ---
 ## Monitoring Agents
 ### Check agent job queue
 ```sql
 SELECT agent_type, status, COUNT(*) 
 FROM agent_jobs 
 GROUP BY agent_type, status 
 ORDER BY agent_type;
 ```
 ### View recent agent logs
 ```sql
 SELECT agent_type, level, message, created_at 
 FROM agent_logs 
 ORDER BY created_at DESC 
 LIMIT 50;
 ```
 ### View last maintenance report
 ```sql
 SELECT result 
 FROM agent_jobs 
 WHERE agent_type = 'maintenance' AND status = 'done' 
 ORDER BY completed_at DESC 
 LIMIT 1;
 ```
 ---
 ## Disabling Agents
 Set poll intervals to very large values in `.env` to effectively disable specific agents:
 ```env
 LINKING_POLL=999999
 TAGGING_POLL=999999
 ```
 ---
 ## Adding a Custom Agent
 1. Create `services/agents/my-agent/agent.py`:
 ```python
 from base_agent import BaseAgent
 class MyAgent(BaseAgent):
    agent_type = 'my-agent'
    async def process(self, job_id: str, payload: dict) -> dict:
        # Your logic here
        return {'done': True}
 ```
 2. Register in `services/agents/main.py`:
 ```python
 from my_agent.agent import MyAgent
 asyncio.create_task(MyAgent(pool, settings).run_forever(60))
 ```
 3. Enqueue jobs via the `agent_jobs` table or via the base class `enqueue()` method.
--- a/docs/api.md
+++ b/docs/api.md
@ -0,0 +1,178 @@
 # API Reference
 Base URL: `http://localhost:8000/api/v1`
 Interactive docs: `http://localhost:8000/docs` (Swagger UI)
 ---
 ## Authentication
 No authentication is required by default (local-only deployment). Add a reverse proxy with auth for production.
 ---
 ## Endpoints
 ### POST `/search`
 Hybrid vector + full-text search across the knowledge base.
 **Request:**
 ```json
 {
  "query": "machine learning transformers",
  "limit": 10,
  "threshold": 0.65,
  "tags": ["ml", "ai"],
  "hybrid": true
 }
 ```
 **Response:**
 ```json
 {
  "results": [
    {
      "document_id": "uuid",
      "chunk_id": "uuid",
      "title": "Introduction to Transformers",
      "path": "resources/ml/transformers.md",
      "content": "...chunk text...",
      "score": 0.923,
      "tags": ["ml", "transformers"],
      "highlight": "...bolded match..."
    }
  ],
  "total": 5,
  "query_time_ms": 18.4
 }
 ```
 ---
 ### POST `/chat`
 RAG chat with streaming Server-Sent Events response.
 **Request:**
 ```json
 {
  "message": "What do I know about neural networks?",
  "context_limit": 5,
  "stream": true
 }
 ```
 **Response (SSE stream):**
 ```
 data: {"type":"sources","sources":[{"title":"Neural Nets","path":"...","score":0.91}]}
 data: {"type":"token","token":"Neural"}
 data: {"type":"token","token":" networks"}
 data: {"type":"done"}
 ```
 ---
 ### GET `/document/{id}`
 Get a document by UUID.
 **Response:** Full document object including content, frontmatter, tags.
 ---
 ### GET `/document/path/{path}`
 Get a document by its vault-relative path (e.g., `resources/ml/intro.md`).
 ---
 ### GET `/document/{id}/related`
 Get related documents ordered by semantic similarity.
 **Query params:** `limit` (default: 5)
 ---
 ### POST `/index`
 Queue a specific file for indexing.
 **Request:**
 ```json
 { "path": "notes/new-note.md" }
 ```
 ---
 ### POST `/index/reindex`
 Queue a full vault re-index.
 **Request:**
 ```json
 { "force": false }
 ```
 Set `force: true` to reindex even unchanged files.
 ---
 ### GET `/tags`
 List all tags with document counts.
 **Response:**
 ```json
 [
  {"tag": "machine-learning", "count": 42},
  {"tag": "python", "count": 38}
 ]
 ```
 ---
 ### GET `/graph`
 Get the knowledge graph (nodes = documents, edges = links).
 **Query params:** `limit` (default: 200)
 ---
 ### GET `/stats`
 System statistics.
 **Response:**
 ```json
 {
  "total_documents": 1234,
  "total_chunks": 8765,
  "total_relations": 3210,
  "total_tags": 87,
  "last_indexed": "2026-03-05T19:00:00Z",
  "embedding_model": "nomic-embed-text",
  "chat_model": "mistral"
 }
 ```
 ---
 ### GET `/health`
 Health check.
 **Response:**
 ```json
 {
  "status": "ok",
  "database": "ok",
  "ollama": "ok",
  "version": "1.0.0"
 }
 ```
--- a/docs/architecture.md
+++ b/docs/architecture.md
@ -0,0 +1,420 @@
 # AI Second Brain — System Architecture
 > Version: 1.0.0  
 > Date: 2026-03-05  
 > Status: Design Document
 ---
 ## Table of Contents
 1. [Overview](#overview)
 2. [Core Components](#core-components)
 3. [Data Flow](#data-flow)
 4. [Database Schema](#database-schema)
 5. [API Design](#api-design)
 6. [Agent Architecture](#agent-architecture)
 7. [Ingestion Pipeline](#ingestion-pipeline)
 8. [Infrastructure](#infrastructure)
 9. [Design Principles](#design-principles)
 ---
 ## Overview
 The AI Second Brain is a fully self-hosted, offline-capable knowledge management system that treats a Markdown vault (Obsidian/Logseq compatible) as the single source of truth. All AI capabilities—embeddings, retrieval, generation, and autonomous agents—run locally.
 ```
 ┌─────────────────────────────────────────────────────────────────────┐
 │                         AI SECOND BRAIN                             │
 │                                                                     │
 │  ┌──────────┐    ┌────────────┐    ┌──────────┐    ┌────────────┐  │
 │  │  EDITOR  │───▶│ INGESTION  │───▶│ STORAGE  │───▶│    API     │  │
 │  │  LAYER   │    │  PIPELINE  │    │  LAYER   │    │   LAYER    │  │
 │  └──────────┘    └────────────┘    └──────────┘    └────────────┘  │
 │       │                                                  │          │
 │  Markdown Vault                                     ┌────▼───────┐  │
 │  (Obsidian/Logseq)                                  │  AI LAYER  │  │
 │                                                     │  (Ollama)  │  │
 │                                                     └────────────┘  │
 │                                                          │          │
 │                                                     ┌────▼───────┐  │
 │                                                     │ INTERFACE  │  │
 │                                                     │   LAYER    │  │
 │                                                     └────────────┘  │
 └─────────────────────────────────────────────────────────────────────┘
 ```
 ---
 ## Core Components
 ### 1. Editor Layer
 - **Vault directory**: `./vault/` — plain Markdown files, fully compatible with Obsidian and Logseq
 - **Format**: CommonMark + YAML frontmatter + `[[WikiLinks]]`
 - **Source of truth**: All knowledge lives here; the database is a derived index
 - **Sync**: File-system watching via `watchdog` triggers the ingestion pipeline
 ### 2. Storage Layer
 - **PostgreSQL 16** with **pgvector** extension
 - Stores: document metadata, text chunks, embeddings (1536-dim or 768-dim), extracted entities, wikilink relations
 - Vector index: IVFFlat or HNSW for ANN search
 ### 3. Processing Layer (Ingestion Pipeline)
 - File watcher monitors `./vault/**/*.md`
 - Parser: frontmatter extraction (YAML), Markdown-to-text, WikiLink graph extraction
 - Chunker: 500–800 token sliding window with 10% overlap
 - Embeddings: Ollama (`nomic-embed-text`) or `sentence-transformers` (offline fallback)
 - Idempotent: SHA-256 content hashing prevents redundant re-indexing
 ### 4. API Layer
 - **FastAPI** service exposing REST endpoints
 - Retrieval: hybrid search (vector similarity + full-text BM25-style)
 - Reranking: optional cross-encoder via `sentence-transformers`
 - Async throughout; connection pooling with `asyncpg`
 ### 5. AI Layer
 - **Ollama** sidecar providing local LLM inference (Mistral, Llama 3, Phi-3, etc.)
 - Embedding model: `nomic-embed-text` (768-dim)
 - Chat/generation model: configurable (default: `mistral`)
 - Agents use LangChain/LlamaIndex or direct Ollama API calls
 ### 6. Agent Layer
 - Long-running Python workers
 - Agents: Ingestion, Knowledge Linking, Tagging, Summarization, Maintenance
 - Message queue: Redis-backed job queue (ARQ) or simple PostgreSQL-backed queue
 - Scheduled via cron-style configuration
 ### 7. Interface Layer
 - **Next.js** (React) web application
 - Pages: Search, Chat, Document Viewer, Graph View (knowledge graph), Tag Browser
 - API client calls the FastAPI backend
 - Served as a Docker container (Node.js)
 ---
 ## Data Flow
 ### Ingestion Flow
 ```
 Markdown File (vault/)
       │
       ▼
   File Watcher (watchdog)
       │
       ▼
   Parse & Validate
   ├── Extract YAML frontmatter (title, tags, date, aliases)
   ├── Extract WikiLinks [[target]]
   └── Convert Markdown → plain text
       │
       ▼
   Content Hash (SHA-256)
   └── Skip if unchanged (idempotent)
       │
       ▼
   Chunker (500-800 tokens, 10% overlap)
       │
       ▼
   Embedding Generation (Ollama nomic-embed-text)
       │
       ▼
   Store in PostgreSQL
   ├── documents table (metadata + full text)
   ├── chunks table (chunk text + embedding vector)
   ├── entities table (extracted NER if enabled)
   └── relations table (WikiLink graph edges)
 ```
 ### Retrieval (RAG) Flow
 ```
 User Query
    │
    ▼
 Query Embedding (Ollama)
    │
    ▼
 Hybrid Search
 ├── Vector similarity (pgvector cosine distance)
 └── Full-text search (PostgreSQL tsvector)
    │
    ▼
 Reranker (optional cross-encoder)
    │
    ▼
 Context Assembly (top-k chunks + metadata)
    │
    ▼
 LLM Generation (Ollama)
    │
    ▼
 Response + Citations
 ```
 ---
 ## Database Schema
 ### Tables
 #### `documents`
 ```sql
 CREATE TABLE documents (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    path        TEXT NOT NULL UNIQUE,        -- relative path in vault
    title       TEXT,
    content     TEXT NOT NULL,               -- full markdown source
    content_hash TEXT NOT NULL,              -- SHA-256 for change detection
    frontmatter JSONB DEFAULT '{}',          -- parsed YAML frontmatter
    tags        TEXT[] DEFAULT '{}',
    aliases     TEXT[] DEFAULT '{}',
    word_count  INTEGER,
    created_at  TIMESTAMPTZ DEFAULT now(),
    updated_at  TIMESTAMPTZ DEFAULT now(),
    indexed_at  TIMESTAMPTZ,
    fts_vector  TSVECTOR                     -- full-text search index
 );
 CREATE INDEX idx_documents_path ON documents(path);
 CREATE INDEX idx_documents_tags ON documents USING GIN(tags);
 CREATE INDEX idx_documents_fts ON documents USING GIN(fts_vector);
 ```
 #### `chunks`
 ```sql
 CREATE TABLE chunks (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    document_id UUID NOT NULL REFERENCES documents(id) ON DELETE CASCADE,
    chunk_index INTEGER NOT NULL,
    content     TEXT NOT NULL,
    token_count INTEGER,
    embedding   VECTOR(768),                 -- nomic-embed-text dimension
    metadata    JSONB DEFAULT '{}',
    created_at  TIMESTAMPTZ DEFAULT now()
 );
 CREATE INDEX idx_chunks_document_id ON chunks(document_id);
 CREATE INDEX idx_chunks_embedding ON chunks USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);
 ```
 #### `entities`
 ```sql
 CREATE TABLE entities (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    document_id UUID NOT NULL REFERENCES documents(id) ON DELETE CASCADE,
    name        TEXT NOT NULL,
    entity_type TEXT NOT NULL,               -- PERSON, ORG, CONCEPT, etc.
    context     TEXT,
    created_at  TIMESTAMPTZ DEFAULT now()
 );
 CREATE INDEX idx_entities_document_id ON entities(document_id);
 CREATE INDEX idx_entities_name ON entities(name);
 CREATE INDEX idx_entities_type ON entities(entity_type);
 ```
 #### `relations`
 ```sql
 CREATE TABLE relations (
    id            UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    source_doc_id UUID NOT NULL REFERENCES documents(id) ON DELETE CASCADE,
    target_path   TEXT NOT NULL,             -- may not exist yet (forward links)
    target_doc_id UUID REFERENCES documents(id) ON DELETE SET NULL,
    relation_type TEXT DEFAULT 'wikilink',   -- wikilink, tag, explicit
    context       TEXT,                      -- surrounding text
    created_at    TIMESTAMPTZ DEFAULT now()
 );
 CREATE INDEX idx_relations_source ON relations(source_doc_id);
 CREATE INDEX idx_relations_target ON relations(target_doc_id);
 CREATE INDEX idx_relations_target_path ON relations(target_path);
 ```
 #### `agent_jobs`
 ```sql
 CREATE TABLE agent_jobs (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    agent_type  TEXT NOT NULL,               -- ingestion, linking, tagging, etc.
    status      TEXT DEFAULT 'pending',      -- pending, running, done, failed
    payload     JSONB DEFAULT '{}',
    result      JSONB,
    error       TEXT,
    created_at  TIMESTAMPTZ DEFAULT now(),
    started_at  TIMESTAMPTZ,
    completed_at TIMESTAMPTZ,
    retry_count INTEGER DEFAULT 0
 );
 CREATE INDEX idx_agent_jobs_status ON agent_jobs(status);
 CREATE INDEX idx_agent_jobs_type ON agent_jobs(agent_type);
 ```
 #### `agent_logs`
 ```sql
 CREATE TABLE agent_logs (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    job_id      UUID REFERENCES agent_jobs(id) ON DELETE SET NULL,
    agent_type  TEXT NOT NULL,
    level       TEXT DEFAULT 'info',
    message     TEXT NOT NULL,
    metadata    JSONB DEFAULT '{}',
    created_at  TIMESTAMPTZ DEFAULT now()
 );
 CREATE INDEX idx_agent_logs_job_id ON agent_logs(job_id);
 CREATE INDEX idx_agent_logs_created ON agent_logs(created_at DESC);
 ```
 ---
 ## API Design
 ### Base URL: `http://localhost:8000/api/v1`
 | Method | Endpoint              | Description                              |
 |--------|-----------------------|------------------------------------------|
 | POST   | `/search`             | Hybrid vector + full-text search         |
 | POST   | `/chat`               | RAG chat with streaming response         |
 | GET    | `/document/{id}`      | Get document by ID                       |
 | GET    | `/document/path`      | Get document by vault path               |
 | POST   | `/index`              | Manually trigger index of a file         |
 | POST   | `/reindex`            | Full vault reindex                       |
 | GET    | `/related/{id}`       | Get related documents by embedding sim   |
 | GET    | `/tags`               | List all tags with counts                |
 | GET    | `/graph`              | WikiLink graph (nodes + edges)           |
 | GET    | `/health`             | Health check                             |
 | GET    | `/stats`              | System statistics                        |
 ### Request/Response Shapes
 #### POST `/search`
 ```json
 // Request
 {
  "query": "machine learning concepts",
  "limit": 10,
  "threshold": 0.7,
  "tags": ["ml", "ai"],
  "hybrid": true
 }
 // Response
 {
  "results": [
    {
      "document_id": "uuid",
      "chunk_id": "uuid",
      "title": "Introduction to ML",
      "path": "notes/ml-intro.md",
      "content": "chunk text...",
      "score": 0.92,
      "tags": ["ml", "ai"],
      "highlight": "...matched text..."
    }
  ],
  "total": 42,
  "query_time_ms": 23
 }
 ```
 #### POST `/chat`
 ```json
 // Request (SSE stream)
 {
  "message": "What do I know about transformers?",
  "conversation_id": "optional-uuid",
  "context_limit": 5
 }
 // Response (Server-Sent Events)
 data: {"token": "Transformers", "type": "token"}
 data: {"token": " are", "type": "token"}
 data: {"sources": [...], "type": "sources"}
 data: {"type": "done"}
 ```
 ---
 ## Agent Architecture
 All agents inherit from a common `BaseAgent` class:
 ```
 BaseAgent
 ├── IngestionAgent      — watches vault, triggers indexing
 ├── LinkingAgent        — discovers and creates knowledge links
 ├── TaggingAgent        — auto-tags documents using LLM
 ├── SummarizationAgent  — generates/updates document summaries
 └── MaintenanceAgent    — detects orphans, broken links, stale content
 ```
 ### Agent Lifecycle
 1. Agent starts, reads config from environment
 2. Polls `agent_jobs` table (or subscribes to PostgreSQL NOTIFY)
 3. Claims job atomically (`UPDATE ... WHERE status='pending' RETURNING *`)
 4. Executes job with retry logic (exponential backoff, max 3 retries)
 5. Writes result / error back to `agent_jobs`
 6. Logs to `agent_logs`
 ### Scheduling
 - **IngestionAgent**: event-driven (file watcher) + fallback poll every 30s
 - **LinkingAgent**: runs after every ingestion batch
 - **TaggingAgent**: runs on new/modified documents without tags
 - **SummarizationAgent**: runs on documents >1000 words without summary
 - **MaintenanceAgent**: scheduled daily at midnight
 ---
 ## Ingestion Pipeline
 ```
 services/ingestion-worker/
 ├── watcher.py          — watchdog file system monitor
 ├── parser.py           — frontmatter + markdown + wikilink parser
 ├── chunker.py          — token-aware sliding window chunker
 ├── embedder.py         — Ollama / sentence-transformers embeddings
 ├── indexer.py          — PostgreSQL upsert logic
 └── pipeline.py         — orchestrates the full ingestion flow
 ```
 ### Chunking Strategy
 - **Method**: Sliding window, 500–800 tokens, 10% overlap
 - **Splitter**: Prefer semantic boundaries (paragraphs, headings) over hard token cuts
 - **Metadata preserved**: document_id, chunk_index, source heading path
 ### Embedding Strategy
 - **Primary**: Ollama `nomic-embed-text` (768-dim, fully offline)
 - **Fallback**: `sentence-transformers/all-MiniLM-L6-v2` (384-dim, local model)
 - **Batching**: 32 chunks per embedding request for efficiency
 ---
 ## Infrastructure
 ### Docker Services
 | Service            | Image                        | Port  | Description                      |
 |--------------------|------------------------------|-------|----------------------------------|
 | `postgres`         | pgvector/pgvector:pg16        | 5432  | PostgreSQL + pgvector            |
 | `ollama`           | ollama/ollama:latest          | 11434 | Local LLM inference              |
 | `rag-api`          | local/rag-api                 | 8000  | FastAPI retrieval service        |
 | `ingestion-worker` | local/ingestion-worker        | —     | Vault watcher + indexer          |
 | `agents`           | local/agents                  | —     | Background AI agents             |
 | `web-ui`           | local/web-ui                  | 3000  | Next.js frontend                 |
 | `redis`            | redis:7-alpine                | 6379  | Job queue + caching              |
 ### Volume Mounts
 - `./vault:/vault:rw` — shared across all services needing vault access
 - `postgres_data:/var/lib/postgresql/data` — persistent database
 - `ollama_data:/root/.ollama` — pulled LLM models
 ### Network
 - Internal Docker network `second-brain-net`
 - External ports: `3000` (UI), `8000` (API), `11434` (Ollama)
 ---
 ## Design Principles
 1. **Vault is source of truth** — database is always a derived index, fully rebuildable
 2. **Offline-first** — zero external API calls required; all AI runs locally via Ollama
 3. **Idempotent ingestion** — SHA-256 hashing ensures files are not re-indexed unless changed
 4. **No vendor lock-in** — all components are open source and self-hosted
 5. **Modular** — each service can be replaced independently (swap Ollama for another runtime)
 6. **Graceful degradation** — system works without agents running; agents enhance, not gate
 7. **Markdown compatibility** — vault works as a standalone Obsidian/Logseq vault at all times
--- a/docs/setup.md
+++ b/docs/setup.md
@ -0,0 +1,167 @@
 # Setup Guide
 ## Prerequisites
 - Docker & Docker Compose v2.20+
 - 16 GB RAM recommended (for local LLMs)
 - GPU optional (CPU inference works but is slower)
 - A Markdown vault (Obsidian/Logseq compatible directory)
 ---
 ## Quick Start
 ### 1. Clone and configure
 ```bash
 git clone <repo-url> second-brain
 cd second-brain
 cp .env.example .env
 # Edit .env — at minimum, set POSTGRES_PASSWORD
 ```
 ### 2. Place your vault
 Copy your Markdown notes into `./vault/`, or mount your existing Obsidian/Logseq vault:
 ```bash
 # Option A: copy files
 cp -r ~/obsidian-vault/* ./vault/
 # Option B: symlink (Linux/macOS)
 ln -s ~/obsidian-vault ./vault
 ```
 The vault directory structure is preserved — subfolders become part of the document path.
 ### 3. Start services
 ```bash
 docker compose up -d
 ```
 This starts:
 - PostgreSQL with pgvector (port 5432)
 - Redis (port 6379)
 - Ollama (port 11434)
 - RAG API (port 8000)
 - Ingestion Worker (background)
 - AI Agents (background)
 - Web UI (port 3000)
 ### 4. Wait for model download
 Ollama pulls the embedding and chat models on first boot. This may take several minutes.
 ```bash
 # Watch the bootstrap container logs
 docker compose logs -f ollama-bootstrap
 ```
 ### 5. Check the UI
 Open **http://localhost:3000** in your browser.
 ---
 ## Service Ports
 | Service     | Port  | URL                          |
 |-------------|-------|------------------------------|
 | Web UI      | 3000  | http://localhost:3000        |
 | RAG API     | 8000  | http://localhost:8000        |
 | API Docs    | 8000  | http://localhost:8000/docs   |
 | Ollama      | 11434 | http://localhost:11434       |
 | PostgreSQL  | 5432  | localhost:5432               |
 ---
 ## Configuration
 All configuration is in `.env`. Key settings:
 | Variable          | Default            | Description                        |
 |-------------------|--------------------|------------------------------------|
 | `CHAT_MODEL`      | `mistral`          | Ollama model for chat              |
 | `EMBEDDING_MODEL` | `nomic-embed-text` | Ollama model for embeddings        |
 | `CHUNK_SIZE`      | `700`              | Target tokens per chunk            |
 | `SEARCH_THRESHOLD`| `0.65`             | Minimum similarity score (0–1)     |
 | `AUTO_TAG`        | `true`             | Enable LLM-based auto-tagging      |
 | `AUTO_SUMMARIZE`  | `true`             | Enable LLM-based auto-summarization|
 ---
 ## Switching LLM Models
 The system is model-agnostic. To use a different model:
 ```bash
 # Pull the model
 docker compose exec ollama ollama pull llama3
 # Update .env
 CHAT_MODEL=llama3
 # Restart the affected services
 docker compose restart rag-api agents
 ```
 Popular model choices:
 - `mistral` — fast, good quality (7B)
 - `llama3` — excellent quality (8B/70B)
 - `phi3` — lightweight, efficient (3.8B)
 - `qwen2` — strong multilingual support
 ---
 ## Re-indexing the Vault
 The ingestion worker automatically re-indexes changed files. To force a full re-index:
 ```bash
 curl -X POST http://localhost:8000/api/v1/index/reindex \
  -H "Content-Type: application/json" \
  -d '{"force": true}'
 ```
 ---
 ## Backup
 ```bash
 # Backup database
 docker compose exec postgres pg_dump -U brain second_brain > backup.sql
 # Restore
 docker compose exec -T postgres psql -U brain second_brain < backup.sql
 ```
 The vault itself is just files — back it up with any file backup tool.
 ---
 ## Stopping / Resetting
 ```bash
 # Stop all services (preserve data)
 docker compose down
 # Full reset (DELETE all data!)
 docker compose down -v
 ```
 ---
 ## Obsidian Compatibility
 The vault is fully compatible with Obsidian. You can:
 - Open `./vault/` directly in Obsidian
 - Use all Obsidian features (graph view, backlinks, templates, etc.)
 - The system reads `[[WikiLinks]]`, `#tags`, and YAML frontmatter
 ## Logseq Compatibility
 Point Logseq's graph folder to `./vault/`. The system handles:
 - `[[Page references]]`
 - `#tags` in journal and pages
 - YAML frontmatter (or Logseq's `::` properties are stored as-is)
--- a/infra/database/migrate.sh
+++ b/infra/database/migrate.sh
@ -0,0 +1,20 @@
 #!/bin/bash
 # Applies database migrations in order.
 # Usage: ./migrate.sh [up|down]
 set -euo pipefail
 DB_URL="${DATABASE_URL:-postgresql://brain:brain@localhost:5432/second_brain}"
 MIGRATIONS_DIR="$(dirname "$0")/migrations"
 ACTION="${1:-up}"
 if [ "$ACTION" = "up" ]; then
    echo "Applying schema..."
    psql "$DB_URL" -f "$(dirname "$0")/schema.sql"
    echo "Schema applied."
 elif [ "$ACTION" = "down" ]; then
    echo "Dropping schema..."
    psql "$DB_URL" -c "DROP SCHEMA public CASCADE; CREATE SCHEMA public;"
    echo "Schema dropped."
 fi
--- a/infra/database/schema.sql
+++ b/infra/database/schema.sql
@ -0,0 +1,195 @@
 -- AI Second Brain — PostgreSQL Schema
 -- Requires: PostgreSQL 14+ with pgvector extension
 -- Enable extensions
 CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
 CREATE EXTENSION IF NOT EXISTS vector;
 CREATE EXTENSION IF NOT EXISTS pg_trgm;  -- for fuzzy text search
 -- ---------------------------------------------------------------------------
 -- DOCUMENTS
 -- Represents a single Markdown file in the vault.
 -- ---------------------------------------------------------------------------
 CREATE TABLE IF NOT EXISTS documents (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    path            TEXT NOT NULL UNIQUE,       -- relative path within vault
    title           TEXT,
    content         TEXT NOT NULL,              -- full raw markdown
    content_hash    TEXT NOT NULL,              -- SHA-256 for change detection
    frontmatter     JSONB NOT NULL DEFAULT '{}',
    tags            TEXT[] NOT NULL DEFAULT '{}',
    aliases         TEXT[] NOT NULL DEFAULT '{}',
    word_count      INTEGER,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    indexed_at      TIMESTAMPTZ,
    fts_vector      TSVECTOR                    -- auto-maintained below
 );
 CREATE INDEX IF NOT EXISTS idx_documents_path        ON documents (path);
 CREATE INDEX IF NOT EXISTS idx_documents_tags        ON documents USING GIN (tags);
 CREATE INDEX IF NOT EXISTS idx_documents_aliases     ON documents USING GIN (aliases);
 CREATE INDEX IF NOT EXISTS idx_documents_fts         ON documents USING GIN (fts_vector);
 CREATE INDEX IF NOT EXISTS idx_documents_frontmatter ON documents USING GIN (frontmatter);
 CREATE INDEX IF NOT EXISTS idx_documents_updated     ON documents (updated_at DESC);
 -- Auto-update fts_vector on insert/update
 CREATE OR REPLACE FUNCTION documents_fts_trigger()
 RETURNS TRIGGER AS $$
 BEGIN
    NEW.fts_vector :=
        setweight(to_tsvector('english', coalesce(NEW.title, '')), 'A') ||
        setweight(to_tsvector('english', coalesce(array_to_string(NEW.tags, ' '), '')), 'B') ||
        setweight(to_tsvector('english', coalesce(NEW.content, '')), 'C');
    RETURN NEW;
 END;
 $$ LANGUAGE plpgsql;
 DROP TRIGGER IF EXISTS trig_documents_fts ON documents;
 CREATE TRIGGER trig_documents_fts
    BEFORE INSERT OR UPDATE ON documents
    FOR EACH ROW EXECUTE FUNCTION documents_fts_trigger();
 -- Auto-update updated_at timestamp
 CREATE OR REPLACE FUNCTION set_updated_at()
 RETURNS TRIGGER AS $$
 BEGIN
    NEW.updated_at = now();
    RETURN NEW;
 END;
 $$ LANGUAGE plpgsql;
 DROP TRIGGER IF EXISTS trig_documents_updated_at ON documents;
 CREATE TRIGGER trig_documents_updated_at
    BEFORE UPDATE ON documents
    FOR EACH ROW EXECUTE FUNCTION set_updated_at();
 -- ---------------------------------------------------------------------------
 -- CHUNKS
 -- Sliding-window text chunks from documents, each with an embedding vector.
 -- ---------------------------------------------------------------------------
 CREATE TABLE IF NOT EXISTS chunks (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    document_id     UUID NOT NULL REFERENCES documents (id) ON DELETE CASCADE,
    chunk_index     INTEGER NOT NULL,
    content         TEXT NOT NULL,
    token_count     INTEGER,
    embedding       VECTOR(768),                -- nomic-embed-text dimension
    metadata        JSONB NOT NULL DEFAULT '{}',-- heading path, page, etc.
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    UNIQUE (document_id, chunk_index)
 );
 CREATE INDEX IF NOT EXISTS idx_chunks_document_id ON chunks (document_id);
 -- HNSW index — fast approximate nearest-neighbour search
 -- Requires pgvector >= 0.5.0. Falls back to IVFFlat if unavailable.
 CREATE INDEX IF NOT EXISTS idx_chunks_embedding_hnsw
    ON chunks USING hnsw (embedding vector_cosine_ops)
    WITH (m = 16, ef_construction = 64);
 -- ---------------------------------------------------------------------------
 -- ENTITIES
 -- Named entities extracted from documents (optional NER layer).
 -- ---------------------------------------------------------------------------
 CREATE TABLE IF NOT EXISTS entities (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    document_id     UUID NOT NULL REFERENCES documents (id) ON DELETE CASCADE,
    name            TEXT NOT NULL,
    entity_type     TEXT NOT NULL,              -- PERSON, ORG, CONCEPT, PLACE, etc.
    context         TEXT,                       -- surrounding sentence
    confidence      FLOAT,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now()
 );
 CREATE INDEX IF NOT EXISTS idx_entities_document_id  ON entities (document_id);
 CREATE INDEX IF NOT EXISTS idx_entities_name         ON entities (name);
 CREATE INDEX IF NOT EXISTS idx_entities_type         ON entities (entity_type);
 CREATE INDEX IF NOT EXISTS idx_entities_name_trgm    ON entities USING GIN (name gin_trgm_ops);
 -- ---------------------------------------------------------------------------
 -- RELATIONS
 -- WikiLink / explicit relations between documents.
 -- ---------------------------------------------------------------------------
 CREATE TABLE IF NOT EXISTS relations (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    source_doc_id   UUID NOT NULL REFERENCES documents (id) ON DELETE CASCADE,
    target_path     TEXT NOT NULL,              -- raw link target (may be unresolved)
    target_doc_id   UUID REFERENCES documents (id) ON DELETE SET NULL,
    relation_type   TEXT NOT NULL DEFAULT 'wikilink', -- wikilink | tag | explicit | ai-inferred
    label           TEXT,                       -- optional human label for the edge
    context         TEXT,                       -- surrounding text of the link
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now()
 );
 CREATE INDEX IF NOT EXISTS idx_relations_source       ON relations (source_doc_id);
 CREATE INDEX IF NOT EXISTS idx_relations_target_id    ON relations (target_doc_id);
 CREATE INDEX IF NOT EXISTS idx_relations_target_path  ON relations (target_path);
 CREATE INDEX IF NOT EXISTS idx_relations_type         ON relations (relation_type);
 -- ---------------------------------------------------------------------------
 -- AGENT JOBS
 -- Persistent job queue consumed by AI agents.
 -- ---------------------------------------------------------------------------
 CREATE TABLE IF NOT EXISTS agent_jobs (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    agent_type      TEXT NOT NULL,              -- ingestion | linking | tagging | summarization | maintenance
    status          TEXT NOT NULL DEFAULT 'pending', -- pending | running | done | failed | cancelled
    priority        INTEGER NOT NULL DEFAULT 5, -- 1 (highest) .. 10 (lowest)
    payload         JSONB NOT NULL DEFAULT '{}',
    result          JSONB,
    error           TEXT,
    retry_count     INTEGER NOT NULL DEFAULT 0,
    max_retries     INTEGER NOT NULL DEFAULT 3,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    started_at      TIMESTAMPTZ,
    completed_at    TIMESTAMPTZ,
    scheduled_for   TIMESTAMPTZ NOT NULL DEFAULT now()
 );
 CREATE INDEX IF NOT EXISTS idx_agent_jobs_status        ON agent_jobs (status);
 CREATE INDEX IF NOT EXISTS idx_agent_jobs_type          ON agent_jobs (agent_type);
 CREATE INDEX IF NOT EXISTS idx_agent_jobs_scheduled     ON agent_jobs (scheduled_for ASC)
    WHERE status = 'pending';
 -- ---------------------------------------------------------------------------
 -- AGENT LOGS
 -- Structured log entries written by agents.
 -- ---------------------------------------------------------------------------
 CREATE TABLE IF NOT EXISTS agent_logs (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    job_id          UUID REFERENCES agent_jobs (id) ON DELETE SET NULL,
    agent_type      TEXT NOT NULL,
    level           TEXT NOT NULL DEFAULT 'info', -- debug | info | warning | error
    message         TEXT NOT NULL,
    metadata        JSONB NOT NULL DEFAULT '{}',
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now()
 );
 CREATE INDEX IF NOT EXISTS idx_agent_logs_job_id   ON agent_logs (job_id);
 CREATE INDEX IF NOT EXISTS idx_agent_logs_created  ON agent_logs (created_at DESC);
 CREATE INDEX IF NOT EXISTS idx_agent_logs_level    ON agent_logs (level);
 -- ---------------------------------------------------------------------------
 -- SYSTEM CONFIG
 -- Runtime key-value configuration, editable by agents and admins.
 -- ---------------------------------------------------------------------------
 CREATE TABLE IF NOT EXISTS system_config (
    key             TEXT PRIMARY KEY,
    value           JSONB NOT NULL,
    description     TEXT,
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT now()
 );
 -- Seed default configuration
 INSERT INTO system_config (key, value, description) VALUES
    ('embedding_model',   '"nomic-embed-text"',         'Ollama model for embeddings'),
    ('chat_model',        '"mistral"',                   'Ollama model for chat/generation'),
    ('chunk_size',        '700',                         'Target tokens per chunk'),
    ('chunk_overlap',     '70',                          'Overlap tokens between chunks'),
    ('search_top_k',      '10',                          'Default number of search results'),
    ('search_threshold',  '0.65',                        'Minimum cosine similarity score'),
    ('rerank_enabled',    'false',                       'Enable cross-encoder reranking'),
    ('auto_tag',          'true',                        'Auto-tag documents via LLM'),
    ('auto_summarize',    'true',                        'Auto-summarize long documents')
 ON CONFLICT (key) DO NOTHING;
--- a/scripts/health.sh
+++ b/scripts/health.sh
@ -0,0 +1,29 @@
 #!/usr/bin/env bash
 # scripts/health.sh — Check health of all services.
 set -euo pipefail
 API_URL="${API_URL:-http://localhost:8000}"
 OLLAMA_URL="${OLLAMA_URL:-http://localhost:11434}"
 check() {
    local name="$1"
    local url="$2"
    if curl -sf "$url" > /dev/null 2>&1; then
        echo "  ✓ $name"
    else
        echo "  ✗ $name (unreachable: $url)"
    fi
 }
 echo "🩺 Second Brain — Health Check"
 echo ""
 check "RAG API"  "$API_URL/api/v1/health"
 check "Ollama"   "$OLLAMA_URL/api/tags"
 echo ""
 echo "Detailed API health:"
 curl -sf "$API_URL/api/v1/health" | python3 -m json.tool 2>/dev/null || echo "(API unavailable)"
 echo ""
 echo "Stats:"
 curl -sf "$API_URL/api/v1/stats" | python3 -m json.tool 2>/dev/null || echo "(API unavailable)"
--- a/scripts/reindex.sh
+++ b/scripts/reindex.sh
@ -0,0 +1,16 @@
 #!/usr/bin/env bash
 # scripts/reindex.sh — Trigger a full vault reindex via the API.
 set -euo pipefail
 API_URL="${API_URL:-http://localhost:8000}"
 FORCE="${1:-false}"
 echo "🔄 Triggering vault reindex (force=$FORCE)..."
 RESPONSE=$(curl -sf -X POST "$API_URL/api/v1/index/reindex" \
    -H "Content-Type: application/json" \
    -d "{\"force\": $FORCE}")
 echo "$RESPONSE" | python3 -m json.tool 2>/dev/null || echo "$RESPONSE"
 echo "✓ Reindex job queued."
--- a/scripts/start.sh
+++ b/scripts/start.sh
@ -0,0 +1,51 @@
 #!/usr/bin/env bash
 # scripts/start.sh — Bootstrap and start the Second Brain stack.
 set -euo pipefail
 SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
 ROOT="$SCRIPT_DIR/.."
 cd "$ROOT"
 echo "🧠 AI Second Brain — startup"
 # Ensure .env exists
 if [ ! -f .env ]; then
    echo "  → Creating .env from .env.example"
    cp .env.example .env
    echo "  ⚠️  Edit .env before production use (set POSTGRES_PASSWORD etc.)"
 fi
 # Ensure vault directory exists
 mkdir -p vault
 echo "  → Starting Docker services..."
 docker compose up -d --build
 echo "  → Waiting for services to be healthy..."
 sleep 5
 # Poll health endpoint
 MAX_ATTEMPTS=30
 ATTEMPT=0
 until curl -sf http://localhost:8000/api/v1/health > /dev/null 2>&1; do
    ATTEMPT=$((ATTEMPT + 1))
    if [ "$ATTEMPT" -ge "$MAX_ATTEMPTS" ]; then
        echo "  ✗ API did not become healthy after ${MAX_ATTEMPTS} attempts."
        echo "    Check logs with: docker compose logs rag-api"
        exit 1
    fi
    echo "  ... waiting for API (${ATTEMPT}/${MAX_ATTEMPTS})"
    sleep 5
 done
 echo ""
 echo "  ✓ Second Brain is running!"
 echo ""
 echo "  🌐  Web UI:     http://localhost:$(grep UI_PORT .env | cut -d= -f2 || echo 3000)"
 echo "  🔌  RAG API:    http://localhost:$(grep API_PORT .env | cut -d= -f2 || echo 8000)"
 echo "  📖  API Docs:   http://localhost:$(grep API_PORT .env | cut -d= -f2 || echo 8000)/docs"
 echo "  🤖  Ollama:     http://localhost:$(grep OLLAMA_PORT .env | cut -d= -f2 || echo 11434)"
 echo ""
 echo "  Run 'docker compose logs -f' to follow logs."
--- a/services/agents/Dockerfile
+++ b/services/agents/Dockerfile
@ -0,0 +1,24 @@
 FROM python:3.12-slim
 WORKDIR /app
 RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential libpq-dev \
    && rm -rf /var/lib/apt/lists/*
 # Install ingestion worker deps first (agents depend on ingestion modules)
 COPY ../ingestion-worker/requirements.txt /tmp/ingestion-requirements.txt
 RUN pip install --no-cache-dir -r /tmp/ingestion-requirements.txt
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
 # Copy ingestion worker source (agents reuse parser, chunker, embedder, indexer, pipeline)
 COPY ../ingestion-worker /app/ingestion-worker
 COPY . .
 ENV PYTHONUNBUFFERED=1
 ENV PYTHONPATH=/app:/app/ingestion-worker
 CMD ["python", "main.py"]
--- a/services/agents/base_agent.py
+++ b/services/agents/base_agent.py
@ -0,0 +1,190 @@
 """
 base_agent.py — Abstract base class for all AI agents.
 """
 from __future__ import annotations
 import asyncio
 import json
 import logging
 import time
 import traceback
 from abc import ABC, abstractmethod
 from typing import Any, Optional
 import asyncpg
 logger = logging.getLogger(__name__)
 class BaseAgent(ABC):
    """
    All agents inherit from this class.
    Responsibilities:
    - Poll agent_jobs table for work
    - Claim jobs atomically
    - Execute with exponential-backoff retries
    - Log results / errors to agent_logs
    """
    agent_type: str  # Must be set by subclass
    def __init__(self, pool: asyncpg.Pool, settings: Any) -> None:
        self.pool = pool
        self.settings = settings
        self._log = logging.getLogger(f'agent.{self.agent_type}')
    # ------------------------------------------------------------------
    # Public interface
    # ------------------------------------------------------------------
    async def run_forever(self, poll_interval: int = 10) -> None:
        """Poll for jobs indefinitely."""
        self._log.info('Agent started (poll_interval=%ds)', poll_interval)
        while True:
            try:
                job = await self._claim_job()
                if job:
                    await self._execute(job)
                else:
                    await asyncio.sleep(poll_interval)
            except asyncio.CancelledError:
                self._log.info('Agent shutting down')
                return
            except Exception as exc:
                self._log.error('Unexpected error in agent loop: %s', exc, exc_info=True)
                await asyncio.sleep(poll_interval)
    async def enqueue(self, payload: dict, priority: int = 5, delay_seconds: int = 0) -> str:
        """Create a new job for this agent."""
        import uuid
        from datetime import datetime, timezone, timedelta
        job_id = str(uuid.uuid4())
        scheduled = datetime.now(timezone.utc)
        if delay_seconds:
            scheduled += timedelta(seconds=delay_seconds)
        async with self.pool.acquire() as conn:
            await conn.execute(
                """
                INSERT INTO agent_jobs (id, agent_type, priority, payload, scheduled_for)
                VALUES ($1::uuid, $2, $3, $4::jsonb, $5)
                """,
                job_id, self.agent_type, priority, json.dumps(payload), scheduled,
            )
        return job_id
    # ------------------------------------------------------------------
    # Abstract
    # ------------------------------------------------------------------
    @abstractmethod
    async def process(self, job_id: str, payload: dict) -> dict:
        """Process a single job. Return result dict."""
        ...
    # ------------------------------------------------------------------
    # Internal helpers
    # ------------------------------------------------------------------
    async def _claim_job(self) -> Optional[asyncpg.Record]:
        """Atomically claim the next pending job for this agent type."""
        async with self.pool.acquire() as conn:
            row = await conn.fetchrow(
                """
                UPDATE agent_jobs
                SET status = 'running', started_at = now()
                WHERE id = (
                    SELECT id FROM agent_jobs
                    WHERE agent_type = $1
                      AND status = 'pending'
                      AND scheduled_for <= now()
                      AND retry_count < max_retries
                    ORDER BY priority ASC, scheduled_for ASC
                    LIMIT 1
                    FOR UPDATE SKIP LOCKED
                )
                RETURNING *
                """,
                self.agent_type,
            )
        return row
    async def _execute(self, job: asyncpg.Record) -> None:
        job_id = str(job['id'])
        payload = dict(job['payload'] or {})
        self._log.info('Processing job %s', job_id)
        start = time.monotonic()
        try:
            result = await self.process(job_id, payload)
            elapsed = time.monotonic() - start
            async with self.pool.acquire() as conn:
                await conn.execute(
                    """
                    UPDATE agent_jobs
                    SET status = 'done', result = $2::jsonb, completed_at = now()
                    WHERE id = $1::uuid
                    """,
                    job_id, json.dumps(result or {}),
                )
            await self._log_event(job_id, 'info', f'Job done in {elapsed:.2f}s', result or {})
        except Exception as exc:
            elapsed = time.monotonic() - start
            err_msg = str(exc)
            self._log.error('Job %s failed: %s', job_id, err_msg, exc_info=True)
            async with self.pool.acquire() as conn:
                row = await conn.fetchrow(
                    'SELECT retry_count, max_retries FROM agent_jobs WHERE id = $1::uuid', job_id
                )
                retries = (row['retry_count'] or 0) + 1
                max_retries = row['max_retries'] or 3
                if retries < max_retries:
                    # Re-queue with exponential backoff
                    backoff = 2 ** retries
                    await conn.execute(
                        """
                        UPDATE agent_jobs
                        SET status = 'pending',
                            retry_count = $2,
                            error = $3,
                            scheduled_for = now() + ($4 || ' seconds')::interval
                        WHERE id = $1::uuid
                        """,
                        job_id, retries, err_msg, str(backoff),
                    )
                    await self._log_event(job_id, 'warning',
                                          f'Retry {retries}/{max_retries} in {backoff}s', {})
                else:
                    await conn.execute(
                        """
                        UPDATE agent_jobs
                        SET status = 'failed', error = $2, completed_at = now()
                        WHERE id = $1::uuid
                        """,
                        job_id, err_msg,
                    )
                    await self._log_event(job_id, 'error', f'Job permanently failed: {err_msg}', {})
    async def _log_event(
        self,
        job_id: Optional[str],
        level: str,
        message: str,
        metadata: dict,
    ) -> None:
        try:
            async with self.pool.acquire() as conn:
                await conn.execute(
                    """
                    INSERT INTO agent_logs (job_id, agent_type, level, message, metadata)
                    VALUES ($1::uuid, $2, $3, $4, $5::jsonb)
                    """,
                    job_id, self.agent_type, level, message, json.dumps(metadata),
                )
        except Exception as log_exc:
            self._log.warning('Failed to write agent log: %s', log_exc)
--- a/services/agents/ingestion/init.py
+++ b/services/agents/ingestion/init.py
--- a/services/agents/ingestion/agent.py
+++ b/services/agents/ingestion/agent.py
@ -0,0 +1,46 @@
 """
 ingestion/agent.py — Ingestion Agent: indexes new/changed files from the vault.
 """
 from __future__ import annotations
 import asyncio
 import sys
 from pathlib import Path
 sys.path.insert(0, str(Path(__file__).parent.parent.parent / 'ingestion-worker'))
 import asyncpg
 from base_agent import BaseAgent
 from pipeline import ingest_file
 from settings import Settings as IngestionSettings
 class IngestionAgent(BaseAgent):
    agent_type = 'ingestion'
    async def process(self, job_id: str, payload: dict) -> dict:
        settings = IngestionSettings()
        vault_root = Path(settings.vault_path)
        if payload.get('reindex_all'):
            md_files = list(vault_root.rglob('*.md'))
            indexed = 0
            skipped = 0
            for fp in md_files:
                async with self.pool.acquire() as conn:
                    result = await ingest_file(fp, settings, conn)
                    if result:
                        indexed += 1
                    else:
                        skipped += 1
            return {'indexed': indexed, 'skipped': skipped, 'total': len(md_files)}
        elif payload.get('path'):
            file_path = vault_root / payload['path']
            async with self.pool.acquire() as conn:
                result = await ingest_file(file_path, settings, conn)
            return {'indexed': 1 if result else 0, 'path': payload['path']}
        return {'message': 'No action specified'}
--- a/services/agents/linking/init.py
+++ b/services/agents/linking/init.py
--- a/services/agents/linking/agent.py
+++ b/services/agents/linking/agent.py
@ -0,0 +1,83 @@
 """
 linking/agent.py — Knowledge Linking Agent: infers and creates AI-powered document links.
 """
 from __future__ import annotations
 import json
 import logging
 import asyncpg
 import httpx
 from base_agent import BaseAgent
 logger = logging.getLogger('agent.linking')
 class LinkingAgent(BaseAgent):
    agent_type = 'linking'
    async def process(self, job_id: str, payload: dict) -> dict:
        """
        For each document without AI-inferred links:
        1. Find top-5 semantically similar documents (vector search).
        2. Insert 'ai-inferred' relations.
        """
        async with self.pool.acquire() as conn:
            # Documents that have chunks but no ai-inferred relations
            docs = await conn.fetch(
                """
                SELECT DISTINCT d.id::text, d.title, d.path
                FROM documents d
                JOIN chunks c ON c.document_id = d.id
                WHERE NOT EXISTS (
                    SELECT 1 FROM relations r
                    WHERE r.source_doc_id = d.id AND r.relation_type = 'ai-inferred'
                )
                LIMIT 50
                """
            )
            linked = 0
            for doc in docs:
                doc_id = doc['id']
                # Find similar docs via average chunk embedding
                similar = await conn.fetch(
                    """
                    WITH doc_avg AS (
                        SELECT AVG(embedding) AS avg_emb
                        FROM chunks WHERE document_id = $1::uuid
                    )
                    SELECT d2.id::text AS target_id, d2.path AS target_path,
                           1 - (AVG(c2.embedding) <=> (SELECT avg_emb FROM doc_avg)) AS score
                    FROM chunks c2
                    JOIN documents d2 ON d2.id = c2.document_id
                    WHERE c2.document_id != $1::uuid
                    GROUP BY d2.id, d2.path
                    HAVING 1 - (AVG(c2.embedding) <=> (SELECT avg_emb FROM doc_avg)) > 0.75
                    ORDER BY score DESC
                    LIMIT 5
                    """,
                    doc_id,
                )
                if not similar:
                    continue
                records = [
                    (doc_id, row['target_path'], row['target_id'], 'ai-inferred')
                    for row in similar
                ]
                await conn.executemany(
                    """
                    INSERT INTO relations (source_doc_id, target_path, target_doc_id, relation_type)
                    VALUES ($1::uuid, $2, $3::uuid, $4)
                    ON CONFLICT DO NOTHING
                    """,
                    records,
                )
                linked += len(similar)
        return {'documents_processed': len(docs), 'links_created': linked}
--- a/services/agents/main.py
+++ b/services/agents/main.py
@ -0,0 +1,92 @@
 """
 main.py — Agent worker entry point. Runs all agents concurrently.
 """
 from __future__ import annotations
 import asyncio
 import logging
 import sys
 from pathlib import Path
 import asyncpg
 from pydantic_settings import BaseSettings, SettingsConfigDict
 class AgentSettings(BaseSettings):
    model_config = SettingsConfigDict(env_file='.env', extra='ignore')
    database_url: str = 'postgresql://brain:brain@postgres:5432/second_brain'
    ollama_url: str = 'http://ollama:11434'
    chat_model: str = 'mistral'
    log_level: str = 'INFO'
    ingestion_poll: int = 15
    linking_poll: int = 30
    tagging_poll: int = 60
    summarization_poll: int = 120
    maintenance_poll: int = 3600
 def setup_logging(level: str) -> None:
    logging.basicConfig(
        level=getattr(logging, level.upper(), logging.INFO),
        format='%(asctime)s [%(levelname)s] %(name)s: %(message)s',
        datefmt='%Y-%m-%dT%H:%M:%S',
        stream=sys.stdout,
    )
 async def main() -> None:
    settings = AgentSettings()
    setup_logging(settings.log_level)
    logger = logging.getLogger('agents')
    logger.info('Starting agent workers...')
    # Add parent dirs to path for cross-service imports
    sys.path.insert(0, str(Path(__file__).parent))
    sys.path.insert(0, str(Path(__file__).parent.parent / 'ingestion-worker'))
    pool = await asyncpg.create_pool(settings.database_url, min_size=2, max_size=10)
    # Import agents after path setup
    from ingestion.agent import IngestionAgent
    from linking.agent import LinkingAgent
    from tagging.agent import TaggingAgent
    from summarization.agent import SummarizationAgent
    from maintenance.agent import MaintenanceAgent
    agents_tasks = [
        asyncio.create_task(
            IngestionAgent(pool, settings).run_forever(settings.ingestion_poll)
        ),
        asyncio.create_task(
            LinkingAgent(pool, settings).run_forever(settings.linking_poll)
        ),
        asyncio.create_task(
            TaggingAgent(pool, settings).run_forever(settings.tagging_poll)
        ),
        asyncio.create_task(
            SummarizationAgent(pool, settings).run_forever(settings.summarization_poll)
        ),
        asyncio.create_task(
            MaintenanceAgent(pool, settings).run_forever(settings.maintenance_poll)
        ),
    ]
    logger.info('All agents running.')
    try:
        await asyncio.gather(*agents_tasks)
    except asyncio.CancelledError:
        pass
    finally:
        for task in agents_tasks:
            task.cancel()
        await pool.close()
        logger.info('Agent workers stopped.')
 if __name__ == '__main__':
    try:
        asyncio.run(main())
    except KeyboardInterrupt:
        pass
--- a/services/agents/maintenance/init.py
+++ b/services/agents/maintenance/init.py
--- a/services/agents/maintenance/agent.py
+++ b/services/agents/maintenance/agent.py
@ -0,0 +1,78 @@
 """
 maintenance/agent.py — Maintenance Agent: detects broken links, orphaned documents, stale content.
 """
 from __future__ import annotations
 import logging
 from datetime import datetime, timezone, timedelta
 from base_agent import BaseAgent
 logger = logging.getLogger('agent.maintenance')
 class MaintenanceAgent(BaseAgent):
    agent_type = 'maintenance'
    async def process(self, job_id: str, payload: dict) -> dict:
        report = {}
        async with self.pool.acquire() as conn:
            # 1. Broken WikiLinks (target_doc_id is NULL but target_path exists)
            broken_links = await conn.fetchval(
                """
                SELECT COUNT(*) FROM relations
                WHERE relation_type = 'wikilink' AND target_doc_id IS NULL
                """
            )
            report['broken_wikilinks'] = broken_links
            # 2. Orphaned documents (no incoming links and no outgoing links)
            orphans = await conn.fetch(
                """
                SELECT d.id::text, d.title, d.path
                FROM documents d
                WHERE NOT EXISTS (
                    SELECT 1 FROM relations r WHERE r.target_doc_id = d.id
                )
                AND NOT EXISTS (
                    SELECT 1 FROM relations r WHERE r.source_doc_id = d.id
                )
                LIMIT 20
                """
            )
            report['orphaned_documents'] = len(orphans)
            report['orphan_paths'] = [r['path'] for r in orphans]
            # 3. Documents not re-indexed in >7 days
            stale_cutoff = datetime.now(timezone.utc) - timedelta(days=7)
            stale_count = await conn.fetchval(
                'SELECT COUNT(*) FROM documents WHERE indexed_at < $1 OR indexed_at IS NULL',
                stale_cutoff,
            )
            report['stale_documents'] = stale_count
            # 4. Documents with chunks but no embeddings
            missing_embeddings = await conn.fetchval(
                'SELECT COUNT(*) FROM chunks WHERE embedding IS NULL'
            )
            report['chunks_missing_embeddings'] = missing_embeddings
            # 5. Resolve previously broken WikiLinks that now have matching docs
            resolved = await conn.execute(
                """
                UPDATE relations r
                SET target_doc_id = d.id
                FROM documents d
                WHERE r.target_doc_id IS NULL
                  AND r.relation_type = 'wikilink'
                  AND (d.path LIKE '%' || r.target_path || '%'
                    OR d.title = r.target_path
                    OR r.target_path = ANY(d.aliases))
                """
            )
            report['wikilinks_resolved'] = int(resolved.split()[-1])
        logger.info('Maintenance report: %s', report)
        return report
--- a/services/agents/requirements.txt
+++ b/services/agents/requirements.txt
@ -0,0 +1,4 @@
 asyncpg>=0.29.0
 pydantic-settings>=2.2.0
 httpx>=0.27.0
 pgvector>=0.2.5
--- a/services/agents/summarization/init.py
+++ b/services/agents/summarization/init.py
--- a/services/agents/summarization/agent.py
+++ b/services/agents/summarization/agent.py
@ -0,0 +1,80 @@
 """
 summarization/agent.py — Summarization Agent: generates summaries for long documents.
 """
 from __future__ import annotations
 import logging
 import re
 import httpx
 from base_agent import BaseAgent
 logger = logging.getLogger('agent.summarization')
 SUMMARY_PROMPT = """You are a knowledge management assistant.
 Write a concise 2-4 sentence summary of the following document.
 The summary should capture the main ideas and be useful for quick reference.
 Respond with only the summary, no preamble.
 Title: {title}
 Content:
 {content}
 Summary:"""
 class SummarizationAgent(BaseAgent):
    agent_type = 'summarization'
    async def process(self, job_id: str, payload: dict) -> dict:
        ollama_url = self.settings.ollama_url
        model = self.settings.chat_model
        async with self.pool.acquire() as conn:
            # Long documents that don't have a summary in frontmatter
            docs = await conn.fetch(
                """
                SELECT id::text, title, content, frontmatter
                FROM documents
                WHERE word_count > 500
                  AND (frontmatter->>'summary' IS NULL OR frontmatter->>'summary' = '')
                LIMIT 10
                """
            )
            summarized = 0
            for doc in docs:
                doc_id = doc['id']
                title = doc['title'] or ''
                content = (doc['content'] or '')[:4000]
                try:
                    summary = await self._generate_summary(title, content, ollama_url, model)
                    if summary:
                        fm = dict(doc['frontmatter'] or {})
                        fm['summary'] = summary
                        await conn.execute(
                            "UPDATE documents SET frontmatter = $2::jsonb WHERE id = $1::uuid",
                            doc_id, __import__('json').dumps(fm),
                        )
                        summarized += 1
                        logger.debug('Summarized: %s', title)
                except Exception as exc:
                    logger.warning('Failed to summarize %s: %s', doc_id, exc)
        return {'documents_summarized': summarized}
    async def _generate_summary(
        self, title: str, content: str, ollama_url: str, model: str
    ) -> str:
        prompt = SUMMARY_PROMPT.format(title=title, content=content)
        async with httpx.AsyncClient(timeout=60.0) as client:
            resp = await client.post(
                f'{ollama_url.rstrip("/")}/api/generate',
                json={'model': model, 'prompt': prompt, 'stream': False},
            )
            resp.raise_for_status()
            return resp.json().get('response', '').strip()
--- a/services/agents/tagging/init.py
+++ b/services/agents/tagging/init.py
--- a/services/agents/tagging/agent.py
+++ b/services/agents/tagging/agent.py
@ -0,0 +1,87 @@
 """
 tagging/agent.py — Tagging Agent: auto-tags documents using the LLM.
 """
 from __future__ import annotations
 import json
 import logging
 import re
 import httpx
 from base_agent import BaseAgent
 logger = logging.getLogger('agent.tagging')
 TAG_PROMPT = """You are a knowledge management assistant.
 Given the following document, suggest 3-7 relevant tags.
 Tags should be lowercase, hyphen-separated, single-concept keywords.
 Respond ONLY with a JSON array of strings. Example: ["machine-learning", "python", "transformers"]
 Document title: {title}
 Document content (excerpt):
 {excerpt}
 Tags:"""
 class TaggingAgent(BaseAgent):
    agent_type = 'tagging'
    async def process(self, job_id: str, payload: dict) -> dict:
        ollama_url = self.settings.ollama_url
        model = self.settings.chat_model
        async with self.pool.acquire() as conn:
            # Documents without tags (or with empty tags array)
            docs = await conn.fetch(
                """
                SELECT id::text, title, content
                FROM documents
                WHERE array_length(tags, 1) IS NULL OR array_length(tags, 1) = 0
                LIMIT 20
                """
            )
            tagged = 0
            for doc in docs:
                doc_id = doc['id']
                title = doc['title'] or ''
                excerpt = (doc['content'] or '')[:2000]
                try:
                    suggested_tags = await self._suggest_tags(
                        title, excerpt, ollama_url, model
                    )
                    if suggested_tags:
                        await conn.execute(
                            'UPDATE documents SET tags = $2 WHERE id = $1::uuid',
                            doc_id, suggested_tags,
                        )
                        tagged += 1
                        logger.debug('Tagged %s with %s', title, suggested_tags)
                except Exception as exc:
                    logger.warning('Failed to tag document %s: %s', doc_id, exc)
        return {'documents_tagged': tagged}
    async def _suggest_tags(
        self, title: str, excerpt: str, ollama_url: str, model: str
    ) -> list[str]:
        prompt = TAG_PROMPT.format(title=title, excerpt=excerpt)
        async with httpx.AsyncClient(timeout=30.0) as client:
            resp = await client.post(
                f'{ollama_url.rstrip("/")}/api/generate',
                json={'model': model, 'prompt': prompt, 'stream': False},
            )
            resp.raise_for_status()
            raw = resp.json().get('response', '').strip()
        # Extract JSON array from response
        match = re.search(r'\[.*?\]', raw, re.DOTALL)
        if match:
            tags = json.loads(match.group())
            return [str(t).lower().strip() for t in tags if t]
        return []
--- a/services/ingestion-worker/Dockerfile
+++ b/services/ingestion-worker/Dockerfile
@ -0,0 +1,17 @@
 FROM python:3.12-slim
 WORKDIR /app
 # System deps
 RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential libpq-dev curl \
    && rm -rf /var/lib/apt/lists/*
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
 COPY . .
 ENV PYTHONUNBUFFERED=1
 CMD ["python", "main.py"]
--- a/services/ingestion-worker/chunker.py
+++ b/services/ingestion-worker/chunker.py
@ -0,0 +1,182 @@
 """
 chunker.py — Token-aware sliding-window text chunker.
 Splits document text into overlapping chunks of 500–800 tokens,
 preferring paragraph / heading boundaries over hard cuts.
 """
 from __future__ import annotations
 import re
 from dataclasses import dataclass
 import tiktoken
 # ---------------------------------------------------------------------------
 # Types
 # ---------------------------------------------------------------------------
@dataclass
 class Chunk:
    chunk_index: int
    content: str
    token_count: int
    metadata: dict  # heading path, start_char, end_char, etc.
 # ---------------------------------------------------------------------------
 # Tokeniser
 # ---------------------------------------------------------------------------
 # cl100k_base works for most modern models; nomic-embed-text is BPE-compatible
 _TOKENISER = tiktoken.get_encoding('cl100k_base')
 def _count_tokens(text: str) -> int:
    return len(_TOKENISER.encode(text, disallowed_special=()))
 def _tokenise(text: str) -> list[int]:
    return _TOKENISER.encode(text, disallowed_special=())
 def _decode(tokens: list[int]) -> str:
    return _TOKENISER.decode(tokens)
 # ---------------------------------------------------------------------------
 # Splitter helpers
 # ---------------------------------------------------------------------------
 _HEADING_RE = re.compile(r'^(#{1,6}\s.+)$', re.MULTILINE)
 _PARA_SEP = re.compile(r'\n{2,}')
 def _split_semantic_blocks(text: str) -> list[tuple[str, str]]:
    """
    Split text into (heading_path, block_text) tuples at heading / paragraph
    boundaries. This is used to build chunks that respect document structure.
    """
    blocks: list[tuple[str, str]] = []
    current_heading = ''
    current_parts: list[str] = []
    for para in _PARA_SEP.split(text):
        para = para.strip()
        if not para:
            continue
        heading_match = _HEADING_RE.match(para)
        if heading_match:
            # Flush current accumulation
            if current_parts:
                blocks.append((current_heading, '\n\n'.join(current_parts)))
                current_parts = []
            current_heading = heading_match.group(1).lstrip('#').strip()
            current_parts.append(para)
        else:
            current_parts.append(para)
    if current_parts:
        blocks.append((current_heading, '\n\n'.join(current_parts)))
    return blocks
 # ---------------------------------------------------------------------------
 # Main chunker
 # ---------------------------------------------------------------------------
 def chunk_document(
    text: str,
    target_tokens: int = 700,
    overlap_tokens: int = 70,
    min_tokens: int = 50,
 ) -> list[Chunk]:
    """
    Chunk ``text`` into overlapping token windows.
    Strategy:
    1. Split into semantic blocks (heading sections / paragraphs).
    2. Merge small blocks and split large blocks to hit ``target_tokens``.
    3. Add overlapping context from the previous chunk.
    Args:
        text:          Plain-text content to chunk.
        target_tokens: Target chunk size in tokens (default 700).
        overlap_tokens: Number of tokens to repeat from the previous chunk.
        min_tokens:    Skip chunks shorter than this.
    Returns:
        List of :class:`Chunk` objects.
    """
    if not text.strip():
        return []
    semantic_blocks = _split_semantic_blocks(text)
    raw_chunks: list[tuple[str, str]] = []  # (heading, text)
    for heading, block in semantic_blocks:
        block_tokens = _count_tokens(block)
        if block_tokens <= target_tokens:
            raw_chunks.append((heading, block))
        else:
            # Split large blocks into token windows
            tokens = _tokenise(block)
            step = target_tokens - overlap_tokens
            start = 0
            while start < len(tokens):
                end = min(start + target_tokens, len(tokens))
                raw_chunks.append((heading, _decode(tokens[start:end])))
                if end == len(tokens):
                    break
                start += step
    # ---- Merge small adjacent chunks ----
    merged: list[tuple[str, str]] = []
    buffer_heading = ''
    buffer_text = ''
    buffer_tokens = 0
    for heading, text_block in raw_chunks:
        block_tokens = _count_tokens(text_block)
        if buffer_tokens + block_tokens <= target_tokens:
            buffer_text = (buffer_text + '\n\n' + text_block).strip()
            buffer_tokens += block_tokens
            buffer_heading = heading or buffer_heading
        else:
            if buffer_text:
                merged.append((buffer_heading, buffer_text))
            buffer_heading = heading
            buffer_text = text_block
            buffer_tokens = block_tokens
    if buffer_text:
        merged.append((buffer_heading, buffer_text))
    # ---- Build final chunks with overlap ----
    chunks: list[Chunk] = []
    prev_overlap_text = ''
    for idx, (heading, chunk_text) in enumerate(merged):
        # Prepend overlap from previous chunk
        if prev_overlap_text:
            chunk_text = prev_overlap_text + '\n\n' + chunk_text
        token_count = _count_tokens(chunk_text)
        if token_count < min_tokens:
            continue
        chunks.append(Chunk(
            chunk_index=idx,
            content=chunk_text.strip(),
            token_count=token_count,
            metadata={'heading': heading, 'chunk_seq': idx},
        ))
        # Compute overlap for next chunk
        tokens = _tokenise(chunk_text)
        prev_overlap_text = _decode(tokens[-overlap_tokens:]) if len(tokens) > overlap_tokens else chunk_text
    return chunks
--- a/services/ingestion-worker/embedder.py
+++ b/services/ingestion-worker/embedder.py
@ -0,0 +1,119 @@
 """
 embedder.py — Embedding generation via Ollama or sentence-transformers fallback.
 """
 from __future__ import annotations
 import logging
 import time
 from typing import Any
 import httpx
 import numpy as np
 logger = logging.getLogger(__name__)
 # Dimensionality per model
 _MODEL_DIMS: dict[str, int] = {
    'nomic-embed-text': 768,
    'all-minilm-l6-v2': 384,
    'mxbai-embed-large': 1024,
 }
 class OllamaEmbedder:
    """Generate embeddings via the Ollama /api/embed endpoint."""
    def __init__(
        self,
        base_url: str = 'http://ollama:11434',
        model: str = 'nomic-embed-text',
        timeout: float = 60.0,
        batch_size: int = 32,
    ) -> None:
        self.base_url = base_url.rstrip('/')
        self.model = model
        self.timeout = timeout
        self.batch_size = batch_size
        self.dimensions = _MODEL_DIMS.get(model, 768)
    def embed_batch(self, texts: list[str]) -> list[list[float]]:
        """Embed a list of texts, returning a list of float vectors."""
        all_embeddings: list[list[float]] = []
        for i in range(0, len(texts), self.batch_size):
            batch = texts[i : i + self.batch_size]
            embeddings = self._call_ollama(batch)
            all_embeddings.extend(embeddings)
        return all_embeddings
    def embed_single(self, text: str) -> list[float]:
        return self.embed_batch([text])[0]
    def _call_ollama(self, texts: list[str], retries: int = 3) -> list[list[float]]:
        url = f'{self.base_url}/api/embed'
        payload: dict[str, Any] = {'model': self.model, 'input': texts}
        for attempt in range(1, retries + 1):
            try:
                with httpx.Client(timeout=self.timeout) as client:
                    resp = client.post(url, json=payload)
                    resp.raise_for_status()
                    data = resp.json()
                    return data['embeddings']
            except (httpx.HTTPError, KeyError) as exc:
                logger.warning('Ollama embed attempt %d/%d failed: %s', attempt, retries, exc)
                if attempt < retries:
                    time.sleep(2 ** attempt)  # exponential backoff
                else:
                    raise
 class SentenceTransformerEmbedder:
    """Local fallback embedder using sentence-transformers."""
    def __init__(
        self,
        model_name: str = 'all-MiniLM-L6-v2',
        batch_size: int = 32,
    ) -> None:
        # Lazy import so the module loads even if not installed
        try:
            from sentence_transformers import SentenceTransformer  # type: ignore
        except ImportError as exc:
            raise ImportError(
                'sentence-transformers is required for the local fallback embedder. '
                'Install it with: pip install sentence-transformers'
            ) from exc
        logger.info('Loading sentence-transformer model: %s', model_name)
        self._model = SentenceTransformer(model_name)
        self.batch_size = batch_size
        self.dimensions = self._model.get_sentence_embedding_dimension()
    def embed_batch(self, texts: list[str]) -> list[list[float]]:
        vectors = self._model.encode(
            texts,
            batch_size=self.batch_size,
            show_progress_bar=False,
            normalize_embeddings=True,
        )
        return [v.tolist() for v in vectors]
    def embed_single(self, text: str) -> list[float]:
        return self.embed_batch([text])[0]
 def get_embedder(
    provider: str = 'ollama',
    ollama_url: str = 'http://ollama:11434',
    model: str = 'nomic-embed-text',
 ) -> OllamaEmbedder | SentenceTransformerEmbedder:
    """Factory function returning the configured embedder."""
    if provider == 'ollama':
        return OllamaEmbedder(base_url=ollama_url, model=model)
    elif provider == 'sentence_transformers':
        return SentenceTransformerEmbedder(model_name=model)
    else:
        raise ValueError(f'Unknown embedding provider: {provider!r}')
--- a/services/ingestion-worker/indexer.py
+++ b/services/ingestion-worker/indexer.py
@ -0,0 +1,142 @@
 """
 indexer.py — Upserts parsed documents and embeddings into PostgreSQL.
 """
 from __future__ import annotations
 import hashlib
 import json
 import logging
 from typing import Any
 import asyncpg
 from chunker import Chunk
 from parser import ParsedDocument
 logger = logging.getLogger(__name__)
 def sha256(text: str) -> str:
    return hashlib.sha256(text.encode('utf-8')).hexdigest()
 async def upsert_document(
    conn: asyncpg.Connection,
    doc: ParsedDocument,
    chunks: list[Chunk],
    embeddings: list[list[float]],
 ) -> str:
    """
    Upsert a document and its chunks atomically.
    Returns the document UUID.
    """
    content_hash = sha256(doc.content_raw)
    async with conn.transaction():
        # ---- Upsert document ----
        row = await conn.fetchrow(
            """
            INSERT INTO documents
                (path, title, content, content_hash, frontmatter, tags, aliases, word_count, indexed_at)
            VALUES ($1, $2, $3, $4, $5::jsonb, $6, $7, $8, now())
            ON CONFLICT (path) DO UPDATE SET
                title        = EXCLUDED.title,
                content      = EXCLUDED.content,
                content_hash = EXCLUDED.content_hash,
                frontmatter  = EXCLUDED.frontmatter,
                tags         = EXCLUDED.tags,
                aliases      = EXCLUDED.aliases,
                word_count   = EXCLUDED.word_count,
                indexed_at   = now()
            RETURNING id, (xmax = 0) AS inserted
            """,
            doc.path,
            doc.title,
            doc.content_raw,
            content_hash,
            json.dumps(doc.frontmatter),
            doc.tags,
            doc.aliases,
            doc.word_count,
        )
        doc_id: str = str(row['id'])
        is_new = row['inserted']
        logger.info('%s document %s (%s)', 'Inserted' if is_new else 'Updated', doc.path, doc_id)
        # ---- Delete stale chunks ----
        await conn.execute('DELETE FROM chunks WHERE document_id = $1', row['id'])
        # ---- Insert chunks + embeddings ----
        chunk_records = []
        for chunk, embedding in zip(chunks, embeddings):
            chunk_records.append((
                row['id'],
                chunk.chunk_index,
                chunk.content,
                chunk.token_count,
                embedding,
                json.dumps(chunk.metadata),
            ))
        await conn.executemany(
            """
            INSERT INTO chunks (document_id, chunk_index, content, token_count, embedding, metadata)
            VALUES ($1, $2, $3, $4, $5::vector, $6::jsonb)
            """,
            chunk_records,
        )
        logger.debug('Upserted %d chunks for document %s', len(chunk_records), doc.path)
        # ---- Upsert relations (WikiLinks) ----
        await conn.execute(
            'DELETE FROM relations WHERE source_doc_id = $1 AND relation_type = $2',
            row['id'],
            'wikilink',
        )
        if doc.wikilinks:
            relation_records = [
                (row['id'], link, 'wikilink')
                for link in doc.wikilinks
            ]
            await conn.executemany(
                """
                INSERT INTO relations (source_doc_id, target_path, relation_type)
                VALUES ($1, $2, $3)
                """,
                relation_records,
            )
            # Resolve targets that already exist in the vault
            await conn.execute(
                """
                UPDATE relations r
                SET target_doc_id = d.id
                FROM documents d
                WHERE r.source_doc_id = $1
                  AND r.relation_type = 'wikilink'
                  AND (d.path LIKE '%' || r.target_path || '%'
                    OR d.title = r.target_path
                    OR r.target_path = ANY(d.aliases))
                """,
                row['id'],
            )
    return doc_id
 async def document_needs_reindex(conn: asyncpg.Connection, path: str, content_hash: str) -> bool:
    """Return True if the document is new or its content hash has changed."""
    row = await conn.fetchrow(
        'SELECT content_hash FROM documents WHERE path = $1',
        path,
    )
    if row is None:
        return True
    return row['content_hash'] != content_hash
 async def delete_document(conn: asyncpg.Connection, path: str) -> None:
    """Remove a document and its cascaded chunks/relations."""
    result = await conn.execute('DELETE FROM documents WHERE path = $1', path)
    logger.info('Deleted document %s (%s)', path, result)
--- a/services/ingestion-worker/main.py
+++ b/services/ingestion-worker/main.py
@ -0,0 +1,33 @@
 """
 main.py — Ingestion worker entry point.
 """
 from __future__ import annotations
 import asyncio
 import logging
 import sys
 from settings import Settings
 from watcher import run_watcher
 def setup_logging(level: str) -> None:
    logging.basicConfig(
        level=getattr(logging, level.upper(), logging.INFO),
        format='%(asctime)s [%(levelname)s] %(name)s: %(message)s',
        datefmt='%Y-%m-%dT%H:%M:%S',
        stream=sys.stdout,
    )
 if __name__ == '__main__':
    settings = Settings()
    setup_logging(settings.log_level)
    logger = logging.getLogger('ingestion-worker')
    logger.info('Starting ingestion worker (vault=%s)', settings.vault_path)
    try:
        asyncio.run(run_watcher(settings))
    except KeyboardInterrupt:
        logger.info('Ingestion worker stopped.')
--- a/services/ingestion-worker/parser.py
+++ b/services/ingestion-worker/parser.py
@ -0,0 +1,134 @@
 """
 parser.py — Markdown vault document parser.
 Extracts:
  - YAML frontmatter (title, tags, aliases, date, custom fields)
  - Plain text content (Markdown stripped)
  - WikiLinks [[target|alias]] and #tags
  - Word count
 """
 from __future__ import annotations
 import re
 from dataclasses import dataclass, field
 from pathlib import Path
 from typing import Any
 import frontmatter  # python-frontmatter
 # ---------------------------------------------------------------------------
 # Data classes
 # ---------------------------------------------------------------------------
@dataclass
 class ParsedDocument:
    path: str
    title: str
    content_raw: str          # original markdown
    content_text: str         # plain text (markdown stripped)
    frontmatter: dict[str, Any]
    tags: list[str]
    aliases: list[str]
    wikilinks: list[str]      # resolved link targets
    word_count: int
 # ---------------------------------------------------------------------------
 # Regexes
 # ---------------------------------------------------------------------------
 _WIKILINK_RE = re.compile(r'\[\[([^\[\]|]+)(?:\|[^\[\]]+)?\]\]')
 _INLINE_TAG_RE = re.compile(r'(?<!\w)#([\w/-]+)')
 _HEADING_RE = re.compile(r'^#{1,6}\s+', re.MULTILINE)
 _MARKDOWN_LINK_RE = re.compile(r'!?\[([^\]]*)\]\([^\)]*\)')
 _CODE_BLOCK_RE = re.compile(r'```[\s\S]*?```|`[^`]+`', re.MULTILINE)
 _HTML_RE = re.compile(r'<[^>]+>')
 _HORIZONTAL_RULE_RE = re.compile(r'^[-*_]{3,}\s*$', re.MULTILINE)
 def _strip_markdown(text: str) -> str:
    """Convert Markdown to plain text (lightweight, no external deps)."""
    # Remove code blocks first (preserve whitespace context)
    text = _CODE_BLOCK_RE.sub(' ', text)
    # Remove headings marker characters
    text = _HEADING_RE.sub('', text)
    # Replace Markdown links with their label
    text = _MARKDOWN_LINK_RE.sub(r'\1', text)
    # Replace WikiLinks with their display text (or target)
    text = _WIKILINK_RE.sub(lambda m: m.group(1).split('/')[-1], text)
    # Remove HTML tags
    text = _HTML_RE.sub(' ', text)
    # Remove horizontal rules
    text = _HORIZONTAL_RULE_RE.sub('', text)
    # Normalise whitespace
    text = re.sub(r'\n{3,}', '\n\n', text)
    return text.strip()
 # ---------------------------------------------------------------------------
 # Parser
 # ---------------------------------------------------------------------------
 def parse_document(file_path: Path, vault_root: Path) -> ParsedDocument:
    """
    Parse a single Markdown file and return a ``ParsedDocument``.
    Args:
        file_path:  Absolute path to the Markdown file.
        vault_root: Absolute path to the vault root (used to compute relative path).
    """
    raw_text = file_path.read_text(encoding='utf-8', errors='replace')
    relative_path = str(file_path.relative_to(vault_root))
    # Parse frontmatter + body
    post = frontmatter.loads(raw_text)
    fm: dict[str, Any] = dict(post.metadata)
    body: str = post.content
    # ---- Title ----
    title: str = fm.get('title', '')
    if not title:
        # Fall back to first H1 heading
        h1 = re.search(r'^#\s+(.+)$', body, re.MULTILINE)
        if h1:
            title = h1.group(1).strip()
        else:
            title = file_path.stem
    # ---- Tags ----
    fm_tags: list[str] = _normalise_list(fm.get('tags', []))
    inline_tags: list[str] = _INLINE_TAG_RE.findall(body)
    tags = list(dict.fromkeys([t.lower().lstrip('#') for t in fm_tags + inline_tags]))
    # ---- Aliases ----
    aliases = _normalise_list(fm.get('aliases', []))
    # ---- WikiLinks ----
    wikilinks = list(dict.fromkeys(_WIKILINK_RE.findall(body)))
    # ---- Plain text ----
    content_text = _strip_markdown(body)
    word_count = len(content_text.split())
    return ParsedDocument(
        path=relative_path,
        title=title,
        content_raw=raw_text,
        content_text=content_text,
        frontmatter=fm,
        tags=tags,
        aliases=aliases,
        wikilinks=wikilinks,
        word_count=word_count,
    )
 def _normalise_list(value: Any) -> list[str]:
    """Accept str, list[str], or None and return list[str]."""
    if not value:
        return []
    if isinstance(value, str):
        return [value]
    return [str(v) for v in value]
--- a/services/ingestion-worker/pipeline.py
+++ b/services/ingestion-worker/pipeline.py
@ -0,0 +1,91 @@
 """
 pipeline.py — Orchestrates the full ingestion flow for a single file.
 """
 from __future__ import annotations
 import hashlib
 import logging
 from pathlib import Path
 import asyncpg
 from chunker import chunk_document
 from embedder import get_embedder
 from indexer import document_needs_reindex, upsert_document
 from parser import parse_document
 from settings import Settings
 logger = logging.getLogger(__name__)
 def _sha256(text: str) -> str:
    return hashlib.sha256(text.encode('utf-8')).hexdigest()
 async def ingest_file(
    file_path: Path,
    settings: Settings,
    conn: asyncpg.Connection,
 ) -> bool:
    """
    Full ingestion pipeline for a single Markdown file.
    Returns True if the file was (re)indexed, False if skipped.
    """
    vault_root = Path(settings.vault_path)
    if not file_path.exists():
        logger.warning('File not found, skipping: %s', file_path)
        return False
    if not file_path.suffix.lower() == '.md':
        return False
    raw_text = file_path.read_text(encoding='utf-8', errors='replace')
    content_hash = _sha256(raw_text)
    relative_path = str(file_path.relative_to(vault_root))
    # Idempotency check
    if not await document_needs_reindex(conn, relative_path, content_hash):
        logger.debug('Skipping unchanged file: %s', relative_path)
        return False
    logger.info('Ingesting %s', relative_path)
    # Parse
    doc = parse_document(file_path, vault_root)
    # Chunk
    chunks = chunk_document(
        doc.content_text,
        target_tokens=settings.chunk_size,
        overlap_tokens=settings.chunk_overlap,
    )
    if not chunks:
        logger.warning('No chunks generated for %s', relative_path)
        return False
    # Embed
    embedder = get_embedder(
        provider=settings.embedding_provider,
        ollama_url=settings.ollama_url,
        model=settings.embedding_model,
    )
    texts = [c.content for c in chunks]
    embeddings = embedder.embed_batch(texts)
    # Validate embedding dimension consistency
    if embeddings and len(embeddings[0]) != embedder.dimensions:
        logger.error(
            'Embedding dimension mismatch: expected %d, got %d',
            embedder.dimensions,
            len(embeddings[0]),
        )
        raise ValueError('Embedding dimension mismatch')
    # Store
    doc_id = await upsert_document(conn, doc, chunks, embeddings)
    logger.info('Indexed %s → %s (%d chunks)', relative_path, doc_id, len(chunks))
    return True
--- a/services/ingestion-worker/requirements.txt
+++ b/services/ingestion-worker/requirements.txt
@ -0,0 +1,10 @@
 watchdog>=4.0.0
 asyncpg>=0.29.0
 pgvector>=0.2.5
 pydantic-settings>=2.2.0
 httpx>=0.27.0
 python-frontmatter>=1.1.0
 markdown-it-py>=3.0.0
 tiktoken>=0.7.0
 sentence-transformers>=3.0.0
 numpy>=1.26.0
--- a/services/ingestion-worker/settings.py
+++ b/services/ingestion-worker/settings.py
@ -0,0 +1,33 @@
 """
 settings.py — Configuration for the ingestion worker, loaded from environment variables.
 """
 from __future__ import annotations
 from pydantic_settings import BaseSettings, SettingsConfigDict
 class Settings(BaseSettings):
    model_config = SettingsConfigDict(env_file='.env', extra='ignore')
    # Database
    database_url: str = 'postgresql://brain:brain@postgres:5432/second_brain'
    # Vault
    vault_path: str = '/vault'
    # Ollama
    ollama_url: str = 'http://ollama:11434'
    # Embedding
    embedding_provider: str = 'ollama'   # ollama | sentence_transformers
    embedding_model: str = 'nomic-embed-text'
    # Chunking
    chunk_size: int = 700
    chunk_overlap: int = 70
    # Worker behaviour
    poll_interval: int = 30              # seconds between fallback polls
    batch_size: int = 20                 # files per ingestion batch
    log_level: str = 'INFO'
--- a/services/ingestion-worker/watcher.py
+++ b/services/ingestion-worker/watcher.py
@ -0,0 +1,118 @@
 """
 watcher.py — File system watcher that triggers ingestion on vault changes.
 """
 from __future__ import annotations
 import asyncio
 import logging
 import time
 from pathlib import Path
 from queue import Queue
 from threading import Thread
 import asyncpg
 from watchdog.events import FileSystemEvent, FileSystemEventHandler
 from watchdog.observers import Observer
 from pipeline import ingest_file
 from settings import Settings
 logger = logging.getLogger(__name__)
 class VaultEventHandler(FileSystemEventHandler):
    """Enqueues changed/created Markdown file paths."""
    def __init__(self, queue: Queue) -> None:
        super().__init__()
        self._queue = queue
    def on_created(self, event: FileSystemEvent) -> None:
        self._enqueue(event)
    def on_modified(self, event: FileSystemEvent) -> None:
        self._enqueue(event)
    def on_deleted(self, event: FileSystemEvent) -> None:
        if not event.is_directory and str(event.src_path).endswith('.md'):
            self._queue.put(('delete', event.src_path))
    def _enqueue(self, event: FileSystemEvent) -> None:
        if not event.is_directory and str(event.src_path).endswith('.md'):
            self._queue.put(('upsert', event.src_path))
 async def process_queue(
    queue: Queue,
    settings: Settings,
    pool: asyncpg.Pool,
 ) -> None:
    """Drain the event queue and process each file."""
    pending: set[str] = set()
    DEBOUNCE_SECONDS = 2.0
    while True:
        # Collect all queued events (debounce rapid saves)
        deadline = time.monotonic() + DEBOUNCE_SECONDS
        while time.monotonic() < deadline:
            try:
                action, path = queue.get_nowait()
                pending.add((action, path))
            except Exception:
                await asyncio.sleep(0.1)
        for action, path in list(pending):
            try:
                async with pool.acquire() as conn:
                    if action == 'upsert':
                        await ingest_file(Path(path), settings, conn)
                    elif action == 'delete':
                        from indexer import delete_document
                        relative = str(Path(path).relative_to(Path(settings.vault_path)))
                        await delete_document(conn, relative)
            except Exception as exc:
                logger.error('Error processing %s %s: %s', action, path, exc, exc_info=True)
        pending.clear()
 async def initial_scan(settings: Settings, pool: asyncpg.Pool) -> None:
    """Index all Markdown files in the vault at startup."""
    vault_path = Path(settings.vault_path)
    md_files = list(vault_path.rglob('*.md'))
    logger.info('Initial scan: found %d Markdown files', len(md_files))
    for i, file_path in enumerate(md_files):
        try:
            async with pool.acquire() as conn:
                indexed = await ingest_file(file_path, settings, conn)
            if indexed:
                logger.info('[%d/%d] Indexed %s', i + 1, len(md_files), file_path.name)
        except Exception as exc:
            logger.error('Failed to index %s: %s', file_path, exc, exc_info=True)
    logger.info('Initial scan complete.')
 async def run_watcher(settings: Settings) -> None:
    """Entry point: start file watcher + initial scan."""
    pool = await asyncpg.create_pool(settings.database_url, min_size=2, max_size=10)
    await initial_scan(settings, pool)
    event_queue: Queue = Queue()
    handler = VaultEventHandler(event_queue)
    observer = Observer()
    observer.schedule(handler, settings.vault_path, recursive=True)
    observer.start()
    logger.info('Watching vault at %s', settings.vault_path)
    try:
        await process_queue(event_queue, settings, pool)
    except asyncio.CancelledError:
        pass
    finally:
        observer.stop()
        observer.join()
        await pool.close()
--- a/services/rag-api/Dockerfile
+++ b/services/rag-api/Dockerfile
@ -0,0 +1,18 @@
 FROM python:3.12-slim
 WORKDIR /app
 RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential libpq-dev curl \
    && rm -rf /var/lib/apt/lists/*
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
 COPY . .
 ENV PYTHONUNBUFFERED=1
 EXPOSE 8000
 CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
--- a/services/rag-api/core/init.py
+++ b/services/rag-api/core/init.py
--- a/services/rag-api/core/database.py
+++ b/services/rag-api/core/database.py
@ -0,0 +1,35 @@
 """
 database.py — Async PostgreSQL connection pool (asyncpg).
 """
 from __future__ import annotations
 import asyncpg
 from core.settings import Settings
 _pool: asyncpg.Pool | None = None
 async def create_pool(settings: Settings) -> asyncpg.Pool:
    global _pool
    _pool = await asyncpg.create_pool(
        settings.database_url,
        min_size=settings.db_pool_min,
        max_size=settings.db_pool_max,
        command_timeout=60,
    )
    return _pool
 async def get_pool() -> asyncpg.Pool:
    if _pool is None:
        raise RuntimeError('Database pool not initialised')
    return _pool
 async def close_pool() -> None:
    global _pool
    if _pool:
        await _pool.close()
        _pool = None
--- a/services/rag-api/core/settings.py
+++ b/services/rag-api/core/settings.py
@ -0,0 +1,39 @@
 """
 settings.py — RAG API configuration loaded from environment variables.
 """
 from __future__ import annotations
 from pydantic_settings import BaseSettings, SettingsConfigDict
 class Settings(BaseSettings):
    model_config = SettingsConfigDict(env_file='.env', extra='ignore')
    # App
    app_title: str = 'Second Brain RAG API'
    app_version: str = '1.0.0'
    log_level: str = 'INFO'
    # Database
    database_url: str = 'postgresql://brain:brain@postgres:5432/second_brain'
    db_pool_min: int = 2
    db_pool_max: int = 20
    # Ollama
    ollama_url: str = 'http://ollama:11434'
    embedding_model: str = 'nomic-embed-text'
    chat_model: str = 'mistral'
    embedding_dimensions: int = 768
    # Search defaults
    search_top_k: int = 10
    search_threshold: float = 0.65
    rerank_enabled: bool = False
    # CORS (comma-separated origins)
    cors_origins: str = 'http://localhost:3000'
    @property
    def cors_origins_list(self) -> list[str]:
        return [o.strip() for o in self.cors_origins.split(',') if o.strip()]
--- a/services/rag-api/main.py
+++ b/services/rag-api/main.py
@ -0,0 +1,59 @@
 """
 main.py — FastAPI application entry point for the RAG API.
 """
 from __future__ import annotations
 import logging
 import sys
 from contextlib import asynccontextmanager
 from fastapi import FastAPI
 from fastapi.middleware.cors import CORSMiddleware
 from core.database import create_pool, close_pool
 from core.settings import Settings
 from routers import search, chat, documents, index, meta
 # Global settings instance (imported by routers via dependency)
 app_settings = Settings()
 def setup_logging(level: str) -> None:
    logging.basicConfig(
        level=getattr(logging, level.upper(), logging.INFO),
        format='%(asctime)s [%(levelname)s] %(name)s: %(message)s',
        datefmt='%Y-%m-%dT%H:%M:%S',
        stream=sys.stdout,
    )
@asynccontextmanager
 async def lifespan(app: FastAPI):
    setup_logging(app_settings.log_level)
    logging.getLogger('rag-api').info('Starting RAG API v%s', app_settings.app_version)
    await create_pool(app_settings)
    yield
    await close_pool()
 app = FastAPI(
    title=app_settings.app_title,
    version=app_settings.app_version,
    lifespan=lifespan,
 )
 app.add_middleware(
    CORSMiddleware,
    allow_origins=app_settings.cors_origins_list,
    allow_credentials=True,
    allow_methods=['*'],
    allow_headers=['*'],
 )
 # Register routers
 app.include_router(search.router, prefix='/api/v1')
 app.include_router(chat.router, prefix='/api/v1')
 app.include_router(documents.router, prefix='/api/v1')
 app.include_router(index.router, prefix='/api/v1')
 app.include_router(meta.router, prefix='/api/v1')
--- a/services/rag-api/models/init.py
+++ b/services/rag-api/models/init.py
--- a/services/rag-api/models/requests.py
+++ b/services/rag-api/models/requests.py
@ -0,0 +1,31 @@
 """
 models/requests.py — Pydantic request schemas for the RAG API.
 """
 from __future__ import annotations
 from typing import Optional
 from pydantic import BaseModel, Field
 class SearchRequest(BaseModel):
    query: str = Field(..., min_length=1, max_length=2000)
    limit: int = Field(default=10, ge=1, le=50)
    threshold: float = Field(default=0.65, ge=0.0, le=1.0)
    tags: Optional[list[str]] = None
    hybrid: bool = True
 class ChatRequest(BaseModel):
    message: str = Field(..., min_length=1, max_length=4000)
    conversation_id: Optional[str] = None
    context_limit: int = Field(default=5, ge=1, le=20)
    stream: bool = True
 class IndexRequest(BaseModel):
    path: str = Field(..., description='Relative path of file within the vault')
 class ReindexRequest(BaseModel):
    force: bool = False  # If True, reindex even unchanged files
--- a/services/rag-api/models/responses.py
+++ b/services/rag-api/models/responses.py
@ -0,0 +1,96 @@
 """
 models/responses.py — Pydantic response schemas for the RAG API.
 """
 from __future__ import annotations
 from datetime import datetime
 from typing import Any, Optional
 from pydantic import BaseModel
 class ChunkResult(BaseModel):
    document_id: str
    chunk_id: str
    title: str
    path: str
    content: str
    score: float
    tags: list[str]
    highlight: Optional[str] = None
 class SearchResponse(BaseModel):
    results: list[ChunkResult]
    total: int
    query_time_ms: float
 class DocumentResponse(BaseModel):
    id: str
    path: str
    title: str
    content: str
    frontmatter: dict[str, Any]
    tags: list[str]
    aliases: list[str]
    word_count: Optional[int]
    created_at: datetime
    updated_at: datetime
    indexed_at: Optional[datetime]
 class RelatedDocument(BaseModel):
    document_id: str
    title: str
    path: str
    score: float
    tags: list[str]
 class GraphNode(BaseModel):
    id: str
    title: str
    path: str
    tags: list[str]
    word_count: Optional[int]
 class GraphEdge(BaseModel):
    source: str
    target: str
    relation_type: str
    label: Optional[str]
 class GraphResponse(BaseModel):
    nodes: list[GraphNode]
    edges: list[GraphEdge]
 class TagCount(BaseModel):
    tag: str
    count: int
 class StatsResponse(BaseModel):
    total_documents: int
    total_chunks: int
    total_relations: int
    total_tags: int
    last_indexed: Optional[datetime]
    embedding_model: str
    chat_model: str
 class HealthResponse(BaseModel):
    status: str
    database: str
    ollama: str
    version: str
 class JobResponse(BaseModel):
    job_id: str
    status: str
    message: str
--- a/services/rag-api/requirements.txt
+++ b/services/rag-api/requirements.txt
@ -0,0 +1,9 @@
 fastapi>=0.111.0
 uvicorn[standard]>=0.29.0
 asyncpg>=0.29.0
 pgvector>=0.2.5
 pydantic>=2.7.0
 pydantic-settings>=2.2.0
 httpx>=0.27.0
 python-multipart>=0.0.9
 sse-starlette>=2.1.0
--- a/services/rag-api/routers/init.py
+++ b/services/rag-api/routers/init.py
--- a/services/rag-api/routers/chat.py
+++ b/services/rag-api/routers/chat.py
@ -0,0 +1,52 @@
 """
 routers/chat.py — /chat endpoint with SSE streaming.
 """
 from __future__ import annotations
 from fastapi import APIRouter, Depends
 from fastapi.responses import StreamingResponse
 from core.database import get_pool
 from core.settings import Settings
 from models.requests import ChatRequest
 from services.chat import stream_chat
 from services.embedder import EmbedService
 from services.retriever import hybrid_search
 router = APIRouter(prefix='/chat', tags=['chat'])
 def _get_settings() -> Settings:
    from main import app_settings
    return app_settings
@router.post('')
 async def chat(req: ChatRequest, settings: Settings = Depends(_get_settings)):
    pool = await get_pool()
    embedder = EmbedService(settings.ollama_url, settings.embedding_model)
    embedding = await embedder.embed(req.message)
    async with pool.acquire() as conn:
        context_chunks, _ = await hybrid_search(
            conn=conn,
            query=req.message,
            embedding=embedding,
            limit=req.context_limit,
            threshold=settings.search_threshold,
        )
    return StreamingResponse(
        stream_chat(
            message=req.message,
            context_chunks=context_chunks,
            ollama_url=settings.ollama_url,
            model=settings.chat_model,
        ),
        media_type='text/event-stream',
        headers={
            'Cache-Control': 'no-cache',
            'X-Accel-Buffering': 'no',
        },
    )
--- a/services/rag-api/routers/documents.py
+++ b/services/rag-api/routers/documents.py
@ -0,0 +1,67 @@
 """
 routers/documents.py — Document CRUD and graph endpoints.
 """
 from __future__ import annotations
 from typing import Optional
 from fastapi import APIRouter, HTTPException, Depends
 import asyncpg
 from core.database import get_pool
 from core.settings import Settings
 from models.responses import DocumentResponse, GraphResponse, GraphNode, GraphEdge, RelatedDocument, TagCount
 from services.retriever import get_related
 router = APIRouter(prefix='/document', tags=['documents'])
 def _get_settings() -> Settings:
    from main import app_settings
    return app_settings
@router.get('/{document_id}', response_model=DocumentResponse)
 async def get_document(document_id: str):
    pool = await get_pool()
    async with pool.acquire() as conn:
        row = await conn.fetchrow(
            'SELECT * FROM documents WHERE id = $1::uuid', document_id
        )
    if not row:
        raise HTTPException(status_code=404, detail='Document not found')
    return _row_to_doc(row)
@router.get('/path/{path:path}', response_model=DocumentResponse)
 async def get_document_by_path(path: str):
    pool = await get_pool()
    async with pool.acquire() as conn:
        row = await conn.fetchrow('SELECT * FROM documents WHERE path = $1', path)
    if not row:
        raise HTTPException(status_code=404, detail='Document not found')
    return _row_to_doc(row)
@router.get('/{document_id}/related', response_model=list[RelatedDocument])
 async def related_documents(document_id: str, limit: int = 5):
    pool = await get_pool()
    async with pool.acquire() as conn:
        related = await get_related(conn, document_id, limit=limit)
    return [RelatedDocument(**r) for r in related]
 def _row_to_doc(row: asyncpg.Record) -> DocumentResponse:
    return DocumentResponse(
        id=str(row['id']),
        path=row['path'],
        title=row['title'] or '',
        content=row['content'],
        frontmatter=dict(row['frontmatter'] or {}),
        tags=list(row['tags'] or []),
        aliases=list(row['aliases'] or []),
        word_count=row['word_count'],
        created_at=row['created_at'],
        updated_at=row['updated_at'],
        indexed_at=row['indexed_at'],
    )
--- a/services/rag-api/routers/index.py
+++ b/services/rag-api/routers/index.py
@ -0,0 +1,49 @@
 """
 routers/index.py — /index and /reindex endpoints.
 """
 from __future__ import annotations
 import uuid
 from fastapi import APIRouter, BackgroundTasks, Depends
 from core.database import get_pool
 from core.settings import Settings
 from models.requests import IndexRequest, ReindexRequest
 from models.responses import JobResponse
 router = APIRouter(prefix='/index', tags=['indexing'])
 def _get_settings() -> Settings:
    from main import app_settings
    return app_settings
 async def _enqueue_job(agent_type: str, payload: dict, pool) -> str:
    job_id = str(uuid.uuid4())
    async with pool.acquire() as conn:
        await conn.execute(
            """
            INSERT INTO agent_jobs (id, agent_type, payload)
            VALUES ($1::uuid, $2, $3::jsonb)
            """,
            job_id,
            agent_type,
            __import__('json').dumps(payload),
        )
    return job_id
@router.post('', response_model=JobResponse)
 async def index_file(req: IndexRequest, settings: Settings = Depends(_get_settings)):
    pool = await get_pool()
    job_id = await _enqueue_job('ingestion', {'path': req.path, 'force': True}, pool)
    return JobResponse(job_id=job_id, status='pending', message=f'Indexing {req.path}')
@router.post('/reindex', response_model=JobResponse)
 async def reindex_vault(req: ReindexRequest, settings: Settings = Depends(_get_settings)):
    pool = await get_pool()
    job_id = await _enqueue_job('ingestion', {'reindex_all': True, 'force': req.force}, pool)
    return JobResponse(job_id=job_id, status='pending', message='Full vault reindex queued')
--- a/services/rag-api/routers/meta.py
+++ b/services/rag-api/routers/meta.py
@ -0,0 +1,129 @@
 """
 routers/meta.py — /health, /stats, /tags, /graph endpoints.
 """
 from __future__ import annotations
 from fastapi import APIRouter
 import httpx
 from core.database import get_pool
 from core.settings import Settings
 from models.responses import HealthResponse, StatsResponse, TagCount, GraphResponse, GraphNode, GraphEdge
 router = APIRouter(tags=['meta'])
 def _get_settings() -> Settings:
    from main import app_settings
    return app_settings
@router.get('/health', response_model=HealthResponse)
 async def health():
    settings = _get_settings()
    db_status = 'ok'
    ollama_status = 'ok'
    try:
        pool = await get_pool()
        async with pool.acquire() as conn:
            await conn.fetchval('SELECT 1')
    except Exception:
        db_status = 'error'
    try:
        async with httpx.AsyncClient(timeout=5.0) as client:
            resp = await client.get(f'{settings.ollama_url}/api/tags')
            if resp.status_code != 200:
                ollama_status = 'error'
    except Exception:
        ollama_status = 'unavailable'
    overall = 'ok' if db_status == 'ok' else 'degraded'
    return HealthResponse(
        status=overall,
        database=db_status,
        ollama=ollama_status,
        version=settings.app_version,
    )
@router.get('/stats', response_model=StatsResponse)
 async def stats():
    settings = _get_settings()
    pool = await get_pool()
    async with pool.acquire() as conn:
        docs = await conn.fetchval('SELECT COUNT(*) FROM documents')
        chunks = await conn.fetchval('SELECT COUNT(*) FROM chunks')
        relations = await conn.fetchval('SELECT COUNT(*) FROM relations')
        tags_count = await conn.fetchval(
            "SELECT COUNT(DISTINCT tag) FROM documents, unnest(tags) AS tag"
        )
        last_indexed = await conn.fetchval(
            'SELECT MAX(indexed_at) FROM documents'
        )
    return StatsResponse(
        total_documents=docs or 0,
        total_chunks=chunks or 0,
        total_relations=relations or 0,
        total_tags=tags_count or 0,
        last_indexed=last_indexed,
        embedding_model=settings.embedding_model,
        chat_model=settings.chat_model,
    )
@router.get('/tags', response_model=list[TagCount])
 async def list_tags():
    pool = await get_pool()
    async with pool.acquire() as conn:
        rows = await conn.fetch(
            """
            SELECT tag, COUNT(*) AS count
            FROM documents, unnest(tags) AS tag
            GROUP BY tag
            ORDER BY count DESC, tag
            """
        )
    return [TagCount(tag=row['tag'], count=row['count']) for row in rows]
@router.get('/graph', response_model=GraphResponse)
 async def knowledge_graph(limit: int = 200):
    pool = await get_pool()
    async with pool.acquire() as conn:
        doc_rows = await conn.fetch(
            'SELECT id, title, path, tags, word_count FROM documents LIMIT $1',
            limit,
        )
        rel_rows = await conn.fetch(
            """
            SELECT r.source_doc_id::text, r.target_doc_id::text, r.relation_type, r.label
            FROM relations r
            WHERE r.target_doc_id IS NOT NULL
            LIMIT $1
            """,
            limit * 3,
        )
    nodes = [
        GraphNode(
            id=str(row['id']),
            title=row['title'] or '',
            path=row['path'],
            tags=list(row['tags'] or []),
            word_count=row['word_count'],
        )
        for row in doc_rows
    ]
    edges = [
        GraphEdge(
            source=row['source_doc_id'],
            target=row['target_doc_id'],
            relation_type=row['relation_type'],
            label=row['label'],
        )
        for row in rel_rows
    ]
    return GraphResponse(nodes=nodes, edges=edges)
--- a/services/rag-api/routers/search.py
+++ b/services/rag-api/routers/search.py
@ -0,0 +1,43 @@
 """
 routers/search.py — /search endpoint.
 """
 from __future__ import annotations
 import time
 from fastapi import APIRouter, Depends
 from fastapi.responses import JSONResponse
 from core.database import get_pool
 from models.requests import SearchRequest
 from models.responses import SearchResponse
 from services.embedder import EmbedService
 from services.retriever import hybrid_search
 from core.settings import Settings
 router = APIRouter(prefix='/search', tags=['search'])
 def _get_settings() -> Settings:
    from main import app_settings
    return app_settings
@router.post('', response_model=SearchResponse)
 async def search(req: SearchRequest, settings: Settings = Depends(_get_settings)):
    pool = await get_pool()
    embedder = EmbedService(settings.ollama_url, settings.embedding_model)
    embedding = await embedder.embed(req.query)
    async with pool.acquire() as conn:
        results, elapsed = await hybrid_search(
            conn=conn,
            query=req.query,
            embedding=embedding,
            limit=req.limit,
            threshold=req.threshold,
            tags=req.tags,
        )
    return SearchResponse(results=results, total=len(results), query_time_ms=elapsed)
--- a/services/rag-api/services/init.py
+++ b/services/rag-api/services/init.py
--- a/services/rag-api/services/chat.py
+++ b/services/rag-api/services/chat.py
@ -0,0 +1,87 @@
 """
 services/chat.py — RAG chat: retrieves context, streams LLM response.
 """
 from __future__ import annotations
 import json
 import logging
 from typing import AsyncIterator
 import httpx
 from models.responses import ChunkResult
 logger = logging.getLogger(__name__)
 SYSTEM_PROMPT = """You are a knowledgeable assistant with access to the user's personal knowledge base (Second Brain).
 Answer questions based on the provided context documents. 
 Always cite which documents you drew information from using the format [Document Title].
 If the context doesn't contain enough information, say so honestly rather than fabricating answers.
 Be concise and precise."""
 async def stream_chat(
    message: str,
    context_chunks: list[ChunkResult],
    ollama_url: str,
    model: str,
 ) -> AsyncIterator[str]:
    """
    Stream a chat response via Ollama using the retrieved context.
    Yields Server-Sent Events (SSE) formatted strings.
    """
    # Build context block
    context_parts = []
    for i, chunk in enumerate(context_chunks, 1):
        context_parts.append(
            f'[{i}] **{chunk.title}** (path: {chunk.path})\n{chunk.content}'
        )
    context_text = '\n\n---\n\n'.join(context_parts)
    prompt = f"""Context from knowledge base:
 {context_text}
 ---
 User question: {message}
 Answer based on the above context:"""
    url = f'{ollama_url.rstrip("/")}/api/chat'
    payload = {
        'model': model,
        'stream': True,
        'messages': [
            {'role': 'system', 'content': SYSTEM_PROMPT},
            {'role': 'user', 'content': prompt},
        ],
    }
    # Yield sources first
    sources = [
        {'title': c.title, 'path': c.path, 'score': c.score}
        for c in context_chunks
    ]
    yield f'data: {json.dumps({"type": "sources", "sources": sources})}\n\n'
    # Stream tokens
    async with httpx.AsyncClient(timeout=120.0) as client:
        async with client.stream('POST', url, json=payload) as resp:
            resp.raise_for_status()
            async for line in resp.aiter_lines():
                if not line.strip():
                    continue
                try:
                    chunk_data = json.loads(line)
                    token = chunk_data.get('message', {}).get('content', '')
                    if token:
                        yield f'data: {json.dumps({"type": "token", "token": token})}\n\n'
                    if chunk_data.get('done', False):
                        break
                except json.JSONDecodeError:
                    continue
    yield f'data: {json.dumps({"type": "done"})}\n\n'
--- a/services/rag-api/services/embedder.py
+++ b/services/rag-api/services/embedder.py
@ -0,0 +1,31 @@
 """
 services/embedder.py — Thin async wrapper around Ollama embeddings for the API.
 """
 from __future__ import annotations
 import logging
 import time
 import httpx
 logger = logging.getLogger(__name__)
 class EmbedService:
    def __init__(self, ollama_url: str, model: str, timeout: float = 30.0) -> None:
        self._url = f'{ollama_url.rstrip("/")}/api/embed'
        self._model = model
        self._timeout = timeout
    async def embed(self, text: str) -> list[float]:
        return (await self.embed_batch([text]))[0]
    async def embed_batch(self, texts: list[str]) -> list[list[float]]:
        async with httpx.AsyncClient(timeout=self._timeout) as client:
            resp = await client.post(
                self._url,
                json={'model': self._model, 'input': texts},
            )
            resp.raise_for_status()
            return resp.json()['embeddings']
--- a/services/rag-api/services/retriever.py
+++ b/services/rag-api/services/retriever.py
@ -0,0 +1,160 @@
 """
 services/retriever.py — Hybrid vector + full-text search against PostgreSQL.
 """
 from __future__ import annotations
 import logging
 import time
 from typing import Optional
 import asyncpg
 from models.responses import ChunkResult
 logger = logging.getLogger(__name__)
 async def hybrid_search(
    conn: asyncpg.Connection,
    query: str,
    embedding: list[float],
    limit: int = 10,
    threshold: float = 0.65,
    tags: Optional[list[str]] = None,
 ) -> tuple[list[ChunkResult], float]:
    """
    Hybrid search: vector similarity + full-text search, merged by RRF.
    Returns (results, query_time_ms).
    """
    start = time.monotonic()
    tag_filter = ''
    params: list = [embedding, query, limit * 2, threshold]
    if tags:
        tag_filter = 'AND d.tags && $5'
        params.append(tags)
    # Combined RRF (Reciprocal Rank Fusion) of vector and FTS results
    sql = f"""
    WITH vector_results AS (
        SELECT
            c.id AS chunk_id,
            c.document_id,
            c.content,
            c.chunk_index,
            1 - (c.embedding <=> $1::vector) AS vector_score,
            ROW_NUMBER() OVER (ORDER BY c.embedding <=> $1::vector) AS vector_rank
        FROM chunks c
        JOIN documents d ON d.id = c.document_id
        WHERE 1 - (c.embedding <=> $1::vector) >= $4
        {tag_filter}
        ORDER BY c.embedding <=> $1::vector
        LIMIT $3
    ),
    fts_results AS (
        SELECT
            c.id AS chunk_id,
            c.document_id,
            c.content,
            c.chunk_index,
            ts_rank_cd(d.fts_vector, plainto_tsquery('english', $2)) AS fts_score,
            ROW_NUMBER() OVER (
                ORDER BY ts_rank_cd(d.fts_vector, plainto_tsquery('english', $2)) DESC
            ) AS fts_rank
        FROM chunks c
        JOIN documents d ON d.id = c.document_id
        WHERE d.fts_vector @@ plainto_tsquery('english', $2)
        {tag_filter}
        ORDER BY fts_score DESC
        LIMIT $3
    ),
    merged AS (
        SELECT
            COALESCE(v.chunk_id, f.chunk_id) AS chunk_id,
            COALESCE(v.document_id, f.document_id) AS document_id,
            COALESCE(v.content, f.content) AS content,
            (COALESCE(1.0 / (60 + v.vector_rank), 0) +
             COALESCE(1.0 / (60 + f.fts_rank), 0)) AS rrf_score,
            COALESCE(v.vector_score, 0) AS vector_score
        FROM vector_results v
        FULL OUTER JOIN fts_results f ON v.chunk_id = f.chunk_id
    )
    SELECT
        m.chunk_id::text,
        m.document_id::text,
        m.content,
        m.rrf_score,
        m.vector_score,
        d.title,
        d.path,
        d.tags,
        ts_headline('english', m.content, plainto_tsquery('english', $2),
                    'MaxWords=20, MinWords=10, ShortWord=3') AS highlight
    FROM merged m
    JOIN documents d ON d.id = m.document_id
    ORDER BY m.rrf_score DESC
    LIMIT $3
    """
    rows = await conn.fetch(sql, *params)
    elapsed_ms = (time.monotonic() - start) * 1000
    results = [
        ChunkResult(
            chunk_id=str(row['chunk_id']),
            document_id=str(row['document_id']),
            title=row['title'] or '',
            path=row['path'],
            content=row['content'],
            score=round(float(row['rrf_score']), 4),
            tags=list(row['tags'] or []),
            highlight=row['highlight'],
        )
        for row in rows
    ]
    return results[:limit], round(elapsed_ms, 2)
 async def get_related(
    conn: asyncpg.Connection,
    document_id: str,
    limit: int = 5,
 ) -> list[dict]:
    """Find documents related to the given document via average chunk embedding."""
    rows = await conn.fetch(
        """
        WITH doc_embedding AS (
            SELECT AVG(embedding) AS avg_emb
            FROM chunks
            WHERE document_id = $1::uuid
        )
        SELECT
            d.id::text,
            d.title,
            d.path,
            d.tags,
            1 - (AVG(c.embedding) <=> (SELECT avg_emb FROM doc_embedding)) AS score
        FROM chunks c
        JOIN documents d ON d.id = c.document_id
        WHERE c.document_id != $1::uuid
        GROUP BY d.id, d.title, d.path, d.tags
        ORDER BY score DESC
        LIMIT $2
        """,
        document_id,
        limit,
    )
    return [
        {
            'document_id': row['id'],
            'title': row['title'] or '',
            'path': row['path'],
            'tags': list(row['tags'] or []),
            'score': round(float(row['score']), 4),
        }
        for row in rows
    ]
--- a/services/web-ui/Dockerfile
+++ b/services/web-ui/Dockerfile
@ -0,0 +1,31 @@
 FROM node:20-alpine AS base
 FROM base AS deps
 WORKDIR /app
 COPY package.json ./
 RUN npm install --frozen-lockfile || npm install
 FROM base AS builder
 WORKDIR /app
 COPY --from=deps /app/node_modules ./node_modules
 COPY . .
 ENV NEXT_TELEMETRY_DISABLED=1
 RUN npm run build
 FROM base AS runner
 WORKDIR /app
 ENV NODE_ENV=production
 ENV NEXT_TELEMETRY_DISABLED=1
 RUN addgroup --system --gid 1001 nodejs && \
    adduser --system --uid 1001 nextjs
 COPY --from=builder /app/public ./public
 COPY --from=builder --chown=nextjs:nodejs /app/.next/standalone ./
 COPY --from=builder --chown=nextjs:nodejs /app/.next/static ./.next/static
 USER nextjs
 EXPOSE 3000
 ENV PORT=3000
 CMD ["node", "server.js"]
--- a/services/web-ui/app/chat/page.tsx
+++ b/services/web-ui/app/chat/page.tsx
@ -0,0 +1,136 @@
 'use client';
 import { streamChat } from '@/lib/api';
 import { useState, useRef, useEffect, useCallback } from 'react';
 import { Send, Loader2, Bot, User, BookOpen } from 'lucide-react';
 interface Message {
  role: 'user' | 'assistant';
  content: string;
  sources?: { title: string; path: string; score: number }[];
 }
 export default function ChatPage() {
  const [messages, setMessages] = useState<Message[]>([]);
  const [input, setInput] = useState('');
  const [streaming, setStreaming] = useState(false);
  const cancelRef = useRef<(() => void) | null>(null);
  const bottomRef = useRef<HTMLDivElement>(null);
  useEffect(() => {
    bottomRef.current?.scrollIntoView({ behavior: 'smooth' });
  }, [messages]);
  const sendMessage = useCallback(async () => {
    const text = input.trim();
    if (!text || streaming) return;
    setInput('');
    const userMsg: Message = { role: 'user', content: text };
    setMessages((prev) => [...prev, userMsg]);
    setStreaming(true);
    const assistantMsg: Message = { role: 'assistant', content: '', sources: [] };
    setMessages((prev) => [...prev, assistantMsg]);
    cancelRef.current = streamChat(
      text,
      5,
      (token) => {
        setMessages((prev) => {
          const next = [...prev];
          const last = next[next.length - 1];
          next[next.length - 1] = { ...last, content: last.content + token };
          return next;
        });
      },
      (sources) => {
        setMessages((prev) => {
          const next = [...prev];
          next[next.length - 1] = { ...next[next.length - 1], sources };
          return next;
        });
      },
      () => setStreaming(false),
    );
  }, [input, streaming]);
  const handleKeyDown = (e: React.KeyboardEvent) => {
    if (e.key === 'Enter' && !e.shiftKey) {
      e.preventDefault();
      sendMessage();
    }
  };
  return (
    <div className="max-w-3xl mx-auto flex flex-col h-[calc(100vh-5rem)]">
      <h1 className="text-2xl font-bold text-slate-900 mb-4">AI Chat</h1>
      <div className="flex-1 overflow-y-auto space-y-4 pr-1 mb-4">
        {messages.length === 0 && (
          <div className="text-center text-slate-400 py-16">
            <Bot size={48} className="mx-auto mb-3 opacity-30" />
            <p>Ask anything about your knowledge base</p>
          </div>
        )}
        {messages.map((msg, i) => (
          <div key={i} className={`flex gap-3 ${msg.role === 'user' ? 'justify-end' : 'justify-start'}`}>
            {msg.role === 'assistant' && (
              <div className="w-8 h-8 rounded-full bg-brain-600 flex items-center justify-center shrink-0 mt-1">
                <Bot size={16} className="text-white" />
              </div>
            )}
            <div className={`max-w-[80%] ${msg.role === 'user' ? 'order-1' : ''}`}>
              <div className={`rounded-2xl px-4 py-3 text-sm leading-relaxed whitespace-pre-wrap
                ${msg.role === 'user'
                  ? 'bg-brain-600 text-white rounded-tr-none'
                  : 'bg-white border border-slate-200 text-slate-800 rounded-tl-none'
                }`}>
                {msg.content}
                {msg.role === 'assistant' && streaming && i === messages.length - 1 && (
                  <span className="inline-block w-1.5 h-4 bg-brain-500 animate-pulse ml-0.5 rounded" />
                )}
              </div>
              {msg.sources && msg.sources.length > 0 && (
                <div className="mt-2 flex flex-wrap gap-1">
                  {msg.sources.map((src, si) => (
                    <span key={si} className="text-xs bg-slate-100 text-slate-600 px-2 py-0.5 rounded-full flex items-center gap-1">
                      <BookOpen size={10} />
                      {src.title}
                    </span>
                  ))}
                </div>
              )}
            </div>
            {msg.role === 'user' && (
              <div className="w-8 h-8 rounded-full bg-slate-200 flex items-center justify-center shrink-0 mt-1">
                <User size={16} className="text-slate-600" />
              </div>
            )}
          </div>
        ))}
        <div ref={bottomRef} />
      </div>
      <div className="flex gap-2">
        <textarea
          value={input}
          onChange={(e) => setInput(e.target.value)}
          onKeyDown={handleKeyDown}
          disabled={streaming}
          placeholder="Ask your second brain... (Enter to send, Shift+Enter for newline)"
          rows={2}
          className="flex-1 px-4 py-2.5 border border-slate-300 rounded-xl focus:outline-none focus:ring-2 focus:ring-brain-500 resize-none text-sm disabled:opacity-60"
        />
        <button
          onClick={sendMessage}
          disabled={streaming || !input.trim()}
          className="px-4 bg-brain-600 text-white rounded-xl hover:bg-brain-700 disabled:opacity-50 disabled:cursor-not-allowed flex items-center"
        >
          {streaming ? <Loader2 size={20} className="animate-spin" /> : <Send size={20} />}
        </button>
      </div>
    </div>
  );
 }
--- a/services/web-ui/app/documents/[id]/page.tsx
+++ b/services/web-ui/app/documents/[id]/page.tsx
@ -0,0 +1,87 @@
 'use client';
 import { getDocument, Document } from '@/lib/api';
 import { useEffect, useState } from 'react';
 import { useParams } from 'next/navigation';
 import { FileText, Tag, ArrowLeft, Loader2, Calendar } from 'lucide-react';
 import Link from 'next/link';
 export default function DocumentPage() {
  const { id } = useParams<{ id: string }>();
  const [doc, setDoc] = useState<Document | null>(null);
  const [loading, setLoading] = useState(true);
  const [error, setError] = useState('');
  useEffect(() => {
    getDocument(id)
      .then(setDoc)
      .catch(() => setError('Document not found'))
      .finally(() => setLoading(false));
  }, [id]);
  if (loading) return (
    <div className="flex justify-center items-center h-40">
      <Loader2 className="animate-spin text-brain-500" size={32} />
    </div>
  );
  if (error || !doc) return (
    <div className="text-red-600 p-4">{error || 'Document not found'}</div>
  );
  return (
    <div className="max-w-3xl mx-auto">
      <Link href="/search" className="inline-flex items-center gap-1 text-sm text-slate-500 hover:text-brain-600 mb-4">
        <ArrowLeft size={14} /> Back to Search
      </Link>
      <div className="bg-white rounded-xl border border-slate-200 p-8">
        <div className="flex items-start gap-3 mb-4">
          <FileText className="text-brain-500 mt-1 shrink-0" size={24} />
          <div>
            <h1 className="text-2xl font-bold text-slate-900">{doc.title}</h1>
            <p className="text-sm text-slate-400 mt-1">{doc.path}</p>
          </div>
        </div>
        <div className="flex flex-wrap gap-4 text-sm text-slate-500 mb-6 pb-6 border-b border-slate-100">
          {doc.word_count && (
            <span>{doc.word_count.toLocaleString()} words</span>
          )}
          {doc.indexed_at && (
            <span className="flex items-center gap-1">
              <Calendar size={13} />
              Indexed {new Date(doc.indexed_at).toLocaleDateString()}
            </span>
          )}
        </div>
        {doc.tags.length > 0 && (
          <div className="flex flex-wrap gap-2 mb-6">
            {doc.tags.map((tag) => (
              <Link
                key={tag}
                href={`/tags?tag=${encodeURIComponent(tag)}`}
                className="text-xs bg-brain-50 text-brain-700 px-2.5 py-1 rounded-full flex items-center gap-1 hover:bg-brain-100"
              >
                <Tag size={10} />
                {tag}
              </Link>
            ))}
          </div>
        )}
        {doc.frontmatter?.summary && (
          <div className="mb-6 p-4 bg-slate-50 rounded-lg border border-slate-100">
            <p className="text-sm font-medium text-slate-600 mb-1">Summary</p>
            <p className="text-sm text-slate-700">{doc.frontmatter.summary as string}</p>
          </div>
        )}
        <div className="prose max-w-none text-slate-700 whitespace-pre-wrap font-mono text-sm leading-relaxed bg-slate-50 rounded-lg p-5 overflow-x-auto">
          {doc.content}
        </div>
      </div>
    </div>
  );
 }
--- a/services/web-ui/app/globals.css
+++ b/services/web-ui/app/globals.css
@ -0,0 +1,26 @@
@tailwind base;
@tailwind components;
@tailwind utilities;
 :root {
  --foreground: #0f172a;
  --background: #f8fafc;
 }
 body {
  color: var(--foreground);
  background: var(--background);
  font-family: system-ui, -apple-system, sans-serif;
 }
 /* Markdown content rendering */
 .prose h1 { @apply text-2xl font-bold mb-3 mt-6; }
 .prose h2 { @apply text-xl font-semibold mb-2 mt-5; }
 .prose h3 { @apply text-lg font-semibold mb-2 mt-4; }
 .prose p  { @apply mb-3 leading-relaxed; }
 .prose ul { @apply list-disc list-inside mb-3; }
 .prose ol { @apply list-decimal list-inside mb-3; }
 .prose code { @apply bg-slate-100 px-1 rounded text-sm font-mono; }
 .prose pre  { @apply bg-slate-900 text-slate-100 p-4 rounded-lg overflow-x-auto mb-3; }
 .prose blockquote { @apply border-l-4 border-brain-500 pl-4 italic text-slate-600 mb-3; }
 .prose a { @apply text-brain-600 underline hover:text-brain-700; }
--- a/services/web-ui/app/layout.tsx
+++ b/services/web-ui/app/layout.tsx
@ -0,0 +1,21 @@
 import type { Metadata } from 'next';
 import './globals.css';
 import Sidebar from '@/components/layout/Sidebar';
 export const metadata: Metadata = {
  title: 'Second Brain',
  description: 'Your AI-powered personal knowledge base',
 };
 export default function RootLayout({ children }: { children: React.ReactNode }) {
  return (
    <html lang="en">
      <body className="flex h-screen overflow-hidden bg-slate-50">
        <Sidebar />
        <main className="flex-1 overflow-y-auto p-6">
          {children}
        </main>
      </body>
    </html>
  );
 }
--- a/services/web-ui/app/page.tsx
+++ b/services/web-ui/app/page.tsx
@ -0,0 +1,67 @@
 'use client';
 import { getStats } from '@/lib/api';
 import { useEffect, useState } from 'react';
 import { Brain, FileText, Layers, Tag, Link } from 'lucide-react';
 interface Stats {
  total_documents: number;
  total_chunks: number;
  total_relations: number;
  total_tags: number;
  last_indexed: string | null;
  embedding_model: string;
  chat_model: string;
 }
 export default function HomePage() {
  const [stats, setStats] = useState<Stats | null>(null);
  useEffect(() => {
    getStats().then(setStats).catch(console.error);
  }, []);
  const cards = stats ? [
    { label: 'Documents', value: stats.total_documents, icon: FileText, color: 'text-blue-600' },
    { label: 'Chunks', value: stats.total_chunks, icon: Layers, color: 'text-purple-600' },
    { label: 'Links', value: stats.total_relations, icon: Link, color: 'text-green-600' },
    { label: 'Tags', value: stats.total_tags, icon: Tag, color: 'text-orange-600' },
  ] : [];
  return (
    <div className="max-w-4xl mx-auto">
      <div className="flex items-center gap-3 mb-8">
        <Brain className="text-brain-600" size={36} />
        <div>
          <h1 className="text-3xl font-bold text-slate-900">Second Brain</h1>
          <p className="text-slate-500">Your AI-powered knowledge base</p>
        </div>
      </div>
      {stats && (
        <>
          <div className="grid grid-cols-2 md:grid-cols-4 gap-4 mb-8">
            {cards.map((card) => (
              <div key={card.label} className="bg-white rounded-xl shadow-sm border border-slate-200 p-5">
                <card.icon className={`${card.color} mb-2`} size={24} />
                <p className="text-2xl font-bold text-slate-900">{card.value.toLocaleString()}</p>
                <p className="text-sm text-slate-500">{card.label}</p>
              </div>
            ))}
          </div>
          <div className="bg-white rounded-xl border border-slate-200 p-5 text-sm text-slate-600">
            <p><span className="font-medium">Embedding model:</span> {stats.embedding_model}</p>
            <p><span className="font-medium">Chat model:</span> {stats.chat_model}</p>
            {stats.last_indexed && (
              <p><span className="font-medium">Last indexed:</span> {new Date(stats.last_indexed).toLocaleString()}</p>
            )}
          </div>
        </>
      )}
      {!stats && (
        <div className="text-slate-400 animate-pulse">Loading stats...</div>
      )}
    </div>
  );
 }
--- a/services/web-ui/app/search/page.tsx
+++ b/services/web-ui/app/search/page.tsx
@ -0,0 +1,108 @@
 'use client';
 import { search, SearchResult } from '@/lib/api';
 import { useState, useCallback } from 'react';
 import { Search, Loader2, FileText, Tag } from 'lucide-react';
 import Link from 'next/link';
 export default function SearchPage() {
  const [query, setQuery] = useState('');
  const [results, setResults] = useState<SearchResult[]>([]);
  const [loading, setLoading] = useState(false);
  const [queryTime, setQueryTime] = useState<number | null>(null);
  const [error, setError] = useState('');
  const handleSearch = useCallback(async (e: React.FormEvent) => {
    e.preventDefault();
    if (!query.trim()) return;
    setLoading(true);
    setError('');
    try {
      const res = await search(query.trim());
      setResults(res.results);
      setQueryTime(res.query_time_ms);
    } catch (err) {
      setError('Search failed. Is the API running?');
    } finally {
      setLoading(false);
    }
  }, [query]);
  return (
    <div className="max-w-3xl mx-auto">
      <h1 className="text-2xl font-bold text-slate-900 mb-6">Search</h1>
      <form onSubmit={handleSearch} className="flex gap-2 mb-6">
        <div className="flex-1 relative">
          <Search className="absolute left-3 top-1/2 -translate-y-1/2 text-slate-400" size={18} />
          <input
            type="text"
            value={query}
            onChange={(e) => setQuery(e.target.value)}
            placeholder="Search your knowledge base..."
            className="w-full pl-10 pr-4 py-2.5 border border-slate-300 rounded-lg focus:outline-none focus:ring-2 focus:ring-brain-500 focus:border-transparent"
          />
        </div>
        <button
          type="submit"
          disabled={loading || !query.trim()}
          className="px-5 py-2.5 bg-brain-600 text-white rounded-lg font-medium hover:bg-brain-700 disabled:opacity-50 disabled:cursor-not-allowed flex items-center gap-2"
        >
          {loading && <Loader2 size={16} className="animate-spin" />}
          Search
        </button>
      </form>
      {error && (
        <div className="mb-4 p-3 bg-red-50 text-red-700 rounded-lg border border-red-200">{error}</div>
      )}
      {queryTime !== null && results.length > 0 && (
        <p className="text-sm text-slate-400 mb-4">
          {results.length} results in {queryTime}ms
        </p>
      )}
      <div className="space-y-4">
        {results.map((result) => (
          <Link
            key={result.chunk_id}
            href={`/documents/${result.document_id}`}
            className="block bg-white border border-slate-200 rounded-xl p-5 hover:border-brain-400 hover:shadow-sm transition-all"
          >
            <div className="flex items-start justify-between gap-3 mb-2">
              <div className="flex items-center gap-2">
                <FileText size={16} className="text-brain-500 shrink-0 mt-0.5" />
                <h3 className="font-semibold text-slate-900">{result.title}</h3>
              </div>
              <span className="text-xs text-slate-400 shrink-0 bg-slate-100 px-2 py-0.5 rounded-full">
                {(result.score * 100).toFixed(0)}%
              </span>
            </div>
            {result.highlight && (
              <p
                className="text-sm text-slate-600 mb-3 line-clamp-3"
                dangerouslySetInnerHTML={{ __html: result.highlight }}
              />
            )}
            <div className="flex items-center gap-2 flex-wrap">
              <span className="text-xs text-slate-400">{result.path}</span>
              {result.tags.slice(0, 4).map((tag) => (
                <span key={tag} className="text-xs bg-brain-50 text-brain-700 px-2 py-0.5 rounded-full flex items-center gap-1">
                  <Tag size={10} />
                  {tag}
                </span>
              ))}
            </div>
          </Link>
        ))}
        {results.length === 0 && queryTime !== null && !loading && (
          <p className="text-center text-slate-400 py-12">No results found for "{query}"</p>
        )}
      </div>
    </div>
  );
 }
--- a/services/web-ui/app/tags/page.tsx
+++ b/services/web-ui/app/tags/page.tsx
@ -0,0 +1,60 @@
 'use client';
 import { getTags, TagCount } from '@/lib/api';
 import { useEffect, useState } from 'react';
 import { Tag, Search } from 'lucide-react';
 import Link from 'next/link';
 export default function TagsPage() {
  const [tags, setTags] = useState<TagCount[]>([]);
  const [filter, setFilter] = useState('');
  useEffect(() => {
    getTags().then(setTags).catch(console.error);
  }, []);
  const filtered = filter
    ? tags.filter((t) => t.tag.includes(filter.toLowerCase()))
    : tags;
  const maxCount = tags[0]?.count ?? 1;
  return (
    <div className="max-w-3xl mx-auto">
      <h1 className="text-2xl font-bold text-slate-900 mb-6">Tags</h1>
      <div className="relative mb-6">
        <Search className="absolute left-3 top-1/2 -translate-y-1/2 text-slate-400" size={16} />
        <input
          type="text"
          value={filter}
          onChange={(e) => setFilter(e.target.value)}
          placeholder="Filter tags..."
          className="w-full pl-9 pr-4 py-2 border border-slate-300 rounded-lg text-sm focus:outline-none focus:ring-2 focus:ring-brain-500"
        />
      </div>
      <div className="flex flex-wrap gap-2">
        {filtered.map(({ tag, count }) => {
          const size = 0.75 + (count / maxCount) * 0.75;
          return (
            <Link
              key={tag}
              href={`/search?q=${encodeURIComponent(tag)}`}
              className="inline-flex items-center gap-1.5 bg-white border border-slate-200 rounded-full px-3 py-1.5 hover:border-brain-400 hover:bg-brain-50 transition-colors"
              style={{ fontSize: `${size}rem` }}
            >
              <Tag size={12} className="text-brain-500" />
              <span className="text-slate-700">{tag}</span>
              <span className="text-xs text-slate-400 bg-slate-100 px-1.5 rounded-full">{count}</span>
            </Link>
          );
        })}
      </div>
      {filtered.length === 0 && (
        <p className="text-slate-400 text-center py-12">No tags found</p>
      )}
    </div>
  );
 }
--- a/services/web-ui/components/layout/Sidebar.tsx
+++ b/services/web-ui/components/layout/Sidebar.tsx
@ -0,0 +1,49 @@
 'use client';
 import Link from 'next/link';
 import { usePathname } from 'next/navigation';
 import { Brain, Search, MessageSquare, Tag, GitGraph, Home } from 'lucide-react';
 import clsx from 'clsx';
 const NAV_ITEMS = [
  { href: '/',        label: 'Home',     icon: Home },
  { href: '/search',  label: 'Search',   icon: Search },
  { href: '/chat',    label: 'Chat',     icon: MessageSquare },
  { href: '/tags',    label: 'Tags',     icon: Tag },
  { href: '/graph',   label: 'Graph',    icon: GitGraph },
 ];
 export default function Sidebar() {
  const pathname = usePathname();
  return (
    <aside className="w-56 shrink-0 bg-slate-900 text-slate-300 flex flex-col h-screen">
      <div className="flex items-center gap-2.5 px-5 py-5 border-b border-slate-700">
        <Brain size={22} className="text-brain-400" />
        <span className="font-bold text-white text-lg">Second Brain</span>
      </div>
      <nav className="flex-1 py-4 space-y-1 px-2">
        {NAV_ITEMS.map(({ href, label, icon: Icon }) => (
          <Link
            key={href}
            href={href}
            className={clsx(
              'flex items-center gap-3 px-3 py-2 rounded-lg text-sm font-medium transition-colors',
              pathname === href
                ? 'bg-brain-700 text-white'
                : 'hover:bg-slate-800 hover:text-white',
            )}
          >
            <Icon size={16} />
            {label}
          </Link>
        ))}
      </nav>
      <div className="px-4 py-4 border-t border-slate-700 text-xs text-slate-500">
        AI Second Brain v1.0.0
      </div>
    </aside>
  );
 }
--- a/services/web-ui/lib/api.ts
+++ b/services/web-ui/lib/api.ts
@ -0,0 +1,138 @@
 /**
 * lib/api.ts — API client for the RAG backend.
 */
 const API_BASE = process.env.NEXT_PUBLIC_API_URL || 'http://localhost:8000';
 export interface SearchResult {
  document_id: string;
  chunk_id: string;
  title: string;
  path: string;
  content: string;
  score: number;
  tags: string[];
  highlight?: string;
 }
 export interface SearchResponse {
  results: SearchResult[];
  total: number;
  query_time_ms: number;
 }
 export interface Document {
  id: string;
  path: string;
  title: string;
  content: string;
  frontmatter: Record<string, unknown>;
  tags: string[];
  aliases: string[];
  word_count: number | null;
  created_at: string;
  updated_at: string;
  indexed_at: string | null;
 }
 export interface StatsResponse {
  total_documents: number;
  total_chunks: number;
  total_relations: number;
  total_tags: number;
  last_indexed: string | null;
  embedding_model: string;
  chat_model: string;
 }
 export interface TagCount {
  tag: string;
  count: number;
 }
 export interface GraphData {
  nodes: { id: string; title: string; path: string; tags: string[]; word_count: number | null }[];
  edges: { source: string; target: string; relation_type: string; label: string | null }[];
 }
 export async function search(query: string, tags?: string[], limit = 10): Promise<SearchResponse> {
  const res = await fetch(`${API_BASE}/api/v1/search`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ query, tags, limit, hybrid: true }),
  });
  if (!res.ok) throw new Error(`Search failed: ${res.status}`);
  return res.json();
 }
 export async function getDocument(id: string): Promise<Document> {
  const res = await fetch(`${API_BASE}/api/v1/document/${id}`);
  if (!res.ok) throw new Error(`Document not found: ${res.status}`);
  return res.json();
 }
 export async function getStats(): Promise<StatsResponse> {
  const res = await fetch(`${API_BASE}/api/v1/stats`);
  if (!res.ok) throw new Error('Stats fetch failed');
  return res.json();
 }
 export async function getTags(): Promise<TagCount[]> {
  const res = await fetch(`${API_BASE}/api/v1/tags`);
  if (!res.ok) throw new Error('Tags fetch failed');
  return res.json();
 }
 export async function getGraph(limit = 150): Promise<GraphData> {
  const res = await fetch(`${API_BASE}/api/v1/graph?limit=${limit}`);
  if (!res.ok) throw new Error('Graph fetch failed');
  return res.json();
 }
 export function streamChat(
  message: string,
  contextLimit = 5,
  onToken: (token: string) => void,
  onSources: (sources: { title: string; path: string; score: number }[]) => void,
  onDone: () => void,
 ): () => void {
  const controller = new AbortController();
  (async () => {
    try {
      const res = await fetch(`${API_BASE}/api/v1/chat`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ message, context_limit: contextLimit, stream: true }),
        signal: controller.signal,
      });
      if (!res.ok || !res.body) return;
      const reader = res.body.getReader();
      const decoder = new TextDecoder();
      let buffer = '';
      while (true) {
        const { done, value } = await reader.read();
        if (done) break;
        buffer += decoder.decode(value, { stream: true });
        const lines = buffer.split('\n');
        buffer = lines.pop() ?? '';
        for (const line of lines) {
          if (!line.startsWith('data: ')) continue;
          try {
            const data = JSON.parse(line.slice(6));
            if (data.type === 'token') onToken(data.token);
            else if (data.type === 'sources') onSources(data.sources);
            else if (data.type === 'done') onDone();
          } catch {}
        }
      }
    } catch (err: unknown) {
      if ((err as Error)?.name !== 'AbortError') console.error('Stream error:', err);
    }
  })();
  return () => controller.abort();
 }
--- a/services/web-ui/next.config.js
+++ b/services/web-ui/next.config.js
@ -0,0 +1,9 @@
 /** @type {import('next').NextConfig} */
 const nextConfig = {
  output: 'standalone',
  env: {
    NEXT_PUBLIC_API_URL: process.env.NEXT_PUBLIC_API_URL || 'http://localhost:8000',
  },
 };
 module.exports = nextConfig;
--- a/services/web-ui/package.json
+++ b/services/web-ui/package.json
@ -0,0 +1,30 @@
 {
  "name": "second-brain-ui",
  "version": "1.0.0",
  "private": true,
  "scripts": {
    "dev": "next dev",
    "build": "next build",
    "start": "next start",
    "lint": "next lint"
  },
  "dependencies": {
    "next": "14.2.3",
    "react": "^18.3.1",
    "react-dom": "^18.3.1",
    "lucide-react": "^0.378.0",
    "clsx": "^2.1.1",
    "swr": "^2.2.5"
  },
  "devDependencies": {
    "@types/node": "^20",
    "@types/react": "^18",
    "@types/react-dom": "^18",
    "typescript": "^5",
    "tailwindcss": "^3.4.3",
    "postcss": "^8",
    "autoprefixer": "^10",
    "eslint": "^8",
    "eslint-config-next": "14.2.3"
  }
 }
--- a/services/web-ui/postcss.config.js
+++ b/services/web-ui/postcss.config.js
@ -0,0 +1,6 @@
 module.exports = {
  plugins: {
    tailwindcss: {},
    autoprefixer: {},
  },
 };
--- a/services/web-ui/tailwind.config.js
+++ b/services/web-ui/tailwind.config.js
@ -0,0 +1,22 @@
 /** @type {import('tailwindcss').Config} */
 module.exports = {
  content: [
    './app/**/*.{js,ts,jsx,tsx,mdx}',
    './components/**/*.{js,ts,jsx,tsx,mdx}',
  ],
  theme: {
    extend: {
      colors: {
        brain: {
          50:  '#f0f4ff',
          100: '#dde6ff',
          500: '#6366f1',
          600: '#4f46e5',
          700: '#4338ca',
          900: '#1e1b4b',
        },
      },
    },
  },
  plugins: [],
 };
--- a/services/web-ui/tsconfig.json
+++ b/services/web-ui/tsconfig.json
@ -0,0 +1,21 @@
 {
  "compilerOptions": {
    "target": "ES2017",
    "lib": ["dom", "dom.iterable", "esnext"],
    "allowJs": true,
    "skipLibCheck": true,
    "strict": true,
    "noEmit": true,
    "esModuleInterop": true,
    "module": "esnext",
    "moduleResolution": "bundler",
    "resolveJsonModule": true,
    "isolatedModules": true,
    "jsx": "preserve",
    "incremental": true,
    "plugins": [{ "name": "next" }],
    "paths": { "@/*": ["./*"] }
  },
  "include": ["next-env.d.ts", "**/*.ts", "**/*.tsx", ".next/types/**/*.ts"],
  "exclude": ["node_modules"]
 }
--- a/vault/Welcome.md
+++ b/vault/Welcome.md
@ -0,0 +1,52 @@
 ---
 title: Welcome to Your Second Brain
 tags: [getting-started, meta, second-brain]
 aliases: [home, start-here]
 date: 2026-03-05
 ---
 # Welcome to Your Second Brain
 This is your AI-powered personal knowledge management system.
 ## What is a Second Brain?
 A **Second Brain** is an external, digital system for capturing, organising, and sharing ideas, insights, and information. The concept was popularised by [[Building a Second Brain]] by Tiago Forte.
 By externalising your thinking, you:
 - Free up mental RAM for creative work
 - Build a personal knowledge base that compounds over time
 - Make connections between ideas you might otherwise miss
 ## How This System Works
 Your notes live in this Markdown vault — fully compatible with [[Obsidian]] and [[Logseq]].
 The AI layer:
 1. **Ingests** every Markdown file automatically
 2. **Embeds** content into vector space for semantic search
 3. **Links** related documents autonomously
 4. **Tags** untagged documents using an LLM
 5. **Summarises** long documents for quick reference
 ## Getting Started
 1. Add Markdown files to this vault
 2. Use `[[WikiLinks]]` to connect ideas
 3. Add YAML frontmatter for structured metadata
 4. Search your knowledge at `http://localhost:3000/search`
 5. Chat with your notes at `http://localhost:3000/chat`
 ## Folder Structure
 - `daily/` — Daily notes and journals
 - `projects/` — Active projects
 - `resources/` — Reference material and research
 - `areas/` — Ongoing areas of responsibility
 - `templates/` — Note templates
 ## Tips
 - Use `#tags` inline or in frontmatter
 - `[[WikiLinks]]` create automatic knowledge graph edges
 - The AI agents run in the background — check the graph view after a few minutes
--- a/vault/templates/Daily
+++ b/vault/templates/Daily
@ -0,0 +1,20 @@
 ---
 title: Daily Note Template
 tags: [template, daily]
 date: {{date}}
 ---
 # {{date}}
 ## Focus
 - 
 ## Notes
 ## Links
 - 
 ## Reflections