Alexandria User Guide¶

What is Alexandria?¶

Alexandria is a personal knowledge engine. It collects information from your sources — papers, repos, blogs, notes, conversations — stores it locally, and makes it searchable by you and your AI coding agents via MCP.

Think of it as: your second brain, queryable by AI.

Getting Started¶

1. Install¶

pip install "alexandria-wiki[all]"

This installs the core engine plus PDF and YouTube support. If you only want the basics:

pip install alexandria-wiki

2. Initialize¶

alxia init

This creates ~/.alexandria/ with: - A SQLite database (state.db) - A global workspace with raw/ and wiki/ directories - A config file (config.toml)

Check it worked:

alxia status
alxia doctor

3. Connect to Claude Code¶

# Register Alexandria as an MCP server (one-time)
alxia mcp install claude-code

# Install hooks to auto-capture conversations (one-time)
alxia hooks install claude-code

Restart Claude Code. You now have 20 Alexandria tools available (mcp__alexandria__search, mcp__alexandria__ingest, mcp__alexandria__query, etc.).

To verify:

alxia mcp status
alxia hooks verify claude-code

4. Create a workspace for your project¶

alxia project create ml-research --description "Machine learning papers and notes"
alxia workspace use ml-research

Workspaces isolate knowledge by project. The global workspace is for general knowledge. Each project workspace has its own raw/, wiki/, and source configurations.

Ingesting Knowledge¶

The ingest command handles everything: files, directories, URLs, git repos, and conversations.

From files¶

# Markdown, text
alxia ingest ~/notes/transformer-notes.md

# PDF (arxiv papers, books, slides)
alxia ingest ~/Downloads/attention-is-all-you-need.pdf

# Code files (Python, TypeScript, Rust, Go, Terraform, Ansible, YAML)
# AST extraction produces structured beliefs automatically
alxia ingest ~/project/main.py --topic myproject

From URLs¶

# Web pages
alxia ingest https://lilianweng.github.io/posts/2023-06-23-agent/

# Arxiv papers
alxia ingest https://arxiv.org/abs/2407.09450

# Wikipedia
alxia ingest https://en.wikipedia.org/wiki/Retrieval-augmented_generation

From directories and git repos¶

# Local directory (walks tree, ingests all supported files)
alxia ingest ./my-project --topic myproject

# Git repo URL (shallow clones, then ingests)
alxia ingest https://github.com/owner/repo.git

# GitHub shorthand
alxia ingest owner/repo

From conversations¶

# Ingest a Claude Code session transcript
# Captures the conversation AND ingests all referenced artifacts (papers, repos)
alxia ingest ~/.claude/projects/*/session-id.jsonl --topic research

# Ingest all sessions from a project directory
alxia ingest ~/.claude/projects/-home-me-myproject/ --topic research

From continuous sources¶

These are sources that update over time. Add them once, then alxia sync pulls new content.

# RSS/Atom feeds (blogs, substacks)
alxia source add rss --name simon-willison --feed-url "https://simonwillison.net/atom/everything/"
alxia source add rss --name ai-news --feed-url "https://buttondown.com/ainews/rss"

# Git repositories (tracks commits as events)
alxia source add git-local --name my-project --repo-url ~/workspace/my-project

# GitHub issues, PRs, releases
alxia source add github --name my-repo --owner myorg --repo myrepo --token-ref github-pat

# YouTube transcripts
alxia source add youtube --name lectures --urls "https://youtu.be/video1,https://youtu.be/video2"

# HuggingFace model cards
alxia source add huggingface --name models --repos "meta-llama/Llama-3-8b,mistralai/Mistral-7B"

# Obsidian vault or any folder (auto-discovers file types)
alxia source add folder --name obsidian --path ~/Documents/ObsidianVault

# Zip/tar archives
alxia source add archive --name papers --path ~/Downloads/conference-papers.zip

# Notion pages
alxia source add notion --name team-wiki --token-ref notion-key --page-ids "abc-123,def-456"

# Email newsletters
alxia source add imap --name newsletters --imap-host imap.gmail.com --imap-user me@gmail.com --imap-pass-ref gmail-app-password --from-allowlist "*@substack.com"

# Pull everything
alxia sync

# Check what came in
alxia subscriptions list

Quick capture¶

For quick notes and thoughts:

alxia paste --title "meeting notes" --content "Decided to use SQLite instead of Postgres for simplicity"

Querying Your Knowledge¶

From the CLI¶

# Search across everything (documents, beliefs, events, subscriptions)
alxia query "attention mechanisms"

# Ask why something is believed (traces provenance to sources)
alxia why "transformer"

# JSON output for scripting
alxia query "RAG vs fine-tuning" --json

From Claude Code (via MCP)¶

This is where Alexandria really shines. If you followed step 3 above, Claude Code already has access. Examples of what to ask:

"What do my notes say about transformers?" — Claude calls search and read
"Ingest this paper: https://arxiv.org/abs/2407.09450" — Claude calls ingest
"Ingest my project at ./backend" — Claude calls ingest on the directory
"What changed in my-project this week?" — Claude calls events and git_log
"Why do I believe X?" — Claude calls why to trace belief provenance
"What do you know?" — Claude calls query, Alexandria answers with self-knowledge
"Add a belief: Transformers use self-attention" — Claude calls belief_add

The MCP server exposes 20 tools:

Category	Tools
Navigate	`guide`, `overview`, `list`, `search`, `grep`, `read`, `follow`
History	`history`, `why`, `timeline`, `events`
Git	`git_log`, `git_show`, `git_blame`
Sources	`sources`, `subscriptions`
Write	`ingest`, `query`, `belief_add`, `belief_supersede`

Pinned vs Open mode¶

By default, the MCP server runs in open mode — all workspaces are accessible. For project-specific use:

# Pin to one workspace in the project's .mcp.json
alxia mcp install claude-code --workspace ml-research

Managing Secrets¶

For sources that need authentication (GitHub tokens, Notion keys, IMAP passwords):

# Store a secret (prompted for value)
alxia secrets set github-pat

# Use it in a source via --token-ref
alxia source add github --name repo --owner org --repo name --token-ref github-pat

# List stored secrets (never shows values)
alxia secrets list

# Rotate a secret
alxia secrets rotate github-pat

Secrets are encrypted with AES-256-GCM. The vault passphrase comes from the ALEXANDRIA_VAULT_PASSPHRASE environment variable or your OS keyring.

Wiki Health¶

# Check for stale citations, missing sources, orphaned beliefs
alxia lint

# Deduplicate beliefs and remove orphans (beliefs whose wiki page was deleted)
alxia beliefs cleanup --dry-run    # preview
alxia beliefs cleanup              # apply

# Run quality metrics
alxia eval run

# Generate a weekly digest from recent activity
alxia synthesize --dry-run    # preview
alxia synthesize              # generate

Deduplication¶

Alexandria automatically skips re-ingesting files whose content hasn't changed (content-hash check). When a file IS re-ingested with new content, existing beliefs are superseded and identical beliefs are restored — no duplicates accumulate.

For manual cleanup of accumulated duplicates or orphaned beliefs:

alxia beliefs cleanup -w global

Background Daemon¶

For continuous syncing without manual alxia sync:

# Start (foreground, for testing)
alxia daemon start --foreground

# Start (background)
alxia daemon start

# Check status
alxia daemon status

# Stop
alxia daemon stop

The daemon runs three jobs: - Source syncs every 5 minutes - Subscription polls every hour - Capture queue draining every 60 seconds (auto-processes conversation captures)

Capturing Conversations¶

Alexandria auto-captures your AI coding sessions when hooks are installed:

# Install hooks (one-time, already done if you followed Getting Started)
alxia hooks install claude-code

# Verify hooks are active
alxia hooks verify claude-code

When a Claude Code session ends, the Stop hook fires and: 1. Parses the JSONL transcript 2. Converts to a searchable wiki page 3. Extracts referenced artifacts (papers, repos, URLs) 4. Ingests each artifact through the full pipeline

You can also manually ingest conversations:

# Ingest a specific session
alxia ingest ~/.claude/projects/*/session-id.jsonl --topic research

# Ingest all sessions from a project
alxia ingest ~/.claude/projects/-home-me-myproject/ --topic research

# Or use the capture command directly
alxia capture conversation ~/path/to/session.jsonl --client claude-code
alxia captures    # list captured sessions

Workspaces¶

alxia project create paper-review --description "Reviewing ICML submissions"
alxia workspace use paper-review
alxia workspace list
alxia workspace current

# Switch back
alxia workspace use global

Each workspace is isolated: separate raw/, wiki/, sources, beliefs, and events.

Docker¶

docker run -v ~/.alexandria:/data ghcr.io/epappas/alexandria init
docker run -v ~/.alexandria:/data ghcr.io/epappas/alexandria status
docker run -v ~/.alexandria:/data ghcr.io/epappas/alexandria ingest https://arxiv.org/pdf/2401.12345.pdf

Typical Workflow¶

Setup (once): alxia init && alxia mcp install claude-code && alxia hooks install claude-code
Morning: alxia sync pulls overnight RSS, GitHub activity, git commits
During work: alxia ingest papers, repos, articles, notes as you find them
In Claude Code: ask questions — Claude searches your Alexandria knowledge via MCP tools
After sessions: conversations auto-captured by hooks, referenced papers/repos ingested
Weekly: alxia synthesize generates a digest of what happened
Querying: alxia query "What do I know about X?" for grounded answers from your knowledge base
Maintenance: alxia lint checks wiki health, alxia eval run tracks quality

LLM Configuration¶

Alexandria uses an LLM for query answering, content summarization, and belief extraction during ingest. It auto-detects providers in this order:

Claude Max/Pro subscription (via Claude Code SDK) — no API key needed, works automatically if claude CLI is installed
ANTHROPIC_API_KEY — direct Anthropic API
OPENAI_API_KEY — OpenAI / GPT models
OPENROUTER_API_KEY — OpenRouter (any model)
GOOGLE_API_KEY — Google Gemini
config.toml [llm] section — local models (Ollama, vLLM, SGLang)

Without an LLM, Alexandria still works for ingest (mechanical extraction), search (FTS5), and code analysis (AST). Query and belief extraction require an LLM.