Install Pauhu Sovereign AI

Eight containers. One system. Each container handles a specific function - from search to speech to answer generation. This guide takes you from a bare server to a fully operational sovereign AI.

1. Prerequisites

Hardware

Profile	CPU	RAM	Disk	Containers
Minimum	4 cores (x86_64 or ARM64)	16 GB	100 GB SSD	Core 5 containers
Recommended	8 cores	32 GB	250 GB NVMe	All 8 containers
Full + GPU	8+ cores + NVIDIA GPU	64 GB	500 GB NVMe	All 8 + GPU inference

No GPU required. All models are optimized for CPU inference. GPU accelerates answer generation but is not needed for search, translation, classification, or voice.

Software

Ubuntu 22.04 LTS or any Linux with Docker 24+ and Docker Compose v2
Ports 443 (HTTPS) and 8060 (admin) available on your LAN
Pauhu license key (provided with your subscription - email sales@pauhu.eu)

License agreement required. Use of the Pauhu Sovereign AI is governed by the Pauhu LDS Connector End User License Agreement. By deploying these containers you agree to the EULA terms. The containers are provided as binary images - source code is not included and reverse engineering is prohibited (EULA §3.2). Underlying EU institutional data is licensed under CC-BY 4.0 per Data Terms.

Air-gapped deployment: No internet is required after installation. The container images are delivered via SFTP, physical media, or your private container registry. See Sovereign AI §6 for delivery methods.

What is NOT required

No cloud accounts or API keys
No external database servers - databases are embedded in the containers
No configuration management tools (Ansible, Terraform) - Docker Compose handles everything
No DNS - accessible on your LAN by IP address. DNS is optional for HTTPS with your own certificate.

2. Quickstart

From a clean server to a running system in four commands. Allow 15 minutes on recommended hardware (10 minutes on NVMe with pre-bundled delivery).

# 1. Load the container images (from SFTP download or USB delivery)
docker load -i pauhu-sovereign-ai-v1.tar.gz

# 2. Create the data directory
mkdir -p /opt/pauhu/data

# 3. Start the system
docker compose -f docker-compose1.yaml -f docker-compose-pauhu.yaml \
  --profile production --profile pauhu up -d

# 4. Verify all 8 containers are healthy
docker compose -f docker-compose1.yaml -f docker-compose-pauhu.yaml ps

Expected output after step 4:

NAME               STATUS    PORTS
pauhu-compass      healthy   8060/tcp
pauhu-answer          healthy   8050/tcp
pauhu-nmt          healthy   8080/tcp
pauhu-specialist   healthy   8070/tcp
pauhu-tts          healthy   8000/tcp
pauhu-gateway      healthy   8090/tcp
pauhu-mcp          healthy   3100/tcp
pauhu-llm-adapter  healthy   8001/tcp   (optional, sovereign-llm profile)

Open http://<your-server>:8090 to access the search interface. Navigate to /pauhu for the admin panel.

First-start index build: On first launch, the search index takes 5–10 minutes to build. During this time, search queries may return empty results. Translation, TTS, and classification are available immediately.

Quick verification

# Search for EU AI Act (via nginx reverse proxy)
curl -s http://localhost/api/compass?q=artificial+intelligence+regulation | head -20

# Translate to Finnish
curl -s -X POST http://localhost/api/translate \
  -H "Content-Type: application/json" \
  -d '{"text": "The regulation enters into force", "target": "fi"}'

# Ask a question (grounded answer)
curl -s -X POST http://localhost/api/answer \
  -H "Content-Type: application/json" \
  -d '{"message": "Does the EU AI Act apply to procurement systems?"}'

# Check container health
curl -s http://localhost/health/compass
curl -s http://localhost/health/answer
curl -s http://localhost/health/translate

3. Architecture diagram

Each of the 8 containers handles a specific function. The diagram below shows how they connect.


  RETRIEVAL (comprehension)                GENERATION (synthesis)
  ┌─────────────────────────┐              ┌─────────────────────────┐
  │                         │              │                         │
  │  pauhu-compass          │              │  pauhu-answer              │
  │  (search engine)        │              │  (answer generation)    │
  │  Semantic search,       │              │  Model-agnostic,        │
  │  EU documents in 26ms   │              │  grounded answers       │
  │                         │              │                         │
  │  pauhu-specialist       │              │  pauhu-nmt              │
  │  (classification)       │              │  (translation)          │
  │  Domain classifiers     │              │  552 language pairs     │
  │                         │              │                         │
  │                         │              │  pauhu-tts              │
  │                         │              │  (speech)               │
  │                         │              │  Voice in 24 languages  │
  └────────────┬────────────┘              └────────────┬────────────┘
               │                                        │
               └──────────┐    ┌────────────────────────┘
                          │    │
                    ══════════════════
                    ║ pauhu-gateway  ║
                    ║   API GATEWAY  ║
                    ║ (request relay)║
                    ══════════════════
                          │
          ┌───────┐       │       ┌──────────┐
          │Docker │       │       │ pauhu-mcp│
          │volumes│  DATA STORE   │ (context)│
          │docs + │       │       │ IATE +   │
          │terms  │       │       │ EUR-Lex  │
          └───────┘       │       └──────────┘
                 ┌────────┴────────┐
                 │  VALIDATION     │
                 │  Safety checks  │
                 │  and sequencing │
                 │                 │
                 └────────┬────────┘
                          │
                 ┌────────┴────────┐
                 │  RUNTIME        │
                 │  Docker runtime │
                 └─────────────────┘

The 8 containers

pauhu-compass

Search engine

Semantic search across EU documents from 24 sources. Returns the exact paragraph in 26 milliseconds. Port 8060

pauhu-answer

Answer generation

Retrieval-augmented answer generation. Model-agnostic - swap the model, keep the grounding. Reads 3–10 retrieved passages and produces a grounded answer with citations. Port 8050

pauhu-nmt

Translation

Translation models in optimized format. 552 language pairs across all 24 EU official languages. CPU-only, no external API calls. Port 8080

pauhu-specialist

Domain classification

Domain specialist models for named entity recognition, regulatory classification, and compliance detection. Port 8070

pauhu-tts

Speech production

Text-to-speech engine in optimized format. Text-to-speech for all 24 EU official languages. Read legislation aloud for accessibility compliance. Port 8000

pauhu-gateway

API gateway

Single entry point that routes requests to all services. Validation and safety checks on every request. SHA-256 audit trail. Admin panel at /pauhu. Port 8090

pauhu-mcp

Context server

MCP server with IATE (2.4M terms), EUR-Lex context, eForms BT fields. Powers the pauhu.ai VS Code extension and terminal CLI. Port 3100

pauhu-llm-adapter

Optional - sovereign LLM bridge

Model-agnostic bridge. Connect any LLM - bring your own weights, your own API, your own choice. We provide the grounded context. Port 8001

Container resources

Container	RAM	Disk	CPU	Required?
`pauhu-compass`	2–4 GB	5 GB	2 cores	Yes
`pauhu-answer`	2–6 GB	2 GB	2 cores	Yes
`pauhu-gateway`	0.5 GB	0.1 GB	1 core	Yes
`pauhu-specialist`	1–2 GB	5 GB	1 core	Yes
`pauhu-mcp`	0.25 GB	0.1 GB	0.5 core	Yes
`pauhu-nmt`	2–6 GB	15 GB	1 core	Recommended
`pauhu-tts`	0.5–2 GB	3 GB	1 core	Optional
`pauhu-llm-adapter`	2–8 GB	varies	1+ core	Optional

Minimum viable deployment: 5 core containers (compass + answer + gateway + specialist + mcp) run on an 8-core, 16 GB server. Add NMT for translation, TTS for voice output, and llm-adapter for a sovereign LLM as needed. All 8 containers require 32 GB.

4. Model swap guide

All AI models are stored as optimized model files in the data volume. You can swap any model without rebuilding containers.

Where models live

/opt/pauhu/data/models/
├── answer/                  # Answer generation model
│   ├── encoder.model         # Encoder, INT8 quantized
│   ├── decoder.model         # Decoder with cross-attention
│   ├── tokenizer.model      # SentencePiece tokenizer
│   └── manifest.json        # SHA-256 checksums
├── specialist/              # Domain classifiers
│   ├── law.model
│   ├── environment.model
│   ├── procurement.model
│   ├── ... (18 more)
│   └── manifest.json
├── nmt/                     # Translation models (552 pairs)
│   ├── en-fi.model
│   ├── fi-en.model
│   ├── en-de.model
│   ├── ... (549 more)
│   └── manifest.json
└── tts/                     # Voice models (24 languages)
    ├── en.model
    ├── fi.model
    ├── ... (22 more)
    └── manifest.json

Swap a model

# 1. Stop the container that uses the model
docker compose stop pauhu-answer

# 2. Replace the model file
cp /path/to/new/encoder.model /opt/pauhu/data/models/answer/encoder.model
cp /path/to/new/decoder.model /opt/pauhu/data/models/answer/decoder.model

# 3. Update the manifest with new checksums
sha256sum /opt/pauhu/data/models/answer/*.model > /opt/pauhu/data/models/answer/manifest.json

# 4. Restart the container
docker compose start pauhu-answer

# 5. Verify the new model loads
curl -s http://localhost/health/answer | python3 -m json.tool

Swap a domain specialist

# Replace only the law domain model with a retrained version
docker compose stop pauhu-specialist
cp /path/to/law-v2.model /opt/pauhu/data/models/specialist/law.model
sha256sum /opt/pauhu/data/models/specialist/law.model >> /opt/pauhu/data/models/specialist/manifest.json
docker compose start pauhu-specialist

Add a new translation pair

# Add Irish (ga) ↔ English (en) model
cp ga-en.model /opt/pauhu/data/models/nmt/ga-en.model
cp en-ga.model /opt/pauhu/data/models/nmt/en-ga.model
docker compose restart pauhu-nmt

Integrity check: On startup, each container verifies the SHA-256 checksums in its manifest. If a model file does not match its manifest entry, the container will log a warning and refuse to load that model. Always update the manifest after replacing a model file.

Bring your own LLM

The answer generation container is model-agnostic - the retrieval-grounding-citation pattern is the product, the model is swappable. For an additional sovereign LLM, enable the pauhu-llm-adapter container:

# Enable the sovereign LLM profile
docker compose -f docker-compose1.yaml -f docker-compose-pauhu.yaml \
  --profile production --profile pauhu --profile sovereign-llm up -d

# Configure in .env:
MODEL_PROVIDER=local              # or: openai-compatible
MODEL_NAME=your-model-name
MODEL_PATH=/models/your-model    # volume-mounted

The gateway routes LLM requests through the same evidence bridge - the LLM receives only verified passages from the compass container, not raw user input. Three adapter patterns: OpenAI-compatible API, local model, or edge inference.

5. Admin panel

The admin panel (http://<your-server>/pauhu) is served by pauhu-gateway and provides a web interface for managing your Pauhu installation.

Dashboard

The main dashboard shows real-time status of all 8 containers:

Health status: green (healthy), yellow (degraded), red (down) for each container
Resource usage: CPU, RAM, and disk for each container
Query metrics: queries per minute, average response time, cache hit rate
Data freshness: age of the most recent document in each of the 20 data sources

Query logs

Every query is logged locally with a SHA-256 audit hash. The admin panel lets you:

Search query history by date range, user, or content
View the complete audit trail for any query: which documents were retrieved, which passages were used, and what the answer generation engine produced
Export audit logs in JSON or CSV format for your compliance team

Container management

Start/Stop: start or stop individual containers without affecting others
Restart: restart a container (e.g., after a model swap)
Logs: view real-time logs for any container
Version: see the current model version and SHA-256 checksum for each container

Access control

The admin panel requires authentication. On first launch, it generates a random admin password and prints it to the container logs:

# View the initial admin password
docker compose logs pauhu-gateway | grep "Admin password"

# Change the admin password
curl -X POST http://localhost/pauhu/api/admin/password \
  -H "Authorization: Bearer <current-password>" \
  -H "Content-Type: application/json" \
  -d '{"new_password": "your-secure-password"}'

Restrict access: The admin panel is served at /pauhu on the gateway (port 8090). In production, use your firewall or nginx rules to restrict the /pauhu path to your management network only.

6. Feed subscriptions

The cloud version of Pauhu receives continuous updates from 20 EU institutional sources via automated sync. In a sovereign deployment, you control when and how data updates arrive.

Update methods

Method	How it works	Best for
Automatic (connected)	Server connects to Pauhu's EU update endpoint on a schedule you define. Downloads only new and changed documents since last sync.	Servers with internet access. Recommended for most installations.
Manual (SFTP)	Download a data update package from your Pauhu account. Transfer it to the server via SFTP. Apply with one command.	Restricted networks where outbound connections are controlled.
Air-gapped (physical)	Data update package delivered on encrypted media. Load onto the server via USB. Apply with one command.	Classified environments with no network access.

Configure automatic updates

# In /opt/pauhu/.env
PAUHU_LICENSE_KEY=your-license-key
PAUHU_UPDATE_ENDPOINT=https://pauhu.eu/v1/sovereign
PAUHU_UPDATE_SCHEDULE=0 2 * * 1    # Weekly, Monday at 02:00
PAUHU_UPDATE_SOURCES=all            # Or: eurlex,ted,curia (comma-separated)

The compass container checks for updates on the schedule you define. It downloads only delta packages (new and modified documents), verifies SHA-256 checksums, and applies them to the local databases. The search indexes are rebuilt automatically.

Apply a manual update

# Transfer the update package to the server
scp pauhu-update-2026-03.tar.gz admin@your-server:/opt/pauhu/updates/

# Apply the update
docker exec pauhu-compass /update /updates/pauhu-update-2026-03.tar.gz

# Verify the update
docker exec pauhu-compass /health/data-freshness

Subscribe to specific sources

You can subscribe to all 20 sources or a subset. Configure in the admin panel under Settings → Feed Subscriptions, or via the .env file:

Source	Identifier	Update frequency (cloud)
EUR-Lex	`eurlex`	Every 4 hours (weekdays)
TED	`ted`	Every 6 hours
National Law	`lex`	Daily
CURIA	`curia`	Daily
OEIL	`oeil`	Every 4 hours
IATE	`iate`	Daily
ECB	`ecb`	Daily
EMA	`ema`	Daily
EPO^*	`epo`	Daily
ECHA	`echa`	Weekly
All 20 sources	`all`	Mixed (see above)

^* EPO patent data requires a separate Data Use Agreement with the European Patent Office. Contact sales@pauhu.eu for status.

Delta updates only. The system never re-downloads the entire dataset. After the initial delivery (4.8M documents), updates contain only new and modified documents. A typical weekly update is 50–200 MB.

7. VS Code extension

The Pauhu VS Code extension connects your IDE to the Pauhu server. It provides EU regulatory context, terminology lookup, and compliance checks directly in your editor - useful for policy drafting, legislative analysis, and procurement document preparation.

Install

# From the VS Code marketplace
code --install-extension pauhu.pauhu-eu

# Or from the .vsix file (air-gapped install)
code --install-extension /path/to/pauhu-eu-1.0.0.vsix

Configure

Open VS Code settings (Ctrl+,) and search for pauhu:

{
  "pauhu.serverUrl": "http://your-server",
  "pauhu.apiKey": "",
  "pauhu.language": "en",
  "pauhu.showTerminology": true,
  "pauhu.showClassification": true
}

Setting	Default	Description
`pauhu.serverUrl`	`http://localhost`	URL of your Pauhu server (nginx reverse proxy)
`pauhu.apiKey`	(empty)	API key from the admin panel. Leave empty if your server allows unauthenticated access on LAN.
`pauhu.language`	`en`	Default language for terminology lookup. Any of the 24 EU official language codes.
`pauhu.showTerminology`	`true`	Show IATE terminology annotations inline
`pauhu.showClassification`	`true`	Show domain classification in the status bar

Features

IATE terminology lookup: Hover over a legal term to see its definition and translations in all 24 EU languages, drawn from 2.4 million verified entries
Domain classification: The status bar shows which of the 21 EU policy domains your current document relates to
Search panel: Ctrl+Shift+P → Pauhu: Search EU Sources to search EU documents from inside your editor
Chat panel: Ctrl+Shift+P → Pauhu: Ask a Question to get a grounded answer with citations
Compliance check: Ctrl+Shift+P → Pauhu: Check Compliance to scan your current file for regulatory references and verify they are up to date
Translate selection: Select text, right-click, Pauhu: Translate to... to translate using Translation models running on your server

MCP integration: The VS Code extension also supports the MCP Sovereign Mode protocol. If you use AI coding assistants that support MCP (e.g., Claude Code, GitHub Copilot), Pauhu provides 4 MCP tools for EU regulatory context.

8. Troubleshooting

Container won't start

# Check container logs
docker compose logs pauhu-compass --tail 50

# Check available disk space
df -h /opt/pauhu/data

# Check available memory
free -h

# Verify the container image is loaded
docker images | grep pauhu

Symptom	Cause	Fix
Container exits immediately	Insufficient RAM	Check `docker compose logs <container>` for OOM messages. Increase RAM or stop non-essential containers.
Model integrity check failed	Model file corrupted or manifest mismatch	Re-copy the model file and update the manifest. See Section 4.
Port already in use	Another service on the same port	Edit port mappings in `docker-compose.yml` or stop the conflicting service.
Search returns no results	Search index still building	Wait for `pauhu-compass` to reach "healthy" status. Initial index build takes 5–10 minutes on first start.
Translation timeout	Language pair model not loaded	Check `docker compose logs pauhu-nmt`. Verify the language pair model file exists in `/opt/pauhu/data/models/nmt/`.

Performance tuning

Issue	Tuning
Search is slow (>100ms)	Increase `pauhu-compass` RAM to 4 GB. The search index is loaded into memory on startup - more RAM means more of the index is cached.
Answer generation is slow (>5s)	Answer generation runs on CPU by default. For faster inference, add a GPU and set `PAUHU_ANSWER_DEVICE=cuda` in `.env`. Alternatively, reduce the number of passages with `PAUHU_ANSWER_TOP_K=3` (default: 5).
High disk I/O	Move `/opt/pauhu/data` to NVMe storage. The compass container performs frequent reads during search index lookups.
Memory pressure	Stop TTS if not needed (`docker compose stop pauhu-tts`). Translation models can be restricted to a subset of language pairs by setting `PAUHU_NMT_LANGUAGES=en,fi,de,fr,sv`.

Verify data integrity

# Check all model manifests
docker exec pauhu-answer /verify-integrity
docker exec pauhu-specialist /verify-integrity
docker exec pauhu-nmt /verify-integrity
docker exec pauhu-tts /verify-integrity

# Check data integrity (document count, index status)
docker exec pauhu-compass /verify-integrity

# Full system health report
curl -s http://localhost/health/all | python3 -m json.tool

Reset to factory state

This deletes all query logs and custom configurations. Model files and data are preserved.

# Stop all containers
docker compose down

# Remove configuration (keeps models and data)
rm -rf /opt/pauhu/data/config /opt/pauhu/data/logs

# Restart
docker compose up -d

Get help

Technical support: support@pauhu.eu
On-site deployment: Available for EU government customers. Contact sales@pauhu.eu.
Related documentation:

Sovereign AI - architecture overview and procurement specifications
Answer Generation Architecture - technical deep-dive into the answer generation engine
MCP Sovereign Mode - IDE integration protocol reference
Government Procurement Demo - 10-step walkthrough
The Guide vs. the Encyclopedia - why Pauhu exists