API Reference
Working endpoints for IATE terminology, per-product search across 24 EU data sources, SSE chat, and production health checks.
Base URL: https://staging.pauhu.eu (Pauhu® EU API — staging)
Authentication: Authorization: Bearer pk_...
Without a key, requests run in trial mode (3 requests/day, OAuth login required).
Transport security: All API traffic is encrypted with TLS 1.3. Clients supporting post-quantum TLS (Chrome 124+, Firefox 128+) automatically negotiate hybrid X25519Kyber768 key exchange on Cloudflare's edge.
API key storage: Keys are hashed with SHA-256 before storage. Full keys are shown only once at creation time. Rotate keys via the dashboard if compromised.
Rate limiting: Tier-based. Free: 3 requests/day. Paid tiers have per-endpoint limits documented in your dashboard. Rate limit headers (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset) are included in every response.
IATE Terminology LIVE
2,456,445 terms across 24 EU official languages sourced from the EU’s Inter-Active Terminology database. All responses are JSON.
Exact term lookup
| Parameter | Type | Required | Description |
|---|---|---|---|
term | string | Yes | Term to look up |
lang | string | No | ISO 639-1 language code (e.g. fi, en, de). Searches all languages if omitted. |
curl "https://staging.pauhu.eu/iate/lookup?term=tietosuoja&lang=fi"
Response:
{
"query": "tietosuoja",
"lang": "fi",
"found": true,
"count": 3,
"results": [
{
"concept_id": "1109095",
"term": "tietosuoja",
"lang": "fi",
"reliability": 4,
"domain": "12",
"translations": {
"en": "data protection",
"de": "Datenschutz",
"fr": "protection des donn\u00e9es",
"sv": "dataskydd"
}
}
]
}
Fuzzy search
| Parameter | Type | Required | Description |
|---|---|---|---|
q | string | Yes | Search query (fuzzy matching) |
lang | string | No | ISO 639-1 language code. Searches all languages if omitted. |
limit | integer | No | Max results. Default: 20. Max: 100. |
curl "https://staging.pauhu.eu/iate/search?q=personal%20data&lang=en&limit=5"
Response:
{
"query": "personal data",
"lang": "en",
"count": 5,
"results": [
{
"concept_id": "1441228",
"term": "personal data",
"lang": "en",
"reliability": 4,
"domain": "12",
"score": 0.97
},
{
"concept_id": "3570685",
"term": "personal data processing",
"lang": "en",
"reliability": 3,
"domain": "12",
"score": 0.84
}
]
}
Entry by concept ID
| Parameter | Type | Required | Description |
|---|---|---|---|
conceptId | string | Yes | IATE concept identifier (path parameter) |
curl "https://staging.pauhu.eu/iate/entry/1109095"
Response:
{
"concept_id": "1109095",
"domain": "12",
"reliability": 4,
"translations": {
"bg": "...",
"cs": "ochrana dat",
"da": "databeskyttelse",
"de": "Datenschutz",
"el": "...",
"en": "data protection",
"es": "protecci\u00f3n de datos",
"et": "andmekaitse",
"fi": "tietosuoja",
"fr": "protection des donn\u00e9es",
"ga": "cosaint sonra\u00ed",
"hr": "za\u0161tita podataka",
"hu": "adatv\u00e9delem",
"it": "protezione dei dati",
"lt": "duomen\u0173 apsauga",
"lv": "datu aizsardz\u012bba",
"mt": "protezzjoni tad-data",
"nl": "gegevensbescherming",
"pl": "ochrona danych",
"pt": "prote\u00e7\u00e3o de dados",
"ro": "protec\u021bia datelor",
"sk": "ochrana \u00fadajov",
"sl": "varstvo podatkov",
"sv": "dataskydd"
}
}
Language stats
curl "https://staging.pauhu.eu/iate/languages"
Response:
{
"languages": 24,
"codes": ["bg","cs","da","de","el","en","es","et","fi","fr","ga","hr","hu","it","lt","lv","mt","nl","pl","pt","ro","sk","sl","sv"],
"counts": {
"en": 312847,
"fr": 298412,
"de": 287631,
"fi": 184205
}
}
Statistics
curl "https://staging.pauhu.eu/iate/stats"
Response:
{
"total": 2456445,
"languages": 24,
"reliability": "4-star",
"byLanguage": [
{"lang": "en", "count": 312847},
{"lang": "fr", "count": 298412},
{"lang": "de", "count": 287631}
],
"source": "IATE"
}
Quality Dashboard
curl "https://staging.pauhu.eu/iate/quality"
Returns comprehensive quality metrics across all 2.4M terms: language coverage, reliability distribution, and domain breakdown.
Response:
{
"total_terms": 2456445,
"total_languages": 24,
"language_stats": [
{"lang": "en", "total": 312847, "reliable": 287631},
{"lang": "fr", "total": 298412, "reliable": 271003}
],
"top_domains": [
{"domain": "electronics", "count": 45231},
{"domain": "medical", "count": 38992}
],
"source": "IATE — Inter-Active Terminology for Europe"
}
Per-Product Search LIVE
Hybrid BM25 + semantic search across 24 data sources. Each product has a dedicated index. Results are scoped by domain ACL.
List available products
curl "https://staging.pauhu.eu/v1/search"
Response:
{
"products": [
"code", "commission", "consilium", "cordis", "curia", "dataeuropa",
"dpp", "ecb", "echa", "ema", "epo", "europarl", "eurlex",
"eurostat", "iate", "lex", "news", "oeil", "osm", "publications",
"ted", "weather", "whoiswho", "wiki"
],
"count": 24,
"domain": "pauhu.eu",
"search_url": "/v1/search/:product?q=QUERY"
}
Search a product
| Parameter | Type | Required | Description |
|---|---|---|---|
product | string | Yes | Product identifier (path parameter). One of the 24 products, subject to domain scoping. |
q | string | Yes | Search query |
lang | string | No | ISO 639-1 language code. Default: en |
limit | integer | No | Max results. Default: 10. Max: 50. |
Legislative
# EUR-Lex — Is deploying a chatbot regulated under the EU AI Act?
curl "https://staging.pauhu.eu/v1/search/eurlex?q=high-risk+artificial+intelligence+systems+chatbot&lang=en&limit=3"
Response:
{
"product": "eurlex",
"query": "high-risk artificial intelligence systems chatbot",
"results": [
{
"id": "32024R1689",
"title": "Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (AI Act)",
"date": "2024-07-12",
"modality": "obligation",
"in_force": true,
"score": 0.96,
"url": "https://eur-lex.europa.eu/eli/reg/2024/1689/oj"
},
{
"id": "32022R2065",
"title": "Regulation (EU) 2022/2065 on a Single Market For Digital Services (Digital Services Act)",
"date": "2022-10-27",
"modality": "obligation",
"in_force": true,
"score": 0.81,
"url": "https://eur-lex.europa.eu/eli/reg/2022/2065/oj"
}
]
}
Chatbots are not high-risk under Annex III, but Article 50(1) of the AI Act requires disclosure that the user is interacting with an AI system. The DSA adds transparency obligations for platforms.
# European Parliament — What did MEPs say about banning biometric surveillance?
curl "https://staging.pauhu.eu/v1/search/europarl?q=biometric+surveillance+public+spaces+ban&lang=en&limit=3"
{
"product": "europarl",
"results": [{
"id": "P9_TA(2021)0405",
"title": "European Parliament resolution on artificial intelligence in criminal law and its use by the police and judicial authorities",
"date": "2021-10-06",
"modality": "prohibition",
"score": 0.93
}]
}
# Council of the EU — What are the latest conclusions on EU strategic autonomy?
curl "https://staging.pauhu.eu/v1/search/consilium?q=strategic+autonomy+digital+sovereignty&lang=en&limit=3"
{
"product": "consilium",
"results": [{
"id": "ST_14926_2022_INIT",
"title": "Council conclusions on EU approach to digital diplomacy",
"date": "2022-07-18",
"score": 0.88
}]
}
# European Commission — Has the Commission opened a DMA investigation against a gatekeeper?
curl "https://staging.pauhu.eu/v1/search/commission?q=Digital+Markets+Act+gatekeeper+non-compliance&lang=en&limit=3"
{
"product": "commission",
"results": [{
"id": "C_2024_4761",
"title": "Commission Preliminary findings: Apple's App Store rules in breach of the Digital Markets Act",
"date": "2024-06-24",
"modality": "prohibition",
"score": 0.95
}]
}
# Legislative Observatory — Where is the Corporate Sustainability Due Diligence Directive in the legislative process?
curl "https://staging.pauhu.eu/v1/search/oeil?q=corporate+sustainability+due+diligence+supply+chain&lang=en&limit=3"
{
"product": "oeil",
"results": [{
"id": "2022/0051(COD)",
"title": "Corporate Sustainability Due Diligence Directive (CSDDD)",
"stage": "Adopted",
"date": "2024-05-24",
"score": 0.97
}]
}
Judicial
# CURIA — Can a company transfer personal data to the US after Schrems II?
curl "https://staging.pauhu.eu/v1/search/curia?q=transfer+personal+data+third+country+adequacy+standard+contractual+clauses&lang=en&limit=3"
{
"product": "curia",
"results": [{
"id": "C-311/18",
"title": "Data Protection Commissioner v Facebook Ireland Limited and Maximillian Schrems (Schrems II)",
"date": "2020-07-16",
"modality": "prohibition",
"score": 0.98,
"url": "https://curia.europa.eu/juris/liste.jsf?num=C-311/18"
}]
}
The CJEU invalidated the EU-US Privacy Shield. Standard Contractual Clauses remain valid but require supplementary measures. The EU-US Data Privacy Framework (adequacy decision 2023/1795) now provides a new transfer mechanism.
Scientific & Industrial
# ECHA — Is bisphenol A restricted in thermal paper?
curl "https://staging.pauhu.eu/v1/search/echa?q=bisphenol+A+restriction+thermal+paper&lang=en&limit=3"
{
"product": "echa",
"results": [{
"id": "0227-01",
"title": "REACH Annex XVII Entry 66: Bisphenol A — restriction in thermal paper",
"date": "2016-12-12",
"modality": "prohibition",
"in_force": true,
"score": 0.95
}]
}
Yes. Commission Regulation (EU) 2016/2235 prohibits placing thermal paper with ≥0.02% BPA on the market since 2 January 2020.
# EMA — Is a biosimilar approved for adalimumab in the EU?
curl "https://staging.pauhu.eu/v1/search/ema?q=adalimumab+biosimilar+marketing+authorisation&lang=en&limit=3"
{
"product": "ema",
"results": [{
"id": "EMEA/H/C/004212",
"title": "Imraldi (adalimumab) — biosimilar marketing authorisation",
"date": "2017-08-24",
"modality": "permission",
"score": 0.92
}]
}
# EPO — Has anyone patented a machine translation method for legal texts?
curl "https://staging.pauhu.eu/v1/search/epo?q=machine+translation+legal+documents+neural&lang=en&limit=3"
{
"product": "epo",
"results": [{
"id": "EP3295359B1",
"title": "Method and system for machine translation using neural networks with domain adaptation",
"date": "2020-03-18",
"score": 0.87
}]
}
# CORDIS — What EU-funded projects research federated learning for healthcare?
curl "https://staging.pauhu.eu/v1/search/cordis?q=federated+learning+healthcare+privacy+preserving&lang=en&limit=3"
{
"product": "cordis",
"results": [{
"id": "101120763",
"title": "TRUMPET — TRUstworthy Multi-site Privacy Enhancing Technologies",
"programme": "Horizon Europe",
"date": "2023-01-01",
"score": 0.89
}]
}
# DPP — Does this product need a Digital Product Passport under the Ecodesign regulation?
curl "https://staging.pauhu.eu/v1/search/dpp?q=ecodesign+sustainable+products+battery+textile&lang=en&limit=3"
{
"product": "dpp",
"results": [{
"id": "32024R1781",
"title": "Regulation (EU) 2024/1781 on ecodesign requirements for sustainable products (ESPR)",
"date": "2024-06-28",
"modality": "obligation",
"score": 0.94
}]
}
Batteries (from February 2027), textiles, and electronics are the first product categories requiring Digital Product Passports under the ESPR.
Economic
# ECB — What is the current ECB monetary policy stance on inflation?
curl "https://staging.pauhu.eu/v1/search/ecb?q=monetary+policy+decision+interest+rate+inflation&lang=en&limit=3"
{
"product": "ecb",
"results": [{
"id": "ecb.mp250306",
"title": "Monetary policy decisions — 6 March 2025",
"date": "2025-03-06",
"score": 0.96
}]
}
# Eurostat — What is the EU unemployment rate by country?
curl "https://staging.pauhu.eu/v1/search/eurostat?q=unemployment+rate+EU+member+states+monthly&lang=en&limit=3"
{
"product": "eurostat",
"results": [{
"id": "une_rt_m",
"title": "Unemployment rate by sex and age — monthly data",
"dataset": "une_rt_m",
"date": "2025-12-01",
"score": 0.93
}]
}
# TED — Are there open procurement notices for translation services above the EU threshold?
curl "https://staging.pauhu.eu/v1/search/ted?q=translation+services+language+technology+procurement&lang=en&limit=3"
{
"product": "ted",
"results": [{
"id": "2024/S 142-453287",
"title": "Translation and interpretation services — European Commission, DG Translation",
"buyer": "European Commission",
"date": "2024-07-24",
"value_eur": 45000000,
"score": 0.91
}]
}
Reference
# IATE — What is the official Finnish term for "personal data"?
curl "https://staging.pauhu.eu/v1/search/iate?q=personal+data&lang=fi&limit=3"
{
"product": "iate",
"results": [{
"id": "1109095",
"term_en": "personal data",
"term_fi": "henkilötieto",
"domain": "Information technology and data processing",
"reliability": 4,
"score": 0.99
}]
}
# National law — Does Finland have a national AI strategy with legal obligations?
curl "https://staging.pauhu.eu/v1/search/lex?q=tekoäly+kansallinen+strategia&lang=fi&limit=3"
{
"product": "lex",
"results": [{
"id": "HE 28/2024",
"title": "Hallituksen esitys eduskunnalle laiksi tekoälystä",
"country": "FI",
"date": "2024-03-14",
"score": 0.86
}]
}
# Publications Office — Where can I find the consolidated text of the GDPR?
curl "https://staging.pauhu.eu/v1/search/publications?q=general+data+protection+regulation+consolidated&lang=en&limit=3"
{
"product": "publications",
"results": [{
"id": "CELEX:02016R0679-20160504",
"title": "Consolidated text: Regulation (EU) 2016/679 (General Data Protection Regulation)",
"date": "2016-05-04",
"score": 0.98
}]
}
# Data Europa — Is there an open dataset of EU-funded AI research projects?
curl "https://staging.pauhu.eu/v1/search/dataeuropa?q=artificial+intelligence+research+dataset+horizon&lang=en&limit=3"
{
"product": "dataeuropa",
"results": [{
"id": "cordis-h2020-ai-projects",
"title": "AI-related projects funded under Horizon 2020",
"publisher": "European Commission",
"format": "CSV",
"score": 0.88
}]
}
Directory
# Who is Who — Who is the current European Data Protection Supervisor?
curl "https://staging.pauhu.eu/v1/search/whoiswho?q=European+Data+Protection+Supervisor&lang=en&limit=3"
{
"product": "whoiswho",
"results": [{
"id": "edps-supervisor",
"name": "Wojciech Wiewiórowski",
"role": "European Data Protection Supervisor",
"institution": "EDPS",
"score": 0.97
}]
}
# Wiki — What entities are linked to the European Chemicals Agency?
curl "https://staging.pauhu.eu/v1/search/wiki?q=European+Chemicals+Agency+ECHA+Helsinki&lang=en&limit=3"
{
"product": "wiki",
"results": [{
"id": "Q583725",
"label": "European Chemicals Agency",
"description": "Agency of the European Union, based in Helsinki",
"headquarters": "Helsinki, Finland",
"score": 0.99
}]
}
Domain scoping
Each Pauhu domain scopes which products are available. Requesting a product outside your domain’s scope returns 403 Forbidden.
| Domain | Products | Count |
|---|---|---|
pauhu.ai | All 24 products | 24 |
pauhu.eu | All 24 products | 24 |
pauhu.com | All 24 products | 24 |
pauhu.dev | eurlex, iate, wiki, code | 4 |
pauhu.io | eurostat, echa, ema, dpp, dataeuropa, osm, weather | 7 |
The domain is determined from the request URL hostname (Cloudflare-routed, trusted). Cross-origin browser requests fall back to the Origin header. Unknown domains receive an empty product set (fail-closed).
# This works — eurlex is available on pauhu.dev
curl "https://staging.pauhu.dev/v1/search/eurlex?q=AI+Act"
# This returns 403 — ted is not available on pauhu.dev
curl "https://staging.pauhu.dev/v1/search/ted?q=translation+services"
# {"error":"Product \"ted\" is not available on this domain"}
Chat LIVE
Grounded answers from EU institutional sources via Server-Sent Events (SSE). Every response streams paragraph-level citations from source documents.
Send a message
Returns a text/event-stream SSE response. The stream delivers source documents, intent classification, paragraphs, and a grounding status before closing.
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | Yes | User query in any of the 24 EU official languages. Max 2,000 characters. |
language | string | No | Response language (ISO 639-1). Default: detected from query. |
sources | string[] | No | Restrict search to specific products (e.g. ["eurlex","curia"]). Default: domain-scoped products. |
Authentication
Three methods, checked in priority order:
- JWT (EdDSA Ed25519):
Authorization: Bearer eyJ...— unlimited requests per tier. - API key:
Authorization: Bearer pk_...orX-API-Key: pk_...— tier from user record. - No auth: Free tier — 3 requests/day per IP. No key required.
Invalid credentials return 401 immediately (no fallback to free tier).
File upload
Use Content-Type: multipart/form-data to attach a file (max 10 MB):
curl -N -X POST "https://staging.pauhu.eu/v1/chat" \
-F "query=Summarise this regulation" \
-F "file=@regulation.pdf"
Basic example (SSE stream)
curl -N -X POST "https://staging.pauhu.eu/v1/chat" \
-H "Content-Type: application/json" \
-d '{"query":"Is deploying a chatbot regulated under the EU AI Act?"}'
The -N flag disables output buffering so SSE events appear in real time.
SSE event reference
Events arrive in the following order. Parse each data: line as JSON.
| Event | When | Payload |
|---|---|---|
sources |
Always (first event) | Search results: intent, count, paragraphs count, and sources[] with id, title, product, score, has_text |
intent |
Only if intent is translate, code, or app | Classified intent type and metadata (e.g. source_language for translate) |
paragraphs |
Only if matches found | Top paragraphs: text, source, title, score |
status |
Always | Inference mode (retrieval-only), grounding stats: paragraph and source counts |
done |
Always (last event) | Original query, language, timestamp |
Example SSE stream
event: sources
data: {"intent":"search","count":10,"paragraphs":5,"sources":[{"id":"32024R1689","title":"Regulation (EU) 2024/1689 (AI Act)","product":"EURLEX","score":0.96,"has_text":true}]}
event: paragraphs
data: {"count":5,"texts":[{"text":"Article 50(1) requires providers to ensure...","source":"EURLEX","title":"AI Act","score":0.96}]}
event: status
data: {"mode":"retrieval-only","grounding":{"paragraph_count":5,"source_count":10,"products":["EURLEX","IATE"]}}
event: done
data: {"query":"Is deploying a chatbot regulated under the EU AI Act?","language":"en","timestamp":"2026-03-11T10:00:00Z"}
JavaScript client
const response = await fetch('https://staging.pauhu.eu/v1/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ query: 'What are GDPR fines?' })
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
for (const line of decoder.decode(value).split('\n')) {
if (line.startsWith('data: ')) {
const event = JSON.parse(line.slice(6));
console.log(event);
}
}
}
Intent classification
The gateway auto-classifies every query into one of five intent types. Intent determines how results are displayed and which products are prioritised.
| Intent | Trigger | Behaviour |
|---|---|---|
search | Questions (who, what, when…), quoted terms | Default. Returns ranked source documents. |
translate | Keywords like “translate”, “käännä”, “übersetze”, “traduire” | Routes to translation engine with IATE terminology. |
code | Code blocks (```), programming patterns | Prioritises code product, enables syntax highlighting. |
app | “build”/“create”/“make” + “calculator”/“dashboard”/“chart” | Enables sandbox mode for interactive outputs. |
chat | Everything else | General conversational fallback. |
Grounding guarantee: Every factual claim is backed by at least one citation with a score (0.0–1.0) indicating how closely the source paragraph supports the claim. If no sources are found, the system returns a disclosure instead of a hallucinated response.
Metering: Browser-native inference (FiD) generates no meter events (free path). When an external provider is used (requires consent), a Stripe meter event is recorded.
Error responses
| Status | Cause |
|---|---|
400 | Missing or empty query, or query exceeds 2,000 characters |
401 | Invalid JWT or API key |
413 | File upload exceeds 10 MB |
429 | Daily request limit reached. Retry-After: 86400 |
Translate LIVE
Neural machine translation across all 24 EU official languages. Domain-adapted with IATE terminology enforcement.
Translate text
| Parameter | Type | Required | Description |
|---|---|---|---|
text | string | Yes | Text to translate. Max 10,000 characters per request. |
source | string | No | Source language (ISO 639-1). Auto-detected if omitted. |
target | string | Yes | Target language (ISO 639-1). |
terminology | boolean | No | Enable IATE terminology enforcement. Default: true. |
domain | string | No | Domain hint for terminology selection (e.g. "legal", "medical", "finance"). |
formality | string | No | Formality level: "formal" (default for legal/institutional), "informal". |
curl -X POST "https://staging.pauhu.eu/v1/translate" \
-H "Authorization: Bearer pk_..." \
-H "Content-Type: application/json" \
-d '{
"text": "The controller shall implement appropriate technical and organisational measures.",
"source": "en",
"target": "fi",
"terminology": true,
"domain": "legal"
}'
Response:
{
"translation": "Rekisterinpitäjän on toteutettava asianmukaiset tekniset ja organisatoriset toimenpiteet.",
"source_language": "en",
"target_language": "fi",
"detected_language": "en",
"confidence": 0.94,
"terminology_applied": [
{
"source_term": "controller",
"target_term": "rekisterinpitäjä",
"concept_id": "1403955",
"source": "IATE"
},
{
"source_term": "technical and organisational measures",
"target_term": "tekniset ja organisatoriset toimenpiteet",
"concept_id": "3567891",
"source": "IATE"
}
]
}
Terminology enforcement: When terminology is enabled, the translation engine cross-references IATE (2.4M terms) to ensure institutional terminology is used consistently. The terminology_applied array shows which terms were enforced.
Supported language pairs
curl "https://staging.pauhu.eu/v1/languages"
Response:
{
"languages": 24,
"codes": ["bg","cs","da","de","el","en","es","et","fi","fr","ga","hr","hu","it","lt","lv","mt","nl","pl","pt","ro","sk","sl","sv"],
"pairs": 552,
"note": "All 24×23 = 552 language pairs are supported via pivot translation through English."
}
Document Extraction LIVE
Server-side document extraction via headless Chrome running on a dedicated EU server in Helsinki. Documents never leave the EU.
Extract text from URL
Extracts readable text from any URL using server-side Chrome. Optionally enriches with IATE terminology and annotation.
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | Yes | URL to extract text from |
product | string | No | Product code for annotation (default: USER_DOCUMENT) |
lang | string | No | ISO 639-1 language code (default: fi) |
annotate | boolean | No | Enable annotation (topic classification, deontic modality, language detection) |
terminology | boolean | No | Extract IATE terminology from the document text |
maxChars | integer | No | Maximum characters to extract. Default: 100,000. |
curl -X POST "https://staging.pauhu.eu/v1/extract" \
-H "Authorization: Bearer pk_..." \
-H "Content-Type: application/json" \
-d '{
"url": "https://eur-lex.europa.eu/eli/reg/2024/1689/oj",
"terminology": true,
"annotate": true,
"lang": "en"
}'
Response:
{
"text": "Regulation (EU) 2024/1689 of the European Parliament...",
"title": "EUR-Lex - 32024R1689 - EN",
"url": "https://eur-lex.europa.eu/eli/reg/2024/1689/oj",
"truncated": false,
"source": "pinchtab",
"terms": [
{"term": "artificial intelligence", "concept_id": "3567984", "reliability": 4},
{"term": "high-risk AI system", "concept_id": "3592341", "reliability": 3}
],
"annotation": {
"language": "en",
"eurovoc_domain": "12",
"deontic_modality": "obligation",
"word_count": 24831,
"provenance_tier": "NATIVE"
}
}
Extract and index
Extracts text, annotates, and writes the result to storage for automatic indexing. The document becomes searchable via the per-product search API.
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | Yes | URL to extract and index |
product | string | No | Product code for R2 storage path (default: USER_DOCUMENT) |
lang | string | No | ISO 639-1 language code (default: fi) |
maxChars | integer | No | Maximum characters to extract. Default: 100,000. |
curl -X POST "https://staging.pauhu.eu/v1/extract-and-index" \
-H "Authorization: Bearer pk_..." \
-H "Content-Type: application/json" \
-d '{
"url": "https://eur-lex.europa.eu/eli/reg/2024/1689/oj",
"product": "eurlex",
"lang": "en"
}'
Response:
{
"text": "Regulation (EU) 2024/1689...",
"title": "EUR-Lex - 32024R1689 - EN",
"url": "https://eur-lex.europa.eu/eli/reg/2024/1689/oj",
"source": "pinchtab",
"indexed": true,
"r2_key": "pinchtab/eurlex/a1b2c3d4e5f6...stam.json",
"terms": [
{"term": "artificial intelligence", "concept_id": "3567984", "reliability": 4}
],
"annotation": {
"language": "en",
"eurovoc_domain": "12",
"deontic_modality": "obligation",
"word_count": 24831,
"provenance_tier": "NATIVE"
}
}
PDF render
Renders any URL to PDF using server-side Chrome. Returns raw PDF binary. Useful for archiving web pages or generating printable versions of EU documents.
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | Yes | URL to render as PDF |
curl -X POST "https://staging.pauhu.eu/v1/pdf-render" \
-H "Authorization: Bearer pk_..." \
-H "Content-Type: application/json" \
-d '{"url": "https://eur-lex.europa.eu/eli/reg/2024/1689/oj"}' \
-o document.pdf
Response: Raw PDF binary with headers:
Content-Type: application/pdf
Content-Disposition: inline; filename="document.pdf"
Feature Health LIVE
Production smoke tests that verify search, chat, domain scoping, and system integrity in real time.
Run all tests
Run one feature
| Feature | What it validates |
|---|---|
health | Gateway responds with status: ok |
search | Product listing, domain ACL enforcement, cross-domain blocking (403), per-product results with scores |
chat | Query validation, domain scoping, intent classification, SSE event ordering |
transparency | EU AI Act Article 52 disclosure present |
rate-limiting | Tier enforcement, daily counter, 429 on limit exceeded |
curl "https://staging.pauhu.eu/v1/feature-health"
Response:
{
"timestamp": "2026-03-11T10:00:00Z",
"duration_ms": 2450,
"total": 42,
"passed": 42,
"failed": 0,
"results": [
{
"feature": "search",
"case": "eurlex-returns-results",
"invariant": "Search eurlex returns ≥1 match with score",
"verdict": "PASS",
"evidence": "matches=8",
"duration_ms": 450
}
]
}
HTTP status: 200 if all tests pass, 207 Multi-Status if any test failed. Use this in CI/CD pipelines:
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" \
"https://staging.pauhu.eu/v1/feature-health")
[ "$HTTP_CODE" = "200" ] || exit 1
Full documentation: Feature Health guide
Pricing API LIVE
Live pricing from Azure Retail Prices API (North Europe, EUR). Two-part tariff: Azure pass-through + Pauhu Data License. See MACC Guide for details.
Two-part tariff
Returns the full two-part tariff schedule across all 5 Pauhu domains. Prices are refreshed from the Azure Retail Prices API on a weekly cron and cached in KV.
curl "https://staging.pauhu.eu/v1/pricing/two-part"
Response:
{
"tariffs": {
"pauhu.eu": {
"azure_pass_through": {"tier": "S1", "monthly_eur": 229},
"data_license": {"monthly_eur": 69},
"total": 298
},
"pauhu.com": {
"azure_pass_through": {"tier": "S2", "monthly_eur": 1654},
"data_license": {"monthly_eur": 0},
"total": 1654
}
},
"updated": "2026-03-03T04:00:00.000Z"
}
Available Products
24 data sources, each with dedicated storage, database, vector index, and queue pipeline.
| Product | Source | Content |
|---|---|---|
code | GitHub, npm, PyPI, crates.io | Open-source releases and documentation |
commission | European Commission | Commission documents, decisions, communications |
consilium | Council of the EU | Council conclusions, meeting outcomes |
cordis | CORDIS | EU-funded research projects |
curia | Court of Justice (CJEU) | Case law, opinions, judgments |
dataeuropa | data.europa.eu | EU Open Data Portal datasets |
dpp | Digital Product Passport | ESPR product passports |
ecb | European Central Bank | Monetary policy, financial stability |
echa | European Chemicals Agency | REACH/CLP substance data |
ema | European Medicines Agency | Medicinal product authorisations |
epo | European Patent Office | Patent publications and grants |
europarl | European Parliament | Legislative proceedings, plenary debates |
eurlex | EUR-Lex | EU law: regulations, directives, decisions |
eurostat | Eurostat | Statistical indicators and datasets |
iate | IATE | EU institutional terminology (2.4M terms) |
lex | National Law | 27 EU member states + UK national legislation |
news | EU News Feeds | Press releases, news aggregation |
oeil | Legislative Observatory | EU legislative procedure tracking |
osm | OpenStreetMap | Geospatial data for EU member states |
publications | Publications Office | Official EU publications |
ted | TED | Public procurement notices |
weather | EU Weather Services | Meteorological data and forecasts |
whoiswho | EU Who is Who | EU institutional directory |
wiki | Wikidata | EU entity knowledge base |
Error responses
All endpoints return standard JSON error responses:
{
"error": "Missing required parameter: term",
"status": 400
}
| Status | Meaning |
|---|---|
400 | Missing or invalid parameter |
401 | Invalid JWT or API key (no fallback to free tier) |
403 | Product not available on this domain (domain ACL) |
404 | Unknown product or concept ID not found |
413 | File upload exceeds 10 MB |
429 | Rate limit exceeded (free: 3/day). Includes Retry-After header. |
500 | Internal server error |
Response headers
Content-Type: application/json
X-Pauhu-Tier: live
X-Pauhu-Jurisdiction: EU
Retry-After: 1 (only if rate limit hit)
Support
Technical: support@pauhu.eu
API keys: Get an API key
Full documentation: Documentation index