GPU Extensions
Bring your own API keys. Pauhu® routes to the best provider. You pay the provider directly.
How It Works
GPU Extensions are a zero-cost multi-provider API gateway. You provide your own API keys for external providers (OpenAI, Anthropic, Google, Replicate, etc.), and Pauhu handles routing, rate limiting, usage tracking, and a unified API surface. You pay the provider directly for compute — Pauhu charges only for the integration layer via your subscription tier.
Your Application
↓
Pauhu GPU Gateway (Cloudflare Worker, EU jurisdiction)
• JWT authentication
• Tier-based rate limiting
• API key format validation
• Usage tracking (D1)
↓
External Provider (customer-paid)
• OpenAI, Anthropic, Google, Replicate, etc.
• You provide the API key
• You pay the provider directly
↓
Response → Your Application
Policy Boundaries
Extensions operate under strict policy boundaries:
- Authentication: Every request requires a valid Pauhu JWT token. Anonymous access is not permitted.
- Rate limiting: Tier-based limits prevent abuse. Free: 10 generations/month. Starter: 100/month. Professional: 1,000/month. Enterprise: custom.
- API key isolation: Your provider API keys are validated for format but never stored by Pauhu. They are forwarded to the provider in a single request and discarded.
- EU jurisdiction: The gateway worker runs in EU jurisdiction. Requests are routed to the provider from EU edge nodes. Provider data processing is subject to the provider's own terms.
- No data retention: Pauhu does not store prompts, responses, or generated content. Only usage counts and timestamps are recorded for billing.
6 Extension Types
1. Large LLMs (70B+ Parameters)
Chat completions with large language models that exceed browser-native capacity. Providers: OpenAI, Anthropic, Google Gemini, Together AI, Groq, Replicate.
POST /gpu/large-llms/chat
{
"model": "gemini-1.5-pro",
"messages": [{"role": "user", "content": "Translate to Finnish"}],
"api_key": "YOUR_GOOGLE_API_KEY",
"provider": "google"
}
2. Video Generation
Text-to-video and image-to-video generation. Providers: OpenAI (Sora), Replicate, RunwayML, Pika, Fal.ai. Cost: $0.002–$0.20 per second (customer-paid).
3. Image Generation
Text-to-image generation. Providers: OpenAI (DALL-E 3), Replicate, Fal.ai, Together AI. Cost: $0.001–$0.04 per image (customer-paid).
4. Real-time Video
Real-time object detection and video analysis. Providers: Roboflow (YOLOv8), Ultralytics, AWS Rekognition, Replicate. Cost: $0.00001–$0.12 per frame/minute (customer-paid).
5. Audio Generation
Music generation and text-to-speech. Providers: Suno (music), ElevenLabs (speech), Replicate, Stability AI, Mubert. Cost: $0.02–$0.50 per generation (customer-paid).
6. 3D Generation
Text-to-3D and image-to-3D model generation with textures, rigging, and LODs. Providers: Trellis (Microsoft), Meshy, Luma AI, Rodin, Stability AI, Replicate. Cost: $0.05–$2 per model (customer-paid).
POST /gpu/3d-generation/generate-3d
{
"prompt": "A medieval fantasy knight with armor",
"api_key": "YOUR_RODIN_API_KEY",
"provider": "rodin",
"output_format": "glb",
"with_textures": true,
"with_pbr": true
}
Pricing
| Tier | Monthly Fee | Generations / Month |
|---|---|---|
| Free | $0 | 10 |
| Starter | $49 | 100 |
| Professional | $199 | 1,000 |
| Enterprise | Custom | Unlimited |
These prices cover the Pauhu integration layer only. You pay the external provider separately for compute (LLM tokens, GPU time, etc.) using your own API key.
Authentication
All GPU extension requests require two credentials:
- Pauhu JWT token in the
Authorization: Bearerheader. Identifies your account and subscription tier. - Provider API key in the request body (
api_keyfield). Forwarded to the provider for the actual compute call.
Next
- API Documentation — full reference for all Pauhu APIs
- Developer Docs — MCP server, CLI, container setup
- Security — how we protect your data and API keys
- Attributions — open-source licences
© 2026 Pauhu Ltd. All rights reserved. Terms · Privacy · Imprint · Attributions