GPU Extensions

Bring your own API keys. Pauhu® routes to the best provider. You pay the provider directly.

How It Works

GPU Extensions are a zero-cost multi-provider API gateway. You provide your own API keys for external providers (OpenAI, Anthropic, Google, Replicate, etc.), and Pauhu handles routing, rate limiting, usage tracking, and a unified API surface. You pay the provider directly for compute - Pauhu charges only for the integration layer via your subscription tier.

Your Application
      ↓
Pauhu GPU Gateway (EU jurisdiction)
  • JWT authentication
  • Tier-based rate limiting
  • API key format validation
  • Usage tracking
      ↓
External Provider (customer-paid)
  • OpenAI, Anthropic, Google, Replicate, etc.
  • You provide the API key
  • You pay the provider directly
      ↓
Response → Your Application

Policy Boundaries

Extensions operate under strict policy boundaries:

Authentication: Every request requires a valid Pauhu JWT token. Anonymous access is not permitted.
Rate limiting: Tier-based limits prevent abuse. Free: 10 generations/month. Starter: 100/month. Professional: 1,000/month. Enterprise: custom.
API key isolation: Your provider API keys are validated for format but never stored by Pauhu. They are forwarded to the provider in a single request and discarded.
EU jurisdiction: The gateway runs in EU jurisdiction. Requests are routed to the provider from EU edge nodes. Provider data processing is subject to the provider's own terms.
No data retention: Pauhu does not store prompts, responses, or generated content. Only usage counts and timestamps are recorded for billing.

6 Extension Types

1. Large LLMs (70B+ Parameters)

Endpoint: /gpu/large-llms/chat

Chat completions with large language models that exceed browser-native capacity. Providers: OpenAI, Anthropic, Google Gemini, Together AI, Groq, Replicate.

POST /gpu/large-llms/chat
{
  "model": "gemini-1.5-pro",
  "messages": [{"role": "user", "content": "Translate to Finnish"}],
  "api_key": "YOUR_GOOGLE_API_KEY",
  "provider": "google"
}

2. Video Generation

Endpoint: /gpu/video-generation/generate-video

Text-to-video and image-to-video generation. Providers: OpenAI (Sora), Replicate, RunwayML, Pika, Fal.ai. Cost: $0.002–$0.20 per second (customer-paid).

3. Image Generation

Endpoint: /gpu/image-generation/generate-image

Text-to-image generation. Providers: OpenAI (DALL-E 3), Replicate, Fal.ai, Together AI. Cost: $0.001–$0.04 per image (customer-paid).

4. Real-time Video

Endpoint: /gpu/realtime-video/process-frame

Real-time object detection and video analysis. Providers: Roboflow (YOLOv8), Ultralytics, AWS Rekognition, Replicate. Cost: $0.00001–$0.12 per frame/minute (customer-paid).

5. Audio Generation

Endpoint: /gpu/audio-generation/generate-music

Music generation and text-to-speech. Providers: Suno (music), ElevenLabs (speech), Replicate, Stability AI, Mubert. Cost: $0.02–$0.50 per generation (customer-paid).

6. 3D Generation

Endpoint: /gpu/3d-generation/generate-3d

Text-to-3D and image-to-3D model generation with textures, rigging, and LODs. Providers: Trellis (Microsoft), Meshy, Luma AI, Rodin, Stability AI, Replicate. Cost: $0.05–$2 per model (customer-paid).

POST /gpu/3d-generation/generate-3d
{
  "prompt": "A medieval fantasy knight with armor",
  "api_key": "YOUR_RODIN_API_KEY",
  "provider": "rodin",
  "output_format": "glb",
  "with_textures": true,
  "with_pbr": true
}

Pricing

Tier	Monthly Fee	Generations / Month
Included	With subscription	Per tier

Extensions are included with your data feed subscription. See pricing for tier details.

Authentication

All GPU extension requests require two credentials:

Pauhu JWT token in the Authorization: Bearer header. Identifies your account and subscription tier.
Provider API key in the request body (api_key field). Forwarded to the provider for the actual compute call.

API Documentation - full reference for all Pauhu APIs
Developer Docs - MCP server, CLI, container setup
Security - how we protect your data and API keys
Attributions - open-source licences