Docs: Porpoise | Oceanic AI

Technical Summary

Porpoise democratizes AI model training through three intuitive interfaces: an AI Interviewer with video avatars, a visual workflow builder, and a quick-train wizard. Organizations can fine-tune domain-specific models without data science expertise, reducing time-to-model from weeks to hours.

Core Capabilities

AI Interviewer: Interactive video avatars conduct knowledge capture interviews with 67% completion rate
Multi-Channel Invitations: Reach subject matter experts via Slack, Teams, Email, SMS, and audio calls
LoRA/QLoRA Fine-Tuning: Efficient parameter-efficient training for 7B-70B models
Multi-Cloud Optimization: Automatic GPU selection across AWS, GCP, Azure for 40% cost savings
No-Code Workflows: Visual pipeline builder and 4-step wizard for business users

Training Efficiency

Model Size	Method	Training Cost	Time
7B parameters	LoRA (3 epochs)	$0.75	1-2 hours
13B parameters	QLoRA (5 epochs)	$5.10	3-5 hours
70B parameters	LoRA (3 epochs)	$25.20	8-12 hours

Architecture Overview

┌─────────────────────────────────────────────────────────┐
│               PORPOISE TRAINING PLATFORM                │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐             │
│  │    AI    │  │  Visual  │  │  Quick   │             │
│  │Interview │  │ Workflow │  │  Wizard  │             │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘             │
│       │             │             │                     │
│       └─────────────┼─────────────┘                     │
│                     │                                   │
│  ┌──────────┐  ┌────▼─────┐  ┌──────────┐             │
│  │  Data    │  │ Training │  │  Multi   │             │
│  │ Pipeline │  │   Job    │  │  Cloud   │             │
│  └──────────┘  └──────────┘  └──────────┘             │
│                                                         │
└─────────────────────────────────────────────────────────┘
                       ▲
                       │
        ┌──────────────┴──────────────┐
        │    INTEGRATIONS             │
        ├─────────────────────────────┤
        │ • HeyGen (Avatars)          │
        │ • Twilio (SMS/Voice)        │
        │ • Slack/Teams (Echo)        │
        │ • MLflow (Tracking)         │
        └─────────────────────────────┘

Core Components

AI Interviewer

Interactive knowledge capture via video avatars

• HeyGen avatar integration
• Echo RAG pre-training
• Natural conversation flow
• 67% completion rate

Multi-Channel Invites

Reach experts through preferred channels

• Slack/Teams via Echo
• Email with calendar sync
• SMS via Twilio
• Audio calls (Phase 2)

Visual Workflow Builder

Drag-and-drop training pipeline configuration

• Node-based interface
• Data preprocessing
• Template sharing
• Real-time validation

Quick Train Wizard

4-step model training for business users

• Upload CSV/JSON data
• Model recommendations
• Auto-tuned parameters
• 27-minute median time

Training Orchestration

Multi-cloud GPU job scheduling

• Spot pricing optimization
• Queue management
• Auto-scaling GPUs
• Cost estimation

Experiment Tracking

Complete training lifecycle management

• MLflow integration
• Hyperparameter logging
• Model comparison
• Artifact versioning

Technology Stack

Training

• PyTorch 2.0+
• Hugging Face PEFT
• LoRA/QLoRA adapters
• DeepSpeed optimization
• bitsandbytes quantization

Infrastructure

• Kubernetes orchestration
• AWS/GCP/Azure GPUs
• Ray for distributed training
• MLflow tracking server
• S3/GCS storage

Integrations

• HeyGen API
• Twilio SDK
• Echo RAG platform
• Slack/Teams OAuth
• Blue Whale deployment

Porpoise AI Training Pipeline Platform