# Technical Architecture: AudiobookPipeline Web Platform ## Executive Summary This document outlines the technical architecture for transforming the AudiobookPipeline CLI tool into a full-featured SaaS platform with web interface, user management, and cloud infrastructure. **Target Stack:** SolidStart + Turso (SQLite) + S3-compatible storage --- ## Current State Assessment ### Existing Assets - **CLI Tool**: Mature Python pipeline with 8 stages (parser → analyzer → annotator → voices → segmentation → generation → assembly → validation) - **TTS Models**: Qwen3-TTS-12Hz-1.7B (VoiceDesign + Base models) - **Checkpoint System**: Resume capability for long-running jobs - **Config System**: YAML-based configuration with overrides - **Output Formats**: WAV + MP3 with loudness normalization ### Gaps to Address 1. No user authentication or multi-tenancy 2. No job queue or async processing 3. No API layer for web clients 4. No usage tracking or billing integration 5. CLI-only UX (no dashboard, history, or file management) --- ## Architecture Overview ``` ┌─────────────────────────────────────────────────────────────┐ │ Client Layer │ │ ┌───────────┐ ┌───────────┐ ┌─────────────────────────┐ │ │ │ Web │ │ CLI │ │ REST API (public) │ │ │ │ App │ │ (enhanced)│ │ │ │ │ │ (SolidStart)│ │ │ │ /api/jobs, /api/files │ │ │ └───────────┘ └───────────┘ └─────────────────────────┘ │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ API Gateway Layer │ │ ┌──────────────────────────────────────────────────────┐ │ │ │ Next.js API Routes │ │ │ │ - Auth middleware (Clerk or custom JWT) │ │ │ │ - Rate limiting + quota enforcement │ │ │ │ - Request validation (Zod) │ │ │ └──────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Service Layer │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────┐ │ │ │ Job │ │ File │ │ User │ │ Billing │ │ │ │ Service │ │ Service │ │ Service │ │ Service │ │ │ └──────────┘ └──────────┘ └──────────┘ └────────────┘ │ └─────────────────────────────────────────────────────────────┘ │ ┌─────────────┼─────────────┐ ▼ ▼ ▼ ┌───────────────┐ ┌──────────────┐ ┌──────────────┐ │ Turso │ │ S3 │ │ GPU │ │ (SQLite) │ │ (Storage) │ │ Workers │ │ │ │ │ │ (TTS Jobs) │ │ - Users │ │ - Uploads │ │ │ │ - Jobs │ │ - Outputs │ │ - Qwen3-TTS │ │ - Usage │ │ - Models │ │ - Assembly │ │ - Subscriptions│ │ │ │ │ └───────────────┘ └──────────────┘ └──────────────┘ ``` --- ## Technology Decisions ### Frontend: SolidStart **Why SolidStart?** - Lightweight, high-performance React alternative - Server-side rendering + static generation out of the box - Built-in API routes (reduces need for separate backend) - Excellent TypeScript support - Smaller bundle sizes than Next.js **Key Packages:** ```json { "solid-start": "^1.0.0", "solid-js": "^1.8.0", "@solidjs/router": "^0.14.0", "zod": "^3.22.0" } ``` ### Database: Turso (SQLite) **Why Turso?** - Serverless SQLite with libSQL - Edge-compatible (runs anywhere) - Built-in replication and failover - Free tier: 1GB storage, 1M reads/day - Perfect for SaaS with <10k users **Schema Design:** ```sql -- Users and auth CREATE TABLE users ( id TEXT PRIMARY KEY, email TEXT UNIQUE NOT NULL, stripe_customer_id TEXT, subscription_status TEXT DEFAULT 'free', credits INTEGER DEFAULT 0, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); -- Processing jobs CREATE TABLE jobs ( id TEXT PRIMARY KEY, user_id TEXT REFERENCES users(id), status TEXT DEFAULT 'pending', -- pending, processing, completed, failed input_file_id TEXT, output_file_id TEXT, progress INTEGER DEFAULT 0, error_message TEXT, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, completed_at TIMESTAMP ); -- File metadata (not the files themselves) CREATE TABLE files ( id TEXT PRIMARY KEY, user_id TEXT REFERENCES users(id), filename TEXT NOT NULL, s3_key TEXT UNIQUE NOT NULL, file_size INTEGER, mime_type TEXT, purpose TEXT, -- input, output, model created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); -- Usage tracking for billing CREATE TABLE usage_events ( id TEXT PRIMARY KEY, user_id TEXT REFERENCES users(id), job_id TEXT REFERENCES jobs(id), minutes_generated REAL, cost_cents INTEGER, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); ``` ### Storage: S3-Compatible **Why S3?** - Industry standard for file storage - Cheap (~$0.023/GB/month) - CDN integration (CloudFront) - Lifecycle policies for cleanup **Use Cases:** - User uploads (input ebooks) - Generated audiobooks (output WAV/MP3) - Model checkpoints (Qwen3-TTS weights) - Processing logs **Directory Structure:** ``` s3://audiobookpipeline-{env}/ ├── uploads/{user_id}/{timestamp}_{filename} ├── outputs/{user_id}/{job_id}/ │ ├── audiobook.wav │ ├── audiobook.mp3 │ └── metadata.json ├── models/ │ ├── qwen3-tts-voicedesign/ │ └── qwen3-tts-base/ └── logs/{date}/{job_id}.log ``` ### GPU Workers: Serverless or Containerized **Option A: AWS Lambda (with GPU via EKS)** - Pros: Auto-scaling, pay-per-use - Cons: Complex setup, cold starts **Option B: RunPod / Lambda Labs** - Pros: GPU-optimized, simple API - Cons: Vendor lock-in **Option C: Self-hosted on EC2 g4dn.xlarge** - Pros: Full control, predictable pricing (~$0.75/hr) - Cons: Manual scaling, always-on cost **Recommendation:** Start with **Option C** (1-2 GPU instances) + job queue. Scale to serverless later. --- ## Core Components ### 1. Job Processing Pipeline ```python # services/job_processor.py class JobProcessor: """Processes audiobook generation jobs.""" async def process_job(self, job_id: str) -> None: job = await self.db.get_job(job_id) try: # Download input file from S3 input_path = await self.file_service.download(job.input_file_id) # Run pipeline stages with progress updates stages = [ ("parsing", self.parse_ebook), ("analyzing", self.analyze_book), ("segmenting", self.segment_text), ("generating", self.generate_audio), ("assembling", self.assemble_audiobook), ] for stage_name, stage_func in stages: await self.update_progress(job_id, stage_name) await stage_func(input_path, job.config) # Upload output to S3 output_file_id = await self.file_service.upload( job_id=job_id, files=["output.wav", "output.mp3"] ) await self.db.complete_job(job_id, output_file_id) except Exception as e: await self.db.fail_job(job_id, str(e)) raise ``` ### 2. API Routes (SolidStart) ```typescript // app/routes/api/jobs.ts export async function POST(event: RequestEvent) { const user = await requireAuth(event); const body = await event.request.json(); const schema = z.object({ fileId: z.string(), config: z.object({ voices: z.object({ narrator: z.string().optional(), }), }).optional(), }); const { fileId, config } = schema.parse(body); // Check quota const credits = await db.getUserCredits(user.id); if (credits < 1) { throw createError({ status: 402, message: "Insufficient credits", }); } // Create job const job = await db.createJob({ userId: user.id, inputFileId: fileId, config, }); // Queue for processing await jobQueue.add("process-audiobook", { jobId: job.id }); return event.json({ job }); } ``` ### 3. Dashboard UI ```tsx // app/routes/dashboard.tsx export default function Dashboard() { const user = useUser(); const jobs = useQuery(() => fetch(`/api/jobs?userId=${user.id}`)); return (