current org

2026-03-09 09:21:48 -04:00
commit 22e4864b8e
82 changed files with 4587 additions and 0 deletions
--- a/technical-architecture.md
+++ b/technical-architecture.md
@@ -0,0 +1,462 @@
+# Technical Architecture: AudiobookPipeline Web Platform
+
+## Executive Summary
+
+This document outlines the technical architecture for transforming the AudiobookPipeline CLI tool into a full-featured SaaS platform with web interface, user management, and cloud infrastructure.
+
+**Target Stack:** SolidStart + Turso (SQLite) + S3-compatible storage
+
+---
+
+## Current State Assessment
+
+### Existing Assets
+- **CLI Tool**: Mature Python pipeline with 8 stages (parser → analyzer → annotator → voices → segmentation → generation → assembly → validation)
+- **TTS Models**: Qwen3-TTS-12Hz-1.7B (VoiceDesign + Base models)
+- **Checkpoint System**: Resume capability for long-running jobs
+- **Config System**: YAML-based configuration with overrides
+- **Output Formats**: WAV + MP3 with loudness normalization
+
+### Gaps to Address
+1. No user authentication or multi-tenancy
+2. No job queue or async processing
+3. No API layer for web clients
+4. No usage tracking or billing integration
+5. CLI-only UX (no dashboard, history, or file management)
+
+---
+
+## Architecture Overview
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                      Client Layer                           │
+│  ┌───────────┐  ┌───────────┐  ┌─────────────────────────┐  │
+│  │   Web     │  │   CLI     │  │   REST API (public)     │  │
+│  │  App      │  │  (enhanced)│  │                       │  │
+│  │ (SolidStart)│ │           │  │  /api/jobs, /api/files │  │
+│  └───────────┘  └───────────┘  └─────────────────────────┘  │
+└─────────────────────────────────────────────────────────────┘
+                            │
+                            ▼
+┌─────────────────────────────────────────────────────────────┐
+│                   API Gateway Layer                         │
+│  ┌──────────────────────────────────────────────────────┐   │
+│  │              Next.js API Routes                      │   │
+│  │  - Auth middleware (Clerk or custom JWT)            │   │
+│  │  - Rate limiting + quota enforcement                │   │
+│  │  - Request validation (Zod)                         │   │
+│  └──────────────────────────────────────────────────────┘   │
+└─────────────────────────────────────────────────────────────┘
+                            │
+                            ▼
+┌─────────────────────────────────────────────────────────────┐
+│                    Service Layer                            │
+│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌────────────┐  │
+│  │  Job     │  │   File   │  │   User   │  │   Billing  │  │
+│  │ Service  │  │  Service │  │  Service │  │  Service   │  │
+│  └──────────┘  └──────────┘  └──────────┘  └────────────┘  │
+└─────────────────────────────────────────────────────────────┘
+                            │
+              ┌─────────────┼─────────────┐
+              ▼             ▼             ▼
+┌───────────────┐  ┌──────────────┐  ┌──────────────┐
+│   Turso       │  │    S3        │  │   GPU        │
+│   (SQLite)    │  │  (Storage)   │  │  Workers     │
+│               │  │              │  │  (TTS Jobs)  │
+│ - Users       │  │ - Uploads    │  │              │
+│ - Jobs        │  │ - Outputs    │  │ - Qwen3-TTS  │
+│ - Usage       │  │ - Models     │  │ - Assembly   │
+│ - Subscriptions│ │              │  │              │
+└───────────────┘  └──────────────┘  └──────────────┘
+```
+
+---
+
+## Technology Decisions
+
+### Frontend: SolidStart
+
+**Why SolidStart?**
+- Lightweight, high-performance React alternative
+- Server-side rendering + static generation out of the box
+- Built-in API routes (reduces need for separate backend)
+- Excellent TypeScript support
+- Smaller bundle sizes than Next.js
+
+**Key Packages:**
+```json
+{
+  "solid-start": "^1.0.0",
+  "solid-js": "^1.8.0",
+  "@solidjs/router": "^0.14.0",
+  "zod": "^3.22.0"
+}
+```
+
+### Database: Turso (SQLite)
+
+**Why Turso?**
+- Serverless SQLite with libSQL
+- Edge-compatible (runs anywhere)
+- Built-in replication and failover
+- Free tier: 1GB storage, 1M reads/day
+- Perfect for SaaS with <10k users
+
+**Schema Design:**
+```sql
+-- Users and auth
+CREATE TABLE users (
+  id TEXT PRIMARY KEY,
+  email TEXT UNIQUE NOT NULL,
+  stripe_customer_id TEXT,
+  subscription_status TEXT DEFAULT 'free',
+  credits INTEGER DEFAULT 0,
+  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
+);
+
+-- Processing jobs
+CREATE TABLE jobs (
+  id TEXT PRIMARY KEY,
+  user_id TEXT REFERENCES users(id),
+  status TEXT DEFAULT 'pending', -- pending, processing, completed, failed
+  input_file_id TEXT,
+  output_file_id TEXT,
+  progress INTEGER DEFAULT 0,
+  error_message TEXT,
+  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
+  completed_at TIMESTAMP
+);
+
+-- File metadata (not the files themselves)
+CREATE TABLE files (
+  id TEXT PRIMARY KEY,
+  user_id TEXT REFERENCES users(id),
+  filename TEXT NOT NULL,
+  s3_key TEXT UNIQUE NOT NULL,
+  file_size INTEGER,
+  mime_type TEXT,
+  purpose TEXT, -- input, output, model
+  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
+);
+
+-- Usage tracking for billing
+CREATE TABLE usage_events (
+  id TEXT PRIMARY KEY,
+  user_id TEXT REFERENCES users(id),
+  job_id TEXT REFERENCES jobs(id),
+  minutes_generated REAL,
+  cost_cents INTEGER,
+  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
+);
+```
+
+### Storage: S3-Compatible
+
+**Why S3?**
+- Industry standard for file storage
+- Cheap (~$0.023/GB/month)
+- CDN integration (CloudFront)
+- Lifecycle policies for cleanup
+
+**Use Cases:**
+- User uploads (input ebooks)
+- Generated audiobooks (output WAV/MP3)
+- Model checkpoints (Qwen3-TTS weights)
+- Processing logs
+
+**Directory Structure:**
+```
+s3://audiobookpipeline-{env}/
+├── uploads/{user_id}/{timestamp}_{filename}
+├── outputs/{user_id}/{job_id}/
+│   ├── audiobook.wav
+│   ├── audiobook.mp3
+│   └── metadata.json
+├── models/
+│   ├── qwen3-tts-voicedesign/
+│   └── qwen3-tts-base/
+└── logs/{date}/{job_id}.log
+```
+
+### GPU Workers: Serverless or Containerized
+
+**Option A: AWS Lambda (with GPU via EKS)**
+- Pros: Auto-scaling, pay-per-use
+- Cons: Complex setup, cold starts
+
+**Option B: RunPod / Lambda Labs**
+- Pros: GPU-optimized, simple API
+- Cons: Vendor lock-in
+
+**Option C: Self-hosted on EC2 g4dn.xlarge**
+- Pros: Full control, predictable pricing (~$0.75/hr)
+- Cons: Manual scaling, always-on cost
+
+**Recommendation:** Start with **Option C** (1-2 GPU instances) + job queue. Scale to serverless later.
+
+---
+
+## Core Components
+
+### 1. Job Processing Pipeline
+
+```python
+# services/job_processor.py
+class JobProcessor:
+    """Processes audiobook generation jobs."""
+    
+    async def process_job(self, job_id: str) -> None:
+        job = await self.db.get_job(job_id)
+        
+        try:
+            # Download input file from S3
+            input_path = await self.file_service.download(job.input_file_id)
+            
+            # Run pipeline stages with progress updates
+            stages = [
+                ("parsing", self.parse_ebook),
+                ("analyzing", self.analyze_book),
+                ("segmenting", self.segment_text),
+                ("generating", self.generate_audio),
+                ("assembling", self.assemble_audiobook),
+            ]
+            
+            for stage_name, stage_func in stages:
+                await self.update_progress(job_id, stage_name)
+                await stage_func(input_path, job.config)
+            
+            # Upload output to S3
+            output_file_id = await self.file_service.upload(
+                job_id=job_id,
+                files=["output.wav", "output.mp3"]
+            )
+            
+            await self.db.complete_job(job_id, output_file_id)
+            
+        except Exception as e:
+            await self.db.fail_job(job_id, str(e))
+            raise
+```
+
+### 2. API Routes (SolidStart)
+
+```typescript
+// app/routes/api/jobs.ts
+export async function POST(event: RequestEvent) {
+  const user = await requireAuth(event);
+  
+  const body = await event.request.json();
+  const schema = z.object({
+    fileId: z.string(),
+    config: z.object({
+      voices: z.object({
+        narrator: z.string().optional(),
+      }),
+    }).optional(),
+  });
+  
+  const { fileId, config } = schema.parse(body);
+  
+  // Check quota
+  const credits = await db.getUserCredits(user.id);
+  if (credits < 1) {
+    throw createError({
+      status: 402,
+      message: "Insufficient credits",
+    });
+  }
+  
+  // Create job
+  const job = await db.createJob({
+    userId: user.id,
+    inputFileId: fileId,
+    config,
+  });
+  
+  // Queue for processing
+  await jobQueue.add("process-audiobook", { jobId: job.id });
+  
+  return event.json({ job });
+}
+```
+
+### 3. Dashboard UI
+
+```tsx
+// app/routes/dashboard.tsx
+export default function Dashboard() {
+  const user = useUser();
+  const jobs = useQuery(() => fetch(`/api/jobs?userId=${user.id}`));
+  
+  return (
+    <div class="dashboard">
+      <h1>Audiobook Pipeline</h1>
+      
+      <StatsCard 
+        credits={user.credits}
+        booksGenerated={jobs.data.length}
+      />
+      
+      <UploadButton />
+      
+      <JobList jobs={jobs.data} />
+    </div>
+  );
+}
+```
+
+---
+
+## Security Considerations
+
+### Authentication
+- **Option 1:** Clerk (fastest to implement, $0-25/mo)
+- **Option 2:** Custom JWT with email magic links
+- **Recommendation:** Clerk for MVP
+
+### Authorization
+- Row-level security in Turso queries
+- S3 pre-signed URLs with expiration
+- API rate limiting per user
+
+### Data Isolation
+- All S3 keys include `user_id` prefix
+- Database queries always filter by `user_id`
+- GPU workers validate job ownership
+
+---
+
+## Deployment Architecture
+
+### Development
+```bash
+# Local setup
+npm run dev # SolidStart dev server
+turso dev   # Local SQLite
+minio       # Local S3-compatible storage
+```
+
+### Production (Vercel + Turso)
+```
+┌─────────────┐     ┌──────────────┐     ┌──────────┐
+│   Vercel    │────▶│    Turso     │     │    S3    │
+│  (SolidStart)│     │  (Database)  │     │(Storage) │
+└─────────────┘     └──────────────┘     └──────────┘
+       │
+       ▼
+┌─────────────┐
+│  GPU Fleet  │
+│  (Workers)  │
+└─────────────┘
+```
+
+### CI/CD Pipeline
+```yaml
+# .github/workflows/deploy.yml
+name: Deploy
+on:
+  push:
+    branches: [main]
+
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - run: npm ci
+      - run: npm test
+      
+  deploy:
+    needs: test
+    runs-on: ubuntu-latest
+    steps:
+      - uses: vercel/actions@v2
+        with:
+          token: ${{ secrets.VERCEL_TOKEN }}
+```
+
+---
+
+## MVP Implementation Plan
+
+### Phase 1: Foundation (Week 1-2)
+- [ ] Set up SolidStart project structure
+- [ ] Integrate Turso database
+- [ ] Implement user auth (Clerk)
+- [ ] Create file upload endpoint (S3)
+- [ ] Build basic dashboard UI
+
+### Phase 2: Pipeline Integration (Week 2-3)
+- [ ] Containerize existing Python pipeline
+- [ ] Set up job queue (BullMQ or Redis)
+- [ ] Implement job processor service
+- [ ] Add progress tracking API
+- [ ] Connect GPU workers
+
+### Phase 3: User Experience (Week 3-4)
+- [ ] Job history UI with status indicators
+- [ ] Audio player for preview/download
+- [ ] Usage dashboard + credit system
+- [ ] Stripe integration for payments
+- [ ] Email notifications on job completion
+
+---
+
+## Cost Analysis
+
+### Infrastructure Costs (Monthly)
+
+| Component | Tier | Cost |
+|-----------|------|------|
+| Vercel | Pro | $20/mo |
+| Turso | Free tier | $0/mo (<1M reads/day) |
+| S3 Storage | 1TB | $23/mo |
+| GPU (g4dn.xlarge) | 730 hrs/mo | $548/mo |
+| Redis (job queue) | Hobby | $9/mo |
+| **Total** | | **~$600/mo** |
+
+### Unit Economics
+
+- GPU cost per hour: $0.75
+- Average book processing time: 2 hours (30k words)
+- Cost per book: ~$1.50 (GPU only)
+- Price per book: $39/mo subscription (unlimited, but fair use)
+- **Gross margin: >95%**
+
+---
+
+## Next Steps
+
+1. **Immediate:** Set up SolidStart + Turso scaffolding
+2. **This Week:** Implement auth + file upload
+3. **Next Week:** Containerize Python pipeline + job queue
+4. **Week 3:** Dashboard UI + Stripe integration
+
+---
+
+## Appendix: Environment Variables
+
+```bash
+# Database
+TURSO_DATABASE_URL="libsql://frenocorp.turso.io"
+TURSO_AUTH_TOKEN="..."
+
+# Storage
+AWS_ACCESS_KEY_ID="..."
+AWS_SECRET_ACCESS_KEY="..."
+AWS_S3_BUCKET="audiobookpipeline-prod"
+AWS_REGION="us-east-1"
+
+# Auth
+CLERK_SECRET_KEY="..."
+NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY="..."
+
+# Billing
+STRIPE_SECRET_KEY="..."
+STRIPE_WEBHOOK_SECRET="..."
+
+# GPU Workers
+GPU_WORKER_ENDPOINT="https://workers.audiobookpipeline.com"
+GPU_API_KEY="..."
+```