# Technical Architecture: AudiobookPipeline Web Platform

## Executive Summary

This document outlines the technical architecture for transforming the AudiobookPipeline CLI tool into a full-featured SaaS platform with web interface, user management, and cloud infrastructure.

**Target Stack:** SolidStart + Turso (SQLite) + S3-compatible storage

---

## Current State Assessment

### Existing Assets
- **CLI Tool**: Mature Python pipeline with 8 stages (parser → analyzer → annotator → voices → segmentation → generation → assembly → validation)
- **TTS Models**: Qwen3-TTS-12Hz-1.7B (VoiceDesign + Base models)
- **Checkpoint System**: Resume capability for long-running jobs
- **Config System**: YAML-based configuration with overrides
- **Output Formats**: WAV + MP3 with loudness normalization

### Gaps to Address
1. No user authentication or multi-tenancy
2. No job queue or async processing
3. No API layer for web clients
4. No usage tracking or billing integration
5. CLI-only UX (no dashboard, history, or file management)

---

## Architecture Overview

```
┌─────────────────────────────────────────────────────────────┐
│                      Client Layer                           │
│  ┌───────────┐  ┌───────────┐  ┌─────────────────────────┐  │
│  │   Web     │  │   CLI     │  │   REST API (public)     │  │
│  │  App      │  │  (enhanced)│  │                       │  │
│  │ (SolidStart)│ │           │  │  /api/jobs, /api/files │  │
│  └───────────┘  └───────────┘  └─────────────────────────┘  │
└─────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                   API Gateway Layer                         │
│  ┌──────────────────────────────────────────────────────┐   │
│  │              Next.js API Routes                      │   │
│  │  - Auth middleware (Clerk or custom JWT)            │   │
│  │  - Rate limiting + quota enforcement                │   │
│  │  - Request validation (Zod)                         │   │
│  └──────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                    Service Layer                            │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌────────────┐  │
│  │  Job     │  │   File   │  │   User   │  │   Billing  │  │
│  │ Service  │  │  Service │  │  Service │  │  Service   │  │
│  └──────────┘  └──────────┘  └──────────┘  └────────────┘  │
└─────────────────────────────────────────────────────────────┘
                            │
              ┌─────────────┼─────────────┐
              ▼             ▼             ▼
┌───────────────┐  ┌──────────────┐  ┌──────────────┐
│   Turso       │  │    S3        │  │   GPU        │
│   (SQLite)    │  │  (Storage)   │  │  Workers     │
│               │  │              │  │  (TTS Jobs)  │
│ - Users       │  │ - Uploads    │  │              │
│ - Jobs        │  │ - Outputs    │  │ - Qwen3-TTS  │
│ - Usage       │  │ - Models     │  │ - Assembly   │
│ - Subscriptions│ │              │  │              │
└───────────────┘  └──────────────┘  └──────────────┘
```

---

## Technology Decisions

### Frontend: SolidStart

**Why SolidStart?**
- Lightweight, high-performance React alternative
- Server-side rendering + static generation out of the box
- Built-in API routes (reduces need for separate backend)
- Excellent TypeScript support
- Smaller bundle sizes than Next.js

**Key Packages:**
```json
{
  "solid-start": "^1.0.0",
  "solid-js": "^1.8.0",
  "@solidjs/router": "^0.14.0",
  "zod": "^3.22.0"
}
```

### Database: Turso (SQLite)

**Why Turso?**
- Serverless SQLite with libSQL
- Edge-compatible (runs anywhere)
- Built-in replication and failover
- Free tier: 1GB storage, 1M reads/day
- Perfect for SaaS with <10k users

**Schema Design:**
```sql
-- Users and auth
CREATE TABLE users (
  id TEXT PRIMARY KEY,
  email TEXT UNIQUE NOT NULL,
  stripe_customer_id TEXT,
  subscription_status TEXT DEFAULT 'free',
  credits INTEGER DEFAULT 0,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Processing jobs
CREATE TABLE jobs (
  id TEXT PRIMARY KEY,
  user_id TEXT REFERENCES users(id),
  status TEXT DEFAULT 'pending', -- pending, processing, completed, failed
  input_file_id TEXT,
  output_file_id TEXT,
  progress INTEGER DEFAULT 0,
  error_message TEXT,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  completed_at TIMESTAMP
);

-- File metadata (not the files themselves)
CREATE TABLE files (
  id TEXT PRIMARY KEY,
  user_id TEXT REFERENCES users(id),
  filename TEXT NOT NULL,
  s3_key TEXT UNIQUE NOT NULL,
  file_size INTEGER,
  mime_type TEXT,
  purpose TEXT, -- input, output, model
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Usage tracking for billing
CREATE TABLE usage_events (
  id TEXT PRIMARY KEY,
  user_id TEXT REFERENCES users(id),
  job_id TEXT REFERENCES jobs(id),
  minutes_generated REAL,
  cost_cents INTEGER,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
```

### Storage: S3-Compatible

**Why S3?**
- Industry standard for file storage
- Cheap (~$0.023/GB/month)
- CDN integration (CloudFront)
- Lifecycle policies for cleanup

**Use Cases:**
- User uploads (input ebooks)
- Generated audiobooks (output WAV/MP3)
- Model checkpoints (Qwen3-TTS weights)
- Processing logs

**Directory Structure:**
```
s3://audiobookpipeline-{env}/
├── uploads/{user_id}/{timestamp}_{filename}
├── outputs/{user_id}/{job_id}/
│   ├── audiobook.wav
│   ├── audiobook.mp3
│   └── metadata.json
├── models/
│   ├── qwen3-tts-voicedesign/
│   └── qwen3-tts-base/
└── logs/{date}/{job_id}.log
```

### GPU Workers: Serverless or Containerized

**Option A: AWS Lambda (with GPU via EKS)**
- Pros: Auto-scaling, pay-per-use
- Cons: Complex setup, cold starts

**Option B: RunPod / Lambda Labs**
- Pros: GPU-optimized, simple API
- Cons: Vendor lock-in

**Option C: Self-hosted on EC2 g4dn.xlarge**
- Pros: Full control, predictable pricing (~$0.75/hr)
- Cons: Manual scaling, always-on cost

**Recommendation:** Start with **Option C** (1-2 GPU instances) + job queue. Scale to serverless later.

---

## Core Components

### 1. Job Processing Pipeline

```python
# services/job_processor.py
class JobProcessor:
    """Processes audiobook generation jobs."""
    
    async def process_job(self, job_id: str) -> None:
        job = await self.db.get_job(job_id)
        
        try:
            # Download input file from S3
            input_path = await self.file_service.download(job.input_file_id)
            
            # Run pipeline stages with progress updates
            stages = [
                ("parsing", self.parse_ebook),
                ("analyzing", self.analyze_book),
                ("segmenting", self.segment_text),
                ("generating", self.generate_audio),
                ("assembling", self.assemble_audiobook),
            ]
            
            for stage_name, stage_func in stages:
                await self.update_progress(job_id, stage_name)
                await stage_func(input_path, job.config)
            
            # Upload output to S3
            output_file_id = await self.file_service.upload(
                job_id=job_id,
                files=["output.wav", "output.mp3"]
            )
            
            await self.db.complete_job(job_id, output_file_id)
            
        except Exception as e:
            await self.db.fail_job(job_id, str(e))
            raise
```

### 2. API Routes (SolidStart)

```typescript
// app/routes/api/jobs.ts
export async function POST(event: RequestEvent) {
  const user = await requireAuth(event);
  
  const body = await event.request.json();
  const schema = z.object({
    fileId: z.string(),
    config: z.object({
      voices: z.object({
        narrator: z.string().optional(),
      }),
    }).optional(),
  });
  
  const { fileId, config } = schema.parse(body);
  
  // Check quota
  const credits = await db.getUserCredits(user.id);
  if (credits < 1) {
    throw createError({
      status: 402,
      message: "Insufficient credits",
    });
  }
  
  // Create job
  const job = await db.createJob({
    userId: user.id,
    inputFileId: fileId,
    config,
  });
  
  // Queue for processing
  await jobQueue.add("process-audiobook", { jobId: job.id });
  
  return event.json({ job });
}
```

### 3. Dashboard UI

```tsx
// app/routes/dashboard.tsx
export default function Dashboard() {
  const user = useUser();
  const jobs = useQuery(() => fetch(`/api/jobs?userId=${user.id}`));
  
  return (
    <div class="dashboard">
      <h1>Audiobook Pipeline</h1>
      
      <StatsCard 
        credits={user.credits}
        booksGenerated={jobs.data.length}
      />
      
      <UploadButton />
      
      <JobList jobs={jobs.data} />
    </div>
  );
}
```

---

## Security Considerations

### Authentication
- **Option 1:** Clerk (fastest to implement, $0-25/mo)
- **Option 2:** Custom JWT with email magic links
- **Recommendation:** Clerk for MVP

### Authorization
- Row-level security in Turso queries
- S3 pre-signed URLs with expiration
- API rate limiting per user

### Data Isolation
- All S3 keys include `user_id` prefix
- Database queries always filter by `user_id`
- GPU workers validate job ownership

---

## Deployment Architecture

### Development
```bash
# Local setup
npm run dev # SolidStart dev server
turso dev   # Local SQLite
minio       # Local S3-compatible storage
```

### Production (Vercel + Turso)
```
┌─────────────┐     ┌──────────────┐     ┌──────────┐
│   Vercel    │────▶│    Turso     │     │    S3    │
│  (SolidStart)│     │  (Database)  │     │(Storage) │
└─────────────┘     └──────────────┘     └──────────┘
       │
       ▼
┌─────────────┐
│  GPU Fleet  │
│  (Workers)  │
└─────────────┘
```

### CI/CD Pipeline
```yaml
# .github/workflows/deploy.yml
name: Deploy
on:
  push:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: npm test
      
  deploy:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - uses: vercel/actions@v2
        with:
          token: ${{ secrets.VERCEL_TOKEN }}
```

---

## MVP Implementation Plan

### Phase 1: Foundation (Week 1-2)
- [ ] Set up SolidStart project structure
- [ ] Integrate Turso database
- [ ] Implement user auth (Clerk)
- [ ] Create file upload endpoint (S3)
- [ ] Build basic dashboard UI

### Phase 2: Pipeline Integration (Week 2-3)
- [ ] Containerize existing Python pipeline
- [ ] Set up job queue (BullMQ or Redis)
- [ ] Implement job processor service
- [ ] Add progress tracking API
- [ ] Connect GPU workers

### Phase 3: User Experience (Week 3-4)
- [ ] Job history UI with status indicators
- [ ] Audio player for preview/download
- [ ] Usage dashboard + credit system
- [ ] Stripe integration for payments
- [ ] Email notifications on job completion

---

## Cost Analysis

### Infrastructure Costs (Monthly)

| Component | Tier | Cost |
|-----------|------|------|
| Vercel | Pro | $20/mo |
| Turso | Free tier | $0/mo (<1M reads/day) |
| S3 Storage | 1TB | $23/mo |
| GPU (g4dn.xlarge) | 730 hrs/mo | $548/mo |
| Redis (job queue) | Hobby | $9/mo |
| **Total** | | **~$600/mo** |

### Unit Economics

- GPU cost per hour: $0.75
- Average book processing time: 2 hours (30k words)
- Cost per book: ~$1.50 (GPU only)
- Price per book: $39/mo subscription (unlimited, but fair use)
- **Gross margin: >95%**

---

## Next Steps

1. **Immediate:** Set up SolidStart + Turso scaffolding
2. **This Week:** Implement auth + file upload
3. **Next Week:** Containerize Python pipeline + job queue
4. **Week 3:** Dashboard UI + Stripe integration

---

## Appendix: Environment Variables

```bash
# Database
TURSO_DATABASE_URL="libsql://frenocorp.turso.io"
TURSO_AUTH_TOKEN="..."

# Storage
AWS_ACCESS_KEY_ID="..."
AWS_SECRET_ACCESS_KEY="..."
AWS_S3_BUCKET="audiobookpipeline-prod"
AWS_REGION="us-east-1"

# Auth
CLERK_SECRET_KEY="..."
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY="..."

# Billing
STRIPE_SECRET_KEY="..."
STRIPE_WEBHOOK_SECRET="..."

# GPU Workers
GPU_WORKER_ENDPOINT="https://workers.audiobookpipeline.com"
GPU_API_KEY="..."
```