Files
FrenoCorp/tasks/FRE-17.yaml
2026-03-09 09:21:48 -04:00

34 lines
1002 B
YAML

---
date: 2026-03-08
day_of_week: Sunday
task_id: FRE-17
title: Add Memory-Efficient Model Loading
status: todo
company_id: FrenoCorp
objective: Implement gradient checkpointing and mixed precision for lower VRAM usage
context: |
- Qwen3-TTS 1.7B may not fit in low-end GPUs
- Gradient checkpointing trades compute for memory
- Mixed precision (FP16) reduces memory by half
issue_type: enhancement
priority: medium
assignee: Atlas
parent_task: FRE-32
goal_id: MVP_Pipeline_Working
blocking_tasks: []
expected_outcome: |
- Model runs on GPUs with <8GB VRAM
- Configurable precision (FP32/FP16/BF16)
- Graceful degradation when memory insufficient
acceptance_criteria:
- FP16 mode reduces memory usage by ~50%
- Gradient checkpointing option available
- Clear error when memory still insufficient
notes:
- Use torch.cuda.amp for mixed precision
- Set gradient_checkpointing=True in model config
links:
tts_model: /home/mike/code/AudiobookPipeline/src/generation/tts_model.py
---