34 lines
1002 B
YAML
34 lines
1002 B
YAML
---
|
|
date: 2026-03-08
|
|
day_of_week: Sunday
|
|
task_id: FRE-17
|
|
title: Add Memory-Efficient Model Loading
|
|
status: todo
|
|
company_id: FrenoCorp
|
|
objective: Implement gradient checkpointing and mixed precision for lower VRAM usage
|
|
context: |
|
|
- Qwen3-TTS 1.7B may not fit in low-end GPUs
|
|
- Gradient checkpointing trades compute for memory
|
|
- Mixed precision (FP16) reduces memory by half
|
|
issue_type: enhancement
|
|
priority: medium
|
|
assignee: Atlas
|
|
parent_task: FRE-32
|
|
goal_id: MVP_Pipeline_Working
|
|
blocking_tasks: []
|
|
expected_outcome: |
|
|
- Model runs on GPUs with <8GB VRAM
|
|
- Configurable precision (FP32/FP16/BF16)
|
|
- Graceful degradation when memory insufficient
|
|
acceptance_criteria:
|
|
- FP16 mode reduces memory usage by ~50%
|
|
- Gradient checkpointing option available
|
|
- Clear error when memory still insufficient
|
|
|
|
notes:
|
|
- Use torch.cuda.amp for mixed precision
|
|
- Set gradient_checkpointing=True in model config
|
|
|
|
links:
|
|
tts_model: /home/mike/code/AudiobookPipeline/src/generation/tts_model.py
|
|
--- |