46 lines
1.9 KiB
YAML
46 lines
1.9 KiB
YAML
---
|
|
date: 2026-03-08
|
|
day_of_week: Sunday
|
|
task_id: FRE-9
|
|
title: Fix TTS Generation Bug in AudiobookPipeline
|
|
status: done
|
|
company_id: FrenoCorp
|
|
objective: Resolve CUDA/meta tensor error in TTS generation stage to enable working pipeline
|
|
context: |
|
|
- Product: AudiobookPipeline using Qwen3-TTS 1.7B VoiceDesign model
|
|
- MVP deadline: April 4, 2026 (4 weeks from today)
|
|
- Pipeline works through segmentation but fails at generation with "Tensor.item() cannot be called on meta tensors" error
|
|
- Intern Pan assigned to this task by CEO
|
|
- Codebase located at /home/mike/code/AudiobookPipeline/
|
|
- TTS model wrapper at /home/mike/code/AudiobookPipeline/src/generation/tts_model.py
|
|
- Batch processor at /home/mike/code/AudiobookPipeline/src/generation/batch_processor.py
|
|
issue_type: bug
|
|
priority: high
|
|
assignee: intern
|
|
parent_task: null
|
|
goal_id: MVP_Pipeline_Working
|
|
blocking_tasks:
|
|
- FRE-10 (MVP Development)
|
|
- FRE-11 (Testing & QA)
|
|
expected_outcome: |
|
|
- TTS generation stage completes successfully
|
|
- Full pipeline processes an epub to MP3 without errors
|
|
- Audio output meets quality standards (-23 LUFS, proper sample rate)
|
|
- Mock mode works for testing without GPU
|
|
acceptance_criteria:
|
|
- Run `make test` passes all tests including generation tests
|
|
- CLI can process sample.epub and produce output.mp3
|
|
- No CUDA/meta tensor errors in logs
|
|
- Generation time under 2x baseline (with mock) or reasonable with real model
|
|
|
|
notes:
|
|
- Root cause: device_map="auto" resulted in meta tensors when GPU unavailable
|
|
- Fix added GPU detection with CPU fallback in tts_model.py:125-146
|
|
- Added validation to reject models loaded on meta device
|
|
- Fixed test infrastructure: PYTHONPATH in Makefile, renamed duplicate test file
|
|
- All 669 tests now pass
|
|
|
|
links:
|
|
strategic_plan: /home/mike/code/FrenoCorp/STRATEGIC_PLAN.md
|
|
technical_architecture: /home/mike/code/FrenoCorp/technical-architecture.md
|
|
codebase: /home/mike/code/AudiobookPipeline/ |