Thoughts

1 thought of type "observation" about "transcription"

ENN Pipeline Session 39 (2026-03-26): Major milestone. Phase 11 (long-form video pipeline) complete. Built 5 new pipeline scripts for NotebookLM content intake through final render. First Grix long-form video rendered and approved by Dave. Multi-shot studio sequence: establishing video with SFX, medium Hedra lip-sync, closeup Hedra lip-sync, hard cut to NotebookLM body, outro medium lip-sync. Key learnings: ElevenLabs generates different pacing per API call (never generate split audio separately, always split one generation with ffmpeg). Hedra mouth freezes on clips over 12-14s with silence gaps. Presigned PUT fails over 150MB (use S3 SDK PutObjectCommand). CRF 23 cuts render size 45% with no visible quality loss. Whisper base + Claude editorial correction beats Whisper large for domain-specific transcription. Phase 12 prep also done: blog schema alignment, 10 blog posts in Postgres as drafts, YouTube publish updated for ENN metadata with long-form support and character rotation, variant rotation system for studio assets (3-4 per character). Remaining: studio assets for 3 characters, TikTok script, end-to-end tests, launch date. 4 commits, 4,779 lines added.

People: Dave