Thoughts

1 thought about "Animation"
3/18/2026

UFO UAP RV YT Pipeline - Session 5 Decisions (2026-03-18) Major decisions made in this session after reviewing all character assets, scripts, and researching tool capabilities: ## 1. Custom LoRA Training is Essential (Not Optional) The pipeline needs characters generated into per-episode scenes/environments programmatically. A fixed portrait library only covers talking-head shots. Custom per-character LoRAs on fal.ai enable Flux to generate any character + scene + outfit combination the script calls for, at ~$0.02/image with locked identity and style. Training cost: ~$2 per character per run on fal-ai/flux-lora-fast-training. Training time: 30 seconds to a few minutes. Can stack custom character LoRA + HRDFLS style LoRA at inference time on fal.ai (up to 3 LoRAs). Training data: Krea multi-angle base poses (19-25 per character) + Gemini-generated outfit variant images. Including outfit variants in training data prevents the LoRA from baking clothing into character identity. This was confirmed as correct approach. Trigger words defined: reptilian_zeth, grey_kael, lgm_zix, mantis_dr. ## 2. Hedra Replaces D-ID for Talking Head Animation D-ID was built for photorealistic faces and performs poorly with 2D illustrated characters. Hedra (Character-3 model) is purpose-built for any image style including 2D art/cartoons. Audio-driven (portrait + audio = lip-synced video). Lip-sync rated 9/10 in independent tests. Hedra pricing: Professional tier at $60/month gives 12,000 credits. At 540p (3 credits/sec), that's ~66 minutes of avatar video per month, enough for 20-30 shorts/week. Node.js library: hedra-node. API is job-based (submit, poll, download). Output: MP4, supports 9:16 for Shorts, 540p sufficient for mobile viewing. ## 3. How Each Tool Handles Character Content - Hedra: Portrait + audio → lip-synced talking head video. ONLY tool that does audio-driven lip sync. - Flux + custom LoRA: Text prompt → still image of character in any scene/outfit. No animation. - Hailuo S2V-01: Reference image + prompt → short video clip (3-10 sec). Character can move but NOT lip-synced to audio. May pull 2D art toward photorealism. LoRA helps indirectly: Flux generates a perfect character-in-scene still → that becomes Hailuo's reference image, reducing drift. - Luma: Prompt → atmospheric video clips. No character identity preservation. - Remotion: Assembles all pieces into final Short with continuous voiceover + word-by-word captions. ## 4. Short Structure (Editing Pattern) Shorts alternate between talking head clips and b-roll, with voiceover running continuously: - Hedra talking head (3-5 sec, lip-synced) - Flux/Hailuo b-roll scene (2-4 sec, voice continues as narration) - Back to talking head - More b-roll This is standard film editing technique, not a limitation. Cuts keep viewers engaged. ## 5. Krea Is Training Data Source, Not Production Pipeline Dave manually generated multi-angle base poses for all 4 characters via Krea. Grey also has outfit variants (MIB, Lab, Priest, Pope) generated via Gemini Gems. These images serve as training data for custom per-character LoRAs. Production image generation will use Flux + custom LoRA + HRDFLS style LoRA. Krea character image counts: Grey ~25 base + 33 outfit variants, Mantis ~23, Reptilian ~19, Little Green Man ~19. ## 6. Phase 1.5 Revised Plan (Grey POC) Order of operations designed to validate cheapest unknowns first: 1. Train Grey custom LoRA on fal.ai (~$2) 2. Generate Grey-in-scene stills via Flux + LoRA (cents) 3. Test Hailuo with Flux-generated reference images (~$0.27/clip) 4. Test Hedra with Grey portrait + audio clip (free tier, 300 credits) 5. Set up ElevenLabs, create Grey voice 6. Write sample 30-sec script, generate voiceover 7. Run Whisper for word-level timestamps 8. Full Hedra test with actual voiceover 9. Minimal Remotion composition assembling all pieces 10. Review with Dave → decision gate Total cost to validate all unknowns: under $10. ## 7. Character Bible Updates Needed - Add Pope outfit for Grey (new, not in original bible) - Note Krea as training data source, fal.ai as production generation - Replace D-ID with Hedra throughout pipeline docs - Hedra Elements (Jan 2026 launch) may allow registering characters as reusable elements via API (needs verification) ## 8. Open Questions for POC to Answer - Does Hedra produce good lip-sync from 2D comic-style Grey portrait? - Does Hailuo maintain 2D art style or photorealize from Flux-generated reference images? - Does custom LoRA + HRDFLS stack produce consistent Grey across different scenes? - If Hailuo drifts: fallback to image-to-video mode or Luma, or use Ken Burns on Flux stills

People: Dave, Grey