Building A 2 Host Video Pipeline With AI

AI-assisted draft.

Building A 2-Host Video Pipeline With AI

I wanted to move past short vertical videos.

Longer content needs a better format. A single robot voice reading a list is boring. People stop watching.

I built a system to create 10-minute videos with two hosts. They talk, they disagree, and they hand off topics naturally. This rhythm keeps people watching.

I built this from scratch to work inside GitHub Actions. It must run automatically every time I update a file.

Here is how the system works:

• Everything starts with a single JSON file. • This file contains the script, the speakers, and the slide data. • I use edge-tts for audio. It is free and requires no API keys. • I use Pillow to turn JSON data into slide images. • I use ffmpeg to stitch the audio and images into a video.

Key technical choices:

Two Voices: I map Speaker A to one voice and Speaker B to another. I keep sentences under 25 words. This makes the AI sound more human.
No Browsers: I do not use Playwright or Chrome to make slides. That takes too long in a CI pipeline. Pillow is much faster for rendering images.
Smart Errors: I check the file size of every audio clip. Sometimes the API returns an empty file. My script catches this before the video fails.
Fast Rendering: A 10-minute video takes about 5 minutes to render in GitHub Actions. Most of that time is spent waiting for the audio API.

The workflow is simple:

I push a JSON file to a specific folder.
GitHub Actions triggers the render.
The system uploads the video to YouTube via API.
The file moves to an uploaded folder.

This setup allows me to produce long-form educational content without manual editing. It turns a script into a finished video automatically.

Source: https://dev.to/morinaga/what-i-learned-building-a-scripted-two-host-video-pipeline-with-edge-tts-and-ffmpeg-41o6

Optional learning community: https://t.me/GyaanSetuAi

Building A 2 Host Video Pipeline With AI

Continue reading

Customizing AI for Different Genres

AI Automation for YouTube Editors

Building Real Time Voice AI with LiveKit and FastAPI

Build a Reliable AI Transcription Pipeline