Usual disclaimer: "technical notes" posts are probably of zero interest to the blog followers and are just meant for Google. If they annoy, tell me and I'll get a wiki or something.
In a past life I wrote FFmpeg filters, which has the interesting side effect of making you think of the FFmpeg filtergraph as sane. Colleagues who detect that won't fail to take advantage, so I ended up tasked with crafting an unholy command line to mix the CloudFlare London tech talk videos.
The inputs are pretty common:
- a camera video of the speaker waving their hands
- a Keynote recording of the slides
- a nice background
The desired output is both streams scaled and placed on top of the background at the opposite corners, with audio from one of them, DEFCON style.
Color background
Here's a first iteration with a black background:
ffmpeg -i slides.mp4 -i speaker.mp4 -filter_complex "
color=size=hd1080:c=black [background];
[0:v] setpts=PTS-STARTPTS, scale=w=960:h=-1 [slides];
[1:v] setpts=PTS-STARTPTS, scale=w=1240:h=-1 [speaker];
[background][speaker] overlay=shortest=1:x=main_w-overlay_w:y=main_h-overlay_h [background+speaker];
[background+speaker][slides] overlay=shortest=1 [mix]
" -map "[mix]" -map "1:a" video.mp4
Let's break it down a bit. There are three parts to the command: inputs, graph and outputs.
-i slides.mp4 -i speaker.mp4
are the inputs. Nice and easy. The ordering is important, as we will refer to slides.mp4
as [0]
and speaker.mp4
as [1]
.
The -filter_complex
argument is the graph. It is composed of sources and filters. Each line, separated by ;
, takes zero or more input streams, one or more source/filter, and defines one (or more) output streams.
In this graph we first generate a [background]
stream of the right color and size to work on with a color
source. Then we take the video streams [0:v]
and [1:v]
, sync them and scale
them to the final size we want them while keeping the proportions (h=-1
), generating the [slides]
and [speaker]
streams.
A note about that setpts
filter: streams can have timestamps that say for example that the first frame of the video is meant to show at second 5. This is often the case when you previously cut the video, and since overlay
respects that, one stream would start after the other. We fix that by passing the stream through a filter that sets each timestamp to "timestamp minus timestamp of the first frame" (PTS-STARTPTS
).
Finally we use the overlay
filter twice. overlay
slaps the second input stream on top of the first at the specified position. The first overlay places the video at the bottom right corner using parameters (x=main_w-overlay_w:y=main_h-overlay_h
) and the second at the top left. shortest=1
makes the output terminate as soon as any input terminates. We call the final video result [mix]
.
Last part, the outputs. Again, argument order is all: video.mp4
will contain the streams specified by the -map
that precede it, so [mix]
for video and 1:a
for audio.
Picture background
Here's how you can use a picture instead of a black background:
ffmpeg -i slides.mp4 -i speaker.mp4 -i background.jpg -filter_complex "
[2:v] scale=s=hd1080, loop=loop=-1:size=1 [background];
[0:v] setpts=PTS-STARTPTS, scale=w=960:h=-1 [slides];
[1:v] setpts=PTS-STARTPTS, scale=w=1240:h=-1 [speaker];
[background][speaker] overlay=shortest=1:x=main_w-overlay_w:y=main_h-overlay_h [background+speaker];
[background+speaker][slides] overlay=shortest=1 [mix]
" -map "[mix]" -map "1:a" video.mp4
Note that you need to compile FFmpeg from master (brew install --HEAD ffmpeg
) for that to work, or you will see a No such filter: 'loop'
error. Apparently that's what "general users" are supposed to do anyway:
23:58:35 #ffmpeg <llogan> FiloSottile: your ffmpeg is too old. the filter is newer than the 3.0 branch.
23:59:08 #ffmpeg <llogan> general users are recommended to use a build from git master instead of releases which are mainly for distributors
If you are stuck with a release without loop
filter, here's a workaround: use the loop demuxing option: -loop 1 -i background.jpg
and remove the loop filter like [2:v] scale=s=hd1080 [background];
. However, be advised that it'll be about twice slower as this way it will scale the background again for each and every frame.
Customization
The filtergraph doesn't need any change to adapt to different aspect ratios, but here are some things you might want to adapt.
Output size
The main stream is the background, so just change the size of that by editing size=hd1080
or scale=s=hd1080
in the [background]
stream definition. You can use a format like 1920x1080
instead of hd1080
.
The two overlaid videos will stick to their corners.
Positioning
You can change the two videos positioning by messing with the overlay
filter parameters. x
and y
refer to the position of the top-left corner of the overlaid video relative to the top-left of the entire canvas. You can use a bunch of parameters, here are the docs, good luck.
To change the videos size instead change the scale
parameters of the [slides]
or [speaker]
stream definitions. If you used overlay
parameteres properly that should not mess with the alignment.
Output options
You can use all your usual output arguments by sticking them just before the target filename, like -c:a libfdk_aac video.mp4
to set the encoder.
Audio source
If you want the audio from the first video instead of the second, just change the second -map
into -map "0:a"
.
You can also easily add an mp3 as an additional -i
after all the others and use -map "3:a"
to take the audio from there.
Input options
I'm not sure what input options you ffmpeg [slides.mp4 options] -i slides.mp4 [speaker.mp4 options] -i speaker.mp4
Syncing
You can adjust the start time of one stream or the other by messing with the setpts
argument. For example to cut the first 20 seconds do setpts=PTS-STARTPTS-20/TB
.
Crazy cool effects
If you want to do cooler stuff, like cutting a stream or applying some slick perspective, you're on your own, but it's probably just a matter of picking a filter and chaining it to one of the [slides]
or [speaker]
stream definitions.
Have fun, and maybe follow me on Twitter for completely unrelated material. (I swear.)