First: Why and when to use a two-pass encoding?

Let’s make a simple example: We target a video bitrate of 6 Mbit/s. This means that in average a 8 second video needs 6 MB storage (6 Mbit * 8 seconds / 8; 8 Mbit = 1 MByte). So the encoder has 6 Mbit (or 0,75 MByte) for a second video. Ok, now let’s say our video has 25 FPS (frames per second). Then the encoder has 0,24 Mbit per frame (6 Mbit / 25 frames). The first frame (I-Frame) contains the whole compressed image and the other 24 frames contain only the changes to the previous frame. This saves a lot of storage (because if there is not much movement in the video, only a view parts of the image have changed (imagine a lion that sits on the road; the road doesn’t change but only the lion; so the encoder can use the bits for the lion and is not wasting it for the road).

So why is the encoder not doing it like above? It would be simple and we result a fixed output size. The answer is, because this would be very inefficient. The first frame (I-Frame) needs more storage (because it is a whole image) and the changes (the other frames) need less storage (because they contain only the deltas). And there might be also a different between each frame (e.g. the first 10 frames there is no movement in the video, and the other 15 frames a person walks through).

For this reasons (and there are event more) the encoder must decide how much bytes each frame gets. But how should he know about this information mentioned above? This is where two-pass encoding comes in. This means that the encoder runs twice: The first run is to collect some information and statistics (about how much bytes would be needed, how much movement is within the video, …). The second run is to do the actual encoding. It uses the statistics of the first run to produce a better output quality and target bitrate.

Now: How to run two-pass encoding using FFmpeg?

We use a simple video encoding command

ffmpeg -i input.mp4 -t 00:08 -map 0:v:0 -c:v libx264 -b:v 6000k output.mp4
  • -i input.mp4: is the input / source video file
  • – t 00:08 is to encode only the first 8 seconds (used for our test here and can be removed to encode the whole video)
  • -map 0:v:0: select only the video of the input file (0 = first input; v = video stream; second 0 = first video stream)
  • -c:v libx264: h264 encoder
  • -b:v 6000k: 6Mbit bitrate

The x264 log shows us that the overall bitrate was 7176 Kbit/s (7.1 Mbit/s). Hmm, far away from the targeted 6 Mbit/s. So we try now a two-pass encoding and compare the result.

We use the same command as above but add the -pass 1 parameter

ffmpeg -i input.mp4 -pass 1 -t 00:08 -map 0:v:0 -c:v libx264 -b:v 6000k output.mp4

This run creates new log files in the same folder (for me it was ffmpeg2pass-0.log and ffmpeg2pass-0.log.mbtree). This files contain the statistics about the first encoding process. Don’t delete them, because the second pass needs this files.

Now run the command again with the second pass (parameter -pass 2):

ffmpeg -i input.mp4 -pass 2 -t 00:08 -map 0:v:0 -c:v libx264 -b:v 6000k output.mp4

Whoow! The bitrate now was 5992 Kbits/s (5.99 Mbit/s). Thats much better than the 7.1 Mbit/s of the first encoding.