Goal and why HLS?

My goal was to setup my own streaming server that provides a video livestream using the HLS protocol (Apple HTTP Live Streaming).

HLS itself is very awesome protocol for live streams. You can split audio and video into separate files, multiple audio and video streams for different formats and qualities are possible, easy to read and built in support on Apple devices (iOS, MacOS and tvOS). The data is delivered over a HTTP/HTTPS connection, so it can be also perfectly cached on server side or used in combination with an CDN.

This article contains all my personal experience with HLS in FFmpeg. Some things might have changed and might not be up to date.

Requirements

First of all a new version of FFmpeg is required. During the next steps I will use features that are only available since version 4.0. So be sure, that you use a fresh release.

To be a able to follow my this article with using FFmpeg you should have basic knowledge about audio and video, the difference between container format (e.g. TS or MP4) and codec (e.g. h264, aac, mp3) and to know, how to use a command line.

Input Source

How do we receive the live stream?

RTMP is a common used protocol for this and it’s supported by most of the streaming clients. So we’re starting with our fist parameters of the FFmpeg command:

-listen 1 -i rtmp://martin-riedl.de/stream01

-listen 1 tells FFmpeg to use the RTMP protocol as a server and wait for incoming connections

-i defines the input source. In our case it’s a rtmp source defined with our domain and a stream name. There are a lot of other useful options for the rtmp protocol e.g. usename and password login or defining a different port. The standard port is 1935.

Link to documentation of the RTMP source: http://ffmpeg.org/ffmpeg-all.html#rtmp

Video Encoding

Now our server receives the stream so it’s time to start encoding our video. In the first step, I just want to have a full HD video feed. Later it should be extended to support also lower resolutions. HLS can dynamically switch between different qualities based on the available bandwidth of the client.

-c:v libx264 -crf 21 -preset veryfast

-c:v libx264 is the codec:video. Here we use the x264 codec to have a h264 output format.

-crf 21 is the video quality. 51 is the worst quality and 1 the best. Here you need to decide between video quality and file size (lower value = better quality = larger files)

-preset veryfast tells the encoder to prefer a fast encoding instead of better video compression. A faster preset produces larger files but ist faster. Since we’re producing a livestream here veryfast or superfast is a good choice. All presets are listed in this wiki article. For ondemand content you should choose a slower preset.

All x264 options on FFmpeg are documented under: http://ffmpeg.org/ffmpeg-all.html#libx264_002c-libx264rgb

Audio Encoding

Similar to the video encoding we start with one audio stream.

-c:a aac -b:a 128k -ac 2

-c:a aac sets the audio codec to AAC. Later we will support mp3 as additional audio stream for backwards compatibility.

-b:a is the audio bitrate. This parameter enforces a constant bitrate for the audio stream. 128k is also the default value for AAC if you don’t specify one.

-ac 2 specifies the number of audio channels. Here we use stereo (2 channels).

Output

We’d like to have HLS as the output format.

-f hls -hls_time 4 -hls_playlist_type event stream.m3u8

-f hls defines the output format HLS

-hls_time 4 slices the video and audio into segments with a duration of 4 seconds. The default value in FFmpeg is 2 seconds. Apple recommends a duration of 6 seconds.

-hls_playlist_type event tells HLS to not remove old segments. Usually the HLS livestream contains only the last x segments. Older segments are automatically deleted. Using this command no segments are deleted. This gives the user the option to go back in the stream (DVR/re-live) or to pause the live stream.

stream.m3u8 is the name of the playlist file. This contains a list of all available segments and is the main file for the player.

First run

Putting all arguments together (the order of the parameters is important; first the input stuff, then the encodings and at the end the output format information) will create the following statement:

./ffmpeg -listen 1 -i rtmp://martin-riedl.de/stream01 \
    -c:v libx264 -crf 21 -preset veryfast \
    -c:a aac -b:a 128k -ac 2 \
    -f hls -hls_time 4 -hls_playlist_type event stream.m3u8

So far, so good. The command produces a live stream. Using the URL for the stream.m3u8 in Safari (because it has built in support for HLS) we can see the live stream.

Having a look into the folder we can see the stream.m3u8 playlist file and the segments. Each segment contains the video and audio data for a short timeframe.

After a look in the stream.m3u8 we have a big surprise: The given target duration of 4 seconds are not used. Instead a TARGETDURATION of 10 is set and the segment length (value after #EXTINF:) is also not constant. A constant value is required for a good buffering in the player. We don’t want any interruptions.

We will try to solve this in the next article.