On the Fly HLS Transcoding with Golang.

I have watched and enjoyed countless hours of video content both online and offline over the years. As someone who loves to dig deep into the tech stack every time I use some software product, it really seems weird that I had entirely never cared about how this video streaming tech actually works. That was until recently, when I started to set up a local media server for streaming some of my personal and all-time family favorites.

I was able to play my favorite content over the network at different resolutions and bitrates depending on my network bandwidth. This especially fascinated me because I only had one original copy of the video. The local media servers were able to transcode it to different formats depending on the external conditions.

This behavior, however awesome it may be, triggered a panic mode in me. I thought to myself that if the media server can serve different versions of my original content, then it should be consuming a lot of storage for pre-transcoding and storing every variant of the original content. I have a limited storage capacity on my local server that hosts much more important stuff than these video contents.

To my surprise, I was sweating for no reason. There was no surge in storage space. This made me curious enough to go under the hood to find out how the media server can stream wide variants of the original content on user request and clean up everything when it is no longer necessary.

So, now I had something to work on over the weekend. I started reading and exploring a lot of things about media streaming, which eventually led to getting my hands dirty by trying a few things on my own. I will try to share my learning in parts over a series of blog posts.

HLS (HTTP Live Streaming)

HLS is a widely recognized term in the media streaming business. I had come across it even before beginning my exploration, but I wasn’t curious enough to delve deeper than just knowing its full form.

So, what is HLS beyond its name? It’s a streaming protocol specifically designed to deliver video and audio over HTTP, dynamically adjusting the stream quality based on external factors like the user’s network conditions and device capabilities.

Okay, but how does it actually work?
The original video is transcoded into multiple variants, each with different resolutions and bitrates (e.g., 1080p at 5 Mbps, 720p at 3 Mbps, etc.). These variants are then divided into small segments of fixed duration, typically 2–5 seconds each. These segments are of MPEG-2 Transport Stream (.ts) format.

That’s not all. The HLS protocol also includes manifest files (commonly referred to as playlists) that tie all these components together. There’s a master manifest file that lists the available variant streams and a media manifest file for each variant that contains segment details. These manifest file don’t hold any video data, its rather a plain text that holds the path of different variants and segments.

All this transcoding and segmentation is precomputed on the server. The client video player starts by downloading the master manifest file and then handles much of the heavy lifting. It continuously monitors device capabilities and network bandwidth, dynamically switching between variant streams for subsequent segments as needed for smooth playback.

Despite its widespread adoption, HLS playback isn’t natively supported by most browsers except for Safari (For obvious reasons). For other browsers, we have to rely on JavaScript libraries to enable HLS streaming.

Now, let’s look at how we can convert an MP4 file to HLS. As discussed, the process involves transcoding, segmenting, and creating manifest files. Thankfully, we don’t need to do all this manually. Tools like FFmpeg make the task much simpler.

Ffmpeg for MP4 -> HLS

FFmpeg is a popular open-source command-line tool for handling multimedia operations like transcoding and virtually any other multimedia task you can think of.

For simplicity, we will perform a single-quality HLS conversion using FFmpeg. This means we will not generate a master playlist with different variants of the original video. Instead, we will only generate a media playlist since we are working with a single variant (quality) of the original video—let’s say a 720p HLS stream.

ffmpeg -i original_video.mp4
	-vf "scale=2:720"
	-c:v h264
	-c:a aac
	-hls_time 2
	-hls_playlist_type vod
	-hls_segment_filename "segment_%d.ts"
	stream.m3u8

Running this simple command will transcode the original video to 720p, divide it into 2-second segments, and create a media manifest that contains the paths to each segment.

This is a very basic setup to get us started. Of course, there are a myriad of configurations that can be applied to enhance the efficiency of transcoding.

Dynamic HLS Streaming

In most streaming services we use today, the original videos are pre-transcoded and stored in globally distributed data centers for faster delivery. However, in the case of local media streaming, this approach is inefficient. It consumes significant storage space and requires an initial waiting period for all the local media to be transcoded.

For local media servers, it is far more efficient to transcode on the fly, in real time, based on user requests.

Now that I’ve set the stage, we can move on to coding a Golang server to dynamically transcode and stream content upon user request. But that’s for Part II of this series.


 Date: May 21, 2025
 Tags:  Tech Local Media Server Golang Transcoding HLS FFMPEG

Previous:
⏪ Android Multi-Module Code Coverage with Kover.

Next:
On the Fly HLS Transcoding with Golang - Part II ⏩