Building video services for Znanstvenik u meni!


Last year, I started a competition in science communication for high school students in Croatia: Znanstvenik u meni! (ZUM, for short)

The competition has two phases:

  1. The online part: where high school students record a video about a scientific topic that they find interesting, which then get graded by a panel of scientists / science communicators, and the public
  2. The live part: where the best students from the online phase compete by performing live, in front of an audience

When we started out, we had a problem: how do we collect and store the videos?

The first year, we used AWS S3 and used Uppy to upload videos to S3, which we would then serve without transcoding. We settled for this approach because we needed something that could be developed fast and didn’t have time to consider the implications of such a solution.

This resulted in:

  • poor performance for video streaming (as we didn’t transcode the videos – serving huge files over S3)
  • unpredictable bandwidth costs (as we can’t know how many people will watch videos) which could get high at peak times
  • great stability for our website (as most of the bandwidth was handled by S3)
  • simple management and administration (as S3 is mostly straightforward to use)

As the cons clearly outweighed the benefits, we came to the conclusion that using S3 for video storage isn’t our best option and started looking into replacing it for the next year. Here’s how we hacked together a solution in six days of work.

Looking into proprietary solutions

Most cloud providers, including Amazon, do offer video services!

AWS offers them under AWS Media Services which are great, but still don’t offer predictable bandwidth costs, which isn’t ideal for us as a non-profit project on a tight budget. Elastic Transcoder is fairly priced for what it offers and it would cost us about 1 USD to convert a single video into the resolutions that we’d like to be able to serve. This would, however, come at a higher cost if we’d like to transcode to multiple formats/codecs/etc.

Azure also offers media services, which are similarly priced to AWS for transcoding, but can get pricey (for us, at least) when you include the data transfer (or CDN) and streaming endpoint costs.

We also looked into using Cloudflare Stream, which would have been the most affordable solution for us, but unfortunately, we were told that we aren’t able to use it on the Free plan (which has seemingly changed) and that we’d have to upgrade to Enterprise, which was highly above what we could afford.

As we were looking for a service that would allow us predictable and affordable pricing for data transfers and as we had a few developers on our team, we ended up building a system ourselves.

The birth of vmss

We had decent servers at our disposal which we could use to build the video infrastructure that we needed ourselves.

As we failed to find an opensource solution for video management that would suit us (maybe we didn’t look hard enough), we started drafting what the system would need to do.

We needed a:

  • secure API for video uploads
  • public API to fetch video metadata
  • video transcoding system

So, one rainy day in Zagreb, a fellow contributor and I went for a cup of coffee and tried to draft how we could make that happen.

We settled on building a PHP service with a PostgreSQL database (in the end, we ended up using MariaDB – but the differences in our use case are subtle) and nginx, all while using ffmpeg to transcode. We gave it the codename vmss: video management and streaming services.

Uploading (NaN%)

Uploading was the first problem we had to solve.

As we didn’t want anybody uploading anything to our servers, we had to design an authentication method.

Considering that our primary use case is high school students uploading videos in a web app and that we wanted to handle uploads directly in vmss (instead of using the application web app as a proxy for uploads), we decided to issue a secret key to every client app (in this particular case, the web app) which the app could use to fetch a nonce from the API on the backend. This nonce would then be used to authenticate users trying to access the upload endpoint and would be invalidated after a user uploaded their video.

After a video is uploaded, vmss stores it on the server, adds a record of it existing to the database and adds it to the transcoding queue.


The fun part! (Kidding)

Writing a service for video transcoding is a problem that seems trivial until you actually have to do it.

As you start to build a transcoding service, a few questions arise:

What do I transcode… to?

As you’re (probably) limited by the resources at your disposal, you need as little transcoding operations as you could possibly get away with, so what do you transcode to? You can’t transcode to every known container and codec, and a lot of them have a significant amount of legal… annoyances regarding them.

As far as video goes, your best bet is using WebM with VP8 and Vorbis, but MP4 with H.264 and AAC is also good if you want to deal with MPEG legalese. Currently, vmss by default transcodes to both WebM and MPEG for maximum compatibility, but is configurable and will probably change in the future.

So, after considering what container and codecs you’ll use, what resolutions do you transcode to?

This highly depends on what your use case is:

  • if you’re targeting users with high speed internet, you could get away with using 720p as your base resolution
  • if you’re targeting users who could have really low internet speeds, you’ll probably need to target 144p

As Croatia has decent-enough internet speeds and as we wanted to transcode fast, we ended up targeting four resolutions: 360p and 480p, which provide a decent streaming experience on low speed connections and 720p/1080p which provide a decent viewing experience on high speed connections. As we didn’t have many people uploading in resolutions higher than 1080p, we didn’t feel the need to try to transcode to higher resolutions.

How do I transcode?

This is where the fun starts in PHP!

There’s a pretty good PHP library called PHP-FFMpeg for interfacing with ffmpeg which allows you to deal with videos as PHP objects.

This allows you to write clean, understandable code without having to remember the man pages for ffmpeg.

$video = $ffmpeg->open('uploads/'.$videoRow['originalUploadFile']);
			case '1080p':
				$Dimension = new FFMpeg\Coordinate\Dimension(320, 240);
			case '720p':
				$Dimension = new FFMpeg\Coordinate\Dimension(960, 720);
			case '480p':
				$Dimension = new FFMpeg\Coordinate\Dimension(854, 480);
			case '360p':
				$Dimension = new FFMpeg\Coordinate\Dimension(640, 360);
			case 'mp4':
				$exportFormat = new FFMpeg\Format\Video\X264();
				$exportExt = '.mp4';
			case 'webm':
				$exportFormat = new FFMpeg\Format\Video\WebM();
				$exportExt = '.webm';
		$video->filters()->resize($Dimension, FFMpeg\Filters\Video\ResizeFilter::RESIZEMODE_INSET, true)->synchronize();
		$exportFormat->on('progress', function ($video, $exportFormat, $percentage) {
    		echo "$percentage % transcoded";
		$video->save($exportFormat, 'uploads/'.$videoRow['vmssID'].'_'.$format.'_'.$resolution.$exportExt);

However, we decided not to use PHP-FFMpeg because of our misguided attempts to handle queued actions. We started by triggering queue actions with web requests (which allowed us flexibility over queue management), but this had a caveat: PHP web requests have a timeout, and PHP-FFMpeg would regularly trigger timeouts. We needed a way to execute queue actions in the background.

This meant that we ended up using the ffmpeg cli tools:

$command = 'ffmpeg -y -i {input} -vf scale={dimension} -async 1 -metadata:s:v:0 start_time=0 -f webm -c:v libvpx -b:v 1M -acodec libvorbis {output} -hide_banner';

Even after we fixed our queueing issues, we ended up staying with ffmpeg cli tools.

Okay, so how do you actually transcode videos? / Queue handling

After a video is uploaded, the transcode operations that have to be done on it get put into a queue:

A queue row has an action ID, the video ID that the action should be executed on, the action itself (for example, convert/webm/1080p) and the status (which can be either 0, 1 or 2: 0 marking a queued, unexecuted action, 1 marking an action that’s currently being handled and 2 marking an action that has been handled).

We store the queue in the SQL database as this allows us to easily interact with the queue from PHP and as we didn’t have need for a specialized queue like Beanstalk – the SQL queue is performant enough for our current needs.

I mentioned earlier that we had issues with handling our queue actions – and that’s correct, our first queue handling was particularly flawed.

As we never really built similar services in PHP, our first instinct was to use web requests to trigger an action from the queue. This would – in theory – allow for flexibility in how and when queue handling would be triggered, but it came with a few problems of its own.

First of all, if our queue key leaked, anyone on the web could trigger the queue handling. Another caveat was that web requests have a timeout in both PHP and nginx, which halts our transcoding operations.

The latter issue had a workaround: we could use bash. This meant that we could execute chained commands like ‘ffmpeg something && curl something’, which allowed us to mark an item as done (i.e. set the status to 2) after it was successfully transcoded.

This is flawed – using hacky solutions to make queueing work isn’t what we hoped to end up with.

After we realized our mistake, we went back and thought about how we wanted to deal with queued actions, coming to the conclusion that we wanted to execute them sequentially in non-peak times and that we’d like to control execution in PHP itself.

This time, instead of using web requests to control execution, we ended up using php-cli which doesn’t have execution time limits.

We wrote a php-cli script that handled all queued actions sequentially and managed the queue accordingly.

Now came the question of how and when should it run? Do we write a system service or a daemon and analyze system usage to minimally impact performance?

This approach was quickly shot down as we couldn’t afford enough time to actually write that, so we used the simplest solution.

You see, our peak traffic is mostly predictable: not a lot of people browse our websites and use our services very early in the morning, so we ended up setting a cronjob around 2 AM to run the CLI command.

Using cronjobs means that videos aren’t transcoded immediately after upload, which is what you see happening on sites like Facebook or YouTube. However, as we have enough time between stopping to accept videos and making them publicly viewable, this didn’t matter: they would be already transcoded when the first viewers appeared.

Why, once again?

We built a hacky, but functional video management system that fit our purposes in six days of work. It’s messy, it has its problems, but it works and it works well enough for our needs. VMSS solved our biggest problems: it made our infrastructure costs fully predictable and affordable.

As developers, we often try to avoid hacky code and code debt as much as possible: either by using open source libraries or depending on cloud providers for things like this, but in the end, writing code in-house for some things you’d rather avoid dealing with is a real solution to some problems you’ll certainly encounter. Code sometimes needs to be hacked together, especially when you’re facing a deadline – and I consider that to be perfectly fine. As long as you’re aware of the issues that could arise from what you’ve built and are prepared to deal with that.

That being said, as we can’t find anything remotely similar to vmss that’s open source, please let me know if you find something like this.

In either case, we’d love to improve vmss to the point that it’s actually usable for high-scale use cases and – to be honest – any use case that isn’t ours particularly.

I’d love to hear your input about what features you’d like to see in vmss if you were to consider using it and if you’re interested in contributing or even using vmss, feel free to take a look at the GitHub repo and email me if you need help.