Cloud and systems

How Prime Video updates its app for more than 8,000 device types

The switch to WebAssembly increases stability, speed.

January 27, 2022

6 min read

At Prime Video, we’re delivering content to millions of customers on more than 8,000 device types, such as gaming consoles, TVs, set-top boxes, and USB-powered streaming sticks. When we want to do an update, every one of those devices requires a separate native release, posing a difficult trade-off between updatability and performance.

In the past year, we’ve been using WebAssembly (Wasm), a framework that allows code written in high-level languages to run efficiently on any device, to help resolve that trade-off. Because we are excited to contribute to the Wasm ecosystem, Amazon has joined the Bytecode Alliance, a consortium dedicated to developing secure, efficient, modular, and portable runtime environments built atop standards such as Wasm.

By using Wasm instead of JavaScript for certain elements of the Prime Video app, we’ve reduced the average frame times on a mid-range TV from 28 milliseconds to 18. The worst-case frame times also decreased from 40 milliseconds to 25. And in ongoing work we’re driving the frame time down still further.

Division of labor

To enable efficient updates on a wide variety of devices while still maintaining performance, the Prime Video app has two parts: a high-performance engine written in C++ that is stored on-device and an easy-to-update component that is downloaded every time the app launches.

Original architecture.jpeg — The original architecture of the Prime Video app, with a layer of C++ code stored on-device and layers of JavaScript code downloaded at run time.

In the figure above, the stuff on device is a thin C++ layer that includes a JavaScript virtual machine (VM) and the components required to run the Prime Video application, which handle input, the media pipeline, and such processes as such as network access, image decoding, and window events handling.

WebAssembly

Wasm provides a compilation target for programming languages that offer more expressivity than JavaScript does, such as C or Rust. Like JavaScript code, compiled Wasm binaries run on a VM that provides a uniform interface between code and hardware, regardless of device.

Wasm was initially intended for web browsers, but there are now standalone applications of Wasm VMs outside the browser, such as running Internet-of-things software, game mods, and server-side workloads.

Our Wasm investigations started in August 2020, when we built some prototypes to compare the performance of Wasm VMs and JavaScript VMs in simulations involving the type of work our low-level JavaScript components were doing. In those experiments, code written in Rust and compiled to Wasm was 10 to 25 times as fast as JavaScript.

Example of sports field registration, including visualizations of grid keypoints and dense features

The switch

The figure above shows the new architecture with a Wasm VM and a JavaScript VM running in separate threads. But how did we transition from the first architecture to the second one without rewriting the app?

The first step is updating the stuff on device to include the Wasm VM, so it can now run both versions of a given software component (JavaScript only or JavaScript and Wasm). This allows us to gradually release the Wasm components to a subset of customers.

We had to modify how the Prime Video application communicates with these components. At a high level, the application works by creating a scene — a representation of a visual scene — which consists of nodes whose implementations are device specific. A host node (e.g., view, image, text) is a data structure that has all the necessary information to update and render a component of the visual scene.

Results

As I mentioned, the switch to Rust and Wasm has improved the applications’ frame rate stability and speed. To reach our goal of reliable 60-frame-per-second frame generation and improve input latency, we will move more systems to Wasm, such as focus management and layout.

The total memory consumption for the Wasm VM, including the module instance, environment, and the module itself is at most 7.5 megabytes. By moving these systems to Wasm, we have saved a total of 30 megabytes of JavaScript heap memory. Memory is a scarce resource on most of the devices we deploy on, so this is a welcome reduction.

The binary size of our Wasm module is 150 kilobytes when compressed (750 kilobytes uncompressed, after symbol stripping). The module’s small size, coupled with the fast VM start time, means that the addition of Wasm doesn’t affect the app start-up time.

Using Rust has enabled programmers of all experience levels to contribute code without requiring reviewers to carefully scrutinize every line for safety pitfalls. We trust the compiler, and we can focus our code reviews on functionality, not language corner cases.

Another benefit of using Rust is having access to an ecosystem of high-quality libraries. For instance, we built an application that overlays debugger information on an application scene render using egui, a Rust GUI library. Integrating egui with our Wasm renderer took a couple of hours of work and offers us an easy way to gain insights into the engine’s internals.

Debug overlay.png — An image generated by the renderer, with debugging information overlaid.

Overall, we think that this investment in Rust and WebAssembly has paid off: after a year and 37,000 lines of Rust code, we have significantly improved performance, stability, and CPU consumption and reduced memory utilization.

About the Author

Alexandru Ene

Alexandru Ene is a principal software engineer with Amazon Prime Video.