Fun with Motion-JPEG, Jupyter Notebook and How to Stream GUI and Animations in the Browser

I did the Underactuated Robotics course at the beginning of this year (for anyone hasn’t, highly recommended! https://underactuated.mit.edu/). One thing interesting to watch is to see how Russ opened the notebook, did the computation instanously, and then spent a few seconds to a minute to generate the animated illustrations.

If you previously Googled about how to show animations in a Jupyter notebook, you probably know what I was talking about. There are two documented ways: 1. sending image display messages to the notebook and rendering the image again and again. There is no control over frame rate, and the re-rendering in the notebook can cause stutters and flashs. 2. rendering the animation into video and only sending the whole to the notebook when it is done. The animation will be smooth, you can also play the video back and forth. However, during the rendering, you have to wait. That is what Russ complained about repeatedly during the course.

This is an interest to me now because I want to build physical models with the amazing MuJoCo library, and it became a headache to view the UI over KVM from my workstation to the laptop (don’t ask why I don’t do this on my laptop directly, long story and weird setup). What if I can write the model in a Jupyter notebook and play with it from the notebook directly? After a few days, well, you can see it now:

I first learned motion-jpeg from the TinyPilot project. It is a space-wise inefficient, but otherwise fast and easy to implement low-latency video streaming protocol. The best part: it is supported universally in all browsers, and over HTTP protocol. The format is dead simple: we reply with the content type as “x-mixed-replace” and a boundary identifier, then just stuff the connection with JPEG images one-by-one. This worked surprisingly well over the local network. No wonder why many cheap webcams, without a better H264 hardware encoder, use motion jpeg as their streaming protocol too.

After sorting out the streaming protocol, what if we can make this interactive? Nowadays, this is as simple as opening a web-socket and passing a few JSON messages over.

With a day of coding, I can now run MuJoCo’s simulate app inside a web browser, not bad!

To integrate with the notebook, there are a few minor, but important usability issues to sort out.

First, how do we know which host we should connect to? Remember, all code in the Jupyter notebook runs on the server, while it can launch a shadow HTTP server without any problems, we need to know if the notebook was opened from 127.0.0.1 or x-1840-bbde.us-west-2.amazonaws.com. Luckily, we can send Javascript to create the img tag, thus, fetching the window.location.hostname directly from the notebook webpage.

Second, how do we know when to launch a HTTP server? This could be maintained globally (i.e., only one HTTP server per notebook), although you probably want this to end as soon as the cell execution ends. If you control the notebook kernel implementation like I do, insert some cell local variables to help would be a good solution. Otherwise, with IPython, you have to get the current execution count (through IPython.Application.instance().kernel.execution_count) to implement poor-person’s cell local variables. It guarantees only one HTTP server would be created per cell, and you can decide whether you want to destroy the HTTP server created by the previous cell, or just keep it running.

With all these tricks, I now can run the control loop for RL Gym and render the view in the notebook smoothly:

I am pretty happy with it now. If there is one thing I’d like to improve, probably at the end of cell execution, I would render all the jpeg frames into a video and embed them into the notebook through the video data attribute. That will keep the notebook self-contained and can render properly if you push it to GitHub or other places that can render the notebook offline.

Older ›

‹ Newer