Once for a while, we asked questions like: what can we do with more computations? When we asked that question in 2010, CUDA came along and Jesen Huang gifted everyone under the sun a GTX 980 or GTX Titan in search of problems beyond graphics that these wonderful computation workhorses can help.
Then suddenly, we found out that we can not only solve simulation problems (3D graphics, physics emulation, weather forecast etc.), but also perception problems with 100x more computations. That started the gold rush of deep learning.
Fast-forward to today, as deep learning makes great advances in perception and understanding, the focus moved from pure computations to interconnects and memory. We can do really interesting things now with the computations available today. What can we do, if there are another 100x more computations available?
Put it more blatantly, I am not interested in supercomputers in data centers to be 100x faster. What if a mobile phone, a laptop, or a Raspberry Pi, can carry 100x more computations in the similar envelope? What can we do with that?
To answer that question, we need to turn our eyes from the virtual world back to our physical world. Because dynamics in the physical world are complex, for many years, we built machines with ruthless simplifications. We built heavy cranes to balance out heavy things we are going to lift. Even our most-advanced robotic arms, often have heavy base such that the dynamics can be affine to the control force. Often than not, humans are in the loop to control these dynamics, such as early airplanes, or F1 racing cars.
That’s why the machines we built today mostly have pretty straightforward dynamics. Even with microcontrollers, our jet-fighters or quad-copters actually have pretty simple dynamics control. Only recently, Boston Dynamics started to build machines with whole-body dynamics in mind and actually have sophisticated dynamics.
Now, imagine a world where every machine is much more nimble, more biological-like, because we don’t need to simplify the system dynamics, but to leverage them. To get there, we need to do much more computations.
To control a dynamics system, we normally need to solve optimization problems with hundreds to thousands of variables. These are not crazy numbers, our computers today can solve eigenvalues of a matrix on the rank of thousands pretty easily. The trick is to do this fast. An active control applied at 1000Hz is much more stable than ones at 10Hz. That means do these numerical integrations, inverting matrices, all under 1 millisecond. For this, we need to do much more computations in 1 millisecond than what we can today.
If we are careful, we will plan our gifted 100x computations more strategically. We will work on anything that reduces computation latency, such as sharing memory between CPUs and GPUs. We will mainstream critical works such as PREEMPT_RT
to the Linux kernel. We will reduce the number of ECUs so it is easier to compute whole-body dynamics with one beefier computer. We will make our software / hardware packages more easy-to-use; it will scale from small robot vacuums to biggest cranes.
During our first 100x leap, we solved graphics. With our next 100x leap, we solved simulation and perception. Now it is the time to do another 100x leap, and to solve dynamics. I am convinced this is the only way to build machines as efficient as their biological counterparts. And these more dynamic, more biological-like machines will be our answer to be sustainable, greener, where we can build more with less.