I was reminded of this whole area when I came across an OSDI paper on “Neural Adaptive Content-aware Internet Video Delivery“. The summary is that by using neural networks they’re able to improve a quality-of-experience metric by 43% if they keep the bandwidth the same, or alternatively reduce the bandwidth by 17% while preserving the perceived quality. There have also been other papers in a similar vein, such as this one on generative compression [PDF], or adaptive image compression. They all show impressive results, so why don’t we hear more about compression as a machine learning application?
All of these approaches require comparatively large neural networks, and the amount of arithmetic needed scales with the number of pixels. This means large images or video with high frames-per-second can require more computing power than current phones and similar devices have available. Most CPUs can only practically handle tens of billions of arithmetic operations per second, and running ML compression on HD video could easily require ten times that. The good news is that there are hardware solutions, like the Edge TPU amongst others, that offer the promise of much more compute being available in the future. I’m hopeful that we’ll be able to apply these resources to all sorts of compression problems, from video and image, to audio, and even more imaginative approaches.