Skill Level: Any Skill Level


  1. What is Image Optimization?

    Image optimization is the process of storing and transferring images with the smallest possible file size, without compromising overall image quality. Video optimization refers to a set of techniques that improve the user’s video experience, by reducing video start time and using buffering strategies. This process is also designed to reduce the amount of network bandwidth used by video sessions.

    Large, unoptimized images and videos can slow down web pages and lead to a sub-optimal user experience. Therefore website owners, developers and product design teams go to great efforts to reduce page load times, by using plugins, scripts or other techniques to reduce file size. Beyond traditional methods, the industry is starting to explore new techniques for image and video optimization based on machine learning and artificial intelligence (ML/AI).

  2. Traditional Image Compression Techniques

    Compression is essential to ensure image and video files are loaded quickly and efficiently. During file transmissions and streaming there are network and storage limitations that can impact the overall delivery time and quality of the file. Simply put, when the file is too big, it can take a long time to load. Additionally, heavy files consume a lot of storage and network resources.

    Compression techniques reduce the size of the file, ideally without damaging the quality of the image or video. The most basic compression technique removes redundant data, like duplicated pixels and spatial redundancies, from the file. When this information is removed, it reduces the overall size of the file. 

    There are also compression methods, like Huffman, which reduce statistical redundancies using entropy coding techniques. Later on, a spatial frequency encoding method was developed, called Discrete Cosine Transform (DCT), to convert images from 24-bit to 16 bits, significantly reducing the size. This technique is still used today in compression standards such as JPEG.

    DCT can detect minute details in the image, which are invisible to the human eye. These are translated into higher frequencies and ignored by the compression algorithm. This allows the algorithm to convert the image from a 24-bit color space to 16 bits, which can dramatically reduce image size.

  3. Image Optimization with Machine Learning

    While traditional image optimization algorithms were quite successful, there is a lot of potential in the use of machine learning to improve compression and optimization processes. Here are a few ways machine learning can be used.

    Image Compression and Resolution
    RAISR is an algorithm developed and released by Google. RAISR applies traditional upsampling and deep learning (DL) together, for the purpose of turning low-resolution images into high-resolution files. 

    Image Enhancement
    Yet another Google innovation, the Deep Bilateral Learning framework uses neural networks to enhance images in full HD, in real time. It can perform smart retouching on images, similar to a human photoshop artist. The amazing part of this technology is that it is efficient enough to perform real-time inference on a mobile device. 

    One possible application is allowing phone users to preview an enhanced image with various improvements or filters applied, even before they take a photo on their phone.

    Image Manipulation and Generation
    Generative adversarial networks (GAN) can be used to generate new features in existing images. GAN works by pitting two neural networks against each other – one tries to create fake versions of an image, and the other attempts to tell the fake images apart from real images. Eventually, the network gets very good at generating realistic fake images.

    GAN can be used to modify images in surprising ways, for example, removing rain, changing a scene shot during the day to night time, adding or removing people or other objects from the image. These capabilities can be used for a variety of applications, from film production to fun consumer applications to medical image processing.

    Another variation on the GAN concept is super-resolution generative adversarial networks (SR-GAN). These are neural networks that can take a low-resolution image and “magically” increase the resolution, by predicting and adding realistic details to the image.

  4. Conclusion

    AI has tremendous potential to optimize images and video – beyond the adequate performance of traditional algorithms. With the growing demand for rich media streamed over mobile devices, this will be critical for powering richer online experiences. Just like the Internet transitioned from text-based content to image-rich web pages to dynamic and video content, AI will power the next transition to new types of content, such as augmented reality and 360-degree video.

Join The Discussion