Author: Awais Farooq

  • CutMix data augmentation for image classification

    Introduction CutMix is a data augmentation technique that addresses the issue of information loss and inefficiency present in regional dropout strategies. Instead of removing pixels and filling them with black or grey pixels or Gaussian noise, you replace the removed regions with a patch from another image, while the ground truth labels are mixed proportionally to…

  • Zero-DCE for low-light image enhancement

    Introduction Zero-Reference Deep Curve Estimation or Zero-DCE formulates low-light image enhancement as the task of estimating an image-specific tonal curve with a deep neural network. In this example, we train a lightweight deep network, DCE-Net, to estimate pixel-wise and high-order tonal curves for dynamic range adjustment of a given image. Zero-DCE takes a low-light image as input and produces high-order tonal…

  • Enhanced Deep Residual Networks for single-image super-resolution

    Introduction In this example, we implement Enhanced Deep Residual Networks for Single Image Super-Resolution (EDSR) by Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. The EDSR architecture is based on the SRResNet architecture and consists of multiple residual blocks. It uses constant scaling layers instead of batch normalization layers to produce consistent results…

  • Image Super-Resolution using an Efficient Sub-Pixel CNN

    Introduction ESPCN (Efficient Sub-Pixel CNN), proposed by Shi, 2016 is a model that reconstructs a high-resolution version of an image given a low-resolution version. It leverages efficient “sub-pixel convolution” layers, which learns an array of image upscaling filters. In this code example, we will implement the model from the paper and train it on a small dataset, BSDS500. BSDS500.…

  • Low-light image enhancement using MIRNet

    Introduction With the goal of recovering high-quality image content from its degraded version, image restoration enjoys numerous applications, such as in photography, security, medical imaging, and remote sensing. In this example, we implement the MIRNet model for low-light image enhancement, a fully-convolutional architecture that learns an enriched set of features that combines contextual information from multiple scales,…

  • Convolutional autoencoder for image denoising

    Introduction This example demonstrates how to implement a deep convolutional autoencoder for image denoising, mapping noisy digits images from the MNIST dataset to clean digits images. This implementation is based on an original blog post titled Building Autoencoders in Keras by François Chollet. Setup Prepare the data Build the autoencoder We are going to use the Functional API…

  • Handwriting recognition

    Introduction This example shows how the Captcha OCR example can be extended to the IAM Dataset, which has variable length ground-truth targets. Each sample in the dataset is an image of some handwritten text, and its corresponding target is the string present in the image. The IAM Dataset is widely used across many OCR benchmarks, so we hope…

  • OCR model for reading Captchas

    Introduction This example demonstrates a simple OCR model built with the Functional API. Apart from combining CNN and RNN, it also illustrates how you can instantiate a new layer and use it as an “Endpoint layer” for implementing CTC loss. For a detailed guide to layer subclassing, please check out this page in the developer guides. Setup…

  • Point cloud segmentation with PointNet

    Introduction A “point cloud” is an important type of data structure for storing geometric shape data. Due to its irregular format, it’s often transformed into regular 3D voxel grids or collections of images before being used in deep learning applications, a step which makes the data unnecessarily large. The PointNet family of models solves this…

  • 3D volumetric rendering with NeRF

    Introduction In this example, we present a minimal implementation of the research paper NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis by Ben Mildenhall et. al. The authors have proposed an ingenious way to synthesize novel views of a scene by modelling the volumetric scene function through a neural network. To help you understand this intuitively, let’s start with…