Author: Awais Farooq

  • Image Captioning

    Setup Download the dataset We will be using the Flickr8K dataset for this tutorial. This dataset comprises over 8,000 images, that are each paired with five different captions. Preparing the dataset Vectorizing the text data We’ll use the TextVectorization layer to vectorize the text data, that is to say, to turn the original strings into integer sequences…

  • RandAugment for Image Classification for Improved Robustness

    Data augmentation is a very useful technique that can help to improve the translational invariance of convolutional neural networks (CNN). RandAugment is a stochastic data augmentation routine for vision data and was proposed in RandAugment: Practical automated data augmentation with a reduced search space. It is composed of strong augmentation transforms like color jitters, Gaussian blurs,…

  • MixUp augmentation for image classification

    Introduction mixup is a domain-agnostic data augmentation technique proposed in mixup: Beyond Empirical Risk Minimization by Zhang et al. It’s implemented with the following formulas: (Note that the lambda values are values with the [0, 1] range and are sampled from the Beta distribution.) The technique is quite systematically named. We are literally mixing up the features and their corresponding labels.…

  • CutMix data augmentation for image classification

    Introduction CutMix is a data augmentation technique that addresses the issue of information loss and inefficiency present in regional dropout strategies. Instead of removing pixels and filling them with black or grey pixels or Gaussian noise, you replace the removed regions with a patch from another image, while the ground truth labels are mixed proportionally to…

  • Zero-DCE for low-light image enhancement

    Introduction Zero-Reference Deep Curve Estimation or Zero-DCE formulates low-light image enhancement as the task of estimating an image-specific tonal curve with a deep neural network. In this example, we train a lightweight deep network, DCE-Net, to estimate pixel-wise and high-order tonal curves for dynamic range adjustment of a given image. Zero-DCE takes a low-light image as input and produces high-order tonal…

  • Enhanced Deep Residual Networks for single-image super-resolution

    Introduction In this example, we implement Enhanced Deep Residual Networks for Single Image Super-Resolution (EDSR) by Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. The EDSR architecture is based on the SRResNet architecture and consists of multiple residual blocks. It uses constant scaling layers instead of batch normalization layers to produce consistent results…

  • Image Super-Resolution using an Efficient Sub-Pixel CNN

    Introduction ESPCN (Efficient Sub-Pixel CNN), proposed by Shi, 2016 is a model that reconstructs a high-resolution version of an image given a low-resolution version. It leverages efficient “sub-pixel convolution” layers, which learns an array of image upscaling filters. In this code example, we will implement the model from the paper and train it on a small dataset, BSDS500. BSDS500.…

  • Low-light image enhancement using MIRNet

    Introduction With the goal of recovering high-quality image content from its degraded version, image restoration enjoys numerous applications, such as in photography, security, medical imaging, and remote sensing. In this example, we implement the MIRNet model for low-light image enhancement, a fully-convolutional architecture that learns an enriched set of features that combines contextual information from multiple scales,…

  • Convolutional autoencoder for image denoising

    Introduction This example demonstrates how to implement a deep convolutional autoencoder for image denoising, mapping noisy digits images from the MNIST dataset to clean digits images. This implementation is based on an original blog post titled Building Autoencoders in Keras by François Chollet. Setup Prepare the data Build the autoencoder We are going to use the Functional API…

  • Handwriting recognition

    Introduction This example shows how the Captcha OCR example can be extended to the IAM Dataset, which has variable length ground-truth targets. Each sample in the dataset is an image of some handwritten text, and its corresponding target is the string present in the image. The IAM Dataset is widely used across many OCR benchmarks, so we hope…