Author: Awais Farooq

  • The base Layer class

    This is the class from which all layers inherit. A layer is a callable object that takes as input one or more tensors and that outputs one or more tensors. It involves computation, defined in the call() method, and a state (weight variables). State can be created: Layers are recursively composable: If you assign a Layer instance as an attribute…

  • Model training APIs

    Configures the model for training. Example Arguments fit method Trains the model for a fixed number of epochs (dataset iterations). Arguments Unpacking behavior for iterator-like inputs: A common pattern is to pass an iterator like object such as a tf.data.Dataset or a keras.utils.PyDataset to fit(), which will in fact yield not only features (x) but optionally targets (y) and sample weights…

  • The Sequential class

    Sequential groups a linear stack of layers into a Model. Examples add method Adds a layer instance on top of the layer stack. Arguments pop method

  • The Model class

    A model grouping layers into an object with training/inference features. There are three ways to instantiate a Model: With the “Functional API” You start from Input, you chain layer calls to specify the model’s forward pass, and finally you create your model from inputs and outputs: Note: Only dicts, lists, and tuples of input tensors are supported.…

  • Migrating Keras 2 code to multi-backend Keras 3

    Setup First, lets install keras-nightly. This example uses the TensorFlow backend (os.environ[“KERAS_BACKEND”] = “tensorflow”). After you’ve migrated your code, you can change the “tensorflow” string to “jax” or “torch” and click “Restart runtime” in Colab, and your code will run on the JAX or PyTorch backend. Going from Keras 2 to Keras 3 with the TensorFlow backend First, replace your imports: Next,…

  • Distributed training with Keras 3

    Introduction The Keras distribution API is a new interface designed to facilitate distributed deep learning across a variety of backends like JAX, TensorFlow and PyTorch. This powerful API introduces a suite of tools enabling data and model parallelism, allowing for efficient scaling of deep learning models on multiple accelerators and hosts. Whether leveraging the power…

  • Multi-GPU distributed training with PyTorch

    Introduction There are generally two ways to distribute computation across multiple devices: Data parallelism, where a single model gets replicated on multiple devices or multiple machines. Each of them processes different batches of data, then they merge their results. There exist many variants of this setup, that differ in how the different model replicas merge…

  • Multi-GPU distributed training with TensorFlow

    Introduction There are generally two ways to distribute computation across multiple devices: Data parallelism, where a single model gets replicated on multiple devices or multiple machines. Each of them processes different batches of data, then they merge their results. There exist many variants of this setup, that differ in how the different model replicas merge…

  • Multi-GPU distributed training with JAX

    Introduction There are generally two ways to distribute computation across multiple devices: Data parallelism, where a single model gets replicated on multiple devices or multiple machines. Each of them processes different batches of data, then they merge their results. There exist many variants of this setup, that differ in how the different model replicas merge…

  • Transfer learning & fine-tuning

    Setup Introduction Transfer learning consists of taking features learned on one problem, and leveraging them on a new, similar problem. For instance, features from a model that has learned to identify racoons may be useful to kick-start a model meant to identify tanukis. Transfer learning is usually done for tasks where your dataset has too little…