Author: Awais Farooq
-
Accuracy metrics
Accuracy class Calculates how often predictions equal labels. This metric creates two local variables, total and count that are used to compute the frequency with which y_pred matches y_true. This frequency is ultimately returned as binary accuracy: an idempotent operation that simply divides total by count. If sample_weight is None, weights default to 1. Use sample_weight of 0 to mask values. Arguments Examples Usage with compile() API: BinaryAccuracy class Calculates how often predictions match binary…
-
Base Metric class
Metric class Encapsulates metric logic and state. Arguments Example Usage with compile() API: To be implemented by subclasses: Example subclass implementation:
-
Loss Scale Optimizer
LossScaleOptimizer class An optimizer that dynamically scales the loss to prevent underflow. Loss scaling is a technique to prevent numeric underflow in intermediate gradients when float16 is used. To prevent underflow, the loss is multiplied (or “scaled”) by a certain factor called the “loss scale”, which causes intermediate gradients to be scaled by the loss scale…
-
Lion
Lion class Optimizer that implements the Lion algorithm. The Lion optimizer is a stochastic-gradient-descent method that uses the sign operator to control the magnitude of the update, unlike other adaptive optimizers such as Adam that rely on second-order moments. This make Lion more memory-efficient as it only keeps track of the momentum. According to the authors…
-
Ftrl
Ftrl class Optimizer that implements the FTRL algorithm. “Follow The Regularized Leader” (FTRL) is an optimization algorithm developed at Google for click-through rate prediction in the early 2010s. It is most suitable for shallow models with large and sparse feature spaces. The algorithm is described by McMahan et al., 2013. The Keras version has support for both…
-
Nadam
Nadam class Optimizer that implements the Nadam algorithm. Much like Adam is essentially RMSprop with momentum, Nadam is Adam with Nesterov momentum. Arguments
-
Adafactor
Adafactor class Optimizer that implements the Adafactor algorithm. Adafactor is commonly used in NLP tasks, and has the advantage of taking less memory because it only saves partial information of previous gradients. The default argument setup is based on the original paper (see reference). When gradients are of dimension > 2, Adafactor optimizer will delete the…
-
Adamax
Adamax class Optimizer that implements the Adamax algorithm. Adamax, a variant of Adam based on the infinity norm, is a first-order gradient-based optimization method. Due to its capability of adjusting the learning rate based on data characteristics, it is suited to learn time-variant process, e.g., speech data with dynamically changed noise conditions. Default parameters follow those…
-
Adagrad
Adagrad class Optimizer that implements the Adagrad algorithm. Adagrad is an optimizer with parameter-specific learning rates, which are adapted relative to how frequently a parameter gets updated during training. The more updates a parameter receives, the smaller the updates. Arguments
-
Adadelta
Adadelta class Optimizer that implements the Adadelta algorithm. Adadelta optimization is a stochastic gradient descent method that is based on adaptive learning rate per dimension to address two drawbacks: Adadelta is a more robust extension of Adagrad that adapts learning rates based on a moving window of gradient updates, instead of accumulating all past gradients. This…