Author: Awais Farooq
-
What’s the difference between Model methods predict() and __call__()?
Let’s answer with an extract from Deep Learning with Python, Second Edition: Both y = model.predict(x) and y = model(x) (where x is an array of input data) mean “run the model on x and retrieve the output y.” Yet they aren’t exactly the same thing. predict() loops over the data in batches (in fact, you can specify the batch size via predict(x, batch_size=64)), and it extracts…
-
What if I need to customize what fit() does?
You have two options: 1) Subclass the Model class and override the train_step (and test_step) methods This is a better option if you want to use custom update rules but still want to leverage the functionality provided by fit(), such as callbacks, efficient step fusing, etc. Note that this pattern does not prevent you from building models with the Functional API,…
-
What’s the recommended way to monitor my metrics when training with fit()?
Loss values and metric values are reported via the default progress bar displayed by calls to fit(). However, staring at changing ascii numbers in a console is not an optimal metric-monitoring experience. We recommend the use of TensorBoard, which will display nice-looking graphs of your training and validation metrics, regularly updated during training, which you can access…
-
In fit(), is the data shuffled during training?
If you pass your data as NumPy arrays and if the shuffle argument in model.fit() is set to True (which is the default), the training data will be globally randomly shuffled at each epoch. If you pass your data as a tf.data.Dataset object and if the shuffle argument in model.fit() is set to True, the dataset will be locally shuffled (buffered shuffling). When using tf.data.Dataset objects, prefer shuffling your data…
-
In fit(), how is the validation split computed?
If you set the validation_split argument in model.fit to e.g. 0.1, then the validation data used will be the last 10% of the data. If you set it to 0.25, it will be the last 25% of the data, etc. Note that the data isn’t shuffled before extracting the validation split, so the validation is literally just the last x% of samples in…
-
What’s the difference between the training argument in call() and the trainable attribute?
training is a boolean argument in call that determines whether the call should be run in inference mode or training mode. For example, in training mode, a Dropout layer applies random dropout and rescales the output. In inference mode, the same layer does nothing. Example: trainable is a boolean layer attribute that determines the trainable weights of the layer should be…
-
How can I freeze layers and do fine-tuning?
Setting the trainable attribute All layers & models have a layer.trainable boolean attribute: On all layers & models, the trainable attribute can be set (to True or False). When set to False, the layer.trainable_weights attribute is empty: Setting the trainable attribute on a layer recursively sets it on all children layers (contents of self.layers). 1) When training with fit(): To do fine-tuning with fit(), you would: Like this: You…
-
How can I interrupt training when the validation loss isn’t decreasing anymore?
You can use an EarlyStopping callback:
-
How can I ensure my training run can recover from program interruptions?
To ensure the ability to recover from an interrupted training run at any time (fault tolerance), you should use a keras.callbacks.BackupAndRestore callback that regularly saves your training progress, including the epoch number and weights, to disk, and loads it the next time you call Model.fit().
-
Why is my training loss much higher than my testing loss?
A Keras model has two modes: training and testing. Regularization mechanisms, such as Dropout and L1/L2 weight regularization, are turned off at testing time. They are reflected in the training time loss but not in the test time loss. Besides, the training loss that Keras displays is the average of the losses for each batch…