If you pass your data as NumPy arrays and if the shuffle
argument in model.fit()
is set to True
(which is the default), the training data will be globally randomly shuffled at each epoch.
If you pass your data as a tf.data.Dataset
object and if the shuffle
argument in model.fit()
is set to True
, the dataset will be locally shuffled (buffered shuffling).
When using tf.data.Dataset
objects, prefer shuffling your data beforehand (e.g. by calling dataset = dataset.shuffle(buffer_size)
) so as to be in control of the buffer size.
Validation data is never shuffled.
Leave a Reply