In fit(), is the data shuffled during training?

If you pass your data as NumPy arrays and if the shuffle argument in model.fit() is set to True (which is the default), the training data will be globally randomly shuffled at each epoch.

If you pass your data as a tf.data.Dataset object and if the shuffle argument in model.fit() is set to True, the dataset will be locally shuffled (buffered shuffling).

When using tf.data.Dataset objects, prefer shuffling your data beforehand (e.g. by calling dataset = dataset.shuffle(buffer_size)) so as to be in control of the buffer size.

Validation data is never shuffled.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *