How can I obtain reproducible results using Keras during development?

There are four sources of randomness to consider:

  1. Keras itself (e.g. keras.random ops or random layers from keras.layers).
  2. The current Keras backend (e.g. JAX, TensorFlow, or PyTorch).
  3. The Python runtime.
  4. The CUDA runtime. When running on a GPU, some operations have non-deterministic outputs. This is due to the fact that GPUs run many operations in parallel, so the order of execution is not always guaranteed. Due to the limited precision of floats, even adding several numbers together may give slightly different results depending on the order in which you add them.

To make both Keras and the current backend framework deterministic, use this:

keras.utils.set_random_seed(1337)

To make Python deterministic, you need to set the PYTHONHASHSEED environment variable to 0 before the program starts (not within the program itself). This is necessary in Python 3.2.3 onwards to have reproducible behavior for certain hash-based operations (e.g., the item order in a set or a dict, see Python’s documentation).

To make the CUDA runtime deterministic: if using the TensorFlow backend, call tf.config.experimental.enable_op_determinism. Note that this will have a performance cost. What to do for other backends may vary – check the documentation of your backend framework directly.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *