Category: 05. Optimizers

  • RMSprop

    RMSprop class Optimizer that implements the RMSprop algorithm. The gist of RMSprop is to: This implementation of RMSprop uses plain momentum, not Nesterov momentum. The centered version additionally maintains a moving average of the gradients, and uses that average to estimate the variance. Arguments Example

  • SGD

    SGD class Gradient descent (with momentum) optimizer. Update rule for parameter w with gradient g when momentum is 0: Update rule when momentum is larger than 0: When nesterov=True, this rule becomes: Arguments