mx.opt.adadelta
¶
Description¶
Create an AdaDelta optimizer with respective parameters.
AdaDelta optimizer as described in Zeiler, M. D. (2012). ADADELTA: An adaptive learning rate method. http://arxiv.org/abs/1212.5701
Usage¶
mx.opt.adadelta(
rho = 0.9,
epsilon = 1e-05,
wd = 0,
rescale.grad = 1,
clip_gradient = -1
)
Arguments¶
Argument |
Description |
---|---|
|
float, default=0.90. Decay rate for both squared gradients and delta x. |
|
float, default=1e-5. The constant as described in the thesis. |
|
float, default=0.0. L2 regularization coefficient add to all the weights. |
|
float, default=1. rescaling factor of gradient. |
|
float, default=-1 (no clipping if < 0). clip gradient in range [-clip_gradient, clip_gradient]. |