This package is currently experimental and may change in the near future.
autograd package enables automatic
differentiation of NDArray operations.
In machine learning applications,
autograd is often used to calculate the gradients
of loss functions with respect to parameters.
Record vs Pause¶
autograd records computation history on the fly to calculate gradients later.
This is only enabled inside a
with autograd.record(): block.
with auto_grad.pause() block can be used inside a
to temporarily disable recording.
To compute gradient with respect to an
x, first call
to allocate space for the gradient. Then, start a
with autograd.record() block,
and do some computation. Finally, call
backward() on the result:
>>> x = mx.nd.array([1,2,3,4]) >>> x.attach_grad() >>> with mx.autograd.record(): ... y = x * x + 1 >>> y.backward() >>> print(x.grad) [ 2. 4. 6. 8.]
Train mode and Predict Mode¶
Some operators (Dropout, BatchNorm, etc) behave differently in
when training and when making predictions.
This can be controlled with
By default, MXNet is in
with autograd.record() block by default turns on
To compute a gradient in prediction mode (as when generating adversarial examples),
call record with
train_mode=False and then call
Although training usually coincides with recording,
this isn’t always the case.
To control training vs predict_mode without changing
recording vs not recording,
with autograd.predict_mode(): block.
Detailed tutorials are available in Part 1 of the MXNet gluon book.