Warning

This package is currently experimental and may change in the near future.

## Overview¶

The autograd package enables automatic differentiation of NDArray operations. In machine learning applications, autograd is often used to calculate the gradients of loss functions with respect to parameters.

### Record vs Pause¶

autograd records computation history on the fly to calculate gradients later. This is only enabled inside a with autograd.record(): block. A with auto_grad.pause() block can be used inside a record() block to temporarily disable recording.

To compute gradient with respect to an NDArray x, first call x.attach_grad() to allocate space for the gradient. Then, start a with autograd.record() block, and do some computation. Finally, call backward() on the result:

>>> x = mx.nd.array([1,2,3,4])
...     y = x * x + 1
>>> y.backward()
[ 2.  4.  6.  8.]



## Train mode and Predict Mode¶

Some operators (Dropout, BatchNorm, etc) behave differently in when training and when making predictions. This can be controlled with train_mode and predict_mode scope.

By default, MXNet is in predict_mode. A with autograd.record() block by default turns on train_mode (equivalent to with autograd.record(train_mode=True)). To compute a gradient in prediction mode (as when generating adversarial examples), call record with train_mode=False and then call backward(train_mode=False)

Although training usually coincides with recording, this isn’t always the case. To control training vs predict_mode without changing recording vs not recording, Use a with autograd.train_mode(): or with autograd.predict_mode(): block.

Detailed tutorials are available in Part 1 of the MXNet gluon book.

