This tutorial provides basic usages of the C++ package through the classical handwritten digits identification database–MNIST.

The following contents assume that the working directory is /path/to/mxnet/cpp-package/example.

Load Data

Before going into codes, we need to fetch MNIST data. You can either use the script /path/to/mxnet/cpp-package/example/, or download mnist data by yourself from Lecun’s website and decompress them into data/mnist_data folder.

Except linking the MXNet shared library, the C++ package itself is a header-only package, which means all you need to do is to include the header files. Among the header files, op.h is special since it is generated dynamically. The generation should be done when building the C++ package. It is important to note that you need to copy the shared library ( in Linux and MacOS, libmxnet.dll in Windows) from /path/to/mxnet/lib to the working directory. We do not recommend you to use pre-built binaries because MXNet is under heavy development, the operator definitions in op.h may be incompatible with the pre-built version.

In order to use functionalities provides by the C++ package, first we include the general header file MxNetCpp.h and specify the namespaces.

#include "mxnet-cpp/MxNetCpp.h"

using namespace std;
using namespace mxnet::cpp;

Next we can use the data iter to load MNIST data (separated to training sets and validation sets). The digits in MNIST are 2-dimension arrays, so we should set flat to true to flatten the data.

auto train_iter = MXDataIter("MNISTIter")
    .SetParam("image", "./data/mnist_data/train-images-idx3-ubyte")
    .SetParam("label", "./data/mnist_data/train-labels-idx1-ubyte")
    .SetParam("batch_size", batch_size)
    .SetParam("flat", 1)
auto val_iter = MXDataIter("MNISTIter")
    .SetParam("image", "./data/mnist_data/t10k-images-idx3-ubyte")
    .SetParam("label", "./data/mnist_data/t10k-labels-idx1-ubyte")
    .SetParam("batch_size", batch_size)
    .SetParam("flat", 1)

The data have been successfully loaded. We can now easily construct various models to identify the digits with the help of C++ package.

GPU Support

It’s worth noting that changing context from Context::cpu() to Context::gpu() is not enough, because the data read by data iter are stored in memory, we cannot assign it directly to the parameters. To bridge this gap, NDArray provides data synchronization functionalities between GPU and CPU. We will illustrate it by making the mlp code run on GPU.

In the previous code, data are used like

args["X"] =;
args["label"] = data_batch.label;

It will be problematic if other parameters are created in the context of GPU. We can use NDArray::CopyTo to solve this problem.

// Data provided by DataIter are stored in memory, should be copied to GPU first.["X"]);
// CopyTo is imperative, need to wait for it to complete.

By replacing the former code to the latter one, we successfully port the code to GPU. You can find the complete code in mlp_gpu.cpp. Compilation is similar to the cpu version. Note that the shared library must be built with GPU support enabled.