# Key-Value Store

# MXNet.mx.KVStoreType.

KVStore(kv_type = :local)


For single machine training, there are two commonly used types:

• local: Copies all gradients to CPU memory and updates weights there.
• device: Aggregates gradients and updates weights on GPU(s). With this setting, the KVStore also attempts to use GPU peer-to-peer communication, potentially accelerating the communication.

For distributed training, KVStore also supports a number of types:

• dist_sync: Behaves similarly to local but with one major difference. With dist_sync, batch-size now means the batch size used on each machine. So if there are n machines and we use batch size $b$, then dist_sync behaves like local with batch size n * b.
• dist_device_sync: Identical to dist_sync with the difference similar to device vs local.
• dist_async: Performs asynchronous updates. The weights are updated whenever gradients are received from any machine. No two updates happen on the same weight at the same time. However, the order is not guaranteed.

# MXNet.mx.barrierMethod.

barrier(kv::KVStore)


Invokes global barrier among all worker nodes.

For example, assume there are n machines. We would like machine 0 to first init the values and then have all the workers pull the initialized value. Before pulling, we can place invoke barrier(kv) to guarantee that the initialization is finished.

# MXNet.mx.init!Method.

init!(kv::KVStore, key::Int, val::NDArray)
init!(kv::KVStore, keys, vals)


Initializes a single or a sequence of key-value pairs into the store.

For each key, one must init! it before calling push! or pull!. When multiple workers invoke init! for the same key, only the value supplied by worker with rank 0 is used. This function returns after data has been initialized successfully.

julia> kv = KVStore(:local)
mx.KVStore @ local

julia> init!(kv, 42, mx.rand(2, 3))


# MXNet.mx.pull!Method.

Pulls a single value or a sequence of values from the store.

This function returns immediately after adding an operator to the engine. Subsequent attempts to read from the out variable will be blocked until the pull operation completes.

pull is executed asynchronously after all previous pull calls and only the last push call for the same input key(s) are finished.

The returned values are guaranteed to be the latest values in the store.

See pull! for more examples.

# MXNet.mx.setoptimizer!Method.

setoptimizer!(kv::KVStore, opt)


Registers an optimizer with the kvstore.

When using a single machine, this function updates the local optimizer. If using multiple machines and this operation is invoked from a worker node, it will serialized the optimizer with pickle and send it to all servers. The function returns after all servers have been updated.

julia> kv = KVStore()
mx.KVStore @ local

julia> W = mx.zeros(2, 3)  # 2×3 weight matrix
2×3 mx.NDArray{Float32,2} @ CPU0:
0.0  0.0  0.0
0.0  0.0  0.0

julia> init!(kv, 42, W)

julia> setoptimizer!(kv, SGD(η = .2))  # SGD with .2 as learning rate

julia> ∇W = mx.ones(2, 3)  # assume it's the gradient
2×3 mx.NDArray{Float32,2} @ CPU0:
1.0  1.0  1.0
1.0  1.0  1.0

julia> push!(kv, 42, ∇W)

julia> pull!(kv, 42, W)  # fetch weight and write back to W

julia> W
2×3 mx.NDArray{Float32,2} @ CPU0:
-0.2  -0.2  -0.2
-0.2  -0.2  -0.2


# MXNet.mx.setupdater!Method.

setupdater!(kv, updater)


Sets a push! updater into the store.

This function only changes the local store. When running on multiple machines one must use set_optimizer.

julia> update(key, val, orig) = mx.@inplace orig += val .* .2
update (generic function with 1 method)

julia> kv = KVStore(:local)
mx.KVStore @ local

julia> mx.setupdater!(kv, update)

julia> init!(kv, 42, mx.ones(2, 3))

julia> push!(kv, 42, mx.ones(2, 3))

julia> x = NDArray(undef, 2, 3);

julia> pull!(kv, 42, x)

julia> x
2×3 mx.NDArray{Float32,2} @ CPU0:
1.2  1.2  1.2
1.2  1.2  1.2


# Base.push!Method.

push!(kv::KVStore, key,  val;  priority = 0)
push!(kv::KVStore, key,  vals; priority = 0)
push!(kv::KVStore, keys, vals; priority = 0)


Pushes a single or a sequence of key-value pairs into the store.

This function returns immediately after adding an operator to the engine. The actual operation is executed asynchronously. If there are consecutive pushes to the same key, there is no guarantee on the serialization of pushes. The execution of a push does not guarantee that all previous pushes are finished. There is no synchronization between workers by default. One can use $barrier()$ to sync all workers.

push! and pull! single NDArray:

julia> kv = KVStore(:local)
mx.KVStore @ local

julia> x = NDArray(undef, 2, 3);

julia> init!(kv, 3, x)

julia> push!(kv, 3, mx.ones(2, 3) * 8)

julia> pull!(kv, 3, x)

julia> x
2×3 mx.NDArray{Float32,2} @ CPU0:
8.0  8.0  8.0
8.0  8.0  8.0


Aggregate values and push!:

julia> vals = [mx.ones((2, 3), gpu(0)) * 3, mx.ones((2, 3), gpu(1)) * 4];

julia> push!(kv, 3, vals)

julia> pull!(kv, 3, x)

julia> x
2×3 mx.NDArray{Float32,2} @ CPU0:
7.0  7.0  7.0
7.0  7.0  7.0


push! a list of key to single device:

julia> keys = [4, 5];

julia> init!(kv, keys, [NDArray(undef, 2, 3), NDArray(undef, 2, 3)])

julia> push!(kv, keys, [x, x])

julia> y, z = NDArray(undef, 2, 3), NDArray(undef, 2, 3);

julia> pull!(kv, keys, [y, z])