Key-Value Store
#
MXNet.mx.KVStore
— Type.
KVStore(kv_type = :local)
For single machine training, there are two commonly used types:
local
: Copies all gradients to CPU memory and updates weights there.device
: Aggregates gradients and updates weights on GPU(s). With this setting, theKVStore
also attempts to use GPU peer-to-peer communication, potentially accelerating the communication.
For distributed training, KVStore
also supports a number of types:
dist_sync
: Behaves similarly tolocal
but with one major difference. Withdist_sync
, batch-size now means the batch size used on each machine. So if there aren
machines and we use batch size $b$, thendist_sync
behaves likelocal
with batch sizen * b
.dist_device_sync
: Identical todist_sync
with the difference similar todevice
vslocal
.dist_async
: Performs asynchronous updates. The weights are updated whenever gradients are received from any machine. No two updates happen on the same weight at the same time. However, the order is not guaranteed.
#
MXNet.mx.barrier
— Method.
barrier(kv::KVStore)
Invokes global barrier among all worker nodes.
For example, assume there are n
machines. We would like machine 0
to first init
the values and then have all the workers pull
the initialized value. Before pulling, we can place invoke barrier(kv)
to guarantee that the initialization is finished.
#
MXNet.mx.init!
— Method.
init!(kv::KVStore, key::Int, val::NDArray)
init!(kv::KVStore, keys, vals)
Initializes a single or a sequence of key-value pairs into the store.
For each key, one must init!
it before calling push!
or pull!
. When multiple workers invoke init!
for the same key, only the value supplied by worker with rank 0
is used. This function returns after data has been initialized successfully.
julia> kv = KVStore(:local)
mx.KVStore @ local
julia> init!(kv, 42, mx.rand(2, 3))
#
MXNet.mx.pull!
— Method.
Pulls a single value or a sequence of values from the store.
This function returns immediately after adding an operator to the engine. Subsequent attempts to read from the out
variable will be blocked until the pull operation completes.
pull
is executed asynchronously after all previous pull
calls and only the last push
call for the same input key(s) are finished.
The returned values are guaranteed to be the latest values in the store.
See pull!
for more examples.
#
MXNet.mx.setoptimizer!
— Method.
setoptimizer!(kv::KVStore, opt)
Registers an optimizer with the kvstore.
When using a single machine, this function updates the local optimizer. If using multiple machines and this operation is invoked from a worker node, it will serialized the optimizer with pickle and send it to all servers. The function returns after all servers have been updated.
julia> kv = KVStore()
mx.KVStore @ local
julia> W = mx.zeros(2, 3) # 2×3 weight matrix
2×3 mx.NDArray{Float32,2} @ CPU0:
0.0 0.0 0.0
0.0 0.0 0.0
julia> init!(kv, 42, W)
julia> setoptimizer!(kv, SGD(η = .2)) # SGD with .2 as learning rate
julia> ∇W = mx.ones(2, 3) # assume it's the gradient
2×3 mx.NDArray{Float32,2} @ CPU0:
1.0 1.0 1.0
1.0 1.0 1.0
julia> push!(kv, 42, ∇W)
julia> pull!(kv, 42, W) # fetch weight and write back to `W`
julia> W
2×3 mx.NDArray{Float32,2} @ CPU0:
-0.2 -0.2 -0.2
-0.2 -0.2 -0.2
#
MXNet.mx.setupdater!
— Method.
setupdater!(kv, updater)
Sets a push!
updater into the store.
This function only changes the local store. When running on multiple machines one must use set_optimizer
.
julia> update(key, val, orig) = mx.@inplace orig += val .* .2
update (generic function with 1 method)
julia> kv = KVStore(:local)
mx.KVStore @ local
julia> mx.setupdater!(kv, update)
julia> init!(kv, 42, mx.ones(2, 3))
julia> push!(kv, 42, mx.ones(2, 3))
julia> x = NDArray(undef, 2, 3);
julia> pull!(kv, 42, x)
julia> x
2×3 mx.NDArray{Float32,2} @ CPU0:
1.2 1.2 1.2
1.2 1.2 1.2
#
Base.push!
— Method.
push!(kv::KVStore, key, val; priority = 0)
push!(kv::KVStore, key, vals; priority = 0)
push!(kv::KVStore, keys, vals; priority = 0)
Pushes a single or a sequence of key-value pairs into the store.
This function returns immediately after adding an operator to the engine. The actual operation is executed asynchronously. If there are consecutive pushes to the same key, there is no guarantee on the serialization of pushes. The execution of a push does not guarantee that all previous pushes are finished. There is no synchronization between workers by default. One can use $barrier()$ to sync all workers.
push!
and pull!
single NDArray
:
julia> kv = KVStore(:local)
mx.KVStore @ local
julia> x = NDArray(undef, 2, 3);
julia> init!(kv, 3, x)
julia> push!(kv, 3, mx.ones(2, 3) * 8)
julia> pull!(kv, 3, x)
julia> x
2×3 mx.NDArray{Float32,2} @ CPU0:
8.0 8.0 8.0
8.0 8.0 8.0
Aggregate values and push!
:
julia> vals = [mx.ones((2, 3), gpu(0)) * 3, mx.ones((2, 3), gpu(1)) * 4];
julia> push!(kv, 3, vals)
julia> pull!(kv, 3, x)
julia> x
2×3 mx.NDArray{Float32,2} @ CPU0:
7.0 7.0 7.0
7.0 7.0 7.0
push!
a list of key to single device:
julia> keys = [4, 5];
julia> init!(kv, keys, [NDArray(undef, 2, 3), NDArray(undef, 2, 3)])
julia> push!(kv, keys, [x, x])
julia> y, z = NDArray(undef, 2, 3), NDArray(undef, 2, 3);
julia> pull!(kv, keys, [y, z])