mxnet.initializer

Weight initializer.

Classes

Bilinear()

Initialize weight for upsampling layers.

Constant(value)

Initializes the weights to a given value.

InitDesc

Descriptor for the initialization pattern.

Initializer(**kwargs)

The base class of an initializer.

LSTMBias([forget_bias])

Initialize all biases of an LSTMCell to 0.0 except for the forget gate whose bias is set to custom value.

Load(param[, default_init, verbose])

Initializes variables by loading data from file or dict.

MSRAPrelu([factor_type, slope])

Initialize the weight according to a MSRA paper.

Mixed(patterns, initializers)

Initialize parameters using multiple initializers.

Normal([sigma])

Initializes weights with random values sampled from a normal distribution with a mean of zero and standard deviation of sigma.

One()

Initializes weights to one.

Orthogonal([scale, rand_type])

Initialize weight as orthogonal matrix.

RNNFused(mode, num_layers, state_size[, …])

Initialize RNN fused parameter with bias part initialized to 0.0 and weight initialized with random values uniformly sampled from a given range.

Uniform([scale])

Initializes weights with random values uniformly sampled from a given range.

Xavier([rnd_type, factor_type, magnitude])

Returns an initializer performing “Xavier” initialization for weights.

Zero()

Initializes weights to zero.

Functions

register(klass)

Registers a custom initializer.

class Bilinear[source]

Bases: mxnet.initializer.Initializer

Initialize weight for upsampling layers.

class Constant(value)[source]

Bases: mxnet.initializer.Initializer

Initializes the weights to a given value. The value passed in can be a scalar or a NDarray that matches the shape of the parameter to be set.

Parameters

value (float, NDArray) – Value to set.

Methods

dumps()

Saves the initializer to string

dumps()[source]

Saves the initializer to string

Returns

JSON formatted string that describes the initializer.

Return type

str

Examples

>>> # Create initializer and retrieve its parameters
...
>>> init = mx.init.Normal(0.5)
>>> init.dumps()
'["normal", {"sigma": 0.5}]'
>>> init = mx.init.Xavier(factor_type="in", magnitude=2.34)
>>> init.dumps()
'["xavier", {"rnd_type": "uniform", "magnitude": 2.34, "factor_type": "in"}]'
class InitDesc[source]

Bases: str

Descriptor for the initialization pattern.

Parameters
  • name (str) – Name of variable.

  • attrs (dict of str to str) – Attributes of this variable taken from Symbol.attr_dict.

  • global_init (Initializer) – Global initializer to fallback to.

class Initializer(**kwargs)[source]

Bases: object

The base class of an initializer.

Methods

dumps()

Saves the initializer to string

set_verbosity([verbose, print_func])

Switch on/off verbose mode

dumps()[source]

Saves the initializer to string

Returns

JSON formatted string that describes the initializer.

Return type

str

Examples

>>> # Create initializer and retrieve its parameters
...
>>> init = mx.init.Normal(0.5)
>>> init.dumps()
'["normal", {"sigma": 0.5}]'
>>> init = mx.init.Xavier(factor_type="in", magnitude=2.34)
>>> init.dumps()
'["xavier", {"rnd_type": "uniform", "magnitude": 2.34, "factor_type": "in"}]'
set_verbosity(verbose=False, print_func=None)[source]

Switch on/off verbose mode

Parameters
  • verbose (bool) – switch on/off verbose mode

  • print_func (function) – A function that computes statistics of initialized arrays. Takes an NDArray and returns an str. Defaults to mean absolute value str((abs(x)/size(x)).asscalar()).

class LSTMBias(forget_bias=1.0)[source]

Bases: mxnet.initializer.Initializer

Initialize all biases of an LSTMCell to 0.0 except for the forget gate whose bias is set to custom value.

Parameters

forget_bias (float, default 1.0) – bias for the forget gate. Jozefowicz et al. 2015 recommends setting this to 1.0.

class Load(param, default_init=None, verbose=False)[source]

Bases: object

Initializes variables by loading data from file or dict.

Note Load will drop arg: or aux: from name and initialize the variables that match with the prefix dropped.

Parameters
  • param (str or dict of str->`NDArray`) – Parameter file or dict mapping name to NDArray.

  • default_init (Initializer) – Default initializer when name is not found in param.

  • verbose (bool) – Flag for enabling logging of source when initializing.

class MSRAPrelu(factor_type='avg', slope=0.25)[source]

Bases: mxnet.initializer.Xavier

Initialize the weight according to a MSRA paper.

This initializer implements Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, available at https://arxiv.org/abs/1502.01852.

This initializer is proposed for initialization related to ReLu activation, it makes some changes on top of Xavier method.

Parameters
  • factor_type (str, optional) – Can be 'avg', 'in', or 'out'.

  • slope (float, optional) – initial slope of any PReLU (or similar) nonlinearities.

class Mixed(patterns, initializers)[source]

Bases: object

Initialize parameters using multiple initializers.

Parameters
  • patterns (list of str) – List of regular expressions matching parameter names.

  • initializers (list of Initializer) – List of initializers corresponding to patterns.

Example

>>> # Given 'block', an instance of 'mxnet.gluon.Block', initialize biases to zero
... # and every other parameter to random values with uniform distribution.
...
>>> init = mx.initializer.Mixed(['bias', '.*'], [mx.init.Zero(), mx.init.Uniform(0.1)])
>>> block.initialize(init)
>>>
>>> for dictionary in module.get_params():
...     for key in dictionary:
...         print(key)
...         print(dictionary[key].asnumpy())
...
fullyconnected1_weight
[[ 0.0097627   0.01856892  0.04303787]]
fullyconnected1_bias
[ 0.]
class Normal(sigma=0.01)[source]

Bases: mxnet.initializer.Initializer

Initializes weights with random values sampled from a normal distribution with a mean of zero and standard deviation of sigma.

Parameters

sigma (float, optional) – Standard deviation of the normal distribution. Default standard deviation is 0.01.

Example

>>> # Given 'block', an instance of 'mxnet.gluon.Block', initialize weights
>>> # to random values sampled from a normal distribution.
...
>>> init = mx.init.Normal(0.5)
>>> module.initialize(init)
>>> for dictionary in module.get_params():
...     for key in dictionary:
...         print(key)
...         print(dictionary[key].asnumpy())
...
fullyconnected0_weight
[[-0.3214761  -0.12660924  0.53789419]]
class One[source]

Bases: mxnet.initializer.Initializer

Initializes weights to one.

Example

>>> # Given 'block', an instance of 'mxnet.gluon.Block', initialize weights to one.
...
>>> init = mx.initializer.One()
>>> module.initialize(init)
>>> for dictionary in module.get_params():
...     for key in dictionary:
...         print(key)
...         print(dictionary[key].asnumpy())
...
fullyconnected0_weight
[[ 1.  1.  1.]]
class Orthogonal(scale=1.414, rand_type='uniform')[source]

Bases: mxnet.initializer.Initializer

Initialize weight as orthogonal matrix.

This initializer implements Exact solutions to the nonlinear dynamics of learning in deep linear neural networks, available at https://arxiv.org/abs/1312.6120.

Parameters
  • scale (float optional) – Scaling factor of weight.

  • rand_type (string optional) – Use “uniform” or “normal” random number to initialize weight.

class RNNFused(mode, num_layers, state_size, bidirectional=False, projection_size=None, i2h_weight_initializer=None, h2h_weight_initializer=None, i2h_bias_initializer=None, h2h_bias_initializer=None, h2r_weight_initializer=None)[source]

Bases: mxnet.initializer.Initializer

Initialize RNN fused parameter with bias part initialized to 0.0 and weight initialized with random values uniformly sampled from a given range.

Parameters
  • mode ({'gru', 'lstm', 'rnn_relu', 'rnn_tanh'}, required) – the type of RNN to compute

  • num_layers (int (non-negative), required) – number of stacked layers

  • state_size (int (non-negative), required) – size of the state for each layer

  • bidirectional (boolean, optional, default=0) – whether to use bidirectional recurrent layers

  • projection_size (int or None, optional, default='None') – size of project size

  • scale (float, optional) – The bound on the range of the generated random values for weights. Values are generated from the range [-scale, scale]. Default scale is 0.07.

class Uniform(scale=0.07)[source]

Bases: mxnet.initializer.Initializer

Initializes weights with random values uniformly sampled from a given range.

Parameters

scale (float, optional) – The bound on the range of the generated random values. Values are generated from the range [-scale, scale]. Default scale is 0.07.

Example

>>> # Given 'block', an instance of 'mxnet.gluon.Block', initialize weights
>>> # to random values uniformly sampled between -0.1 and 0.1.
...
>>> init = mx.init.Uniform(0.1)
>>> module.initialize(init)
>>> for dictionary in module.get_params():
...     for key in dictionary:
...         print(key)
...         print(dictionary[key].asnumpy())
...
fullyconnected0_weight
[[ 0.01360891 -0.02144304  0.08511933]]
class Xavier(rnd_type='uniform', factor_type='avg', magnitude=3)[source]

Bases: mxnet.initializer.Initializer

Returns an initializer performing “Xavier” initialization for weights.

This initializer is designed to keep the scale of gradients roughly the same in all layers.

By default, rnd_type is 'uniform' and factor_type is 'avg', the initializer fills the weights with random numbers in the range of \([-c, c]\), where \(c = \sqrt{\frac{3.}{0.5 * (n_{in} + n_{out})}}\). \(n_{in}\) is the number of neurons feeding into weights, and \(n_{out}\) is the number of neurons the result is fed to.

If rnd_type is 'uniform' and factor_type is 'in', the \(c = \sqrt{\frac{3.}{n_{in}}}\). Similarly when factor_type is 'out', the \(c = \sqrt{\frac{3.}{n_{out}}}\).

If rnd_type is 'gaussian' and factor_type is 'avg', the initializer fills the weights with numbers from normal distribution with a standard deviation of \(\sqrt{\frac{3.}{0.5 * (n_{in} + n_{out})}}\).

Parameters
  • rnd_type (str, optional) – Random generator type, can be 'gaussian' or 'uniform'.

  • factor_type (str, optional) – Can be 'avg', 'in', or 'out'.

  • magnitude (float, optional) – Scale of random number.

class Zero[source]

Bases: mxnet.initializer.Initializer

Initializes weights to zero.

Example

>>> # Given 'block', an instance of 'mxnet.gluon.Block', initialize weights to zero.
...
>>> init = mx.initializer.Zero()
>>> module.initialize(init)
>>> for dictionary in module.get_params():
...     for key in dictionary:
...         print(key)
...         print(dictionary[key].asnumpy())
...
fullyconnected0_weight
[[ 0.  0.  0.]]
register(klass)[source]

Registers a custom initializer.

Custom initializers can be created by extending mx.init.Initializer and implementing the required functions like _init_weight and _init_bias. The created initializer must be registered using mx.init.register before it can be called by name.

Parameters

klass (class) – A subclass of mx.init.Initializer that needs to be registered as a custom initializer.

Example

>>> # Create and register a custom initializer that
... # initializes weights to 0.1 and biases to 1.
...
>>> @mx.init.register
... @alias('myinit')
... class CustomInit(mx.init.Initializer):
...   def __init__(self):
...     super(CustomInit, self).__init__()
...   def _init_weight(self, _, arr):
...     arr[:] = 0.1
...   def _init_bias(self, _, arr):
...     arr[:] = 1
...
>>> # block is an instance of 'mxnet.gluon.Block'
...
>>> block.initialize(CustomInit())