gluon.utils

Parallelization utility optimizer.

Functions

split_data(data, num_slice[, batch_axis, …])

Splits an NDArray into num_slice slices along batch_axis.

split_and_load(data, ctx_list[, batch_axis, …])

Splits an NDArray into len(ctx_list) slices along batch_axis and loads each slice to one context in ctx_list.

clip_global_norm(arrays, max_norm[, …])

Rescales NDArrays so that the sum of their 2-norm is smaller than max_norm.

check_sha1(filename, sha1_hash)

Check whether the sha1 hash of the file content matches the expected hash.

download(url[, path, overwrite, sha1_hash, …])

Download a given URL

replace_file(src, dst)

Implement atomic os.replace with linux and OSX.

split_data(data, num_slice, batch_axis=0, even_split=True)[source]

Splits an NDArray into num_slice slices along batch_axis. Usually used for data parallelism where each slices is sent to one device (i.e. GPU).

Parameters
  • data (NDArray) – A batch of data.

  • num_slice (int) – Number of desired slices.

  • batch_axis (int, default 0) – The axis along which to slice.

  • even_split (bool, default True) – Whether to force all slices to have the same number of elements. If True, an error will be raised when num_slice does not evenly divide data.shape[batch_axis].

Returns

Return value is a list even if num_slice is 1.

Return type

list of NDArray

split_and_load(data, ctx_list, batch_axis=0, even_split=True)[source]

Splits an NDArray into len(ctx_list) slices along batch_axis and loads each slice to one context in ctx_list.

Parameters
  • data (NDArray or ndarray) – A batch of data.

  • ctx_list (list of Context) – A list of Contexts.

  • batch_axis (int, default 0) – The axis along which to slice.

  • even_split (bool, default True) – Whether to force all slices to have the same number of elements.

Returns

Each corresponds to a context in ctx_list.

Return type

list of NDArrays or ndarrays

clip_global_norm(arrays, max_norm, check_isfinite=True)[source]

Rescales NDArrays so that the sum of their 2-norm is smaller than max_norm.

Parameters
  • arrays (list of NDArray) –

  • max_norm (float) –

  • check_isfinite (bool, default True) – If True, check that the total_norm is finite (not nan or inf). This requires a blocking .asscalar() call.

Returns

Total norm. Return type is NDArray of shape (1,) if check_isfinite is False. Otherwise a float is returned.

Return type

NDArray or float

check_sha1(filename, sha1_hash)[source]

Check whether the sha1 hash of the file content matches the expected hash.

Parameters
  • filename (str) – Path to the file.

  • sha1_hash (str) – Expected sha1 hash in hexadecimal digits.

Returns

Whether the file content matches the expected hash.

Return type

bool

download(url, path=None, overwrite=False, sha1_hash=None, retries=5, verify_ssl=True)[source]

Download a given URL

Parameters
  • url (str) – URL to download

  • path (str, optional) – Destination path to store downloaded file. By default stores to the current directory with same name as in url.

  • overwrite (bool, optional) – Whether to overwrite destination file if already exists.

  • sha1_hash (str, optional) – Expected sha1 hash in hexadecimal digits. Will ignore existing file when hash is specified but doesn’t match.

  • retries (integer, default 5) – The number of times to attempt the download in case of failure or non 200 return codes

  • verify_ssl (bool, default True) – Verify SSL certificates.

Returns

The file path of the downloaded file.

Return type

str

replace_file(src, dst)[source]

Implement atomic os.replace with linux and OSX.

Parameters
  • src (source file path) –

  • dst (destination file path) –