algorithms.util.action_perturbator

This module defines several classes to perturb (explore) actions.

An ActionPerturbator takes a continuous action, i.e., a vector of parameters, and returns a vector with same shape but different values. The difference between the original vector and the resulting one is called the perturbation, is often implemented as a random noise, and allows exploring the action space.

We have implemented 4 perturbators:

  • Epsilon applies a uniform random noise to all dimensions.

  • Gaussian applies a gaussian random noise to all dimensions.

  • MultiDim has a probability to apply a uniform noise to each dimension.

  • Identity performs no perturbation.

members:

Classes

ActionPerturbator()

EpsilonActionPerturbator(noise)

Implements a simple ε noise.

GaussianActionPerturbator(noise)

Implements a Gaussian (σ) noise.

IdentityAction()

Returns the same action (no perturbation).

MultiDimActionPerturbator(noise, probability)

Custom algorithm, with a probability to noise each dimension.