algorithms.util.action_perturbator¶
This module defines several classes to perturb (explore) actions.
An ActionPerturbator takes a continuous action, i.e., a vector of parameters, and returns a vector with same shape but different values. The difference between the original vector and the resulting one is called the perturbation, is often implemented as a random noise, and allows exploring the action space.
We have implemented 4 perturbators:
Epsilon applies a uniform random noise to all dimensions.
Gaussian applies a gaussian random noise to all dimensions.
MultiDim has a probability to apply a uniform noise to each dimension.
Identity performs no perturbation.
- members:
Classes
|
Implements a simple ε noise. |
|
Implements a Gaussian (σ) noise. |
Returns the same action (no perturbation). |
|
|
Custom algorithm, with a probability to noise each dimension. |