Source code for algorithms.util.action_perturbator
"""This module defines several classes to perturb (explore) actions.An ActionPerturbator takes a continuous action, i.e., a vector of parameters,and returns a vector with same shape but different values.The difference between the original vector and the resulting one is calledthe perturbation, is often implemented as a random noise, and allows exploringthe action space.We have implemented 4 perturbators:- Epsilon applies a uniform random noise to all dimensions.- Gaussian applies a gaussian random noise to all dimensions.- MultiDim has a probability to apply a uniform noise to each dimension.- Identity performs no perturbation."""importnumpyasnp
[docs]defperturb(self,action,clip=True):""" Add a noise to the proposed action. :param action: A 1D vector of values representing the action. :type action: np.ndarray :param clip: Controls whether to clip the noised values in [0,1]. :return: A vector of noised values, with the same shape as `action`. :rtype: np.ndarray """raiseNotImplementedError()
[docs]defperturb(self,action,clip=True):# Draw a die for each dimension to apply (or not) a random noiseforkinrange(len(action)):ifnp.random.random()<self.proba:# Apply noise to this dimensionnoise=np.random.uniform(-self.noise,self.noise)action[k]=action[k]+noiseifclip:action[k]=np.clip(action[k],0.0,1.0)returnaction
[docs]classIdentityAction(ActionPerturbator):"""Returns the same action (no perturbation)."""