smartgrid.rewards.reward.Reward¶
- class smartgrid.rewards.reward.Reward(name: str | None = None)[source]¶
- Bases: - ABC- The Reward function is responsible for computing a reward for each agent. - The reward is a signal telling the agent to which degree it performed correctly, with respect to the objective(s) specified by the reward function. - Reward functions should judge the agent’s behaviour, based on its actions and/or the action’s consequences on the world (state). - The actuel reward function is defined in - calculate(); a simple function could be used instead, but using classes allows for easier extensions, and using attributes for complex computations.- A reward function is identified by its - name(by default, the class name); this name is particularly used when multiple reward functions are used (multi-objective reinforcement learning).- Methods - __init__([name])- calculate(world, agent)- Compute the reward for a specific Agent at the current time step. - is_activated(world, agent)- Determines whether the reward function should produce a reward. - reset()- Reset the reward function. - Attributes - Uniquely identifying, human-readable name for this reward function. - abstract calculate(world: World, agent: Agent) float[source]¶
- Compute the reward for a specific Agent at the current time step. - Parameters:
- world – The World, used to get the current state and determine consequences of the agent’s action. 
- agent – The Agent that is rewarded, used to access particular information about the agent (personal state) and its action. 
 
- Returns:
- A reward, i.e., a single value describing how well the agent performed. The higher the reward, the better its action was. Typically, a value in [0,1] but any range can be used. 
 
 - is_activated(world: World, agent: Agent) bool[source]¶
- Determines whether the reward function should produce a reward. - This function can be used to enable/disable the reward function at will, allowing for a variety of use cases (changing the reward function over the time, using different reward functions for different agents, etc.). - By default, it returns - Trueto avoid forcing the definition of this function. To specify when this reward function should be activated, two ways are possible:- Wrap the - Rewardobject in a constraint class, e.g.,- TimeConstrainedReward.
- Override this method in the subclass to implement the desired activation mechanism. 
 - Parameters:
- world – The World in which the reward function may be activated. 
- agent – The Agent that should (potentially) be rewarded by this reward function. 
 
- Returns:
- A boolean indicating whether the reward function should produce a reward at this moment (for this state of the world and this learning agent).