smartgrid.rewards.numeric.per_agent.adaptability.AdaptabilityThreePerAgent¶
- class smartgrid.rewards.numeric.per_agent.adaptability.AdaptabilityThreePerAgent[source]¶
Bases:
Reward
Equity when t<2000, (Equity+OverConsumption)/2 when t<6000, (Equity+OC+Comfort)/3 otherwise.
This reward function changes its definition after time step t=2000 and after t=6000. With t < 2000, it performs exactly as the
EquityPerAgent
reward function. When 2000 <= t < 6000, it returns the average ofEquityPerAgent
andOverConsumptionPerAgent
. Finally, when t >= 6000, it returns the average ofEquityPerAgent
,OverConsumptionPerAgent
, andComfort
.Thus, the targeted objectives increase in the second and third phases: the previous ones are kept, and a new one is added. This makes this reward function useful to evaluate whether agents are able to change their behaviour by taking into account new objectives in addition to previous ones.
Methods
__init__
()calculate
(world, agent)Compute the reward for a specific Agent at the current time step.
reset
()Reset the reward function.
Attributes
Uniquely identifying, human-readable name for this reward function.
- calculate(world, agent)[source]¶
Compute the reward for a specific Agent at the current time step.
- Parameters:
world – The World, used to get the current state and determine consequences of the agent’s action.
agent – The Agent that is rewarded, used to access particular information about the agent (personal state) and its action.
- Returns:
A reward, i.e., a single value describing how well the agent performed. The higher the reward, the better its action was. Typically, a value in [0,1] but any range can be used.
- reset()¶
Reset the reward function.
This function must be overridden by reward functions that use a state, so that the state is reset with the environment. By default, does nothing, as most reward functions do not use a state.