smartgrid.rewards.numeric.per_agent.multi_objective_sum.MultiObjectiveSumPerAgent¶
- class smartgrid.rewards.numeric.per_agent.multi_objective_sum.MultiObjectiveSumPerAgent(coefficients=None)[source]¶
Bases:
Reward
Weighted sum of OverConsumptionPerAgent and Comfort.
The reward is equal to
0.2 * comfort + 0.8 * overconsumption
, wherecomfort
refers to the reward ofComfort
, andoverconsumption
refers to the reward ofOverConsumptionPerAgent
.The coefficients (
0.2
and0.8
) can be configured in the constructor. Note that, in this case, the sum of coefficients should be equal to1
, in order to have a weighted average, but this is not strictly mandatory.Methods
__init__
([coefficients])calculate
(world, agent)Compute the reward for a specific Agent at the current time step.
reset
()Reset the reward function.
Attributes
Uniquely identifying, human-readable name for this reward function.
- calculate(world, agent) float [source]¶
Compute the reward for a specific Agent at the current time step.
- Parameters:
world – The World, used to get the current state and determine consequences of the agent’s action.
agent – The Agent that is rewarded, used to access particular information about the agent (personal state) and its action.
- Returns:
A reward, i.e., a single value describing how well the agent performed. The higher the reward, the better its action was. Typically, a value in [0,1] but any range can be used.
- reset()¶
Reset the reward function.
This function must be overridden by reward functions that use a state, so that the state is reset with the environment. By default, does nothing, as most reward functions do not use a state.