smartgrid.rewards.numeric.per_agent.adaptability.AdaptabilityOnePerAgent¶

class smartgrid.rewards.numeric.per_agent.adaptability.AdaptabilityOnePerAgent[source]¶

Bases: Reward

Equity when t<3000, MultiObjectiveSum otherwise.

This reward function changes its definition after time step t=3000. With t < 3000, it performs exactly as the EquityPerAgent reward function. When t >= 3000, it performs as the MultiObjectiveSumPerAgent reward function, which is a weighted average of the Comfort and OverConsumptionPerAgent.

Thus, the targeted objectives are completely different in the two phases (equity vs comfort+overconsumption). This makes this reward function useful to evaluate whether agents are able to “completely” change their behaviour.

__init__()[source]¶

Methods

`__init__`()
`calculate`(world, agent)	Compute the reward for a specific Agent at the current time step.
`reset`()	Reset the reward function.

Attributes

name

Uniquely identifying, human-readable name for this reward function.

calculate(world, agent) → float[source]¶

Compute the reward for a specific Agent at the current time step.

Parameters:

world – The World, used to get the current state and determine consequences of the agent’s action.
agent – The Agent that is rewarded, used to access particular information about the agent (personal state) and its action.

Returns:

A reward, i.e., a single value describing how well the agent performed. The higher the reward, the better its action was. Typically, a value in [0,1] but any range can be used.

name: str¶: Uniquely identifying, human-readable name for this reward function.

reset()¶

Reset the reward function.

This function must be overridden by reward functions that use a state, so that the state is reset with the environment. By default, does nothing, as most reward functions do not use a state.