smartgrid.rewards.argumentation.supply_security.SupplySecurity¶
- class smartgrid.rewards.argumentation.supply_security.SupplySecurity(j_transform=None)[source]¶
Bases:
Reward
Reward based on the security of supply moral value, promoting well-being.
Methods
__init__
([j_transform])calculate
(world, agent)Compute the reward for a specific Agent at the current time step.
is_activated
(world, agent)Determines whether the reward function should produce a reward.
reset
()Reset the reward function.
Attributes
HIGH_GLOBAL
LOW_GLOBAL
Hardcoded decision to use for the judgment.
Uniquely identifying, human-readable name for this reward function.
The JudgingAgent that represents the implemented moral value.
- calculate(world, agent)[source]¶
Compute the reward for a specific Agent at the current time step.
- Parameters:
world – The World, used to get the current state and determine consequences of the agent’s action.
agent – The Agent that is rewarded, used to access particular information about the agent (personal state) and its action.
- Returns:
A reward, i.e., a single value describing how well the agent performed. The higher the reward, the better its action was. Typically, a value in [0,1] but any range can be used.
- decision: str = 'moral'¶
Hardcoded decision to use for the judgment.
The AFDM framework that we use supports multiple decisions; in this current version we only need to decide whether the action was moral or not. Thus, we use an hardcoded identifier for the decision: it must be the same when creating
ajar.argument.Argument
s and when performing the judgment!
- is_activated(world: World, agent: Agent) bool ¶
Determines whether the reward function should produce a reward.
This function can be used to enable/disable the reward function at will, allowing for a variety of use cases (changing the reward function over the time, using different reward functions for different agents, etc.).
By default, it returns
True
to avoid forcing the definition of this function. To specify when this reward function should be activated, two ways are possible:Wrap the
Reward
object in a constraint class, e.g.,TimeConstrainedReward
.Override this method in the subclass to implement the desired activation mechanism.
- Parameters:
world – The World in which the reward function may be activated.
agent – The Agent that should (potentially) be rewarded by this reward function.
- Returns:
A boolean indicating whether the reward function should produce a reward at this moment (for this state of the world and this learning agent).
- judge: JudgingAgent¶
The JudgingAgent that represents the implemented moral value.
It contains the argumentation graph (
ajar.afdm.AFDM
) that stores the arguments and attacks between them, as well as the transform function, and performs the judgment over a given situation.
- reset()¶
Reset the reward function.
This function must be overridden by reward functions that use a state, so that the state is reset with the environment. By default, does nothing, as most reward functions do not use a state.