smartgrid.observation.local_observation.LocalObservation¶
- class smartgrid.observation.local_observation.LocalObservation(personal_storage: float, comfort: float, payoff: float)[source]¶
Bases:
BaseObservation
Observations that are local (individual) to a single Agent.
Observations cannot be modified once created, to limit potential bugs. Local observations are not shared with other agents, and contain the following measures:
- personal_storage
The amount of energy currently available in the
Agent
’s personal battery. This amount is represented as a ratio between 0 (empty) and 1 (full), w.r.t. the Agent’s battery capacity. SeeAgent.storage_ratio
for details.- comfort
This represents to which degree the agent satisfied its need by consuming energy. Intuitively, the more an agent’s consumption is close to its need, the closer the comfort will be to 1. Conversely, if an agent does not consume, its comfort will tend towards 0. Comfort is computed through the Agent’s comfort function; we describe several examples in the
comfort
module, which rely on generalized logistic curves (similar to a sigmoid).- payoff
The agent’s current amount of money. Money can be won by selling energy from the personal battery to the national grid, or lost by buying money from the national grid to the personal battery. The payoff observation is interpolated from the agent’s real payoff and the payoff range to obtain a value between 0 (a loss) and 1 (a win), with 0.5 being the neutral value (neither win nor loss).
Methods
__init__
(personal_storage, comfort, payoff)asdict
()Return the Observation as a dictionary.
compute
(world, agent)Return local observations for a single agent.
fields
()Returns the names of fields that compose an Observation.
reset
()Reset the LocalObservation class.
space
(world, agent)Describe the space in which Observations take their values.
Attributes
The ratio of energy available in the agent's personal storage, over capacity.
The agent's comfort, a value in
[0,1]
based on its consumption and need.The agent's current payoff, expressed as a ratio in
[0,1]
based on maximal and minimal allowed values.- asdict() Dict[str, Any] ¶
Return the Observation as a dictionary.
Fields can be excluded by setting the metadata
include
custom property toFalse
, such as:my_field: Any = field(metadata={'include': False})
.- Parameters:
self – An instance of observation.
- Returns:
The observation represented as a dictionary, with the fields’ names as keys and the fields’ values as values, in the order of definition.
- classmethod compute(world: World, agent: Agent) Self [source]¶
Return local observations for a single agent.
This function extracts the relevant measures from an
Agent
. Most of the computing has already been done in theAgent.update()
andAgent.handle_action()
methods.- Parameters:
world – The World in which the Agent is contained, for eventual data stored outside the agent.
agent – The Agent for which we want to compute the local observations.
- Return type:
- classmethod fields() Tuple[str] ¶
Returns the names of fields that compose an Observation.
Fields can be excluded by setting the metadata
include
custom property toFalse
, such as:my_field: Any = field(metadata={'include': False})
.- Parameters:
cls – Either the class itself, or an instance of the class; this method supports both. In other words, it can be used as
Observation.fields()
, orobs = Observation(...); obs.fields()
.- Returns:
The fields’ names as a tuple, in their order of definition.
- payoff: float¶
The agent’s current payoff, expressed as a ratio in
[0,1]
based on maximal and minimal allowed values.
- personal_storage: float¶
The ratio of energy available in the agent’s personal storage, over capacity.
- classmethod reset()[source]¶
Reset the LocalObservation class.
This method currently does nothing but is implemented to mirror the behaviour of
GlobalObservation
, and to allow extensions to use complex mechanisms that require areset
.
- classmethod space(world: World, agent: Agent) Space ¶
Describe the space in which Observations take their values.
This method is useful if an algorithm has assumptions or requirements on the observation space. For example, values can be interpolated, by knowing their original domain.
We currently use ratios in
[0, 1]
for each metric of observations. This makes it easier for learning algorithms (avoids perceiving a given dimension as more important than another because of an extended range). It also means that theworld
andagent
parameters do not influe on the space (they could beNone
).In the future, we could use the true ranges from the agent’s
AgentProfile
and let users convert these observations to[0, 1]
when necessary. This would provide more useful information, e.g., the actual battery storage in[0, capacity]
, rather than a ratio, or the actual hour in[0, 23]
rather than a value(h % 24) / 24
, which is hard to interpret for human users.- Parameters:
- Return type:
- Returns:
A gym Box, whose
low
field indicates the minimum value of each element of the observation vector. Similarly, thehigh
field indicates the maximum value of each element, such that each element i of the vector is contained betweenlow[i]
andhigh[i]
. The Box’s shape is the number of fields.