smartgrid.make_env.make_basic_smartgrid¶
- smartgrid.make_env.make_basic_smartgrid(rewards: List[Reward] | None = None, max_step: int = 10000) SmartGrid [source]¶
Defines a “basic” scenario, a helper method to instantiate a SmartGrid env.
This method limits the available parameters, and hence, the possible customization. It is used to simplify the creation of an environment. This basic environment is configured with:
20 agents with the Household profile, 5 agents with the Office profile, and 1 agent with the School profile.
A
smartgrid.util.RandomEnergyGenerator
which provides each time step between 80% and 120% of the agents’ total energy need.The
smartgrid.rewards.numeric.differentiated.AdaptabilityThree
reward function, whose definition changes at t=2000 and t=6000 (to force agents to adapt).The default
smartgrid.observation.ObservationManager
to determine observations from the current state of the environment.
Users that desire full control over the environment creation, e.g., to experiment with various scenarii, should instead manually create the environment “from scratch”, as explained in the documentation. They may take inspiration from this method’s content to do so.
- Parameters:
rewards – The list of reward functions to use (see the
smartgrid.rewards
package for a list of available reward functions. Traditionally, most users will want to use a single reward function (single-objective reinforcement learning), but this environment supports multi-objective reinforcement learning if desired. By default, theAdaptabilityThree
reward function is used.max_step – The maximum number of steps to simulate in the environment. By default, a maximum number of
10_000
steps are allowed; however, the environment can still be used after this amount, but it will raise a warning. This is mainly used to control the interaction loop automatically through the terminated and truncated values.
- Returns:
An instance of a
SmartGrid
env. This instance must be, as per the PettingZoo framework,reset()
before it can be used. The instance is wrapped in aRewardAggregator
in order to produce single-objective rewards. To directly access the underlying env, use theunwrapped
property.