smartgrid.make_env.make_basic_smartgrid¶

smartgrid.make_env.make_basic_smartgrid(rewards: List[Reward] | None = None, max_step: int = 10000) → SmartGrid[source]¶

Defines a “basic” scenario, a helper method to instantiate a SmartGrid env.

This method limits the available parameters, and hence, the possible customization. It is used to simplify the creation of an environment. This basic environment is configured with:

20 agents with the Household profile, 5 agents with the Office profile, and 1 agent with the School profile.
A smartgrid.util.RandomEnergyGenerator which provides each time step between 80% and 120% of the agents’ total energy need.
The smartgrid.rewards.numeric.differentiated.AdaptabilityThree reward function, whose definition changes at t=2000 and t=6000 (to force agents to adapt).
The default smartgrid.observation.ObservationManager to determine observations from the current state of the environment.

Users that desire full control over the environment creation, e.g., to experiment with various scenarii, should instead manually create the environment “from scratch”, as explained in the documentation. They may take inspiration from this method’s content to do so.

Parameters:

rewards – The list of reward functions to use (see the smartgrid.rewards package for a list of available reward functions. Traditionally, most users will want to use a single reward function (single-objective reinforcement learning), but this environment supports multi-objective reinforcement learning if desired. By default, the AdaptabilityThree reward function is used.
max_step – The maximum number of steps to simulate in the environment. By default, a maximum number of 10_000 steps are allowed; however, the environment can still be used after this amount, but it will raise a warning. This is mainly used to control the interaction loop automatically through the terminated and truncated values.

Returns:

An instance of a SmartGrid env. This instance must be, as per the PettingZoo framework, reset() before it can be used. The instance is wrapped in a RewardAggregator in order to produce single-objective rewards. To directly access the underlying env, use the unwrapped property.