Skip to content

Commit 7c453bf

Browse files
[Feature] Differentiable VMAS (#80)
* amend * amend * amend * amend * amend * amend * amend
1 parent 32e4295 commit 7c453bf

7 files changed

Lines changed: 151 additions & 360 deletions

File tree

README.md

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,8 @@
1818

1919
This repository contains the code for the Vectorized Multi-Agent Simulator (VMAS).
2020

21-
VMAS is a vectorized framework designed for efficient MARL benchmarking.
22-
It is comprised of a vectorized 2D physics engine written in PyTorch and a set of challenging multi-robot scenarios.
21+
VMAS is a vectorized differentiable simulator designed for efficient MARL benchmarking.
22+
It is comprised of a fully-differentiable vectorized 2D physics engine written in PyTorch and a set of challenging multi-robot scenarios.
2323
Scenario creation is made simple and modular to incentivize contributions.
2424
VMAS simulates agents and landmarks of different shapes and supports rotations, elastic collisions, joints, and custom gravity.
2525
Holonomic motion models are used for the agents to simplify simulation. Custom sensors such as LIDARs are available and the simulator supports inter-agent communication.
@@ -99,17 +99,17 @@ Watch the talk at DARS 2022 about VMAS.
9999
### Install
100100

101101
To install the simulator, you can use pip to get the latest release:
102-
```
102+
```bash
103103
pip install vmas
104104
```
105105
If you want to install the current master version (more up to date than latest release), you can do:
106-
```
106+
```bash
107107
git clone https://github.com/proroklab/VectorizedMultiAgentSimulator.git
108108
cd VectorizedMultiAgentSimulator
109109
pip install -e .
110110
```
111111
By default, vmas has only the core requirements. Here are some optional packages you may want to install:
112-
```
112+
```bash
113113
# Training
114114
pip install "ray[rllib]"==2.1.0 # We support versions "ray[rllib]<=2.2,>=1.13"
115115
pip install torchrl
@@ -132,7 +132,7 @@ The function arguments are explained in the documentation. The function returns
132132
object with the OpenAI gym interface:
133133

134134
Here is an example:
135-
```
135+
```python
136136
env = vmas.make_env(
137137
scenario="waterfall", # can be scenario name or BaseScenario class
138138
num_envs=32,
@@ -143,6 +143,7 @@ Here is an example:
143143
seed=None, # Seed of the environment
144144
dict_spaces=False, # By default tuple spaces are used with each element in the tuple being an agent.
145145
# If dict_spaces=True, the spaces will become Dict with each key being the agent's name
146+
grad_enabled=False, # If grad_enabled the simulator is differentiable and gradients can flow from output to input
146147
**kwargs # Additional arguments you want to pass to the scenario initialization
147148
)
148149
```
@@ -245,7 +246,8 @@ customizable. Examples are: drag, friction, gravity, simulation timestep, non-di
245246
- **Agent actions**: Agents' physical actions are 2D forces for holonomic motion. Agent rotation can also be controlled through a torque action (activated by setting `agent.action.u_rot_range` at agent creation time). Agents can also be equipped with continuous or discrete communication actions.
246247
- **Action preprocessing**: By implementing the `process_action` function of a scenario, you can modify the agents' actions before they are passed to the simulator. This is used in `controllers` (where we provide different types of controllers to use) and `dynamics` (where we provide custom robot dynamic models).
247248
- **Controllers**: Controllers are components that can be appended to the neural network policy or replace it completely. We provide a `VelocityController` which can be used to treat input actions as velocities (instead of default vmas input forces). This PID controller takes velocities and outputs the forces which are fed to the simulator. See the `vel_control` debug scenario for an example.
248-
- **Dynamic models**: VMAS simulates holonomic dynamics models by default. Custom dynamic constraints can be enforced in an action preprocessing step. Implementations now include `DiffDriveDynamics` for differential drive robots and `KinematicBicycleDynamics` for kinematic bicycle model. See `diff_drive` and `kinematic_bicycle` debug scenarios for examples.
249+
- **Dynamic models**: VMAS simulates holonomic dynamics models by default. Custom dynamics can be chosen at agent creation time. Implementations now include `DiffDriveDynamics` for differential drive robots and `KinematicBicycleDynamics` for kinematic bicycle model. See `diff_drive` and `kinematic_bicycle` debug scenarios for examples.
250+
- **Differentiable**: By setting `grad_enabled=True` when creating an environment, the simulator will be differentiable, allowing gradients flowing through any of its function.
249251

250252
## Creating a new scenario
251253

vmas/make_env.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ def make_env(
2424
dict_spaces: bool = False,
2525
multidiscrete_actions: bool = False,
2626
clamp_actions: bool = False,
27+
grad_enabled: bool = False,
2728
**kwargs,
2829
):
2930
"""
@@ -43,6 +44,7 @@ def make_env(
4344
action spaces of an agent.
4445
clamp_actions: Weather to clamp input actions to the range instead of throwing
4546
an error when continuous_actions is True and actions are out of bounds
47+
grad_enabled: (bool): Whether the simulator will keep track of gradients in the output. Default is ``False``.
4648
**kwargs ():
4749
4850
Returns:
@@ -64,6 +66,7 @@ def make_env(
6466
dict_spaces=dict_spaces,
6567
multidiscrete_actions=multidiscrete_actions,
6668
clamp_actions=clamp_actions,
69+
grad_enabled=grad_enabled,
6770
**kwargs,
6871
)
6972

vmas/scenarios/navigation.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Copyright (c) 2022-2023.
1+
# Copyright (c) 2022-2024.
22
# ProrokLab (https://www.proroklab.org/)
33
# All rights reserved.
44
import typing
@@ -14,7 +14,6 @@
1414
from vmas.simulator.sensors import Lidar
1515
from vmas.simulator.utils import Color, ScenarioUtils, X, Y
1616

17-
1817
if typing.TYPE_CHECKING:
1918
from vmas.simulator.rendering import Geom
2019

@@ -286,7 +285,7 @@ def extra_render(self, env_index: int = 0) -> "List[Geom]":
286285

287286

288287
class HeuristicPolicy(BaseHeuristicPolicy):
289-
def __init__(self, clf_epsilon = 0.2, clf_slack = 100.0, *args, **kwargs):
288+
def __init__(self, clf_epsilon=0.2, clf_slack=100.0, *args, **kwargs):
290289
super().__init__(*args, **kwargs)
291290
self.clf_epsilon = clf_epsilon # Exponential CLF convergence rate
292291
self.clf_slack = clf_slack # weights on CLF-QP slack variable

0 commit comments

Comments
 (0)