-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Vectorized/multi-agent environments compatibilitiy issues #777
Comments
For the last one (_reset should know the batch size) we could just pass an empty TensorDict instance. Wdyt? |
There might be use cases where only some of the dimensions of the vector have to be reset. For example, the done flag can state that only some simulations in the vector have to be reset. This is why methods such as |
We have that in ParallelEnv through a "resent_workers" key IIRC. |
Exactly, a key like that can be used in the If this key is not present the default could be reset all dims |
Motivation
Vectorized environments are environments that perform simulations using batches. This can be useful to benefit from parallel computation on GPUs. These environments have their own batch_sizes, which can be used for different reasons.
For example:
(n_vectorized_envs, obs_size)
(n_vectorized_envs, n_agents, obs_size)
Currently, torchrl environment infrastructure has some issues with environemnts which have non-empty batch sizes or that have a batch dimension for agents.
Ideally, we would like to use vectorized environments freely in torch rl and leverage its features such as
ParallelEnv
andCollectors
on top of such environments. This whould create tensordicts with many dimensions in the batch_size, for example:I created this issue to list and organize all the issues that need to be addressed in order to generalize to
BaseEnv
s with general batch sizes in torchrl:Issues
Stacking tensordicts of hetergoeneous shapes and nestedtensors compatibility (#766)(PR)
When some of the dimensions of the vectorized enironment are heterogenous (agents with different observation and action spaces that stil share the other batch dimensions), we need to carry this heterogeneous data in a suitable data straucture.
NestedTensors provide a natural candidate for this task. Here is a list of the operations that need to be supported by NestedTensors in order to enable this feature:
[[a, b], [a, c]]
into a single one of shape[[[a, b], [a, c]], [[a, b], [a, c]]]
)Heterogeneous
CompositeSpec
(#766)(PR #829)Bug on how
ParallelEnv
sets thebatch_size
(#773)(PR #774)Bug on using
sorted()
onCompositeSpec
keys (#775)(PR #787)Hangling of the done flag when it has arbitrary dimensions (#776)(PR #788)
The
_reset()
method needs to be able to know which dimensions and indexes to reset (#790)(PR #800)Collectors crash with enviornments with non-empty batch_size (#807)(PR #828)
The text was updated successfully, but these errors were encountered: