Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix non-full TensorStorage indexing #1730

Merged
merged 2 commits into from
Dec 4, 2023
Merged

[BugFix] Fix non-full TensorStorage indexing #1730

merged 2 commits into from
Dec 4, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 4, 2023

Fixes #1729

Copy link

pytorch-bot bot commented Dec 4, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1730

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (3 Unrelated Failures)

As of commit e7185ea with merge base 3f2ecfc (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 4, 2023
@vmoens vmoens marked this pull request as ready for review December 4, 2023 08:59
@vmoens vmoens added the bug Something isn't working label Dec 4, 2023
Copy link

github-actions bot commented Dec 4, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 89. Improved: $\large\color{#35bf28}20$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 63.0765ms 62.7159ms 15.9449 Ops/s 14.5577 Ops/s $\textbf{\color{#35bf28}+9.53\%}$
test_sync 39.4908ms 34.7402ms 28.7851 Ops/s 27.7345 Ops/s $\color{#35bf28}+3.79\%$
test_async 58.5218ms 32.0911ms 31.1613 Ops/s 30.0563 Ops/s $\color{#35bf28}+3.68\%$
test_simple 0.4951s 0.4357s 2.2950 Ops/s 2.2181 Ops/s $\color{#35bf28}+3.47\%$
test_transformed 0.6507s 0.6013s 1.6631 Ops/s 1.5861 Ops/s $\color{#35bf28}+4.86\%$
test_serial 1.3856s 1.3361s 0.7485 Ops/s 0.7398 Ops/s $\color{#35bf28}+1.17\%$
test_parallel 1.3359s 1.2700s 0.7874 Ops/s 0.7545 Ops/s $\color{#35bf28}+4.36\%$
test_step_mdp_speed[True-True-True-True-True] 0.1820ms 22.7777μs 43.9027 KOps/s 42.1164 KOps/s $\color{#35bf28}+4.24\%$
test_step_mdp_speed[True-True-True-True-False] 46.9580μs 14.0078μs 71.3887 KOps/s 69.1936 KOps/s $\color{#35bf28}+3.17\%$
test_step_mdp_speed[True-True-True-False-True] 37.8410μs 13.8852μs 72.0189 KOps/s 69.2562 KOps/s $\color{#35bf28}+3.99\%$
test_step_mdp_speed[True-True-True-False-False] 36.0970μs 8.4338μs 118.5699 KOps/s 114.1077 KOps/s $\color{#35bf28}+3.91\%$
test_step_mdp_speed[True-True-False-True-True] 67.7460μs 24.2516μs 41.2345 KOps/s 39.3236 KOps/s $\color{#35bf28}+4.86\%$
test_step_mdp_speed[True-True-False-True-False] 80.6410μs 15.1914μs 65.8265 KOps/s 62.8130 KOps/s $\color{#35bf28}+4.80\%$
test_step_mdp_speed[True-True-False-False-True] 42.1390μs 15.2474μs 65.5848 KOps/s 61.4202 KOps/s $\textbf{\color{#35bf28}+6.78\%}$
test_step_mdp_speed[True-True-False-False-False] 43.1910μs 9.8814μs 101.2007 KOps/s 98.1792 KOps/s $\color{#35bf28}+3.08\%$
test_step_mdp_speed[True-False-True-True-True] 53.6410μs 25.6156μs 39.0387 KOps/s 36.9390 KOps/s $\textbf{\color{#35bf28}+5.68\%}$
test_step_mdp_speed[True-False-True-True-False] 60.9140μs 16.4552μs 60.7712 KOps/s 57.9187 KOps/s $\color{#35bf28}+4.92\%$
test_step_mdp_speed[True-False-True-False-True] 41.5380μs 15.0050μs 66.6443 KOps/s 62.3038 KOps/s $\textbf{\color{#35bf28}+6.97\%}$
test_step_mdp_speed[True-False-True-False-False] 34.8850μs 9.7398μs 102.6715 KOps/s 99.1189 KOps/s $\color{#35bf28}+3.58\%$
test_step_mdp_speed[True-False-False-True-True] 54.2120μs 26.8794μs 37.2032 KOps/s 36.1487 KOps/s $\color{#35bf28}+2.92\%$
test_step_mdp_speed[True-False-False-True-False] 48.4610μs 17.7836μs 56.2315 KOps/s 54.2362 KOps/s $\color{#35bf28}+3.68\%$
test_step_mdp_speed[True-False-False-False-True] 43.5410μs 16.2196μs 61.6537 KOps/s 58.6386 KOps/s $\textbf{\color{#35bf28}+5.14\%}$
test_step_mdp_speed[True-False-False-False-False] 47.8100μs 10.9793μs 91.0805 KOps/s 88.4874 KOps/s $\color{#35bf28}+2.93\%$
test_step_mdp_speed[False-True-True-True-True] 63.3880μs 25.5329μs 39.1652 KOps/s 37.3448 KOps/s $\color{#35bf28}+4.87\%$
test_step_mdp_speed[False-True-True-True-False] 51.7970μs 16.4743μs 60.7006 KOps/s 57.3181 KOps/s $\textbf{\color{#35bf28}+5.90\%}$
test_step_mdp_speed[False-True-True-False-True] 47.8600μs 17.5470μs 56.9898 KOps/s 54.1406 KOps/s $\textbf{\color{#35bf28}+5.26\%}$
test_step_mdp_speed[False-True-True-False-False] 41.2870μs 10.9429μs 91.3835 KOps/s 86.9406 KOps/s $\textbf{\color{#35bf28}+5.11\%}$
test_step_mdp_speed[False-True-False-True-True] 60.2930μs 26.9712μs 37.0766 KOps/s 35.5638 KOps/s $\color{#35bf28}+4.25\%$
test_step_mdp_speed[False-True-False-True-False] 70.0410μs 17.5072μs 57.1194 KOps/s 54.4121 KOps/s $\color{#35bf28}+4.98\%$
test_step_mdp_speed[False-True-False-False-True] 51.9580μs 18.6000μs 53.7633 KOps/s 51.6673 KOps/s $\color{#35bf28}+4.06\%$
test_step_mdp_speed[False-True-False-False-False] 39.0240μs 12.1136μs 82.5516 KOps/s 78.3659 KOps/s $\textbf{\color{#35bf28}+5.34\%}$
test_step_mdp_speed[False-False-True-True-True] 70.3010μs 28.1825μs 35.4830 KOps/s 34.2518 KOps/s $\color{#35bf28}+3.59\%$
test_step_mdp_speed[False-False-True-True-False] 45.6950μs 18.8548μs 53.0368 KOps/s 49.7425 KOps/s $\textbf{\color{#35bf28}+6.62\%}$
test_step_mdp_speed[False-False-True-False-True] 69.6010μs 18.6647μs 53.5770 KOps/s 50.9333 KOps/s $\textbf{\color{#35bf28}+5.19\%}$
test_step_mdp_speed[False-False-True-False-False] 47.8000μs 12.1678μs 82.1842 KOps/s 77.0926 KOps/s $\textbf{\color{#35bf28}+6.60\%}$
test_step_mdp_speed[False-False-False-True-True] 68.2780μs 29.1377μs 34.3198 KOps/s 32.6988 KOps/s $\color{#35bf28}+4.96\%$
test_step_mdp_speed[False-False-False-True-False] 52.7090μs 19.9517μs 50.1211 KOps/s 47.3669 KOps/s $\textbf{\color{#35bf28}+5.81\%}$
test_step_mdp_speed[False-False-False-False-True] 66.5540μs 19.4836μs 51.3251 KOps/s 48.1672 KOps/s $\textbf{\color{#35bf28}+6.56\%}$
test_step_mdp_speed[False-False-False-False-False] 52.4380μs 13.2549μs 75.4437 KOps/s 72.8202 KOps/s $\color{#35bf28}+3.60\%$
test_values[generalized_advantage_estimate-True-True] 13.8884ms 12.0777ms 82.7975 Ops/s 81.5401 Ops/s $\color{#35bf28}+1.54\%$
test_values[vec_generalized_advantage_estimate-True-True] 34.4984ms 27.0537ms 36.9635 Ops/s 37.2554 Ops/s $\color{#d91a1a}-0.78\%$
test_values[td0_return_estimate-False-False] 0.2624ms 0.1821ms 5.4903 KOps/s 5.2296 KOps/s $\color{#35bf28}+4.99\%$
test_values[td1_return_estimate-False-False] 29.0286ms 25.7003ms 38.9100 Ops/s 38.0875 Ops/s $\color{#35bf28}+2.16\%$
test_values[vec_td1_return_estimate-False-False] 35.1153ms 27.0400ms 36.9822 Ops/s 37.0105 Ops/s $\color{#d91a1a}-0.08\%$
test_values[td_lambda_return_estimate-True-False] 39.3461ms 35.8667ms 27.8810 Ops/s 27.3360 Ops/s $\color{#35bf28}+1.99\%$
test_values[vec_td_lambda_return_estimate-True-False] 29.7598ms 26.9875ms 37.0543 Ops/s 36.7774 Ops/s $\color{#35bf28}+0.75\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.6699ms 8.0671ms 123.9607 Ops/s 122.8571 Ops/s $\color{#35bf28}+0.90\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 10.2080ms 1.9279ms 518.6966 Ops/s 537.2974 Ops/s $\color{#d91a1a}-3.46\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.9329ms 0.4346ms 2.3009 KOps/s 2.2268 KOps/s $\color{#35bf28}+3.33\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 49.5832ms 41.9191ms 23.8555 Ops/s 24.0415 Ops/s $\color{#d91a1a}-0.77\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 14.2950ms 2.5761ms 388.1824 Ops/s 380.8575 Ops/s $\color{#35bf28}+1.92\%$
test_dqn_speed 9.8079ms 1.6331ms 612.3369 Ops/s 594.1852 Ops/s $\color{#35bf28}+3.05\%$
test_ddpg_speed 71.2263ms 3.8577ms 259.2195 Ops/s 268.4188 Ops/s $\color{#d91a1a}-3.43\%$
test_sac_speed 20.5992ms 10.3371ms 96.7385 Ops/s 94.4894 Ops/s $\color{#35bf28}+2.38\%$
test_redq_speed 28.1397ms 19.5305ms 51.2019 Ops/s 49.8548 Ops/s $\color{#35bf28}+2.70\%$
test_redq_deprec_speed 24.0428ms 15.6032ms 64.0896 Ops/s 61.6659 Ops/s $\color{#35bf28}+3.93\%$
test_td3_speed 20.4095ms 10.8063ms 92.5383 Ops/s 92.2641 Ops/s $\color{#35bf28}+0.30\%$
test_cql_speed 48.1028ms 39.8325ms 25.1052 Ops/s 25.4156 Ops/s $\color{#d91a1a}-1.22\%$
test_a2c_speed 19.1046ms 9.4116ms 106.2522 Ops/s 118.8773 Ops/s $\textbf{\color{#d91a1a}-10.62\%}$
test_ppo_speed 17.8085ms 9.3178ms 107.3211 Ops/s 104.5874 Ops/s $\color{#35bf28}+2.61\%$
test_reinforce_speed 16.3640ms 7.9716ms 125.4448 Ops/s 136.4411 Ops/s $\textbf{\color{#d91a1a}-8.06\%}$
test_iql_speed 40.1942ms 35.9148ms 27.8436 Ops/s 25.1169 Ops/s $\textbf{\color{#35bf28}+10.86\%}$
test_sample_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.3538ms 1.9761ms 506.0437 Ops/s 502.2963 Ops/s $\color{#35bf28}+0.75\%$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 4.0801ms 2.1393ms 467.4330 Ops/s 469.3691 Ops/s $\color{#d91a1a}-0.41\%$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3.2409ms 2.1527ms 464.5332 Ops/s 466.9623 Ops/s $\color{#d91a1a}-0.52\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.2850ms 2.0104ms 497.4044 Ops/s 490.8601 Ops/s $\color{#35bf28}+1.33\%$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 4.0234ms 2.1529ms 464.4804 Ops/s 457.9699 Ops/s $\color{#35bf28}+1.42\%$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3.2733ms 2.1124ms 473.4018 Ops/s 441.1758 Ops/s $\textbf{\color{#35bf28}+7.30\%}$
test_sample_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.7834ms 1.9894ms 502.6672 Ops/s 479.1786 Ops/s $\color{#35bf28}+4.90\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 4.3548ms 2.1193ms 471.8471 Ops/s 456.1630 Ops/s $\color{#35bf28}+3.44\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 3.9645ms 2.1368ms 467.9822 Ops/s 449.3190 Ops/s $\color{#35bf28}+4.15\%$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.8649ms 1.9666ms 508.4811 Ops/s 474.8271 Ops/s $\textbf{\color{#35bf28}+7.09\%}$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3.3831ms 2.1389ms 467.5281 Ops/s 448.9442 Ops/s $\color{#35bf28}+4.14\%$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 4.4158ms 2.1624ms 462.4452 Ops/s 447.7233 Ops/s $\color{#35bf28}+3.29\%$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.5222ms 1.9838ms 504.0851 Ops/s 474.6371 Ops/s $\textbf{\color{#35bf28}+6.20\%}$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.1122s 2.4370ms 410.3397 Ops/s 449.8016 Ops/s $\textbf{\color{#d91a1a}-8.77\%}$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 4.0112ms 2.1225ms 471.1508 Ops/s 447.0627 Ops/s $\textbf{\color{#35bf28}+5.39\%}$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.7439ms 1.9991ms 500.2231 Ops/s 478.7203 Ops/s $\color{#35bf28}+4.49\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.1143s 2.3663ms 422.5926 Ops/s 474.7698 Ops/s $\textbf{\color{#d91a1a}-10.99\%}$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 3.5472ms 2.1104ms 473.8445 Ops/s 466.6356 Ops/s $\color{#35bf28}+1.54\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1869s 18.3514ms 54.4918 Ops/s 54.9259 Ops/s $\color{#d91a1a}-0.79\%$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 0.1189s 16.7683ms 59.6363 Ops/s 58.4530 Ops/s $\color{#35bf28}+2.02\%$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 0.1239s 17.2224ms 58.0640 Ops/s 57.8410 Ops/s $\color{#35bf28}+0.39\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1207s 17.3825ms 57.5290 Ops/s 58.9312 Ops/s $\color{#d91a1a}-2.38\%$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.1132s 16.7818ms 59.5884 Ops/s 51.7953 Ops/s $\textbf{\color{#35bf28}+15.05\%}$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 0.1213s 19.4010ms 51.5439 Ops/s 57.8207 Ops/s $\textbf{\color{#d91a1a}-10.86\%}$
test_populate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1164s 16.9749ms 58.9103 Ops/s 57.8676 Ops/s $\color{#35bf28}+1.80\%$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 0.1196s 17.1722ms 58.2335 Ops/s 57.6165 Ops/s $\color{#35bf28}+1.07\%$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 0.1205s 17.1275ms 58.3856 Ops/s 59.0988 Ops/s $\color{#d91a1a}-1.21\%$

Copy link

github-actions bot commented Dec 4, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 92. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1246s 0.1241s 8.0574 Ops/s 8.2438 Ops/s $\color{#d91a1a}-2.26\%$
test_sync 0.1034s 0.1029s 9.7184 Ops/s 9.7607 Ops/s $\color{#d91a1a}-0.43\%$
test_async 0.2671s 0.1001s 9.9922 Ops/s 9.9664 Ops/s $\color{#35bf28}+0.26\%$
test_single_pixels 0.1339s 0.1338s 7.4756 Ops/s 6.8648 Ops/s $\textbf{\color{#35bf28}+8.90\%}$
test_sync_pixels 97.8031ms 95.3772ms 10.4847 Ops/s 10.3141 Ops/s $\color{#35bf28}+1.65\%$
test_async_pixels 0.2488s 92.3169ms 10.8323 Ops/s 10.9344 Ops/s $\color{#d91a1a}-0.93\%$
test_simple 0.9849s 0.9275s 1.0781 Ops/s 1.1335 Ops/s $\color{#d91a1a}-4.88\%$
test_transformed 1.2305s 1.1567s 0.8645 Ops/s 0.8807 Ops/s $\color{#d91a1a}-1.83\%$
test_serial 2.5702s 2.5260s 0.3959 Ops/s 0.4093 Ops/s $\color{#d91a1a}-3.27\%$
test_parallel 2.6192s 2.5304s 0.3952 Ops/s 0.4014 Ops/s $\color{#d91a1a}-1.54\%$
test_step_mdp_speed[True-True-True-True-True] 0.1049ms 34.8320μs 28.7093 KOps/s 28.8768 KOps/s $\color{#d91a1a}-0.58\%$
test_step_mdp_speed[True-True-True-True-False] 40.0700μs 20.8464μs 47.9699 KOps/s 49.6185 KOps/s $\color{#d91a1a}-3.32\%$
test_step_mdp_speed[True-True-True-False-True] 49.6300μs 21.0500μs 47.5060 KOps/s 48.7968 KOps/s $\color{#d91a1a}-2.65\%$
test_step_mdp_speed[True-True-True-False-False] 26.6500μs 12.1604μs 82.2340 KOps/s 83.8089 KOps/s $\color{#d91a1a}-1.88\%$
test_step_mdp_speed[True-True-False-True-True] 52.8510μs 36.4417μs 27.4411 KOps/s 27.8361 KOps/s $\color{#d91a1a}-1.42\%$
test_step_mdp_speed[True-True-False-True-False] 45.1400μs 22.4641μs 44.5156 KOps/s 45.7299 KOps/s $\color{#d91a1a}-2.66\%$
test_step_mdp_speed[True-True-False-False-True] 49.8300μs 22.5394μs 44.3667 KOps/s 46.3775 KOps/s $\color{#d91a1a}-4.34\%$
test_step_mdp_speed[True-True-False-False-False] 41.5010μs 14.0452μs 71.1985 KOps/s 73.9069 KOps/s $\color{#d91a1a}-3.66\%$
test_step_mdp_speed[True-False-True-True-True] 92.7810μs 38.9622μs 25.6659 KOps/s 26.5136 KOps/s $\color{#d91a1a}-3.20\%$
test_step_mdp_speed[True-False-True-True-False] 71.5100μs 24.8561μs 40.2315 KOps/s 42.1898 KOps/s $\color{#d91a1a}-4.64\%$
test_step_mdp_speed[True-False-True-False-True] 47.7500μs 23.0098μs 43.4598 KOps/s 45.5730 KOps/s $\color{#d91a1a}-4.64\%$
test_step_mdp_speed[True-False-True-False-False] 47.7000μs 14.0245μs 71.3039 KOps/s 73.6083 KOps/s $\color{#d91a1a}-3.13\%$
test_step_mdp_speed[True-False-False-True-True] 66.9010μs 39.9618μs 25.0239 KOps/s 25.4867 KOps/s $\color{#d91a1a}-1.82\%$
test_step_mdp_speed[True-False-False-True-False] 51.2410μs 26.3365μs 37.9701 KOps/s 39.3713 KOps/s $\color{#d91a1a}-3.56\%$
test_step_mdp_speed[True-False-False-False-True] 47.7500μs 24.1207μs 41.4582 KOps/s 42.7098 KOps/s $\color{#d91a1a}-2.93\%$
test_step_mdp_speed[True-False-False-False-False] 35.3400μs 15.8227μs 63.2005 KOps/s 64.9858 KOps/s $\color{#d91a1a}-2.75\%$
test_step_mdp_speed[False-True-True-True-True] 82.7610μs 38.4463μs 26.0103 KOps/s 26.2966 KOps/s $\color{#d91a1a}-1.09\%$
test_step_mdp_speed[False-True-True-True-False] 47.8200μs 24.5997μs 40.6509 KOps/s 41.8399 KOps/s $\color{#d91a1a}-2.84\%$
test_step_mdp_speed[False-True-True-False-True] 72.4200μs 27.2759μs 36.6624 KOps/s 38.1819 KOps/s $\color{#d91a1a}-3.98\%$
test_step_mdp_speed[False-True-True-False-False] 41.6500μs 15.9834μs 62.5649 KOps/s 64.6384 KOps/s $\color{#d91a1a}-3.21\%$
test_step_mdp_speed[False-True-False-True-True] 73.7810μs 40.5305μs 24.6728 KOps/s 25.4177 KOps/s $\color{#d91a1a}-2.93\%$
test_step_mdp_speed[False-True-False-True-False] 46.9310μs 26.7629μs 37.3652 KOps/s 38.1053 KOps/s $\color{#d91a1a}-1.94\%$
test_step_mdp_speed[False-True-False-False-True] 48.7210μs 28.9070μs 34.5937 KOps/s 35.9514 KOps/s $\color{#d91a1a}-3.78\%$
test_step_mdp_speed[False-True-False-False-False] 47.8800μs 17.6805μs 56.5594 KOps/s 56.7215 KOps/s $\color{#d91a1a}-0.29\%$
test_step_mdp_speed[False-False-True-True-True] 68.6400μs 42.2318μs 23.6788 KOps/s 24.0662 KOps/s $\color{#d91a1a}-1.61\%$
test_step_mdp_speed[False-False-True-True-False] 51.9510μs 28.3186μs 35.3125 KOps/s 36.4980 KOps/s $\color{#d91a1a}-3.25\%$
test_step_mdp_speed[False-False-True-False-True] 48.6610μs 29.0453μs 34.4290 KOps/s 35.3839 KOps/s $\color{#d91a1a}-2.70\%$
test_step_mdp_speed[False-False-True-False-False] 35.3900μs 17.8427μs 56.0452 KOps/s 58.3523 KOps/s $\color{#d91a1a}-3.95\%$
test_step_mdp_speed[False-False-False-True-True] 67.4610μs 43.6586μs 22.9050 KOps/s 23.9502 KOps/s $\color{#d91a1a}-4.36\%$
test_step_mdp_speed[False-False-False-True-False] 88.6010μs 30.4188μs 32.8745 KOps/s 33.9310 KOps/s $\color{#d91a1a}-3.11\%$
test_step_mdp_speed[False-False-False-False-True] 59.4000μs 29.9306μs 33.4106 KOps/s 34.4182 KOps/s $\color{#d91a1a}-2.93\%$
test_step_mdp_speed[False-False-False-False-False] 44.7510μs 19.5267μs 51.2118 KOps/s 53.3536 KOps/s $\color{#d91a1a}-4.01\%$
test_values[generalized_advantage_estimate-True-True] 27.6619ms 27.0432ms 36.9779 Ops/s 38.5409 Ops/s $\color{#d91a1a}-4.06\%$
test_values[vec_generalized_advantage_estimate-True-True] 86.7742ms 3.3151ms 301.6544 Ops/s 302.7924 Ops/s $\color{#d91a1a}-0.38\%$
test_values[td0_return_estimate-False-False] 0.1042ms 68.8091μs 14.5330 KOps/s 15.0000 KOps/s $\color{#d91a1a}-3.11\%$
test_values[td1_return_estimate-False-False] 62.6175ms 59.8386ms 16.7116 Ops/s 17.9212 Ops/s $\textbf{\color{#d91a1a}-6.75\%}$
test_values[vec_td1_return_estimate-False-False] 2.1397ms 1.7800ms 561.8031 Ops/s 572.5005 Ops/s $\color{#d91a1a}-1.87\%$
test_values[td_lambda_return_estimate-True-False] 0.1002s 98.7090ms 10.1308 Ops/s 11.1887 Ops/s $\textbf{\color{#d91a1a}-9.46\%}$
test_values[vec_td_lambda_return_estimate-True-False] 2.0876ms 1.7493ms 571.6680 Ops/s 574.8102 Ops/s $\color{#d91a1a}-0.55\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 27.5966ms 26.7664ms 37.3602 Ops/s 40.5709 Ops/s $\textbf{\color{#d91a1a}-7.91\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.9589ms 0.7522ms 1.3295 KOps/s 1.3613 KOps/s $\color{#d91a1a}-2.34\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7705ms 0.7014ms 1.4258 KOps/s 1.4508 KOps/s $\color{#d91a1a}-1.72\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6045ms 1.5104ms 662.0960 Ops/s 672.9552 Ops/s $\color{#d91a1a}-1.61\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.9900ms 0.7331ms 1.3641 KOps/s 1.3909 KOps/s $\color{#d91a1a}-1.93\%$
test_dqn_speed 7.7457ms 1.5067ms 663.6880 Ops/s 678.6054 Ops/s $\color{#d91a1a}-2.20\%$
test_ddpg_speed 4.9502ms 3.4123ms 293.0592 Ops/s 300.9290 Ops/s $\color{#d91a1a}-2.62\%$
test_sac_speed 96.2614ms 10.2829ms 97.2487 Ops/s 107.4227 Ops/s $\textbf{\color{#d91a1a}-9.47\%}$
test_redq_speed 17.2936ms 16.9051ms 59.1536 Ops/s 60.3703 Ops/s $\color{#d91a1a}-2.02\%$
test_redq_deprec_speed 14.2551ms 13.0887ms 76.4018 Ops/s 76.6130 Ops/s $\color{#d91a1a}-0.28\%$
test_td3_speed 19.6205ms 9.7142ms 102.9425 Ops/s 105.1175 Ops/s $\color{#d91a1a}-2.07\%$
test_cql_speed 35.8099ms 32.9382ms 30.3599 Ops/s 27.9281 Ops/s $\textbf{\color{#35bf28}+8.71\%}$
test_a2c_speed 8.6589ms 7.2498ms 137.9349 Ops/s 138.3235 Ops/s $\color{#d91a1a}-0.28\%$
test_ppo_speed 8.9269ms 7.5575ms 132.3186 Ops/s 133.1047 Ops/s $\color{#d91a1a}-0.59\%$
test_reinforce_speed 7.6861ms 6.3048ms 158.6085 Ops/s 160.5845 Ops/s $\color{#d91a1a}-1.23\%$
test_iql_speed 28.4410ms 27.0913ms 36.9122 Ops/s 36.3459 Ops/s $\color{#35bf28}+1.56\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.2406ms 2.5476ms 392.5314 Ops/s 404.1517 Ops/s $\color{#d91a1a}-2.88\%$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 4.2354ms 2.7301ms 366.2932 Ops/s 327.2441 Ops/s $\textbf{\color{#35bf28}+11.93\%}$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 4.3929ms 2.7301ms 366.2929 Ops/s 375.1621 Ops/s $\color{#d91a1a}-2.36\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.1915ms 2.5376ms 394.0747 Ops/s 402.6426 Ops/s $\color{#d91a1a}-2.13\%$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 4.4054ms 2.7317ms 366.0672 Ops/s 375.2155 Ops/s $\color{#d91a1a}-2.44\%$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3.8511ms 2.7091ms 369.1267 Ops/s 375.7782 Ops/s $\color{#d91a1a}-1.77\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.7695ms 2.5520ms 391.8431 Ops/s 402.3783 Ops/s $\color{#d91a1a}-2.62\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.8325ms 2.7199ms 367.6573 Ops/s 374.4111 Ops/s $\color{#d91a1a}-1.80\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 3.7769ms 2.7300ms 366.3049 Ops/s 375.0594 Ops/s $\color{#d91a1a}-2.33\%$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.9586ms 2.5379ms 394.0203 Ops/s 401.0669 Ops/s $\color{#d91a1a}-1.76\%$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3.9010ms 2.7097ms 369.0388 Ops/s 376.0903 Ops/s $\color{#d91a1a}-1.87\%$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 4.0341ms 2.7338ms 365.7881 Ops/s 374.0346 Ops/s $\color{#d91a1a}-2.20\%$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.8023ms 2.5251ms 396.0299 Ops/s 401.4215 Ops/s $\color{#d91a1a}-1.34\%$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 4.3179ms 2.7255ms 366.9082 Ops/s 375.4196 Ops/s $\color{#d91a1a}-2.27\%$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3.7825ms 2.7171ms 368.0343 Ops/s 375.1655 Ops/s $\color{#d91a1a}-1.90\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.1214ms 2.5462ms 392.7465 Ops/s 398.5689 Ops/s $\color{#d91a1a}-1.46\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.6569ms 2.7077ms 369.3199 Ops/s 372.4728 Ops/s $\color{#d91a1a}-0.85\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 4.0558ms 2.7148ms 368.3484 Ops/s 373.1185 Ops/s $\color{#d91a1a}-1.28\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.2083s 19.9178ms 50.2064 Ops/s 51.5522 Ops/s $\color{#d91a1a}-2.61\%$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 0.1314s 18.1877ms 54.9824 Ops/s 56.4261 Ops/s $\color{#d91a1a}-2.56\%$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 0.1303s 18.1021ms 55.2421 Ops/s 65.4430 Ops/s $\textbf{\color{#d91a1a}-15.59\%}$
test_populate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1308s 18.0332ms 55.4534 Ops/s 50.0844 Ops/s $\textbf{\color{#35bf28}+10.72\%}$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.1294s 15.8762ms 62.9873 Ops/s 65.4923 Ops/s $\color{#d91a1a}-3.82\%$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 0.1291s 18.2128ms 54.9064 Ops/s 56.3839 Ops/s $\color{#d91a1a}-2.62\%$
test_populate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1290s 18.1999ms 54.9453 Ops/s 56.4062 Ops/s $\color{#d91a1a}-2.59\%$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 0.1290s 18.1626ms 55.0581 Ops/s 56.4406 Ops/s $\color{#d91a1a}-2.45\%$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 0.1310s 18.0712ms 55.3366 Ops/s 56.8930 Ops/s $\color{#d91a1a}-2.74\%$

@vmoens vmoens merged commit f6188e3 into main Dec 4, 2023
58 of 61 checks passed
@vmoens vmoens deleted the fix-buffer-len branch December 4, 2023 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Using LazyTensorStorage messes up the indexing of the replay-buffer
2 participants