Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix safe probabilistic backward by removing in-place modif #2755

Merged
merged 2 commits into from
Feb 4, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 4, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 4, 2025
ghstack-source-id: 811e138999c191b34c8869b9623874c289738daf
Pull Request resolved: #2755
Copy link

pytorch-bot bot commented Feb 4, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2755

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 4, 2025
Copy link

github-actions bot commented Feb 4, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}2$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.5255s 0.4447s 2.2489 Ops/s 2.2035 Ops/s $\color{#35bf28}+2.06\%$
test_transformed 0.9812s 0.9032s 1.1072 Ops/s 1.0956 Ops/s $\color{#35bf28}+1.07\%$
test_serial 1.4519s 1.3761s 0.7267 Ops/s 0.7179 Ops/s $\color{#35bf28}+1.23\%$
test_parallel 1.2761s 1.1988s 0.8342 Ops/s 0.8246 Ops/s $\color{#35bf28}+1.16\%$
test_step_mdp_speed[True-True-True-True-True] 0.2104ms 29.9108μs 33.4328 KOps/s 33.0618 KOps/s $\color{#35bf28}+1.12\%$
test_step_mdp_speed[True-True-True-True-False] 44.3340μs 17.6725μs 56.5850 KOps/s 55.7284 KOps/s $\color{#35bf28}+1.54\%$
test_step_mdp_speed[True-True-True-False-True] 61.4260μs 17.0339μs 58.7063 KOps/s 58.3094 KOps/s $\color{#35bf28}+0.68\%$
test_step_mdp_speed[True-True-True-False-False] 41.4780μs 10.0284μs 99.7172 KOps/s 97.5659 KOps/s $\color{#35bf28}+2.20\%$
test_step_mdp_speed[True-True-False-True-True] 78.5680μs 32.4991μs 30.7701 KOps/s 30.7232 KOps/s $\color{#35bf28}+0.15\%$
test_step_mdp_speed[True-True-False-True-False] 52.9700μs 19.5749μs 51.0858 KOps/s 50.7015 KOps/s $\color{#35bf28}+0.76\%$
test_step_mdp_speed[True-True-False-False-True] 46.4070μs 18.7509μs 53.3307 KOps/s 52.5806 KOps/s $\color{#35bf28}+1.43\%$
test_step_mdp_speed[True-True-False-False-False] 43.5720μs 11.8237μs 84.5760 KOps/s 83.9281 KOps/s $\color{#35bf28}+0.77\%$
test_step_mdp_speed[True-False-True-True-True] 85.0010μs 33.6462μs 29.7210 KOps/s 29.7087 KOps/s $\color{#35bf28}+0.04\%$
test_step_mdp_speed[True-False-True-True-False] 67.4070μs 21.4881μs 46.5374 KOps/s 46.2213 KOps/s $\color{#35bf28}+0.68\%$
test_step_mdp_speed[True-False-True-False-True] 54.0720μs 18.5138μs 54.0137 KOps/s 52.1133 KOps/s $\color{#35bf28}+3.65\%$
test_step_mdp_speed[True-False-True-False-False] 56.1750μs 11.8758μs 84.2052 KOps/s 83.5502 KOps/s $\color{#35bf28}+0.78\%$
test_step_mdp_speed[True-False-False-True-True] 76.9750μs 35.0807μs 28.5057 KOps/s 28.0010 KOps/s $\color{#35bf28}+1.80\%$
test_step_mdp_speed[True-False-False-True-False] 79.5520μs 22.9402μs 43.5916 KOps/s 42.8415 KOps/s $\color{#35bf28}+1.75\%$
test_step_mdp_speed[True-False-False-False-True] 70.3100μs 20.3333μs 49.1803 KOps/s 48.4322 KOps/s $\color{#35bf28}+1.54\%$
test_step_mdp_speed[True-False-False-False-False] 49.2030μs 13.5311μs 73.9037 KOps/s 72.4814 KOps/s $\color{#35bf28}+1.96\%$
test_step_mdp_speed[False-True-True-True-True] 77.2650μs 33.6632μs 29.7061 KOps/s 29.1895 KOps/s $\color{#35bf28}+1.77\%$
test_step_mdp_speed[False-True-True-True-False] 98.3450μs 21.4154μs 46.6954 KOps/s 45.6964 KOps/s $\color{#35bf28}+2.19\%$
test_step_mdp_speed[False-True-True-False-True] 53.8220μs 21.1927μs 47.1861 KOps/s 46.1110 KOps/s $\color{#35bf28}+2.33\%$
test_step_mdp_speed[False-True-True-False-False] 35.5870μs 13.2369μs 75.5464 KOps/s 74.5845 KOps/s $\color{#35bf28}+1.29\%$
test_step_mdp_speed[False-True-False-True-True] 95.7710μs 35.2800μs 28.3447 KOps/s 27.8321 KOps/s $\color{#35bf28}+1.84\%$
test_step_mdp_speed[False-True-False-True-False] 56.4370μs 23.1238μs 43.2456 KOps/s 42.9234 KOps/s $\color{#35bf28}+0.75\%$
test_step_mdp_speed[False-True-False-False-True] 2.5674ms 23.1168μs 43.2585 KOps/s 42.9071 KOps/s $\color{#35bf28}+0.82\%$
test_step_mdp_speed[False-True-False-False-False] 42.4290μs 14.7874μs 67.6252 KOps/s 65.3338 KOps/s $\color{#35bf28}+3.51\%$
test_step_mdp_speed[False-False-True-True-True] 75.5730μs 36.6878μs 27.2570 KOps/s 26.2866 KOps/s $\color{#35bf28}+3.69\%$
test_step_mdp_speed[False-False-True-True-False] 67.5280μs 24.7209μs 40.4516 KOps/s 40.0797 KOps/s $\color{#35bf28}+0.93\%$
test_step_mdp_speed[False-False-True-False-True] 56.1060μs 22.9269μs 43.6169 KOps/s 43.0568 KOps/s $\color{#35bf28}+1.30\%$
test_step_mdp_speed[False-False-True-False-False] 49.1820μs 14.8904μs 67.1572 KOps/s 65.9657 KOps/s $\color{#35bf28}+1.81\%$
test_step_mdp_speed[False-False-False-True-True] 85.0100μs 38.4638μs 25.9985 KOps/s 25.5744 KOps/s $\color{#35bf28}+1.66\%$
test_step_mdp_speed[False-False-False-True-False] 72.8170μs 26.4680μs 37.7814 KOps/s 37.1677 KOps/s $\color{#35bf28}+1.65\%$
test_step_mdp_speed[False-False-False-False-True] 74.7870μs 24.1850μs 41.3480 KOps/s 40.1130 KOps/s $\color{#35bf28}+3.08\%$
test_step_mdp_speed[False-False-False-False-False] 49.1970μs 16.4465μs 60.8030 KOps/s 59.4890 KOps/s $\color{#35bf28}+2.21\%$
test_values[generalized_advantage_estimate-True-True] 9.9933ms 9.5538ms 104.6708 Ops/s 102.2934 Ops/s $\color{#35bf28}+2.32\%$
test_values[vec_generalized_advantage_estimate-True-True] 28.0266ms 23.7306ms 42.1397 Ops/s 42.1087 Ops/s $\color{#35bf28}+0.07\%$
test_values[td0_return_estimate-False-False] 0.2183ms 0.1730ms 5.7819 KOps/s 5.6033 KOps/s $\color{#35bf28}+3.19\%$
test_values[td1_return_estimate-False-False] 27.7404ms 23.8013ms 42.0145 Ops/s 41.7251 Ops/s $\color{#35bf28}+0.69\%$
test_values[vec_td1_return_estimate-False-False] 26.5612ms 23.9684ms 41.7216 Ops/s 41.8479 Ops/s $\color{#d91a1a}-0.30\%$
test_values[td_lambda_return_estimate-True-False] 36.4562ms 34.2894ms 29.1635 Ops/s 28.9789 Ops/s $\color{#35bf28}+0.64\%$
test_values[vec_td_lambda_return_estimate-True-False] 25.5969ms 23.8514ms 41.9263 Ops/s 41.7145 Ops/s $\color{#35bf28}+0.51\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.1451ms 8.5038ms 117.5947 Ops/s 116.5222 Ops/s $\color{#35bf28}+0.92\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.3194ms 1.9160ms 521.9104 Ops/s 515.4118 Ops/s $\color{#35bf28}+1.26\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5054ms 0.3602ms 2.7761 KOps/s 2.7022 KOps/s $\color{#35bf28}+2.73\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 44.9637ms 40.8847ms 24.4590 Ops/s 24.7872 Ops/s $\color{#d91a1a}-1.32\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.2654ms 3.4233ms 292.1167 Ops/s 291.1025 Ops/s $\color{#35bf28}+0.35\%$
test_dqn_speed[False-None] 1.8873ms 1.3892ms 719.8133 Ops/s 704.9594 Ops/s $\color{#35bf28}+2.11\%$
test_dqn_speed[False-backward] 1.8964ms 1.8638ms 536.5387 Ops/s 529.9945 Ops/s $\color{#35bf28}+1.23\%$
test_dqn_speed[True-None] 0.8001ms 0.4850ms 2.0619 KOps/s 2.0576 KOps/s $\color{#35bf28}+0.21\%$
test_dqn_speed[True-backward] 0.9478ms 0.8916ms 1.1216 KOps/s 809.7112 Ops/s $\textbf{\color{#35bf28}+38.52\%}$
test_dqn_speed[reduce-overhead-None] 0.5827ms 0.4746ms 2.1072 KOps/s 2.0804 KOps/s $\color{#35bf28}+1.29\%$
test_dqn_speed[reduce-overhead-backward] 0.9272ms 0.8965ms 1.1154 KOps/s 1.1046 KOps/s $\color{#35bf28}+0.98\%$
test_ddpg_speed[False-None] 3.1304ms 2.8726ms 348.1158 Ops/s 351.1313 Ops/s $\color{#d91a1a}-0.86\%$
test_ddpg_speed[False-backward] 4.2726ms 4.0280ms 248.2651 Ops/s 251.8005 Ops/s $\color{#d91a1a}-1.40\%$
test_ddpg_speed[True-None] 1.4679ms 1.2034ms 830.9947 Ops/s 820.8912 Ops/s $\color{#35bf28}+1.23\%$
test_ddpg_speed[True-backward] 2.1518ms 2.0858ms 479.4283 Ops/s 472.9926 Ops/s $\color{#35bf28}+1.36\%$
test_ddpg_speed[reduce-overhead-None] 1.6700ms 1.1996ms 833.6360 Ops/s 805.8960 Ops/s $\color{#35bf28}+3.44\%$
test_ddpg_speed[reduce-overhead-backward] 2.1389ms 2.0862ms 479.3502 Ops/s 470.4980 Ops/s $\color{#35bf28}+1.88\%$
test_sac_speed[False-None] 8.8741ms 7.9223ms 126.2259 Ops/s 125.7954 Ops/s $\color{#35bf28}+0.34\%$
test_sac_speed[False-backward] 12.7627ms 10.6764ms 93.6642 Ops/s 94.3048 Ops/s $\color{#d91a1a}-0.68\%$
test_sac_speed[True-None] 2.3584ms 2.0687ms 483.4004 Ops/s 482.0601 Ops/s $\color{#35bf28}+0.28\%$
test_sac_speed[True-backward] 3.8018ms 3.7387ms 267.4719 Ops/s 267.0713 Ops/s $\color{#35bf28}+0.15\%$
test_sac_speed[reduce-overhead-None] 2.6388ms 2.0800ms 480.7659 Ops/s 477.5660 Ops/s $\color{#35bf28}+0.67\%$
test_sac_speed[reduce-overhead-backward] 3.8491ms 3.7544ms 266.3549 Ops/s 262.3918 Ops/s $\color{#35bf28}+1.51\%$
test_redq_speed[False-None] 14.6492ms 12.5551ms 79.6487 Ops/s 79.3821 Ops/s $\color{#35bf28}+0.34\%$
test_redq_speed[False-backward] 23.9763ms 21.6901ms 46.1040 Ops/s 46.2192 Ops/s $\color{#d91a1a}-0.25\%$
test_redq_speed[True-None] 5.7283ms 4.7222ms 211.7652 Ops/s 211.5349 Ops/s $\color{#35bf28}+0.11\%$
test_redq_speed[True-backward] 12.2467ms 11.8254ms 84.5637 Ops/s 80.3044 Ops/s $\textbf{\color{#35bf28}+5.30\%}$
test_redq_speed[reduce-overhead-None] 5.8546ms 4.7523ms 210.4235 Ops/s 211.4250 Ops/s $\color{#d91a1a}-0.47\%$
test_redq_speed[reduce-overhead-backward] 13.3710ms 11.7791ms 84.8962 Ops/s 82.9670 Ops/s $\color{#35bf28}+2.33\%$
test_redq_deprec_speed[False-None] 13.9812ms 12.5289ms 79.8156 Ops/s 78.0187 Ops/s $\color{#35bf28}+2.30\%$
test_redq_deprec_speed[False-backward] 20.3723ms 17.9374ms 55.7493 Ops/s 54.1551 Ops/s $\color{#35bf28}+2.94\%$
test_redq_deprec_speed[True-None] 4.2200ms 3.7751ms 264.8925 Ops/s 258.0511 Ops/s $\color{#35bf28}+2.65\%$
test_redq_deprec_speed[True-backward] 8.1442ms 8.0248ms 124.6138 Ops/s 123.2420 Ops/s $\color{#35bf28}+1.11\%$
test_redq_deprec_speed[reduce-overhead-None] 4.5526ms 3.7775ms 264.7225 Ops/s 260.6117 Ops/s $\color{#35bf28}+1.58\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.3762ms 8.1537ms 122.6433 Ops/s 122.4183 Ops/s $\color{#35bf28}+0.18\%$
test_td3_speed[False-None] 8.4186ms 8.0001ms 124.9978 Ops/s 124.6548 Ops/s $\color{#35bf28}+0.28\%$
test_td3_speed[False-backward] 11.4866ms 10.4477ms 95.7150 Ops/s 95.8216 Ops/s $\color{#d91a1a}-0.11\%$
test_td3_speed[True-None] 1.9809ms 1.7878ms 559.3442 Ops/s 555.3671 Ops/s $\color{#35bf28}+0.72\%$
test_td3_speed[True-backward] 4.3268ms 3.4302ms 291.5242 Ops/s 294.3218 Ops/s $\color{#d91a1a}-0.95\%$
test_td3_speed[reduce-overhead-None] 1.8831ms 1.7695ms 565.1365 Ops/s 546.6022 Ops/s $\color{#35bf28}+3.39\%$
test_td3_speed[reduce-overhead-backward] 4.2345ms 3.4184ms 292.5304 Ops/s 292.5208 Ops/s $+0.00\%$
test_cql_speed[False-None] 37.4799ms 35.8121ms 27.9236 Ops/s 26.9009 Ops/s $\color{#35bf28}+3.80\%$
test_cql_speed[False-backward] 51.1261ms 45.9555ms 21.7602 Ops/s 21.3864 Ops/s $\color{#35bf28}+1.75\%$
test_cql_speed[True-None] 16.9085ms 15.4450ms 64.7459 Ops/s 63.4370 Ops/s $\color{#35bf28}+2.06\%$
test_cql_speed[True-backward] 24.2432ms 22.0298ms 45.3931 Ops/s 45.3183 Ops/s $\color{#35bf28}+0.17\%$
test_cql_speed[reduce-overhead-None] 17.2997ms 15.7327ms 63.5619 Ops/s 63.2708 Ops/s $\color{#35bf28}+0.46\%$
test_cql_speed[reduce-overhead-backward] 24.1743ms 21.8847ms 45.6940 Ops/s 45.0402 Ops/s $\color{#35bf28}+1.45\%$
test_a2c_speed[False-None] 8.0333ms 7.1129ms 140.5899 Ops/s 139.2673 Ops/s $\color{#35bf28}+0.95\%$
test_a2c_speed[False-backward] 16.5395ms 13.9796ms 71.5326 Ops/s 70.1356 Ops/s $\color{#35bf28}+1.99\%$
test_a2c_speed[True-None] 4.2413ms 3.6580ms 273.3769 Ops/s 270.4710 Ops/s $\color{#35bf28}+1.07\%$
test_a2c_speed[True-backward] 10.2850ms 9.9408ms 100.5953 Ops/s 99.3165 Ops/s $\color{#35bf28}+1.29\%$
test_a2c_speed[reduce-overhead-None] 4.2184ms 3.6806ms 271.6946 Ops/s 268.6384 Ops/s $\color{#35bf28}+1.14\%$
test_a2c_speed[reduce-overhead-backward] 10.3024ms 9.9588ms 100.4137 Ops/s 99.5501 Ops/s $\color{#35bf28}+0.87\%$
test_ppo_speed[False-None] 8.0958ms 7.4216ms 134.7418 Ops/s 134.9443 Ops/s $\color{#d91a1a}-0.15\%$
test_ppo_speed[False-backward] 16.2659ms 14.6174ms 68.4114 Ops/s 68.7327 Ops/s $\color{#d91a1a}-0.47\%$
test_ppo_speed[True-None] 4.8345ms 4.0553ms 246.5913 Ops/s 246.0904 Ops/s $\color{#35bf28}+0.20\%$
test_ppo_speed[True-backward] 10.2161ms 9.8354ms 101.6731 Ops/s 101.6915 Ops/s $\color{#d91a1a}-0.02\%$
test_ppo_speed[reduce-overhead-None] 4.7054ms 4.0556ms 246.5756 Ops/s 246.5045 Ops/s $\color{#35bf28}+0.03\%$
test_ppo_speed[reduce-overhead-backward] 11.1062ms 9.8563ms 101.4581 Ops/s 101.8269 Ops/s $\color{#d91a1a}-0.36\%$
test_reinforce_speed[False-None] 7.2493ms 6.4915ms 154.0470 Ops/s 154.3674 Ops/s $\color{#d91a1a}-0.21\%$
test_reinforce_speed[False-backward] 10.0012ms 9.7182ms 102.8993 Ops/s 102.9680 Ops/s $\color{#d91a1a}-0.07\%$
test_reinforce_speed[True-None] 3.3742ms 2.9916ms 334.2709 Ops/s 330.4856 Ops/s $\color{#35bf28}+1.15\%$
test_reinforce_speed[True-backward] 9.4914ms 8.8282ms 113.2731 Ops/s 113.5075 Ops/s $\color{#d91a1a}-0.21\%$
test_reinforce_speed[reduce-overhead-None] 3.5509ms 3.0239ms 330.7001 Ops/s 332.7847 Ops/s $\color{#d91a1a}-0.63\%$
test_reinforce_speed[reduce-overhead-backward] 9.2558ms 8.8526ms 112.9610 Ops/s 112.7396 Ops/s $\color{#35bf28}+0.20\%$
test_iql_speed[False-None] 34.5179ms 31.9658ms 31.2834 Ops/s 30.5099 Ops/s $\color{#35bf28}+2.54\%$
test_iql_speed[False-backward] 47.4206ms 44.6524ms 22.3952 Ops/s 22.5381 Ops/s $\color{#d91a1a}-0.63\%$
test_iql_speed[True-None] 11.9601ms 10.8414ms 92.2392 Ops/s 92.2245 Ops/s $\color{#35bf28}+0.02\%$
test_iql_speed[True-backward] 22.6727ms 21.4278ms 46.6684 Ops/s 46.3883 Ops/s $\color{#35bf28}+0.60\%$
test_iql_speed[reduce-overhead-None] 12.3333ms 10.9498ms 91.3260 Ops/s 91.1275 Ops/s $\color{#35bf28}+0.22\%$
test_iql_speed[reduce-overhead-backward] 22.6702ms 21.3370ms 46.8670 Ops/s 46.3472 Ops/s $\color{#35bf28}+1.12\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.8034ms 4.6991ms 212.8073 Ops/s 208.4996 Ops/s $\color{#35bf28}+2.07\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8723ms 0.5298ms 1.8874 KOps/s 1.8690 KOps/s $\color{#35bf28}+0.99\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8764ms 0.5068ms 1.9732 KOps/s 1.9767 KOps/s $\color{#d91a1a}-0.18\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.3591ms 4.4501ms 224.7154 Ops/s 221.3065 Ops/s $\color{#35bf28}+1.54\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.5912ms 0.5190ms 1.9266 KOps/s 1.9109 KOps/s $\color{#35bf28}+0.82\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8429ms 0.4978ms 2.0090 KOps/s 2.0002 KOps/s $\color{#35bf28}+0.44\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.2302ms 1.6959ms 589.6582 Ops/s 585.4170 Ops/s $\color{#35bf28}+0.72\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.5053ms 1.6059ms 622.7122 Ops/s 619.4332 Ops/s $\color{#35bf28}+0.53\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.3887ms 4.6346ms 215.7672 Ops/s 211.9783 Ops/s $\color{#35bf28}+1.79\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.1882ms 0.6721ms 1.4878 KOps/s 1.4981 KOps/s $\color{#d91a1a}-0.69\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0352ms 0.6359ms 1.5726 KOps/s 1.5523 KOps/s $\color{#35bf28}+1.30\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.1483ms 4.5001ms 222.2167 Ops/s 217.7837 Ops/s $\color{#35bf28}+2.04\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.0486ms 0.5323ms 1.8786 KOps/s 1.9000 KOps/s $\color{#d91a1a}-1.12\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6894ms 0.4991ms 2.0035 KOps/s 1.9573 KOps/s $\color{#35bf28}+2.36\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.3069ms 4.4982ms 222.3128 Ops/s 221.8638 Ops/s $\color{#35bf28}+0.20\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0997ms 0.5299ms 1.8871 KOps/s 1.9297 KOps/s $\color{#d91a1a}-2.21\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7023ms 0.4950ms 2.0203 KOps/s 1.9894 KOps/s $\color{#35bf28}+1.56\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 4.9781ms 4.5676ms 218.9325 Ops/s 212.8215 Ops/s $\color{#35bf28}+2.87\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.3602ms 0.6661ms 1.5012 KOps/s 1.4997 KOps/s $\color{#35bf28}+0.10\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8855ms 0.6429ms 1.5555 KOps/s 1.5745 KOps/s $\color{#d91a1a}-1.20\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.4976ms 4.2119ms 237.4220 Ops/s 246.4269 Ops/s $\color{#d91a1a}-3.65\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 16.5456ms 2.4865ms 402.1765 Ops/s 425.8886 Ops/s $\textbf{\color{#d91a1a}-5.57\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 5.8364ms 1.3992ms 714.7089 Ops/s 720.2299 Ops/s $\color{#d91a1a}-0.77\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 5.5708ms 4.2974ms 232.6973 Ops/s 248.2509 Ops/s $\textbf{\color{#d91a1a}-6.27\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 7.1458ms 2.3383ms 427.6696 Ops/s 432.9330 Ops/s $\color{#d91a1a}-1.22\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.9242ms 1.2249ms 816.3854 Ops/s 766.5540 Ops/s $\textbf{\color{#35bf28}+6.50\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4150s 12.6803ms 78.8624 Ops/s 36.6282 Ops/s $\textbf{\color{#35bf28}+115.30\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.6262ms 2.5345ms 394.5548 Ops/s 390.3830 Ops/s $\color{#35bf28}+1.07\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.1041ms 1.4314ms 698.6340 Ops/s 698.1918 Ops/s $\color{#35bf28}+0.06\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.0447ms 11.7781ms 84.9031 Ops/s 84.3517 Ops/s $\color{#35bf28}+0.65\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.6297ms 14.0898ms 70.9733 Ops/s 69.9151 Ops/s $\color{#35bf28}+1.51\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.5189ms 20.7026ms 48.3031 Ops/s 47.8233 Ops/s $\color{#35bf28}+1.00\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.4307ms 14.6196ms 68.4012 Ops/s 66.8252 Ops/s $\color{#35bf28}+2.36\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 23.0030ms 20.4935ms 48.7959 Ops/s 48.2719 Ops/s $\color{#35bf28}+1.09\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.1027ms 15.6686ms 63.8220 Ops/s 62.6995 Ops/s $\color{#35bf28}+1.79\%$

Copy link

github-actions bot commented Feb 4, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.8323s 0.7468s 1.3390 Ops/s 1.3277 Ops/s $\color{#35bf28}+0.85\%$
test_transformed 1.3120s 1.3110s 0.7628 Ops/s 0.7336 Ops/s $\color{#35bf28}+3.98\%$
test_serial 2.1562s 2.1552s 0.4640 Ops/s 0.4570 Ops/s $\color{#35bf28}+1.53\%$
test_parallel 1.8612s 1.8387s 0.5439 Ops/s 0.5361 Ops/s $\color{#35bf28}+1.45\%$
test_step_mdp_speed[True-True-True-True-True] 0.1901ms 40.7692μs 24.5283 KOps/s 25.5086 KOps/s $\color{#d91a1a}-3.84\%$
test_step_mdp_speed[True-True-True-True-False] 51.5400μs 23.5904μs 42.3901 KOps/s 43.2176 KOps/s $\color{#d91a1a}-1.91\%$
test_step_mdp_speed[True-True-True-False-True] 51.3910μs 22.5444μs 44.3568 KOps/s 45.3093 KOps/s $\color{#d91a1a}-2.10\%$
test_step_mdp_speed[True-True-True-False-False] 47.7310μs 13.1050μs 76.3068 KOps/s 77.7515 KOps/s $\color{#d91a1a}-1.86\%$
test_step_mdp_speed[True-True-False-True-True] 79.5210μs 43.5642μs 22.9546 KOps/s 23.6756 KOps/s $\color{#d91a1a}-3.05\%$
test_step_mdp_speed[True-True-False-True-False] 64.9100μs 25.9688μs 38.5077 KOps/s 39.4405 KOps/s $\color{#d91a1a}-2.37\%$
test_step_mdp_speed[True-True-False-False-True] 60.9710μs 25.1950μs 39.6904 KOps/s 40.8254 KOps/s $\color{#d91a1a}-2.78\%$
test_step_mdp_speed[True-True-False-False-False] 79.8510μs 15.2321μs 65.6509 KOps/s 65.5058 KOps/s $\color{#35bf28}+0.22\%$
test_step_mdp_speed[True-False-True-True-True] 78.6610μs 45.2628μs 22.0932 KOps/s 22.3691 KOps/s $\color{#d91a1a}-1.23\%$
test_step_mdp_speed[True-False-True-True-False] 59.7210μs 28.3407μs 35.2849 KOps/s 35.7555 KOps/s $\color{#d91a1a}-1.32\%$
test_step_mdp_speed[True-False-True-False-True] 52.9110μs 25.2730μs 39.5679 KOps/s 40.0704 KOps/s $\color{#d91a1a}-1.25\%$
test_step_mdp_speed[True-False-True-False-False] 44.6410μs 15.5258μs 64.4090 KOps/s 65.3844 KOps/s $\color{#d91a1a}-1.49\%$
test_step_mdp_speed[True-False-False-True-True] 80.3810μs 47.2022μs 21.1855 KOps/s 21.1897 KOps/s $\color{#d91a1a}-0.02\%$
test_step_mdp_speed[True-False-False-True-False] 59.3910μs 30.0831μs 33.2412 KOps/s 33.1261 KOps/s $\color{#35bf28}+0.35\%$
test_step_mdp_speed[True-False-False-False-True] 60.8000μs 26.8709μs 37.2150 KOps/s 37.3774 KOps/s $\color{#d91a1a}-0.43\%$
test_step_mdp_speed[True-False-False-False-False] 50.4610μs 17.4297μs 57.3734 KOps/s 57.2564 KOps/s $\color{#35bf28}+0.20\%$
test_step_mdp_speed[False-True-True-True-True] 81.8310μs 44.5451μs 22.4492 KOps/s 22.3478 KOps/s $\color{#35bf28}+0.45\%$
test_step_mdp_speed[False-True-True-True-False] 64.3600μs 27.9989μs 35.7157 KOps/s 35.7441 KOps/s $\color{#d91a1a}-0.08\%$
test_step_mdp_speed[False-True-True-False-True] 2.5620ms 28.9488μs 34.5438 KOps/s 35.3225 KOps/s $\color{#d91a1a}-2.20\%$
test_step_mdp_speed[False-True-True-False-False] 44.7500μs 17.2582μs 57.9435 KOps/s 58.8938 KOps/s $\color{#d91a1a}-1.61\%$
test_step_mdp_speed[False-True-False-True-True] 79.9110μs 46.9245μs 21.3108 KOps/s 21.1822 KOps/s $\color{#35bf28}+0.61\%$
test_step_mdp_speed[False-True-False-True-False] 60.7800μs 30.4244μs 32.8684 KOps/s 33.2728 KOps/s $\color{#d91a1a}-1.22\%$
test_step_mdp_speed[False-True-False-False-True] 73.7610μs 31.1394μs 32.1136 KOps/s 32.3297 KOps/s $\color{#d91a1a}-0.67\%$
test_step_mdp_speed[False-True-False-False-False] 45.0400μs 19.2905μs 51.8391 KOps/s 52.2328 KOps/s $\color{#d91a1a}-0.75\%$
test_step_mdp_speed[False-False-True-True-True] 93.3010μs 50.4253μs 19.8313 KOps/s 20.3501 KOps/s $\color{#d91a1a}-2.55\%$
test_step_mdp_speed[False-False-True-True-False] 61.6010μs 32.8216μs 30.4678 KOps/s 30.5859 KOps/s $\color{#d91a1a}-0.39\%$
test_step_mdp_speed[False-False-True-False-True] 54.8900μs 31.3485μs 31.8995 KOps/s 32.8886 KOps/s $\color{#d91a1a}-3.01\%$
test_step_mdp_speed[False-False-True-False-False] 53.1410μs 19.4007μs 51.5446 KOps/s 51.8750 KOps/s $\color{#d91a1a}-0.64\%$
test_step_mdp_speed[False-False-False-True-True] 85.5410μs 51.7629μs 19.3189 KOps/s 19.7241 KOps/s $\color{#d91a1a}-2.05\%$
test_step_mdp_speed[False-False-False-True-False] 62.2600μs 35.4362μs 28.2197 KOps/s 28.5769 KOps/s $\color{#d91a1a}-1.25\%$
test_step_mdp_speed[False-False-False-False-True] 61.8910μs 32.8815μs 30.4123 KOps/s 31.3527 KOps/s $\color{#d91a1a}-3.00\%$
test_step_mdp_speed[False-False-False-False-False] 51.7610μs 21.6393μs 46.2121 KOps/s 46.9510 KOps/s $\color{#d91a1a}-1.57\%$
test_values[generalized_advantage_estimate-True-True] 25.9657ms 25.5696ms 39.1090 Ops/s 40.2952 Ops/s $\color{#d91a1a}-2.94\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1112s 3.1299ms 319.4986 Ops/s 326.7605 Ops/s $\color{#d91a1a}-2.22\%$
test_values[td0_return_estimate-False-False] 0.1076ms 81.3694μs 12.2896 KOps/s 12.4679 KOps/s $\color{#d91a1a}-1.43\%$
test_values[td1_return_estimate-False-False] 57.4062ms 57.0077ms 17.5415 Ops/s 17.8868 Ops/s $\color{#d91a1a}-1.93\%$
test_values[vec_td1_return_estimate-False-False] 1.3971ms 1.0955ms 912.8125 Ops/s 907.0817 Ops/s $\color{#35bf28}+0.63\%$
test_values[td_lambda_return_estimate-True-False] 89.9691ms 89.5759ms 11.1637 Ops/s 11.1409 Ops/s $\color{#35bf28}+0.20\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.4329ms 1.1011ms 908.1957 Ops/s 909.8722 Ops/s $\color{#d91a1a}-0.18\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.6239ms 25.5206ms 39.1840 Ops/s 40.0295 Ops/s $\color{#d91a1a}-2.11\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0396ms 0.7626ms 1.3112 KOps/s 1.2964 KOps/s $\color{#35bf28}+1.14\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7647ms 0.6835ms 1.4630 KOps/s 1.4516 KOps/s $\color{#35bf28}+0.78\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5586ms 1.4948ms 669.0024 Ops/s 666.9877 Ops/s $\color{#35bf28}+0.30\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7381ms 0.6995ms 1.4295 KOps/s 1.4228 KOps/s $\color{#35bf28}+0.47\%$
test_dqn_speed[False-None] 6.8892ms 1.5259ms 655.3320 Ops/s 641.2075 Ops/s $\color{#35bf28}+2.20\%$
test_dqn_speed[False-backward] 2.2030ms 2.1582ms 463.3492 Ops/s 463.2111 Ops/s $\color{#35bf28}+0.03\%$
test_dqn_speed[True-None] 0.6173ms 0.5560ms 1.7985 KOps/s 1.7782 KOps/s $\color{#35bf28}+1.14\%$
test_dqn_speed[True-backward] 1.3178ms 1.2389ms 807.1961 Ops/s 807.7705 Ops/s $\color{#d91a1a}-0.07\%$
test_dqn_speed[reduce-overhead-None] 0.6730ms 0.5916ms 1.6904 KOps/s 1.6927 KOps/s $\color{#d91a1a}-0.13\%$
test_dqn_speed[reduce-overhead-backward] 1.1633ms 1.0839ms 922.5682 Ops/s 928.0469 Ops/s $\color{#d91a1a}-0.59\%$
test_ddpg_speed[False-None] 3.2207ms 2.9125ms 343.3442 Ops/s 342.1525 Ops/s $\color{#35bf28}+0.35\%$
test_ddpg_speed[False-backward] 4.6454ms 4.3082ms 232.1181 Ops/s 229.6784 Ops/s $\color{#35bf28}+1.06\%$
test_ddpg_speed[True-None] 1.4296ms 1.3458ms 743.0281 Ops/s 744.3034 Ops/s $\color{#d91a1a}-0.17\%$
test_ddpg_speed[True-backward] 2.6219ms 2.5693ms 389.2099 Ops/s 409.9300 Ops/s $\textbf{\color{#d91a1a}-5.05\%}$
test_ddpg_speed[reduce-overhead-None] 1.4347ms 1.3606ms 734.9798 Ops/s 734.6568 Ops/s $\color{#35bf28}+0.04\%$
test_ddpg_speed[reduce-overhead-backward] 2.1150ms 2.0410ms 489.9671 Ops/s 526.1490 Ops/s $\textbf{\color{#d91a1a}-6.88\%}$
test_sac_speed[False-None] 8.5758ms 8.1374ms 122.8892 Ops/s 121.5916 Ops/s $\color{#35bf28}+1.07\%$
test_sac_speed[False-backward] 11.8285ms 11.3491ms 88.1125 Ops/s 89.6034 Ops/s $\color{#d91a1a}-1.66\%$
test_sac_speed[True-None] 2.0187ms 1.8472ms 541.3655 Ops/s 542.4644 Ops/s $\color{#d91a1a}-0.20\%$
test_sac_speed[True-backward] 3.7735ms 3.7179ms 268.9678 Ops/s 266.7108 Ops/s $\color{#35bf28}+0.85\%$
test_sac_speed[reduce-overhead-None] 21.6266ms 12.0964ms 82.6694 Ops/s 81.4288 Ops/s $\color{#35bf28}+1.52\%$
test_sac_speed[reduce-overhead-backward] 1.8301ms 1.7927ms 557.8103 Ops/s 602.7297 Ops/s $\textbf{\color{#d91a1a}-7.45\%}$
test_redq_speed[False-None] 8.0729ms 7.5958ms 131.6511 Ops/s 129.3581 Ops/s $\color{#35bf28}+1.77\%$
test_redq_speed[False-backward] 12.1346ms 11.8490ms 84.3953 Ops/s 86.1221 Ops/s $\color{#d91a1a}-2.01\%$
test_redq_speed[True-None] 2.3778ms 2.3181ms 431.3826 Ops/s 421.4511 Ops/s $\color{#35bf28}+2.36\%$
test_redq_speed[True-backward] 4.5928ms 4.1862ms 238.8802 Ops/s 246.3685 Ops/s $\color{#d91a1a}-3.04\%$
test_redq_speed[reduce-overhead-None] 2.4394ms 2.3422ms 426.9569 Ops/s 423.5761 Ops/s $\color{#35bf28}+0.80\%$
test_redq_speed[reduce-overhead-backward] 4.6042ms 4.2080ms 237.6439 Ops/s 243.7673 Ops/s $\color{#d91a1a}-2.51\%$
test_redq_deprec_speed[False-None] 9.4912ms 9.1661ms 109.0980 Ops/s 106.9599 Ops/s $\color{#35bf28}+2.00\%$
test_redq_deprec_speed[False-backward] 12.9972ms 12.4573ms 80.2742 Ops/s 81.3868 Ops/s $\color{#d91a1a}-1.37\%$
test_redq_deprec_speed[True-None] 2.8191ms 2.6617ms 375.6929 Ops/s 372.4681 Ops/s $\color{#35bf28}+0.87\%$
test_redq_deprec_speed[True-backward] 4.9569ms 4.4980ms 222.3205 Ops/s 227.5759 Ops/s $\color{#d91a1a}-2.31\%$
test_redq_deprec_speed[reduce-overhead-None] 2.7855ms 2.6571ms 376.3570 Ops/s 372.9561 Ops/s $\color{#35bf28}+0.91\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.5478ms 4.4856ms 222.9338 Ops/s 227.3537 Ops/s $\color{#d91a1a}-1.94\%$
test_td3_speed[False-None] 8.1080ms 8.0460ms 124.2855 Ops/s 122.3680 Ops/s $\color{#35bf28}+1.57\%$
test_td3_speed[False-backward] 11.4116ms 10.6360ms 94.0204 Ops/s 95.4188 Ops/s $\color{#d91a1a}-1.47\%$
test_td3_speed[True-None] 1.6593ms 1.6322ms 612.6640 Ops/s 588.9845 Ops/s $\color{#35bf28}+4.02\%$
test_td3_speed[True-backward] 3.6780ms 3.3333ms 300.0065 Ops/s 313.8801 Ops/s $\color{#d91a1a}-4.42\%$
test_td3_speed[reduce-overhead-None] 54.2811ms 26.2408ms 38.1086 Ops/s 36.4740 Ops/s $\color{#35bf28}+4.48\%$
test_td3_speed[reduce-overhead-backward] 1.5540ms 1.5009ms 666.2810 Ops/s 720.2203 Ops/s $\textbf{\color{#d91a1a}-7.49\%}$
test_cql_speed[False-None] 17.4292ms 16.9417ms 59.0259 Ops/s 58.5520 Ops/s $\color{#35bf28}+0.81\%$
test_cql_speed[False-backward] 23.0464ms 22.5752ms 44.2964 Ops/s 44.8301 Ops/s $\color{#d91a1a}-1.19\%$
test_cql_speed[True-None] 3.3108ms 3.2689ms 305.9167 Ops/s 311.7893 Ops/s $\color{#d91a1a}-1.88\%$
test_cql_speed[True-backward] 5.9685ms 5.5297ms 180.8405 Ops/s 182.8845 Ops/s $\color{#d91a1a}-1.12\%$
test_cql_speed[reduce-overhead-None] 21.0723ms 13.1234ms 76.1995 Ops/s 59.1506 Ops/s $\textbf{\color{#35bf28}+28.82\%}$
test_cql_speed[reduce-overhead-backward] 1.9415ms 1.8269ms 547.3693 Ops/s 520.5789 Ops/s $\textbf{\color{#35bf28}+5.15\%}$
test_a2c_speed[False-None] 3.3770ms 3.2248ms 310.0976 Ops/s 310.0433 Ops/s $\color{#35bf28}+0.02\%$
test_a2c_speed[False-backward] 6.9133ms 6.2230ms 160.6948 Ops/s 158.9332 Ops/s $\color{#35bf28}+1.11\%$
test_a2c_speed[True-None] 1.4863ms 1.3507ms 740.3837 Ops/s 754.4621 Ops/s $\color{#d91a1a}-1.87\%$
test_a2c_speed[True-backward] 2.9396ms 2.8991ms 344.9342 Ops/s 325.4443 Ops/s $\textbf{\color{#35bf28}+5.99\%}$
test_a2c_speed[reduce-overhead-None] 16.0906ms 9.0491ms 110.5080 Ops/s 112.8471 Ops/s $\color{#d91a1a}-2.07\%$
test_a2c_speed[reduce-overhead-backward] 1.5389ms 1.4621ms 683.9305 Ops/s 617.8533 Ops/s $\textbf{\color{#35bf28}+10.69\%}$
test_ppo_speed[False-None] 3.8359ms 3.7411ms 267.3016 Ops/s 268.4104 Ops/s $\color{#d91a1a}-0.41\%$
test_ppo_speed[False-backward] 7.3684ms 6.8980ms 144.9703 Ops/s 141.1658 Ops/s $\color{#35bf28}+2.70\%$
test_ppo_speed[True-None] 1.4839ms 1.4140ms 707.2270 Ops/s 711.4749 Ops/s $\color{#d91a1a}-0.60\%$
test_ppo_speed[True-backward] 3.1776ms 3.0593ms 326.8680 Ops/s 326.7704 Ops/s $\color{#35bf28}+0.03\%$
test_ppo_speed[reduce-overhead-None] 1.0340ms 0.9635ms 1.0379 KOps/s 1.0443 KOps/s $\color{#d91a1a}-0.61\%$
test_ppo_speed[reduce-overhead-backward] 1.4680ms 1.4205ms 703.9965 Ops/s 686.7150 Ops/s $\color{#35bf28}+2.52\%$
test_reinforce_speed[False-None] 2.4532ms 2.2996ms 434.8505 Ops/s 431.9245 Ops/s $\color{#35bf28}+0.68\%$
test_reinforce_speed[False-backward] 3.9882ms 3.3305ms 300.2530 Ops/s 300.9263 Ops/s $\color{#d91a1a}-0.22\%$
test_reinforce_speed[True-None] 1.3551ms 1.2971ms 770.9455 Ops/s 771.7593 Ops/s $\color{#d91a1a}-0.11\%$
test_reinforce_speed[True-backward] 3.0420ms 2.9326ms 340.9937 Ops/s 346.5744 Ops/s $\color{#d91a1a}-1.61\%$
test_reinforce_speed[reduce-overhead-None] 18.4000ms 10.1633ms 98.3929 Ops/s 100.1730 Ops/s $\color{#d91a1a}-1.78\%$
test_reinforce_speed[reduce-overhead-backward] 1.5522ms 1.4773ms 676.9178 Ops/s 658.8311 Ops/s $\color{#35bf28}+2.75\%$
test_iql_speed[False-None] 9.7517ms 9.2534ms 108.0684 Ops/s 107.0041 Ops/s $\color{#35bf28}+0.99\%$
test_iql_speed[False-backward] 13.5586ms 12.9371ms 77.2973 Ops/s 77.3345 Ops/s $\color{#d91a1a}-0.05\%$
test_iql_speed[True-None] 2.2926ms 2.2320ms 448.0228 Ops/s 436.6456 Ops/s $\color{#35bf28}+2.61\%$
test_iql_speed[True-backward] 5.2208ms 4.7659ms 209.8228 Ops/s 206.2758 Ops/s $\color{#35bf28}+1.72\%$
test_iql_speed[reduce-overhead-None] 19.1238ms 11.2825ms 88.6331 Ops/s 88.6134 Ops/s $\color{#35bf28}+0.02\%$
test_iql_speed[reduce-overhead-backward] 1.9554ms 1.9034ms 525.3868 Ops/s 504.5765 Ops/s $\color{#35bf28}+4.12\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.9287ms 6.3343ms 157.8705 Ops/s 155.0219 Ops/s $\color{#35bf28}+1.84\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5731ms 0.2658ms 3.7618 KOps/s 3.1623 KOps/s $\textbf{\color{#35bf28}+18.96\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5021ms 0.2452ms 4.0780 KOps/s 3.3533 KOps/s $\textbf{\color{#35bf28}+21.61\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3543ms 6.0944ms 164.0861 Ops/s 162.7491 Ops/s $\color{#35bf28}+0.82\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.0386ms 0.2664ms 3.7536 KOps/s 3.6305 KOps/s $\color{#35bf28}+3.39\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5615ms 0.2681ms 3.7296 KOps/s 3.8868 KOps/s $\color{#d91a1a}-4.04\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6559ms 1.2600ms 793.6225 Ops/s 755.7918 Ops/s $\textbf{\color{#35bf28}+5.01\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4386ms 1.2040ms 830.5705 Ops/s 807.2213 Ops/s $\color{#35bf28}+2.89\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.7823ms 6.2607ms 159.7269 Ops/s 156.4786 Ops/s $\color{#35bf28}+2.08\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0760ms 0.4083ms 2.4494 KOps/s 2.2519 KOps/s $\textbf{\color{#35bf28}+8.77\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6312ms 0.3846ms 2.5999 KOps/s 2.2970 KOps/s $\textbf{\color{#35bf28}+13.19\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2390ms 6.1207ms 163.3809 Ops/s 161.7188 Ops/s $\color{#35bf28}+1.03\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.0421ms 0.3440ms 2.9074 KOps/s 3.0567 KOps/s $\color{#d91a1a}-4.89\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5269ms 0.3106ms 3.2191 KOps/s 3.2169 KOps/s $\color{#35bf28}+0.07\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 9.5212ms 6.0933ms 164.1142 Ops/s 159.8252 Ops/s $\color{#35bf28}+2.68\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.5019ms 0.3459ms 2.8911 KOps/s 3.3240 KOps/s $\textbf{\color{#d91a1a}-13.02\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5798ms 0.3219ms 3.1064 KOps/s 3.4944 KOps/s $\textbf{\color{#d91a1a}-11.10\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4228ms 6.2393ms 160.2744 Ops/s 156.3127 Ops/s $\color{#35bf28}+2.53\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7454ms 0.4103ms 2.4370 KOps/s 2.0386 KOps/s $\textbf{\color{#35bf28}+19.54\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.5732ms 0.3871ms 2.5832 KOps/s 2.1255 KOps/s $\textbf{\color{#35bf28}+21.53\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.1051ms 5.4818ms 182.4212 Ops/s 177.7297 Ops/s $\color{#35bf28}+2.64\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.8805ms 1.8071ms 553.3825 Ops/s 432.4884 Ops/s $\textbf{\color{#35bf28}+27.95\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 8.9374ms 1.2880ms 776.4149 Ops/s 791.8176 Ops/s $\color{#d91a1a}-1.95\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 8.3209ms 5.6099ms 178.2549 Ops/s 180.2165 Ops/s $\color{#d91a1a}-1.09\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.6529ms 2.0523ms 487.2490 Ops/s 430.0232 Ops/s $\textbf{\color{#35bf28}+13.31\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.2096ms 1.2230ms 817.6396 Ops/s 841.2088 Ops/s $\color{#d91a1a}-2.80\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4912s 15.4498ms 64.7259 Ops/s 31.5056 Ops/s $\textbf{\color{#35bf28}+105.44\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 10.3258ms 2.2251ms 449.4137 Ops/s 445.4273 Ops/s $\color{#35bf28}+0.89\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.0191ms 1.3469ms 742.4705 Ops/s 733.6291 Ops/s $\color{#35bf28}+1.21\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.9154ms 12.6542ms 79.0254 Ops/s 72.3979 Ops/s $\textbf{\color{#35bf28}+9.15\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.6557ms 16.9198ms 59.1022 Ops/s 58.1572 Ops/s $\color{#35bf28}+1.62\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.2492ms 17.6684ms 56.5981 Ops/s 55.6824 Ops/s $\color{#35bf28}+1.64\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.9537ms 17.2152ms 58.0883 Ops/s 58.1064 Ops/s $\color{#d91a1a}-0.03\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.3803ms 17.6863ms 56.5410 Ops/s 53.3515 Ops/s $\textbf{\color{#35bf28}+5.98\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.8738ms 18.6536ms 53.6089 Ops/s 52.7834 Ops/s $\color{#35bf28}+1.56\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 4, 2025
ghstack-source-id: 574eb1f9b662c1eb5be25e97020e11b3fadf625e
Pull Request resolved: #2755
@vmoens vmoens merged commit 8a8280f into gh/vmoens/85/base Feb 4, 2025
1 check passed
@vmoens vmoens deleted the gh/vmoens/85/head branch February 4, 2025 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants