Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] NonTensor should not convert anything to numpy #2771

Open
wants to merge 3 commits into
base: gh/vmoens/88/base
Choose a base branch
from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 7, 2025

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Feb 7, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2771

Note: Links to docs will display an error until the docs builds have been completed.

❌ 10 New Failures, 3 Unrelated Failures

As of commit 901c999 with merge base 75f113f (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens added a commit that referenced this pull request Feb 7, 2025
ghstack-source-id: 1ea6daf2e1253a5db5ef163b85ff8810b84fd19e
Pull Request resolved: #2771
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 7, 2025
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 7, 2025
ghstack-source-id: 6006ff9b7edde96f785a599e30945c1c2d53fa97
Pull Request resolved: #2771
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 7, 2025
ghstack-source-id: 10a19e08499c21b6fbbe55133619a18624d01678
Pull Request resolved: #2771
Copy link

github-actions bot commented Feb 7, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.6056s 0.5093s 1.9636 Ops/s 2.1958 Ops/s $\textbf{\color{#d91a1a}-10.58\%}$
test_transformed 1.0997s 0.9978s 1.0022 Ops/s 1.0656 Ops/s $\textbf{\color{#d91a1a}-5.95\%}$
test_serial 1.6220s 1.5149s 0.6601 Ops/s 0.7202 Ops/s $\textbf{\color{#d91a1a}-8.34\%}$
test_parallel 1.4026s 1.3004s 0.7690 Ops/s 0.8067 Ops/s $\color{#d91a1a}-4.67\%$
test_step_mdp_speed[True-True-True-True-True] 0.2050ms 30.2204μs 33.0902 KOps/s 33.5397 KOps/s $\color{#d91a1a}-1.34\%$
test_step_mdp_speed[True-True-True-True-False] 58.6100μs 18.1475μs 55.1041 KOps/s 56.6608 KOps/s $\color{#d91a1a}-2.75\%$
test_step_mdp_speed[True-True-True-False-True] 70.0520μs 16.9892μs 58.8609 KOps/s 59.5809 KOps/s $\color{#d91a1a}-1.21\%$
test_step_mdp_speed[True-True-True-False-False] 37.1400μs 10.0866μs 99.1411 KOps/s 100.0379 KOps/s $\color{#d91a1a}-0.90\%$
test_step_mdp_speed[True-True-False-True-True] 63.2890μs 32.4005μs 30.8637 KOps/s 31.6583 KOps/s $\color{#d91a1a}-2.51\%$
test_step_mdp_speed[True-True-False-True-False] 49.6140μs 20.2220μs 49.4512 KOps/s 50.7676 KOps/s $\color{#d91a1a}-2.59\%$
test_step_mdp_speed[True-True-False-False-True] 50.1840μs 19.0009μs 52.6292 KOps/s 52.8663 KOps/s $\color{#d91a1a}-0.45\%$
test_step_mdp_speed[True-True-False-False-False] 50.4350μs 11.9501μs 83.6815 KOps/s 84.3591 KOps/s $\color{#d91a1a}-0.80\%$
test_step_mdp_speed[True-False-True-True-True] 76.5740μs 34.4135μs 29.0584 KOps/s 29.4973 KOps/s $\color{#d91a1a}-1.49\%$
test_step_mdp_speed[True-False-True-True-False] 57.7890μs 21.9434μs 45.5719 KOps/s 46.7172 KOps/s $\color{#d91a1a}-2.45\%$
test_step_mdp_speed[True-False-True-False-True] 58.7810μs 18.9287μs 52.8297 KOps/s 53.6147 KOps/s $\color{#d91a1a}-1.46\%$
test_step_mdp_speed[True-False-True-False-False] 46.6470μs 11.9714μs 83.5325 KOps/s 85.3114 KOps/s $\color{#d91a1a}-2.09\%$
test_step_mdp_speed[True-False-False-True-True] 77.3460μs 36.2605μs 27.5783 KOps/s 28.3331 KOps/s $\color{#d91a1a}-2.66\%$
test_step_mdp_speed[True-False-False-True-False] 71.4140μs 23.7354μs 42.1311 KOps/s 43.0185 KOps/s $\color{#d91a1a}-2.06\%$
test_step_mdp_speed[True-False-False-False-True] 59.8420μs 21.2208μs 47.1237 KOps/s 49.1415 KOps/s $\color{#d91a1a}-4.11\%$
test_step_mdp_speed[True-False-False-False-False] 45.8570μs 13.8462μs 72.2218 KOps/s 73.4190 KOps/s $\color{#d91a1a}-1.63\%$
test_step_mdp_speed[False-True-True-True-True] 78.3670μs 34.5581μs 28.9368 KOps/s 29.8153 KOps/s $\color{#d91a1a}-2.95\%$
test_step_mdp_speed[False-True-True-True-False] 0.6197ms 22.7217μs 44.0107 KOps/s 46.4578 KOps/s $\textbf{\color{#d91a1a}-5.27\%}$
test_step_mdp_speed[False-True-True-False-True] 52.6590μs 21.7454μs 45.9868 KOps/s 46.3694 KOps/s $\color{#d91a1a}-0.83\%$
test_step_mdp_speed[False-True-True-False-False] 55.8650μs 13.2383μs 75.5385 KOps/s 76.4591 KOps/s $\color{#d91a1a}-1.20\%$
test_step_mdp_speed[False-True-False-True-True] 76.6740μs 35.6183μs 28.0755 KOps/s 28.3985 KOps/s $\color{#d91a1a}-1.14\%$
test_step_mdp_speed[False-True-False-True-False] 57.3580μs 23.6142μs 42.3474 KOps/s 42.7477 KOps/s $\color{#d91a1a}-0.94\%$
test_step_mdp_speed[False-True-False-False-True] 2.3523ms 23.5374μs 42.4856 KOps/s 42.6667 KOps/s $\color{#d91a1a}-0.42\%$
test_step_mdp_speed[False-True-False-False-False] 45.1650μs 15.2856μs 65.4209 KOps/s 66.6069 KOps/s $\color{#d91a1a}-1.78\%$
test_step_mdp_speed[False-False-True-True-True] 0.3333ms 37.8186μs 26.4420 KOps/s 26.7792 KOps/s $\color{#d91a1a}-1.26\%$
test_step_mdp_speed[False-False-True-True-False] 58.0790μs 25.3729μs 39.4122 KOps/s 39.6895 KOps/s $\color{#d91a1a}-0.70\%$
test_step_mdp_speed[False-False-True-False-True] 52.5180μs 23.3460μs 42.8340 KOps/s 42.2571 KOps/s $\color{#35bf28}+1.37\%$
test_step_mdp_speed[False-False-True-False-False] 43.3910μs 15.1460μs 66.0242 KOps/s 66.5204 KOps/s $\color{#d91a1a}-0.75\%$
test_step_mdp_speed[False-False-False-True-True] 77.6560μs 38.7813μs 25.7856 KOps/s 25.7912 KOps/s $\color{#d91a1a}-0.02\%$
test_step_mdp_speed[False-False-False-True-False] 81.1730μs 27.1552μs 36.8253 KOps/s 37.0484 KOps/s $\color{#d91a1a}-0.60\%$
test_step_mdp_speed[False-False-False-False-True] 55.6650μs 24.9146μs 40.1371 KOps/s 40.1423 KOps/s $\color{#d91a1a}-0.01\%$
test_step_mdp_speed[False-False-False-False-False] 46.1160μs 16.7202μs 59.8079 KOps/s 59.9094 KOps/s $\color{#d91a1a}-0.17\%$
test_values[generalized_advantage_estimate-True-True] 10.1672ms 9.6773ms 103.3348 Ops/s 103.0280 Ops/s $\color{#35bf28}+0.30\%$
test_values[vec_generalized_advantage_estimate-True-True] 28.4987ms 26.2206ms 38.1379 Ops/s 40.3223 Ops/s $\textbf{\color{#d91a1a}-5.42\%}$
test_values[td0_return_estimate-False-False] 0.2459ms 0.1794ms 5.5752 KOps/s 5.5995 KOps/s $\color{#d91a1a}-0.43\%$
test_values[td1_return_estimate-False-False] 27.5514ms 23.9852ms 41.6923 Ops/s 40.5774 Ops/s $\color{#35bf28}+2.75\%$
test_values[vec_td1_return_estimate-False-False] 28.8484ms 26.2628ms 38.0766 Ops/s 40.1385 Ops/s $\textbf{\color{#d91a1a}-5.14\%}$
test_values[td_lambda_return_estimate-True-False] 36.1381ms 34.4403ms 29.0358 Ops/s 27.7091 Ops/s $\color{#35bf28}+4.79\%$
test_values[vec_td_lambda_return_estimate-True-False] 28.3036ms 26.3466ms 37.9556 Ops/s 38.8592 Ops/s $\color{#d91a1a}-2.33\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 11.5149ms 8.4836ms 117.8738 Ops/s 117.4341 Ops/s $\color{#35bf28}+0.37\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.2097ms 1.9300ms 518.1321 Ops/s 530.8351 Ops/s $\color{#d91a1a}-2.39\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4358ms 0.3587ms 2.7877 KOps/s 2.7040 KOps/s $\color{#35bf28}+3.10\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 54.1924ms 46.0218ms 21.7289 Ops/s 21.6914 Ops/s $\color{#35bf28}+0.17\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.4643ms 3.4384ms 290.8334 Ops/s 290.7443 Ops/s $\color{#35bf28}+0.03\%$
test_dqn_speed[False-None] 5.9120ms 1.3943ms 717.2103 Ops/s 693.4013 Ops/s $\color{#35bf28}+3.43\%$
test_dqn_speed[False-backward] 1.9877ms 1.8662ms 535.8526 Ops/s 517.1731 Ops/s $\color{#35bf28}+3.61\%$
test_dqn_speed[True-None] 0.6910ms 0.4763ms 2.0993 KOps/s 1.9520 KOps/s $\textbf{\color{#35bf28}+7.54\%}$
test_dqn_speed[True-backward] 0.9567ms 0.8943ms 1.1182 KOps/s 987.5358 Ops/s $\textbf{\color{#35bf28}+13.23\%}$
test_dqn_speed[reduce-overhead-None] 0.7548ms 0.4779ms 2.0924 KOps/s 2.0266 KOps/s $\color{#35bf28}+3.25\%$
test_dqn_speed[reduce-overhead-backward] 0.9559ms 0.8915ms 1.1217 KOps/s 1.1008 KOps/s $\color{#35bf28}+1.90\%$
test_ddpg_speed[False-None] 3.3064ms 2.8996ms 344.8712 Ops/s 323.2023 Ops/s $\textbf{\color{#35bf28}+6.70\%}$
test_ddpg_speed[False-backward] 4.2601ms 4.0856ms 244.7622 Ops/s 239.6434 Ops/s $\color{#35bf28}+2.14\%$
test_ddpg_speed[True-None] 1.4129ms 1.2387ms 807.2996 Ops/s 803.4733 Ops/s $\color{#35bf28}+0.48\%$
test_ddpg_speed[True-backward] 2.2207ms 2.1306ms 469.3559 Ops/s 465.7077 Ops/s $\color{#35bf28}+0.78\%$
test_ddpg_speed[reduce-overhead-None] 1.4270ms 1.2366ms 808.6701 Ops/s 794.9026 Ops/s $\color{#35bf28}+1.73\%$
test_ddpg_speed[reduce-overhead-backward] 2.8699ms 2.1741ms 459.9527 Ops/s 454.5865 Ops/s $\color{#35bf28}+1.18\%$
test_sac_speed[False-None] 9.2049ms 8.0025ms 124.9611 Ops/s 122.2809 Ops/s $\color{#35bf28}+2.19\%$
test_sac_speed[False-backward] 11.9038ms 11.1152ms 89.9669 Ops/s 91.9734 Ops/s $\color{#d91a1a}-2.18\%$
test_sac_speed[True-None] 2.3125ms 2.0906ms 478.3316 Ops/s 468.3547 Ops/s $\color{#35bf28}+2.13\%$
test_sac_speed[True-backward] 5.0303ms 4.0820ms 244.9769 Ops/s 259.2037 Ops/s $\textbf{\color{#d91a1a}-5.49\%}$
test_sac_speed[reduce-overhead-None] 3.2405ms 2.1812ms 458.4528 Ops/s 456.8481 Ops/s $\color{#35bf28}+0.35\%$
test_sac_speed[reduce-overhead-backward] 3.9413ms 3.7662ms 265.5171 Ops/s 259.8172 Ops/s $\color{#35bf28}+2.19\%$
test_redq_speed[False-None] 14.0328ms 13.2360ms 75.5516 Ops/s 73.7450 Ops/s $\color{#35bf28}+2.45\%$
test_redq_speed[False-backward] 24.5749ms 22.4224ms 44.5982 Ops/s 43.4153 Ops/s $\color{#35bf28}+2.72\%$
test_redq_speed[True-None] 6.6985ms 5.1500ms 194.1733 Ops/s 191.2903 Ops/s $\color{#35bf28}+1.51\%$
test_redq_speed[True-backward] 14.1291ms 13.0068ms 76.8830 Ops/s 74.7744 Ops/s $\color{#35bf28}+2.82\%$
test_redq_speed[reduce-overhead-None] 6.1798ms 5.1371ms 194.6623 Ops/s 189.9775 Ops/s $\color{#35bf28}+2.47\%$
test_redq_speed[reduce-overhead-backward] 14.2146ms 13.2093ms 75.7040 Ops/s 74.1012 Ops/s $\color{#35bf28}+2.16\%$
test_redq_deprec_speed[False-None] 16.8014ms 14.2250ms 70.2989 Ops/s 73.4995 Ops/s $\color{#d91a1a}-4.35\%$
test_redq_deprec_speed[False-backward] 21.9668ms 19.9887ms 50.0282 Ops/s 51.7793 Ops/s $\color{#d91a1a}-3.38\%$
test_redq_deprec_speed[True-None] 4.8883ms 4.0875ms 244.6484 Ops/s 258.9875 Ops/s $\textbf{\color{#d91a1a}-5.54\%}$
test_redq_deprec_speed[True-backward] 9.8073ms 8.7613ms 114.1380 Ops/s 109.2639 Ops/s $\color{#35bf28}+4.46\%$
test_redq_deprec_speed[reduce-overhead-None] 4.7905ms 4.0114ms 249.2912 Ops/s 252.5087 Ops/s $\color{#d91a1a}-1.27\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.6468ms 8.8052ms 113.5687 Ops/s 111.7890 Ops/s $\color{#35bf28}+1.59\%$
test_td3_speed[False-None] 8.7861ms 8.0354ms 124.4492 Ops/s 120.9170 Ops/s $\color{#35bf28}+2.92\%$
test_td3_speed[False-backward] 11.3991ms 10.9397ms 91.4101 Ops/s 93.7334 Ops/s $\color{#d91a1a}-2.48\%$
test_td3_speed[True-None] 1.9829ms 1.7829ms 560.8836 Ops/s 541.5121 Ops/s $\color{#35bf28}+3.58\%$
test_td3_speed[True-backward] 4.0619ms 3.3840ms 295.5044 Ops/s 288.0153 Ops/s $\color{#35bf28}+2.60\%$
test_td3_speed[reduce-overhead-None] 1.9640ms 1.7726ms 564.1479 Ops/s 532.6747 Ops/s $\textbf{\color{#35bf28}+5.91\%}$
test_td3_speed[reduce-overhead-backward] 4.2388ms 3.6527ms 273.7726 Ops/s 289.3555 Ops/s $\textbf{\color{#d91a1a}-5.39\%}$
test_cql_speed[False-None] 42.6477ms 37.5048ms 26.6633 Ops/s 26.4844 Ops/s $\color{#35bf28}+0.68\%$
test_cql_speed[False-backward] 51.9938ms 48.3371ms 20.6880 Ops/s 20.9227 Ops/s $\color{#d91a1a}-1.12\%$
test_cql_speed[True-None] 17.0408ms 16.0907ms 62.1479 Ops/s 59.6691 Ops/s $\color{#35bf28}+4.15\%$
test_cql_speed[True-backward] 23.7465ms 22.8436ms 43.7760 Ops/s 42.2955 Ops/s $\color{#35bf28}+3.50\%$
test_cql_speed[reduce-overhead-None] 17.3431ms 16.2725ms 61.4533 Ops/s 60.3023 Ops/s $\color{#35bf28}+1.91\%$
test_cql_speed[reduce-overhead-backward] 25.8938ms 23.4045ms 42.7268 Ops/s 41.9846 Ops/s $\color{#35bf28}+1.77\%$
test_a2c_speed[False-None] 9.1277ms 7.4493ms 134.2401 Ops/s 133.1638 Ops/s $\color{#35bf28}+0.81\%$
test_a2c_speed[False-backward] 15.8500ms 14.8223ms 67.4661 Ops/s 66.0653 Ops/s $\color{#35bf28}+2.12\%$
test_a2c_speed[True-None] 4.5570ms 3.6693ms 272.5328 Ops/s 263.8427 Ops/s $\color{#35bf28}+3.29\%$
test_a2c_speed[True-backward] 11.0053ms 10.2373ms 97.6821 Ops/s 90.4390 Ops/s $\textbf{\color{#35bf28}+8.01\%}$
test_a2c_speed[reduce-overhead-None] 4.5507ms 3.7073ms 269.7404 Ops/s 253.8582 Ops/s $\textbf{\color{#35bf28}+6.26\%}$
test_a2c_speed[reduce-overhead-backward] 10.9014ms 10.2281ms 97.7694 Ops/s 96.7387 Ops/s $\color{#35bf28}+1.07\%$
test_ppo_speed[False-None] 8.6539ms 7.4397ms 134.4143 Ops/s 130.4223 Ops/s $\color{#35bf28}+3.06\%$
test_ppo_speed[False-backward] 21.3275ms 15.7870ms 63.3434 Ops/s 65.2789 Ops/s $\color{#d91a1a}-2.97\%$
test_ppo_speed[True-None] 4.8309ms 4.0645ms 246.0302 Ops/s 241.7474 Ops/s $\color{#35bf28}+1.77\%$
test_ppo_speed[True-backward] 11.2061ms 10.4583ms 95.6176 Ops/s 96.8446 Ops/s $\color{#d91a1a}-1.27\%$
test_ppo_speed[reduce-overhead-None] 4.4032ms 4.0962ms 244.1311 Ops/s 240.7357 Ops/s $\color{#35bf28}+1.41\%$
test_ppo_speed[reduce-overhead-backward] 11.0212ms 10.2755ms 97.3191 Ops/s 98.8217 Ops/s $\color{#d91a1a}-1.52\%$
test_reinforce_speed[False-None] 7.2030ms 6.5467ms 152.7483 Ops/s 150.1616 Ops/s $\color{#35bf28}+1.72\%$
test_reinforce_speed[False-backward] 10.6322ms 10.0549ms 99.4544 Ops/s 100.4575 Ops/s $\color{#d91a1a}-1.00\%$
test_reinforce_speed[True-None] 3.5646ms 3.0779ms 324.8916 Ops/s 318.6564 Ops/s $\color{#35bf28}+1.96\%$
test_reinforce_speed[True-backward] 9.9169ms 9.2265ms 108.3834 Ops/s 107.7524 Ops/s $\color{#35bf28}+0.59\%$
test_reinforce_speed[reduce-overhead-None] 3.7981ms 3.0272ms 330.3407 Ops/s 321.9323 Ops/s $\color{#35bf28}+2.61\%$
test_reinforce_speed[reduce-overhead-backward] 9.7570ms 9.1968ms 108.7329 Ops/s 106.8165 Ops/s $\color{#35bf28}+1.79\%$
test_iql_speed[False-None] 40.5450ms 33.1120ms 30.2006 Ops/s 30.0915 Ops/s $\color{#35bf28}+0.36\%$
test_iql_speed[False-backward] 48.6861ms 46.5219ms 21.4953 Ops/s 21.2752 Ops/s $\color{#35bf28}+1.03\%$
test_iql_speed[True-None] 14.8569ms 11.7201ms 85.3233 Ops/s 85.0365 Ops/s $\color{#35bf28}+0.34\%$
test_iql_speed[True-backward] 24.1864ms 23.0618ms 43.3617 Ops/s 42.7052 Ops/s $\color{#35bf28}+1.54\%$
test_iql_speed[reduce-overhead-None] 12.7418ms 11.6293ms 85.9894 Ops/s 84.6279 Ops/s $\color{#35bf28}+1.61\%$
test_iql_speed[reduce-overhead-backward] 24.0320ms 22.5851ms 44.2770 Ops/s 42.8770 Ops/s $\color{#35bf28}+3.27\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.5832ms 4.9579ms 201.7003 Ops/s 195.5986 Ops/s $\color{#35bf28}+3.12\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7933ms 0.5300ms 1.8867 KOps/s 1.8839 KOps/s $\color{#35bf28}+0.15\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8367ms 0.5118ms 1.9540 KOps/s 1.9936 KOps/s $\color{#d91a1a}-1.99\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.3554ms 4.9423ms 202.3333 Ops/s 205.7912 Ops/s $\color{#d91a1a}-1.68\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.2379ms 0.5277ms 1.8950 KOps/s 1.8154 KOps/s $\color{#35bf28}+4.39\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8501ms 0.4978ms 2.0090 KOps/s 2.0079 KOps/s $\color{#35bf28}+0.06\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.9757ms 1.6442ms 608.2057 Ops/s 594.3829 Ops/s $\color{#35bf28}+2.33\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.3269ms 1.5603ms 640.8872 Ops/s 627.5702 Ops/s $\color{#35bf28}+2.12\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.4109ms 5.0315ms 198.7466 Ops/s 197.4796 Ops/s $\color{#35bf28}+0.64\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.4961ms 0.6703ms 1.4920 KOps/s 1.4969 KOps/s $\color{#d91a1a}-0.33\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9507ms 0.6427ms 1.5559 KOps/s 1.5548 KOps/s $\color{#35bf28}+0.07\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.8224ms 5.0723ms 197.1496 Ops/s 203.0856 Ops/s $\color{#d91a1a}-2.92\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.1551ms 0.5322ms 1.8789 KOps/s 1.8724 KOps/s $\color{#35bf28}+0.34\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8120ms 0.5198ms 1.9238 KOps/s 1.9483 KOps/s $\color{#d91a1a}-1.26\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.4850ms 5.0799ms 196.8545 Ops/s 202.8331 Ops/s $\color{#d91a1a}-2.95\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.1583ms 0.5280ms 1.8939 KOps/s 1.9112 KOps/s $\color{#d91a1a}-0.91\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7534ms 0.5119ms 1.9536 KOps/s 1.9922 KOps/s $\color{#d91a1a}-1.94\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.3700ms 5.1362ms 194.6965 Ops/s 199.8361 Ops/s $\color{#d91a1a}-2.57\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0683ms 0.6695ms 1.4936 KOps/s 1.4700 KOps/s $\color{#35bf28}+1.61\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0137ms 0.6361ms 1.5722 KOps/s 1.5414 KOps/s $\color{#35bf28}+1.99\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.5486ms 4.2645ms 234.4931 Ops/s 231.7022 Ops/s $\color{#35bf28}+1.20\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 7.7299ms 2.3276ms 429.6286 Ops/s 423.5709 Ops/s $\color{#35bf28}+1.43\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 4.2058ms 1.2763ms 783.5178 Ops/s 757.5642 Ops/s $\color{#35bf28}+3.43\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.3493ms 4.3883ms 227.8769 Ops/s 240.9274 Ops/s $\textbf{\color{#d91a1a}-5.42\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 10.6496ms 2.4669ms 405.3592 Ops/s 415.3994 Ops/s $\color{#d91a1a}-2.42\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.4084ms 1.3276ms 753.2331 Ops/s 743.6393 Ops/s $\color{#35bf28}+1.29\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5362s 15.3013ms 65.3539 Ops/s 231.0124 Ops/s $\textbf{\color{#d91a1a}-71.71\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.8504ms 2.6341ms 379.6371 Ops/s 377.4070 Ops/s $\color{#35bf28}+0.59\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.4770ms 1.5704ms 636.7923 Ops/s 658.1029 Ops/s $\color{#d91a1a}-3.24\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.2147ms 11.8375ms 84.4771 Ops/s 80.4782 Ops/s $\color{#35bf28}+4.97\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.6981ms 14.4836ms 69.0434 Ops/s 70.0759 Ops/s $\color{#d91a1a}-1.47\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.5589ms 20.7558ms 48.1794 Ops/s 46.2111 Ops/s $\color{#35bf28}+4.26\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.3635ms 14.3324ms 69.7722 Ops/s 67.5141 Ops/s $\color{#35bf28}+3.34\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.0479ms 20.6133ms 48.5123 Ops/s 47.0649 Ops/s $\color{#35bf28}+3.08\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 16.9910ms 15.6767ms 63.7890 Ops/s 59.8131 Ops/s $\textbf{\color{#35bf28}+6.65\%}$

Copy link

github-actions bot commented Feb 7, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}23$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.8679s 0.7726s 1.2943 Ops/s 1.2503 Ops/s $\color{#35bf28}+3.52\%$
test_transformed 1.3883s 1.3081s 0.7645 Ops/s 0.7265 Ops/s $\textbf{\color{#35bf28}+5.22\%}$
test_serial 2.2545s 2.1703s 0.4608 Ops/s 0.4416 Ops/s $\color{#35bf28}+4.33\%$
test_parallel 2.0302s 1.9018s 0.5258 Ops/s 0.5256 Ops/s $\color{#35bf28}+0.03\%$
test_step_mdp_speed[True-True-True-True-True] 0.1845ms 38.3140μs 26.1001 KOps/s 25.3821 KOps/s $\color{#35bf28}+2.83\%$
test_step_mdp_speed[True-True-True-True-False] 54.8710μs 22.2284μs 44.9875 KOps/s 43.4466 KOps/s $\color{#35bf28}+3.55\%$
test_step_mdp_speed[True-True-True-False-True] 89.7710μs 20.9686μs 47.6903 KOps/s 45.7023 KOps/s $\color{#35bf28}+4.35\%$
test_step_mdp_speed[True-True-True-False-False] 48.8000μs 12.3615μs 80.8960 KOps/s 78.3814 KOps/s $\color{#35bf28}+3.21\%$
test_step_mdp_speed[True-True-False-True-True] 96.1810μs 41.1610μs 24.2949 KOps/s 23.9619 KOps/s $\color{#35bf28}+1.39\%$
test_step_mdp_speed[True-True-False-True-False] 62.0010μs 24.2371μs 41.2591 KOps/s 39.4907 KOps/s $\color{#35bf28}+4.48\%$
test_step_mdp_speed[True-True-False-False-True] 67.2010μs 23.7256μs 42.1486 KOps/s 40.7275 KOps/s $\color{#35bf28}+3.49\%$
test_step_mdp_speed[True-True-False-False-False] 54.8410μs 14.6005μs 68.4906 KOps/s 65.6610 KOps/s $\color{#35bf28}+4.31\%$
test_step_mdp_speed[True-False-True-True-True] 84.1010μs 42.7486μs 23.3926 KOps/s 22.6300 KOps/s $\color{#35bf28}+3.37\%$
test_step_mdp_speed[True-False-True-True-False] 82.0410μs 26.5967μs 37.5987 KOps/s 35.9667 KOps/s $\color{#35bf28}+4.54\%$
test_step_mdp_speed[True-False-True-False-True] 49.6310μs 23.6068μs 42.3606 KOps/s 41.1195 KOps/s $\color{#35bf28}+3.02\%$
test_step_mdp_speed[True-False-True-False-False] 42.9100μs 14.7075μs 67.9927 KOps/s 65.5915 KOps/s $\color{#35bf28}+3.66\%$
test_step_mdp_speed[True-False-False-True-True] 81.9310μs 45.5584μs 21.9499 KOps/s 21.7027 KOps/s $\color{#35bf28}+1.14\%$
test_step_mdp_speed[True-False-False-True-False] 67.8710μs 28.9435μs 34.5501 KOps/s 33.9833 KOps/s $\color{#35bf28}+1.67\%$
test_step_mdp_speed[True-False-False-False-True] 69.1210μs 25.4955μs 39.2226 KOps/s 37.3033 KOps/s $\textbf{\color{#35bf28}+5.15\%}$
test_step_mdp_speed[True-False-False-False-False] 55.7610μs 16.6689μs 59.9920 KOps/s 57.5640 KOps/s $\color{#35bf28}+4.22\%$
test_step_mdp_speed[False-True-True-True-True] 0.1319ms 41.1150μs 24.3220 KOps/s 22.7325 KOps/s $\textbf{\color{#35bf28}+6.99\%}$
test_step_mdp_speed[False-True-True-True-False] 57.9200μs 26.9064μs 37.1658 KOps/s 35.9665 KOps/s $\color{#35bf28}+3.33\%$
test_step_mdp_speed[False-True-True-False-True] 60.8410μs 27.7637μs 36.0183 KOps/s 35.7152 KOps/s $\color{#35bf28}+0.85\%$
test_step_mdp_speed[False-True-True-False-False] 48.2410μs 16.3593μs 61.1272 KOps/s 59.4725 KOps/s $\color{#35bf28}+2.78\%$
test_step_mdp_speed[False-True-False-True-True] 76.9800μs 45.5962μs 21.9316 KOps/s 21.7033 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[False-True-False-True-False] 66.8810μs 29.3902μs 34.0250 KOps/s 33.4600 KOps/s $\color{#35bf28}+1.69\%$
test_step_mdp_speed[False-True-False-False-True] 3.3146ms 30.3300μs 32.9707 KOps/s 32.3722 KOps/s $\color{#35bf28}+1.85\%$
test_step_mdp_speed[False-True-False-False-False] 94.4810μs 18.7824μs 53.2414 KOps/s 52.0484 KOps/s $\color{#35bf28}+2.29\%$
test_step_mdp_speed[False-False-True-True-True] 78.8510μs 48.2851μs 20.7103 KOps/s 20.4925 KOps/s $\color{#35bf28}+1.06\%$
test_step_mdp_speed[False-False-True-True-False] 67.1910μs 32.0058μs 31.2444 KOps/s 30.6796 KOps/s $\color{#35bf28}+1.84\%$
test_step_mdp_speed[False-False-True-False-True] 65.3110μs 30.0506μs 33.2773 KOps/s 32.9107 KOps/s $\color{#35bf28}+1.11\%$
test_step_mdp_speed[False-False-True-False-False] 47.8910μs 18.7387μs 53.3654 KOps/s 52.3509 KOps/s $\color{#35bf28}+1.94\%$
test_step_mdp_speed[False-False-False-True-True] 81.3100μs 50.2313μs 19.9079 KOps/s 19.5754 KOps/s $\color{#35bf28}+1.70\%$
test_step_mdp_speed[False-False-False-True-False] 65.5710μs 33.8671μs 29.5272 KOps/s 28.7439 KOps/s $\color{#35bf28}+2.72\%$
test_step_mdp_speed[False-False-False-False-True] 69.8210μs 31.3336μs 31.9146 KOps/s 30.9904 KOps/s $\color{#35bf28}+2.98\%$
test_step_mdp_speed[False-False-False-False-False] 54.3700μs 20.9909μs 47.6398 KOps/s 47.2906 KOps/s $\color{#35bf28}+0.74\%$
test_values[generalized_advantage_estimate-True-True] 25.1556ms 24.8057ms 40.3132 Ops/s 40.3560 Ops/s $\color{#d91a1a}-0.11\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1146s 3.1871ms 313.7651 Ops/s 313.1170 Ops/s $\color{#35bf28}+0.21\%$
test_values[td0_return_estimate-False-False] 0.1058ms 79.5482μs 12.5710 KOps/s 12.7390 KOps/s $\color{#d91a1a}-1.32\%$
test_values[td1_return_estimate-False-False] 55.3047ms 54.8620ms 18.2276 Ops/s 18.3564 Ops/s $\color{#d91a1a}-0.70\%$
test_values[vec_td1_return_estimate-False-False] 1.2981ms 1.0776ms 928.0007 Ops/s 929.9576 Ops/s $\color{#d91a1a}-0.21\%$
test_values[td_lambda_return_estimate-True-False] 87.7644ms 87.0916ms 11.4822 Ops/s 11.5877 Ops/s $\color{#d91a1a}-0.91\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2978ms 1.0754ms 929.8440 Ops/s 932.0292 Ops/s $\color{#d91a1a}-0.23\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.0062ms 24.7727ms 40.3670 Ops/s 40.3869 Ops/s $\color{#d91a1a}-0.05\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0247ms 0.7563ms 1.3222 KOps/s 1.3415 KOps/s $\color{#d91a1a}-1.44\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7511ms 0.6633ms 1.5077 KOps/s 1.5036 KOps/s $\color{#35bf28}+0.27\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5174ms 1.4798ms 675.7736 Ops/s 675.8804 Ops/s $\color{#d91a1a}-0.02\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8102ms 0.6781ms 1.4747 KOps/s 1.4727 KOps/s $\color{#35bf28}+0.13\%$
test_dqn_speed[False-None] 6.9375ms 1.4774ms 676.8421 Ops/s 661.0407 Ops/s $\color{#35bf28}+2.39\%$
test_dqn_speed[False-backward] 2.1054ms 2.0721ms 482.5953 Ops/s 471.9882 Ops/s $\color{#35bf28}+2.25\%$
test_dqn_speed[True-None] 0.6713ms 0.5456ms 1.8329 KOps/s 1.8343 KOps/s $\color{#d91a1a}-0.08\%$
test_dqn_speed[True-backward] 1.2742ms 1.2188ms 820.4732 Ops/s 806.2811 Ops/s $\color{#35bf28}+1.76\%$
test_dqn_speed[reduce-overhead-None] 0.6191ms 0.5588ms 1.7896 KOps/s 1.7802 KOps/s $\color{#35bf28}+0.53\%$
test_dqn_speed[reduce-overhead-backward] 1.1314ms 1.0644ms 939.5186 Ops/s 945.9928 Ops/s $\color{#d91a1a}-0.68\%$
test_ddpg_speed[False-None] 3.1299ms 2.8154ms 355.1837 Ops/s 349.2724 Ops/s $\color{#35bf28}+1.69\%$
test_ddpg_speed[False-backward] 4.7291ms 4.2731ms 234.0221 Ops/s 235.9146 Ops/s $\color{#d91a1a}-0.80\%$
test_ddpg_speed[True-None] 1.3916ms 1.3087ms 764.1382 Ops/s 753.7807 Ops/s $\color{#35bf28}+1.37\%$
test_ddpg_speed[True-backward] 2.5858ms 2.5255ms 395.9585 Ops/s 388.3956 Ops/s $\color{#35bf28}+1.95\%$
test_ddpg_speed[reduce-overhead-None] 1.5570ms 1.3866ms 721.2030 Ops/s 723.1869 Ops/s $\color{#d91a1a}-0.27\%$
test_ddpg_speed[reduce-overhead-backward] 2.0482ms 2.0011ms 499.7346 Ops/s 491.7895 Ops/s $\color{#35bf28}+1.62\%$
test_sac_speed[False-None] 8.2224ms 7.8059ms 128.1089 Ops/s 121.0629 Ops/s $\textbf{\color{#35bf28}+5.82\%}$
test_sac_speed[False-backward] 11.7148ms 10.9930ms 90.9669 Ops/s 88.6502 Ops/s $\color{#35bf28}+2.61\%$
test_sac_speed[True-None] 1.8719ms 1.7937ms 557.5071 Ops/s 549.8506 Ops/s $\color{#35bf28}+1.39\%$
test_sac_speed[True-backward] 3.7432ms 3.6340ms 275.1777 Ops/s 270.2553 Ops/s $\color{#35bf28}+1.82\%$
test_sac_speed[reduce-overhead-None] 21.1521ms 11.8632ms 84.2942 Ops/s 83.1162 Ops/s $\color{#35bf28}+1.42\%$
test_sac_speed[reduce-overhead-backward] 1.7949ms 1.7539ms 570.1508 Ops/s 554.1580 Ops/s $\color{#35bf28}+2.89\%$
test_redq_speed[False-None] 7.6479ms 7.2763ms 137.4331 Ops/s 127.9040 Ops/s $\textbf{\color{#35bf28}+7.45\%}$
test_redq_speed[False-backward] 11.9192ms 11.4345ms 87.4544 Ops/s 83.9583 Ops/s $\color{#35bf28}+4.16\%$
test_redq_speed[True-None] 2.3066ms 2.2174ms 450.9784 Ops/s 441.3557 Ops/s $\color{#35bf28}+2.18\%$
test_redq_speed[True-backward] 4.4640ms 4.0674ms 245.8563 Ops/s 242.3518 Ops/s $\color{#35bf28}+1.45\%$
test_redq_speed[reduce-overhead-None] 2.4194ms 2.2490ms 444.6370 Ops/s 438.1171 Ops/s $\color{#35bf28}+1.49\%$
test_redq_speed[reduce-overhead-backward] 4.4495ms 4.0638ms 246.0739 Ops/s 240.0191 Ops/s $\color{#35bf28}+2.52\%$
test_redq_deprec_speed[False-None] 9.1190ms 8.7635ms 114.1103 Ops/s 110.2439 Ops/s $\color{#35bf28}+3.51\%$
test_redq_deprec_speed[False-backward] 12.4719ms 12.0063ms 83.2894 Ops/s 81.2325 Ops/s $\color{#35bf28}+2.53\%$
test_redq_deprec_speed[True-None] 2.6579ms 2.5712ms 388.9304 Ops/s 381.6639 Ops/s $\color{#35bf28}+1.90\%$
test_redq_deprec_speed[True-backward] 4.5418ms 4.4107ms 226.7223 Ops/s 217.7007 Ops/s $\color{#35bf28}+4.14\%$
test_redq_deprec_speed[reduce-overhead-None] 2.7279ms 2.5768ms 388.0822 Ops/s 381.4384 Ops/s $\color{#35bf28}+1.74\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.2536ms 4.1921ms 238.5429 Ops/s 223.9129 Ops/s $\textbf{\color{#35bf28}+6.53\%}$
test_td3_speed[False-None] 7.7724ms 7.7259ms 129.4345 Ops/s 123.8973 Ops/s $\color{#35bf28}+4.47\%$
test_td3_speed[False-backward] 10.6337ms 10.0587ms 99.4168 Ops/s 94.9848 Ops/s $\color{#35bf28}+4.67\%$
test_td3_speed[True-None] 1.6213ms 1.5994ms 625.2447 Ops/s 611.7893 Ops/s $\color{#35bf28}+2.20\%$
test_td3_speed[True-backward] 3.1701ms 3.1181ms 320.7048 Ops/s 298.1855 Ops/s $\textbf{\color{#35bf28}+7.55\%}$
test_td3_speed[reduce-overhead-None] 49.1672ms 25.4304ms 39.3230 Ops/s 39.2178 Ops/s $\color{#35bf28}+0.27\%$
test_td3_speed[reduce-overhead-backward] 1.3985ms 1.3184ms 758.4900 Ops/s 669.7818 Ops/s $\textbf{\color{#35bf28}+13.24\%}$
test_cql_speed[False-None] 16.7027ms 16.2259ms 61.6298 Ops/s 59.7158 Ops/s $\color{#35bf28}+3.21\%$
test_cql_speed[False-backward] 21.8187ms 21.2713ms 47.0117 Ops/s 45.0515 Ops/s $\color{#35bf28}+4.35\%$
test_cql_speed[True-None] 3.2520ms 3.1677ms 315.6818 Ops/s 311.7299 Ops/s $\color{#35bf28}+1.27\%$
test_cql_speed[True-backward] 5.9195ms 5.5345ms 180.6862 Ops/s 184.0372 Ops/s $\color{#d91a1a}-1.82\%$
test_cql_speed[reduce-overhead-None] 21.5153ms 13.0519ms 76.6171 Ops/s 77.4210 Ops/s $\color{#d91a1a}-1.04\%$
test_cql_speed[reduce-overhead-backward] 2.1428ms 1.9636ms 509.2788 Ops/s 508.6529 Ops/s $\color{#35bf28}+0.12\%$
test_a2c_speed[False-None] 3.2358ms 3.0998ms 322.6022 Ops/s 313.0328 Ops/s $\color{#35bf28}+3.06\%$
test_a2c_speed[False-backward] 6.8757ms 6.2528ms 159.9287 Ops/s 159.3698 Ops/s $\color{#35bf28}+0.35\%$
test_a2c_speed[True-None] 1.5156ms 1.3006ms 768.8735 Ops/s 758.4722 Ops/s $\color{#35bf28}+1.37\%$
test_a2c_speed[True-backward] 3.0170ms 2.9675ms 336.9802 Ops/s 327.5514 Ops/s $\color{#35bf28}+2.88\%$
test_a2c_speed[reduce-overhead-None] 15.9591ms 9.0014ms 111.0942 Ops/s 112.5878 Ops/s $\color{#d91a1a}-1.33\%$
test_a2c_speed[reduce-overhead-backward] 1.6787ms 1.5671ms 638.1287 Ops/s 695.3425 Ops/s $\textbf{\color{#d91a1a}-8.23\%}$
test_ppo_speed[False-None] 3.7686ms 3.6069ms 277.2492 Ops/s 261.2670 Ops/s $\textbf{\color{#35bf28}+6.12\%}$
test_ppo_speed[False-backward] 7.3920ms 6.9647ms 143.5821 Ops/s 145.8378 Ops/s $\color{#d91a1a}-1.55\%$
test_ppo_speed[True-None] 1.4528ms 1.3590ms 735.8558 Ops/s 723.5596 Ops/s $\color{#35bf28}+1.70\%$
test_ppo_speed[True-backward] 3.1285ms 3.0031ms 332.9909 Ops/s 327.7687 Ops/s $\color{#35bf28}+1.59\%$
test_ppo_speed[reduce-overhead-None] 1.0052ms 0.9327ms 1.0721 KOps/s 1.0603 KOps/s $\color{#35bf28}+1.11\%$
test_ppo_speed[reduce-overhead-backward] 1.4482ms 1.3617ms 734.3659 Ops/s 634.7615 Ops/s $\textbf{\color{#35bf28}+15.69\%}$
test_reinforce_speed[False-None] 2.3967ms 2.2880ms 437.0723 Ops/s 437.2163 Ops/s $\color{#d91a1a}-0.03\%$
test_reinforce_speed[False-backward] 3.7435ms 3.2730ms 305.5277 Ops/s 294.2711 Ops/s $\color{#35bf28}+3.83\%$
test_reinforce_speed[True-None] 1.3762ms 1.2447ms 803.3989 Ops/s 782.7709 Ops/s $\color{#35bf28}+2.64\%$
test_reinforce_speed[True-backward] 3.0114ms 2.8637ms 349.2010 Ops/s 342.3446 Ops/s $\color{#35bf28}+2.00\%$
test_reinforce_speed[reduce-overhead-None] 18.5543ms 9.9743ms 100.2581 Ops/s 100.1424 Ops/s $\color{#35bf28}+0.12\%$
test_reinforce_speed[reduce-overhead-backward] 1.4963ms 1.4290ms 699.7669 Ops/s 661.6425 Ops/s $\textbf{\color{#35bf28}+5.76\%}$
test_iql_speed[False-None] 9.3836ms 8.9526ms 111.6996 Ops/s 106.1516 Ops/s $\textbf{\color{#35bf28}+5.23\%}$
test_iql_speed[False-backward] 13.0479ms 12.5431ms 79.7250 Ops/s 76.2402 Ops/s $\color{#35bf28}+4.57\%$
test_iql_speed[True-None] 2.2327ms 2.1607ms 462.8050 Ops/s 434.3703 Ops/s $\textbf{\color{#35bf28}+6.55\%}$
test_iql_speed[True-backward] 4.7747ms 4.6366ms 215.6771 Ops/s 205.1462 Ops/s $\textbf{\color{#35bf28}+5.13\%}$
test_iql_speed[reduce-overhead-None] 18.9266ms 11.0257ms 90.6973 Ops/s 91.0358 Ops/s $\color{#d91a1a}-0.37\%$
test_iql_speed[reduce-overhead-backward] 1.8881ms 1.8411ms 543.1533 Ops/s 510.8166 Ops/s $\textbf{\color{#35bf28}+6.33\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.6199ms 6.0296ms 165.8485 Ops/s 162.3181 Ops/s $\color{#35bf28}+2.18\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5182ms 0.2672ms 3.7429 KOps/s 3.0679 KOps/s $\textbf{\color{#35bf28}+22.00\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5646ms 0.3065ms 3.2621 KOps/s 3.3936 KOps/s $\color{#d91a1a}-3.87\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 8.9425ms 5.8279ms 171.5887 Ops/s 170.9401 Ops/s $\color{#35bf28}+0.38\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.4174ms 0.3458ms 2.8918 KOps/s 3.5233 KOps/s $\textbf{\color{#d91a1a}-17.93\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5710ms 0.3256ms 3.0713 KOps/s 3.4825 KOps/s $\textbf{\color{#d91a1a}-11.81\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6096ms 1.3868ms 721.1033 Ops/s 764.9393 Ops/s $\textbf{\color{#d91a1a}-5.73\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5730ms 1.3395ms 746.5493 Ops/s 842.1833 Ops/s $\textbf{\color{#d91a1a}-11.36\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1507ms 5.9624ms 167.7187 Ops/s 165.3274 Ops/s $\color{#35bf28}+1.45\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0339ms 0.4937ms 2.0254 KOps/s 2.3412 KOps/s $\textbf{\color{#d91a1a}-13.49\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6648ms 0.4358ms 2.2945 KOps/s 2.3207 KOps/s $\color{#d91a1a}-1.13\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.9129ms 5.7922ms 172.6449 Ops/s 169.8022 Ops/s $\color{#35bf28}+1.67\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8240ms 0.2840ms 3.5210 KOps/s 2.5169 KOps/s $\textbf{\color{#35bf28}+39.90\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5145ms 0.3081ms 3.2462 KOps/s 3.1360 KOps/s $\color{#35bf28}+3.51\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 8.2291ms 5.7914ms 172.6693 Ops/s 169.9091 Ops/s $\color{#35bf28}+1.62\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.5927ms 0.3138ms 3.1868 KOps/s 3.6458 KOps/s $\textbf{\color{#d91a1a}-12.59\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5311ms 0.2924ms 3.4198 KOps/s 4.0021 KOps/s $\textbf{\color{#d91a1a}-14.55\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2845ms 5.9646ms 167.6569 Ops/s 165.4288 Ops/s $\color{#35bf28}+1.35\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.4691ms 0.4733ms 2.1126 KOps/s 2.3285 KOps/s $\textbf{\color{#d91a1a}-9.27\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.5659ms 0.3802ms 2.6305 KOps/s 2.4690 KOps/s $\textbf{\color{#35bf28}+6.54\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9482ms 5.3525ms 186.8281 Ops/s 184.8054 Ops/s $\color{#35bf28}+1.09\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 10.1385ms 2.0139ms 496.5455 Ops/s 440.1900 Ops/s $\textbf{\color{#35bf28}+12.80\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.6386ms 1.1719ms 853.3191 Ops/s 862.9032 Ops/s $\color{#d91a1a}-1.11\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.9926ms 5.4418ms 183.7624 Ops/s 185.7092 Ops/s $\color{#d91a1a}-1.05\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 10.2615ms 2.0672ms 483.7501 Ops/s 439.8171 Ops/s $\textbf{\color{#35bf28}+9.99\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 3.6613ms 1.1372ms 879.3460 Ops/s 877.9829 Ops/s $\color{#35bf28}+0.16\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5047s 15.6128ms 64.0501 Ops/s 31.7491 Ops/s $\textbf{\color{#35bf28}+101.74\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.3602ms 2.1128ms 473.3095 Ops/s 504.0026 Ops/s $\textbf{\color{#d91a1a}-6.09\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 8.6293ms 1.3378ms 747.4848 Ops/s 821.2949 Ops/s $\textbf{\color{#d91a1a}-8.99\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.4321ms 12.6489ms 79.0585 Ops/s 75.0714 Ops/s $\textbf{\color{#35bf28}+5.31\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.2469ms 16.5253ms 60.5134 Ops/s 60.1714 Ops/s $\color{#35bf28}+0.57\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.0287ms 17.3263ms 57.7157 Ops/s 56.2093 Ops/s $\color{#35bf28}+2.68\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.5518ms 16.5947ms 60.2602 Ops/s 59.7110 Ops/s $\color{#35bf28}+0.92\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.3605ms 17.1250ms 58.3942 Ops/s 54.6202 Ops/s $\textbf{\color{#35bf28}+6.91\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.6990ms 18.1964ms 54.9558 Ops/s 53.5889 Ops/s $\color{#35bf28}+2.55\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants