Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Install stable torch/tv in docs when on release branch #2761

Merged
merged 1 commit into from
Feb 5, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 5, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 5, 2025
ghstack-source-id: 7c39c049c7cff0ee112be2d07597f2e291d2fafd
Pull Request resolved: #2761
Copy link

pytorch-bot bot commented Feb 5, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2761

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 5, 2025
@vmoens vmoens merged commit 8fa3925 into gh/vmoens/85/base Feb 5, 2025
53 of 57 checks passed
vmoens added a commit that referenced this pull request Feb 5, 2025
ghstack-source-id: 7c39c049c7cff0ee112be2d07597f2e291d2fafd
Pull Request resolved: #2761
@vmoens vmoens deleted the gh/vmoens/85/head branch February 5, 2025 18:10
Copy link

github-actions bot commented Feb 5, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}17$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.5306s 0.4465s 2.2394 Ops/s 2.2105 Ops/s $\color{#35bf28}+1.31\%$
test_transformed 0.9738s 0.8768s 1.1405 Ops/s 1.0981 Ops/s $\color{#35bf28}+3.86\%$
test_serial 1.4367s 1.3479s 0.7419 Ops/s 0.7295 Ops/s $\color{#35bf28}+1.70\%$
test_parallel 1.3066s 1.2125s 0.8248 Ops/s 0.8165 Ops/s $\color{#35bf28}+1.02\%$
test_step_mdp_speed[True-True-True-True-True] 0.1724ms 30.5859μs 32.6948 KOps/s 32.1842 KOps/s $\color{#35bf28}+1.59\%$
test_step_mdp_speed[True-True-True-True-False] 54.0930μs 18.1913μs 54.9714 KOps/s 55.9238 KOps/s $\color{#d91a1a}-1.70\%$
test_step_mdp_speed[True-True-True-False-True] 71.4900μs 17.3959μs 57.4847 KOps/s 57.5890 KOps/s $\color{#d91a1a}-0.18\%$
test_step_mdp_speed[True-True-True-False-False] 60.2440μs 10.2564μs 97.4997 KOps/s 98.6647 KOps/s $\color{#d91a1a}-1.18\%$
test_step_mdp_speed[True-True-False-True-True] 65.2610μs 33.0010μs 30.3021 KOps/s 30.8372 KOps/s $\color{#d91a1a}-1.74\%$
test_step_mdp_speed[True-True-False-True-False] 79.6270μs 20.2585μs 49.3620 KOps/s 50.5119 KOps/s $\color{#d91a1a}-2.28\%$
test_step_mdp_speed[True-True-False-False-True] 86.6110μs 19.2523μs 51.9420 KOps/s 52.3738 KOps/s $\color{#d91a1a}-0.82\%$
test_step_mdp_speed[True-True-False-False-False] 35.5260μs 12.1517μs 82.2928 KOps/s 83.1418 KOps/s $\color{#d91a1a}-1.02\%$
test_step_mdp_speed[True-False-True-True-True] 71.6330μs 34.6724μs 28.8414 KOps/s 29.0707 KOps/s $\color{#d91a1a}-0.79\%$
test_step_mdp_speed[True-False-True-True-False] 48.3790μs 22.0597μs 45.3315 KOps/s 45.7983 KOps/s $\color{#d91a1a}-1.02\%$
test_step_mdp_speed[True-False-True-False-True] 73.0760μs 19.3895μs 51.5743 KOps/s 52.1605 KOps/s $\color{#d91a1a}-1.12\%$
test_step_mdp_speed[True-False-True-False-False] 61.8240μs 12.2191μs 81.8394 KOps/s 83.4486 KOps/s $\color{#d91a1a}-1.93\%$
test_step_mdp_speed[True-False-False-True-True] 93.3000μs 36.7882μs 27.1827 KOps/s 27.7542 KOps/s $\color{#d91a1a}-2.06\%$
test_step_mdp_speed[True-False-False-True-False] 85.1100μs 23.8971μs 41.8460 KOps/s 42.5295 KOps/s $\color{#d91a1a}-1.61\%$
test_step_mdp_speed[True-False-False-False-True] 43.0400μs 21.1307μs 47.3246 KOps/s 47.8676 KOps/s $\color{#d91a1a}-1.13\%$
test_step_mdp_speed[True-False-False-False-False] 37.0790μs 13.9712μs 71.5760 KOps/s 72.2224 KOps/s $\color{#d91a1a}-0.90\%$
test_step_mdp_speed[False-True-True-True-True] 94.4960μs 35.0446μs 28.5351 KOps/s 29.0392 KOps/s $\color{#d91a1a}-1.74\%$
test_step_mdp_speed[False-True-True-True-False] 51.5360μs 22.2032μs 45.0386 KOps/s 45.8857 KOps/s $\color{#d91a1a}-1.85\%$
test_step_mdp_speed[False-True-True-False-True] 51.2760μs 22.2063μs 45.0324 KOps/s 45.9075 KOps/s $\color{#d91a1a}-1.91\%$
test_step_mdp_speed[False-True-True-False-False] 45.2240μs 13.8334μs 72.2887 KOps/s 74.5630 KOps/s $\color{#d91a1a}-3.05\%$
test_step_mdp_speed[False-True-False-True-True] 92.2550μs 36.5826μs 27.3354 KOps/s 27.9174 KOps/s $\color{#d91a1a}-2.08\%$
test_step_mdp_speed[False-True-False-True-False] 50.5040μs 24.0495μs 41.5809 KOps/s 42.6417 KOps/s $\color{#d91a1a}-2.49\%$
test_step_mdp_speed[False-True-False-False-True] 2.8119ms 23.9138μs 41.8169 KOps/s 42.2119 KOps/s $\color{#d91a1a}-0.94\%$
test_step_mdp_speed[False-True-False-False-False] 66.3330μs 15.6430μs 63.9264 KOps/s 65.7198 KOps/s $\color{#d91a1a}-2.73\%$
test_step_mdp_speed[False-False-True-True-True] 70.5510μs 38.1814μs 26.1908 KOps/s 26.3768 KOps/s $\color{#d91a1a}-0.71\%$
test_step_mdp_speed[False-False-True-True-False] 0.2404ms 26.6508μs 37.5224 KOps/s 39.6743 KOps/s $\textbf{\color{#d91a1a}-5.42\%}$
test_step_mdp_speed[False-False-True-False-True] 56.8450μs 23.8287μs 41.9662 KOps/s 42.7745 KOps/s $\color{#d91a1a}-1.89\%$
test_step_mdp_speed[False-False-True-False-False] 75.2190μs 15.3321μs 65.2225 KOps/s 65.6638 KOps/s $\color{#d91a1a}-0.67\%$
test_step_mdp_speed[False-False-False-True-True] 79.1260μs 39.7705μs 25.1443 KOps/s 25.2755 KOps/s $\color{#d91a1a}-0.52\%$
test_step_mdp_speed[False-False-False-True-False] 80.3690μs 27.5299μs 36.3242 KOps/s 37.3085 KOps/s $\color{#d91a1a}-2.64\%$
test_step_mdp_speed[False-False-False-False-True] 75.8910μs 25.4166μs 39.3444 KOps/s 39.7947 KOps/s $\color{#d91a1a}-1.13\%$
test_step_mdp_speed[False-False-False-False-False] 40.6650μs 17.0246μs 58.7386 KOps/s 59.8797 KOps/s $\color{#d91a1a}-1.91\%$
test_values[generalized_advantage_estimate-True-True] 11.7911ms 9.9336ms 100.6682 Ops/s 102.2715 Ops/s $\color{#d91a1a}-1.57\%$
test_values[vec_generalized_advantage_estimate-True-True] 25.5103ms 24.2723ms 41.1992 Ops/s 41.4179 Ops/s $\color{#d91a1a}-0.53\%$
test_values[td0_return_estimate-False-False] 0.2325ms 0.1834ms 5.4537 KOps/s 5.6671 KOps/s $\color{#d91a1a}-3.77\%$
test_values[td1_return_estimate-False-False] 28.1192ms 24.4294ms 40.9342 Ops/s 42.7323 Ops/s $\color{#d91a1a}-4.21\%$
test_values[vec_td1_return_estimate-False-False] 26.1566ms 24.4034ms 40.9779 Ops/s 40.9617 Ops/s $\color{#35bf28}+0.04\%$
test_values[td_lambda_return_estimate-True-False] 38.7842ms 35.1021ms 28.4883 Ops/s 29.3628 Ops/s $\color{#d91a1a}-2.98\%$
test_values[vec_td_lambda_return_estimate-True-False] 26.8463ms 24.5117ms 40.7968 Ops/s 41.1955 Ops/s $\color{#d91a1a}-0.97\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.1330ms 8.5719ms 116.6602 Ops/s 119.6805 Ops/s $\color{#d91a1a}-2.52\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.4860ms 1.9600ms 510.1946 Ops/s 515.4935 Ops/s $\color{#d91a1a}-1.03\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.6648ms 0.3683ms 2.7152 KOps/s 2.7870 KOps/s $\color{#d91a1a}-2.58\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 39.8217ms 38.1519ms 26.2110 Ops/s 25.1957 Ops/s $\color{#35bf28}+4.03\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.4306ms 3.4255ms 291.9324 Ops/s 292.4366 Ops/s $\color{#d91a1a}-0.17\%$
test_dqn_speed[False-None] 5.7960ms 1.3867ms 721.1547 Ops/s 719.3883 Ops/s $\color{#35bf28}+0.25\%$
test_dqn_speed[False-backward] 1.9310ms 1.8584ms 538.1092 Ops/s 540.1856 Ops/s $\color{#d91a1a}-0.38\%$
test_dqn_speed[True-None] 0.7729ms 0.4851ms 2.0613 KOps/s 2.0775 KOps/s $\color{#d91a1a}-0.78\%$
test_dqn_speed[True-backward] 1.0282ms 0.9111ms 1.0975 KOps/s 895.9244 Ops/s $\textbf{\color{#35bf28}+22.50\%}$
test_dqn_speed[reduce-overhead-None] 0.6749ms 0.4788ms 2.0886 KOps/s 2.0656 KOps/s $\color{#35bf28}+1.11\%$
test_dqn_speed[reduce-overhead-backward] 1.0155ms 0.9135ms 1.0947 KOps/s 1.0411 KOps/s $\textbf{\color{#35bf28}+5.15\%}$
test_ddpg_speed[False-None] 3.4747ms 2.8510ms 350.7587 Ops/s 348.1413 Ops/s $\color{#35bf28}+0.75\%$
test_ddpg_speed[False-backward] 4.1100ms 3.9751ms 251.5681 Ops/s 251.3902 Ops/s $\color{#35bf28}+0.07\%$
test_ddpg_speed[True-None] 1.8415ms 1.2130ms 824.3849 Ops/s 819.6187 Ops/s $\color{#35bf28}+0.58\%$
test_ddpg_speed[True-backward] 2.2173ms 2.1129ms 473.2744 Ops/s 424.3717 Ops/s $\textbf{\color{#35bf28}+11.52\%}$
test_ddpg_speed[reduce-overhead-None] 1.6161ms 1.2337ms 810.5597 Ops/s 811.3166 Ops/s $\color{#d91a1a}-0.09\%$
test_ddpg_speed[reduce-overhead-backward] 2.2061ms 2.1106ms 473.7929 Ops/s 471.9271 Ops/s $\color{#35bf28}+0.40\%$
test_sac_speed[False-None] 9.2915ms 7.9220ms 126.2305 Ops/s 125.2788 Ops/s $\color{#35bf28}+0.76\%$
test_sac_speed[False-backward] 11.0904ms 10.6278ms 94.0933 Ops/s 94.1710 Ops/s $\color{#d91a1a}-0.08\%$
test_sac_speed[True-None] 2.5824ms 2.0724ms 482.5406 Ops/s 476.4146 Ops/s $\color{#35bf28}+1.29\%$
test_sac_speed[True-backward] 3.7744ms 3.7256ms 268.4148 Ops/s 267.0239 Ops/s $\color{#35bf28}+0.52\%$
test_sac_speed[reduce-overhead-None] 2.6757ms 2.0738ms 482.2142 Ops/s 467.2788 Ops/s $\color{#35bf28}+3.20\%$
test_sac_speed[reduce-overhead-backward] 4.5265ms 3.7845ms 264.2389 Ops/s 266.9332 Ops/s $\color{#d91a1a}-1.01\%$
test_redq_speed[False-None] 14.7835ms 12.8176ms 78.0177 Ops/s 78.0686 Ops/s $\color{#d91a1a}-0.07\%$
test_redq_speed[False-backward] 22.6728ms 22.1623ms 45.1217 Ops/s 44.8006 Ops/s $\color{#35bf28}+0.72\%$
test_redq_speed[True-None] 6.0044ms 4.9345ms 202.6565 Ops/s 196.9628 Ops/s $\color{#35bf28}+2.89\%$
test_redq_speed[True-backward] 15.6725ms 12.4575ms 80.2730 Ops/s 74.5799 Ops/s $\textbf{\color{#35bf28}+7.63\%}$
test_redq_speed[reduce-overhead-None] 6.2300ms 5.2768ms 189.5075 Ops/s 201.5672 Ops/s $\textbf{\color{#d91a1a}-5.98\%}$
test_redq_speed[reduce-overhead-backward] 13.8574ms 12.5764ms 79.5139 Ops/s 79.2937 Ops/s $\color{#35bf28}+0.28\%$
test_redq_deprec_speed[False-None] 14.5880ms 12.6429ms 79.0961 Ops/s 78.3286 Ops/s $\color{#35bf28}+0.98\%$
test_redq_deprec_speed[False-backward] 20.3189ms 18.5939ms 53.7810 Ops/s 54.2829 Ops/s $\color{#d91a1a}-0.92\%$
test_redq_deprec_speed[True-None] 5.2992ms 3.8340ms 260.8257 Ops/s 259.7998 Ops/s $\color{#35bf28}+0.39\%$
test_redq_deprec_speed[True-backward] 9.2212ms 8.6644ms 115.4154 Ops/s 122.0439 Ops/s $\textbf{\color{#d91a1a}-5.43\%}$
test_redq_deprec_speed[reduce-overhead-None] 4.2240ms 3.8049ms 262.8188 Ops/s 259.8679 Ops/s $\color{#35bf28}+1.14\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.3093ms 8.7361ms 114.4670 Ops/s 120.9124 Ops/s $\textbf{\color{#d91a1a}-5.33\%}$
test_td3_speed[False-None] 8.2829ms 7.8853ms 126.8177 Ops/s 116.5192 Ops/s $\textbf{\color{#35bf28}+8.84\%}$
test_td3_speed[False-backward] 11.2207ms 10.3497ms 96.6209 Ops/s 94.7293 Ops/s $\color{#35bf28}+2.00\%$
test_td3_speed[True-None] 1.9561ms 1.7497ms 571.5428 Ops/s 552.8948 Ops/s $\color{#35bf28}+3.37\%$
test_td3_speed[True-backward] 3.3657ms 3.3217ms 301.0470 Ops/s 294.4346 Ops/s $\color{#35bf28}+2.25\%$
test_td3_speed[reduce-overhead-None] 1.9768ms 1.7522ms 570.7127 Ops/s 552.6763 Ops/s $\color{#35bf28}+3.26\%$
test_td3_speed[reduce-overhead-backward] 3.5100ms 3.3349ms 299.8602 Ops/s 288.4544 Ops/s $\color{#35bf28}+3.95\%$
test_cql_speed[False-None] 37.9297ms 35.8492ms 27.8947 Ops/s 27.5619 Ops/s $\color{#35bf28}+1.21\%$
test_cql_speed[False-backward] 50.3944ms 46.3484ms 21.5757 Ops/s 21.2210 Ops/s $\color{#35bf28}+1.67\%$
test_cql_speed[True-None] 16.6328ms 15.8650ms 63.0319 Ops/s 60.3184 Ops/s $\color{#35bf28}+4.50\%$
test_cql_speed[True-backward] 24.1201ms 23.1834ms 43.1342 Ops/s 42.6537 Ops/s $\color{#35bf28}+1.13\%$
test_cql_speed[reduce-overhead-None] 17.3927ms 16.4675ms 60.7258 Ops/s 61.2457 Ops/s $\color{#d91a1a}-0.85\%$
test_cql_speed[reduce-overhead-backward] 23.7464ms 23.0398ms 43.4032 Ops/s 42.5009 Ops/s $\color{#35bf28}+2.12\%$
test_a2c_speed[False-None] 8.4201ms 7.1619ms 139.6276 Ops/s 135.7960 Ops/s $\color{#35bf28}+2.82\%$
test_a2c_speed[False-backward] 15.4154ms 14.4667ms 69.1242 Ops/s 68.3730 Ops/s $\color{#35bf28}+1.10\%$
test_a2c_speed[True-None] 4.0704ms 3.6991ms 270.3393 Ops/s 265.2264 Ops/s $\color{#35bf28}+1.93\%$
test_a2c_speed[True-backward] 10.8509ms 10.2583ms 97.4822 Ops/s 92.4935 Ops/s $\textbf{\color{#35bf28}+5.39\%}$
test_a2c_speed[reduce-overhead-None] 4.1935ms 3.6844ms 271.4153 Ops/s 260.0691 Ops/s $\color{#35bf28}+4.36\%$
test_a2c_speed[reduce-overhead-backward] 11.1430ms 10.1511ms 98.5112 Ops/s 90.8818 Ops/s $\textbf{\color{#35bf28}+8.39\%}$
test_ppo_speed[False-None] 9.0841ms 7.5214ms 132.9535 Ops/s 130.7778 Ops/s $\color{#35bf28}+1.66\%$
test_ppo_speed[False-backward] 16.6121ms 14.8975ms 67.1252 Ops/s 66.7462 Ops/s $\color{#35bf28}+0.57\%$
test_ppo_speed[True-None] 4.9039ms 4.0889ms 244.5638 Ops/s 242.3459 Ops/s $\color{#35bf28}+0.92\%$
test_ppo_speed[True-backward] 10.7209ms 10.2805ms 97.2717 Ops/s 93.6281 Ops/s $\color{#35bf28}+3.89\%$
test_ppo_speed[reduce-overhead-None] 4.3127ms 4.0402ms 247.5154 Ops/s 239.1782 Ops/s $\color{#35bf28}+3.49\%$
test_ppo_speed[reduce-overhead-backward] 10.8002ms 9.9362ms 100.6420 Ops/s 92.3446 Ops/s $\textbf{\color{#35bf28}+8.99\%}$
test_reinforce_speed[False-None] 7.6173ms 6.4820ms 154.2739 Ops/s 147.9245 Ops/s $\color{#35bf28}+4.29\%$
test_reinforce_speed[False-backward] 10.2820ms 9.8430ms 101.5954 Ops/s 97.6367 Ops/s $\color{#35bf28}+4.05\%$
test_reinforce_speed[True-None] 3.6875ms 3.0928ms 323.3284 Ops/s 297.9329 Ops/s $\textbf{\color{#35bf28}+8.52\%}$
test_reinforce_speed[True-backward] 10.2244ms 9.2585ms 108.0093 Ops/s 101.6145 Ops/s $\textbf{\color{#35bf28}+6.29\%}$
test_reinforce_speed[reduce-overhead-None] 3.6631ms 3.1448ms 317.9819 Ops/s 320.5128 Ops/s $\color{#d91a1a}-0.79\%$
test_reinforce_speed[reduce-overhead-backward] 9.5266ms 9.0302ms 110.7396 Ops/s 106.1657 Ops/s $\color{#35bf28}+4.31\%$
test_iql_speed[False-None] 41.9245ms 33.2455ms 30.0793 Ops/s 30.8526 Ops/s $\color{#d91a1a}-2.51\%$
test_iql_speed[False-backward] 56.9730ms 46.4251ms 21.5401 Ops/s 21.4980 Ops/s $\color{#35bf28}+0.20\%$
test_iql_speed[True-None] 12.0395ms 11.2792ms 88.6586 Ops/s 84.8651 Ops/s $\color{#35bf28}+4.47\%$
test_iql_speed[True-backward] 24.6320ms 22.5708ms 44.3051 Ops/s 44.1420 Ops/s $\color{#35bf28}+0.37\%$
test_iql_speed[reduce-overhead-None] 12.0808ms 11.5169ms 86.8286 Ops/s 84.5460 Ops/s $\color{#35bf28}+2.70\%$
test_iql_speed[reduce-overhead-backward] 24.4931ms 23.1111ms 43.2692 Ops/s 42.9945 Ops/s $\color{#35bf28}+0.64\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2474ms 5.0414ms 198.3588 Ops/s 197.5872 Ops/s $\color{#35bf28}+0.39\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9209ms 0.5410ms 1.8485 KOps/s 1.8234 KOps/s $\color{#35bf28}+1.38\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7404ms 0.5220ms 1.9157 KOps/s 1.9284 KOps/s $\color{#d91a1a}-0.66\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.2673ms 4.8453ms 206.3864 Ops/s 209.0499 Ops/s $\color{#d91a1a}-1.27\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.6626ms 0.5260ms 1.9012 KOps/s 1.8610 KOps/s $\color{#35bf28}+2.16\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8839ms 0.5096ms 1.9621 KOps/s 1.9844 KOps/s $\color{#d91a1a}-1.12\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.3297ms 1.6969ms 589.3078 Ops/s 585.9482 Ops/s $\color{#35bf28}+0.57\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 3.0599ms 1.6739ms 597.3941 Ops/s 618.7387 Ops/s $\color{#d91a1a}-3.45\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.1082ms 4.8797ms 204.9322 Ops/s 198.0249 Ops/s $\color{#35bf28}+3.49\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0572ms 0.6718ms 1.4885 KOps/s 1.4677 KOps/s $\color{#35bf28}+1.42\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0057ms 0.6547ms 1.5275 KOps/s 1.5166 KOps/s $\color{#35bf28}+0.72\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.5784ms 4.7973ms 208.4504 Ops/s 207.2276 Ops/s $\color{#35bf28}+0.59\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.9806ms 0.5413ms 1.8473 KOps/s 1.8730 KOps/s $\color{#d91a1a}-1.38\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7408ms 0.5147ms 1.9429 KOps/s 1.9538 KOps/s $\color{#d91a1a}-0.56\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.1060ms 4.7791ms 209.2454 Ops/s 212.9690 Ops/s $\color{#d91a1a}-1.75\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.3455ms 0.5298ms 1.8874 KOps/s 1.8366 KOps/s $\color{#35bf28}+2.76\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8394ms 0.5142ms 1.9446 KOps/s 1.9542 KOps/s $\color{#d91a1a}-0.49\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.4643ms 5.0010ms 199.9585 Ops/s 204.7338 Ops/s $\color{#d91a1a}-2.33\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.9754ms 0.6740ms 1.4837 KOps/s 1.4749 KOps/s $\color{#35bf28}+0.60\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8641ms 0.6415ms 1.5587 KOps/s 1.5288 KOps/s $\color{#35bf28}+1.95\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.4772ms 4.1850ms 238.9485 Ops/s 247.0311 Ops/s $\color{#d91a1a}-3.27\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.2791ms 2.1675ms 461.3516 Ops/s 438.2992 Ops/s $\textbf{\color{#35bf28}+5.26\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.5606ms 1.3724ms 728.6365 Ops/s 748.3885 Ops/s $\color{#d91a1a}-2.64\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4563s 13.7478ms 72.7389 Ops/s 32.6567 Ops/s $\textbf{\color{#35bf28}+122.74\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 6.7336ms 2.3577ms 424.1399 Ops/s 407.6903 Ops/s $\color{#35bf28}+4.03\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 5.7330ms 1.4370ms 695.8863 Ops/s 681.7383 Ops/s $\color{#35bf28}+2.08\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 8.9837ms 4.4610ms 224.1631 Ops/s 220.5820 Ops/s $\color{#35bf28}+1.62\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.1537ms 2.4972ms 400.4557 Ops/s 367.1578 Ops/s $\textbf{\color{#35bf28}+9.07\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 4.3325ms 1.5009ms 666.2489 Ops/s 628.8199 Ops/s $\textbf{\color{#35bf28}+5.95\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.2973ms 11.6774ms 85.6353 Ops/s 80.5711 Ops/s $\textbf{\color{#35bf28}+6.29\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 17.1445ms 14.2307ms 70.2706 Ops/s 66.1278 Ops/s $\textbf{\color{#35bf28}+6.26\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 23.2631ms 20.7265ms 48.2473 Ops/s 47.0628 Ops/s $\color{#35bf28}+2.52\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.5315ms 14.5140ms 68.8989 Ops/s 66.5676 Ops/s $\color{#35bf28}+3.50\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 22.2884ms 20.3836ms 49.0589 Ops/s 47.6630 Ops/s $\color{#35bf28}+2.93\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 16.5664ms 15.6227ms 64.0092 Ops/s 60.6003 Ops/s $\textbf{\color{#35bf28}+5.63\%}$

Copy link

github-actions bot commented Feb 5, 2025

Result of GPU Benchmark Tests

Expand to view detailed results
Name Max Mean Ops
test_simple 0.8477s 0.7538s 1.3265 Ops/s
test_transformed 1.3268s 1.3178s 0.7588 Ops/s
test_serial 2.2105s 2.1837s 0.4579 Ops/s
test_parallel 1.8396s 1.8142s 0.5512 Ops/s
test_step_mdp_speed[True-True-True-True-True] 0.1156ms 39.5076μs 25.3116 KOps/s
test_step_mdp_speed[True-True-True-True-False] 51.3510μs 23.5959μs 42.3803 KOps/s
test_step_mdp_speed[True-True-True-False-True] 56.8510μs 22.5547μs 44.3367 KOps/s
test_step_mdp_speed[True-True-True-False-False] 39.8910μs 13.2061μs 75.7224 KOps/s
test_step_mdp_speed[True-True-False-True-True] 70.9120μs 43.1670μs 23.1658 KOps/s
test_step_mdp_speed[True-True-False-True-False] 50.3110μs 25.6125μs 39.0434 KOps/s
test_step_mdp_speed[True-True-False-False-True] 58.7120μs 25.1448μs 39.7696 KOps/s
test_step_mdp_speed[True-True-False-False-False] 0.5948ms 15.3052μs 65.3374 KOps/s
test_step_mdp_speed[True-False-True-True-True] 83.6420μs 45.4524μs 22.0010 KOps/s
test_step_mdp_speed[True-False-True-True-False] 62.1820μs 28.2240μs 35.4309 KOps/s
test_step_mdp_speed[True-False-True-False-True] 53.9310μs 24.8512μs 40.2395 KOps/s
test_step_mdp_speed[True-False-True-False-False] 44.5210μs 15.5822μs 64.1756 KOps/s
test_step_mdp_speed[True-False-False-True-True] 85.2510μs 47.6010μs 21.0080 KOps/s
test_step_mdp_speed[True-False-False-True-False] 66.7720μs 30.6032μs 32.6763 KOps/s
test_step_mdp_speed[True-False-False-False-True] 58.6410μs 27.0783μs 36.9300 KOps/s
test_step_mdp_speed[True-False-False-False-False] 44.3110μs 17.9346μs 55.7580 KOps/s
test_step_mdp_speed[False-True-True-True-True] 86.0820μs 45.6093μs 21.9254 KOps/s
test_step_mdp_speed[False-True-True-True-False] 57.1410μs 28.5138μs 35.0707 KOps/s
test_step_mdp_speed[False-True-True-False-True] 61.4710μs 29.2232μs 34.2194 KOps/s
test_step_mdp_speed[False-True-True-False-False] 59.4920μs 17.3669μs 57.5807 KOps/s
test_step_mdp_speed[False-True-False-True-True] 84.4520μs 48.1238μs 20.7798 KOps/s
test_step_mdp_speed[False-True-False-True-False] 75.7820μs 30.9950μs 32.2633 KOps/s
test_step_mdp_speed[False-True-False-False-True] 3.2155ms 31.5127μs 31.7332 KOps/s
test_step_mdp_speed[False-True-False-False-False] 54.7520μs 19.6340μs 50.9320 KOps/s
test_step_mdp_speed[False-False-True-True-True] 89.6920μs 49.7268μs 20.1099 KOps/s
test_step_mdp_speed[False-False-True-True-False] 66.8310μs 32.9333μs 30.3644 KOps/s
test_step_mdp_speed[False-False-True-False-True] 60.9410μs 31.2803μs 31.9690 KOps/s
test_step_mdp_speed[False-False-True-False-False] 48.3810μs 19.5228μs 51.2221 KOps/s
test_step_mdp_speed[False-False-False-True-True] 82.1620μs 51.8538μs 19.2850 KOps/s
test_step_mdp_speed[False-False-False-True-False] 67.8720μs 35.1948μs 28.4133 KOps/s
test_step_mdp_speed[False-False-False-False-True] 66.4910μs 32.9599μs 30.3399 KOps/s
test_step_mdp_speed[False-False-False-False-False] 51.8910μs 21.8879μs 45.6872 KOps/s
test_values[generalized_advantage_estimate-True-True] 24.8668ms 24.4440ms 40.9099 Ops/s
test_values[vec_generalized_advantage_estimate-True-True] 0.1051s 2.9983ms 333.5256 Ops/s
test_values[td0_return_estimate-False-False] 0.1061ms 79.5132μs 12.5765 KOps/s
test_values[td1_return_estimate-False-False] 55.6395ms 55.1490ms 18.1327 Ops/s
test_values[vec_td1_return_estimate-False-False] 1.3214ms 1.0810ms 925.0764 Ops/s
test_values[td_lambda_return_estimate-True-False] 88.3028ms 87.0799ms 11.4837 Ops/s
test_values[vec_td_lambda_return_estimate-True-False] 1.3091ms 1.0775ms 928.0482 Ops/s
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.5940ms 24.4524ms 40.8957 Ops/s
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0165ms 0.7492ms 1.3347 KOps/s
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7838ms 0.6658ms 1.5020 KOps/s
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5225ms 1.4838ms 673.9475 Ops/s
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7107ms 0.6825ms 1.4651 KOps/s
test_dqn_speed[False-None] 1.6079ms 1.5346ms 651.6343 Ops/s
test_dqn_speed[False-backward] 2.2179ms 2.1555ms 463.9285 Ops/s
test_dqn_speed[True-None] 0.6669ms 0.5836ms 1.7135 KOps/s
test_dqn_speed[True-backward] 1.2991ms 1.2495ms 800.2978 Ops/s
test_dqn_speed[reduce-overhead-None] 0.6705ms 0.5938ms 1.6839 KOps/s
test_dqn_speed[reduce-overhead-backward] 1.1169ms 1.0735ms 931.5074 Ops/s
test_ddpg_speed[False-None] 3.2472ms 2.9448ms 339.5788 Ops/s
test_ddpg_speed[False-backward] 4.5846ms 4.4557ms 224.4322 Ops/s
test_ddpg_speed[True-None] 1.4110ms 1.3344ms 749.3735 Ops/s
test_ddpg_speed[True-backward] 2.4810ms 2.4202ms 413.1967 Ops/s
test_ddpg_speed[reduce-overhead-None] 1.4185ms 1.3456ms 743.1454 Ops/s
test_ddpg_speed[reduce-overhead-backward] 1.9434ms 1.8894ms 529.2615 Ops/s
test_sac_speed[False-None] 8.4904ms 8.0754ms 123.8324 Ops/s
test_sac_speed[False-backward] 11.5033ms 10.9710ms 91.1498 Ops/s
test_sac_speed[True-None] 2.0157ms 1.8544ms 539.2702 Ops/s
test_sac_speed[True-backward] 3.8745ms 3.7313ms 268.0017 Ops/s
test_sac_speed[reduce-overhead-None] 21.3509ms 12.0478ms 83.0028 Ops/s
test_sac_speed[reduce-overhead-backward] 1.8268ms 1.7714ms 564.5108 Ops/s
test_redq_speed[False-None] 7.9800ms 7.5174ms 133.0246 Ops/s
test_redq_speed[False-backward] 12.2259ms 11.6601ms 85.7628 Ops/s
test_redq_speed[True-None] 2.4827ms 2.2945ms 435.8293 Ops/s
test_redq_speed[True-backward] 4.1633ms 4.1251ms 242.4176 Ops/s
test_redq_speed[reduce-overhead-None] 2.3865ms 2.2920ms 436.3089 Ops/s
test_redq_speed[reduce-overhead-backward] 4.4023ms 4.1346ms 241.8602 Ops/s
test_redq_deprec_speed[False-None] 9.4129ms 9.0921ms 109.9852 Ops/s
test_redq_deprec_speed[False-backward] 12.8717ms 12.3126ms 81.2178 Ops/s
test_redq_deprec_speed[True-None] 2.8323ms 2.6699ms 374.5416 Ops/s
test_redq_deprec_speed[True-backward] 5.0145ms 4.4619ms 224.1214 Ops/s
test_redq_deprec_speed[reduce-overhead-None] 2.7031ms 2.6142ms 382.5222 Ops/s
test_redq_deprec_speed[reduce-overhead-backward] 4.7701ms 4.4689ms 223.7680 Ops/s
test_td3_speed[False-None] 8.0751ms 8.0022ms 124.9658 Ops/s
test_td3_speed[False-backward] 11.0916ms 10.5724ms 94.5862 Ops/s
test_td3_speed[True-None] 1.6615ms 1.6380ms 610.4989 Ops/s
test_td3_speed[True-backward] 3.3597ms 3.3241ms 300.8375 Ops/s
test_td3_speed[reduce-overhead-None] 50.0443ms 25.9080ms 38.5981 Ops/s
test_td3_speed[reduce-overhead-backward] 1.3792ms 1.3269ms 753.6145 Ops/s
test_cql_speed[False-None] 17.2441ms 16.8383ms 59.3885 Ops/s
test_cql_speed[False-backward] 22.8537ms 22.0380ms 45.3761 Ops/s
test_cql_speed[True-None] 3.3658ms 3.2420ms 308.4472 Ops/s
test_cql_speed[True-backward] 5.9255ms 5.4975ms 181.9007 Ops/s
test_cql_speed[reduce-overhead-None] 21.3306ms 13.1303ms 76.1596 Ops/s
test_cql_speed[reduce-overhead-backward] 1.9719ms 1.8254ms 547.8242 Ops/s
test_a2c_speed[False-None] 3.2825ms 3.1955ms 312.9389 Ops/s
test_a2c_speed[False-backward] 6.7719ms 6.1296ms 163.1418 Ops/s
test_a2c_speed[True-None] 1.3966ms 1.3353ms 748.9004 Ops/s
test_a2c_speed[True-backward] 3.1022ms 2.9412ms 340.0020 Ops/s
test_a2c_speed[reduce-overhead-None] 15.9627ms 9.0269ms 110.7803 Ops/s
test_a2c_speed[reduce-overhead-backward] 1.5144ms 1.4459ms 691.6338 Ops/s
test_ppo_speed[False-None] 3.9695ms 3.7253ms 268.4351 Ops/s
test_ppo_speed[False-backward] 7.2928ms 6.8365ms 146.2731 Ops/s
test_ppo_speed[True-None] 1.5508ms 1.3961ms 716.2734 Ops/s
test_ppo_speed[True-backward] 3.1918ms 3.0544ms 327.3936 Ops/s
test_ppo_speed[reduce-overhead-None] 1.0369ms 0.9446ms 1.0586 KOps/s
test_ppo_speed[reduce-overhead-backward] 1.4865ms 1.3835ms 722.8077 Ops/s
test_reinforce_speed[False-None] 2.4382ms 2.2902ms 436.6378 Ops/s
test_reinforce_speed[False-backward] 3.7248ms 3.3021ms 302.8382 Ops/s
test_reinforce_speed[True-None] 1.3600ms 1.2953ms 772.0055 Ops/s
test_reinforce_speed[True-backward] 2.9931ms 2.9183ms 342.6618 Ops/s
test_reinforce_speed[reduce-overhead-None] 18.7716ms 10.0968ms 99.0409 Ops/s
test_reinforce_speed[reduce-overhead-backward] 1.5161ms 1.4618ms 684.0742 Ops/s
test_iql_speed[False-None] 9.8819ms 9.2179ms 108.4844 Ops/s
test_iql_speed[False-backward] 13.3172ms 12.8136ms 78.0422 Ops/s
test_iql_speed[True-None] 2.4319ms 2.2223ms 449.9757 Ops/s
test_iql_speed[True-backward] 4.9111ms 4.7413ms 210.9120 Ops/s
test_iql_speed[reduce-overhead-None] 18.8031ms 11.1148ms 89.9698 Ops/s
test_iql_speed[reduce-overhead-backward] 1.9036ms 1.8607ms 537.4179 Ops/s
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.8481ms 6.3387ms 157.7619 Ops/s
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5089ms 0.2684ms 3.7254 KOps/s
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4991ms 0.2531ms 3.9505 KOps/s
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3341ms 6.0598ms 165.0222 Ops/s
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1593ms 0.2939ms 3.4025 KOps/s
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6569ms 0.2721ms 3.6755 KOps/s
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6118ms 1.2796ms 781.4722 Ops/s
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5123ms 1.1813ms 846.5505 Ops/s
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5628ms 6.2486ms 160.0365 Ops/s
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1050ms 0.4846ms 2.0636 KOps/s
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8113ms 0.4418ms 2.2635 KOps/s
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3898ms 6.0937ms 164.1048 Ops/s
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.9013ms 0.3236ms 3.0905 KOps/s
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5474ms 0.3349ms 2.9859 KOps/s
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4393ms 6.0303ms 165.8303 Ops/s
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9445ms 0.2738ms 3.6521 KOps/s
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4777ms 0.2511ms 3.9824 KOps/s
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4535ms 6.2674ms 159.5547 Ops/s
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.1793ms 0.4873ms 2.0522 KOps/s
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6357ms 0.4664ms 2.1441 KOps/s
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.0841ms 5.4550ms 183.3166 Ops/s
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.4709ms 2.0628ms 484.7874 Ops/s
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.1971ms 1.2236ms 817.2491 Ops/s
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.0094ms 5.5515ms 180.1320 Ops/s
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 9.6355ms 2.0658ms 484.0711 Ops/s
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.7645ms 1.1629ms 859.9444 Ops/s
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4917s 15.3742ms 65.0441 Ops/s
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 6.6973ms 2.2018ms 454.1730 Ops/s
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 9.2483ms 1.4223ms 703.0699 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.5134ms 13.0639ms 76.5466 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 17.9074ms 16.4433ms 60.8149 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 17.9084ms 17.7121ms 56.4587 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.1396ms 16.6922ms 59.9081 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.9318ms 17.5744ms 56.9009 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.3077ms 18.0148ms 55.5098 Ops/s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants