Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Compatibility with non-tensor inputs in CudaGraphModule #1039

Merged
merged 7 commits into from
Oct 11, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Oct 11, 2024

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 11, 2024
ghstack-source-id: 3965461dd9b4b3684cf2013093797d5306c11008
Pull Request resolved: #1039
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 11, 2024
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 11, 2024
ghstack-source-id: 756234dd11739d521ca04ac42cb0d83a1d361b6d
Pull Request resolved: #1039
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 11, 2024
ghstack-source-id: 3eff6c24b0fa381665823deb5a1efdb7d2cc1bd2
Pull Request resolved: #1039
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 11, 2024
ghstack-source-id: f6f296cdcf00498cae4be818f15eec75309906ce
Pull Request resolved: #1039
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 11, 2024
ghstack-source-id: f09bb358e4941818dad4a0ae1e4d48d347492d6f
Pull Request resolved: #1039
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 11, 2024
ghstack-source-id: 61e14db945864a8bcf5c19934b7761b8a94d1fc8
Pull Request resolved: #1039
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 11, 2024
ghstack-source-id: f5a48452c26ae0c28399355573fe0458e402574c
Pull Request resolved: #1039
@vmoens vmoens merged commit efce95b into gh/vmoens/29/base Oct 11, 2024
23 of 28 checks passed
vmoens added a commit that referenced this pull request Oct 11, 2024
ghstack-source-id: f5a48452c26ae0c28399355573fe0458e402574c
Pull Request resolved: #1039
@vmoens vmoens deleted the gh/vmoens/29/head branch October 11, 2024 10:09
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 216. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}38$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 68.6680μs 25.5609μs 39.1222 KOps/s 41.5688 KOps/s $\textbf{\color{#d91a1a}-5.89\%}$
test_plain_set_stack_nested 90.6830μs 26.1473μs 38.2449 KOps/s 41.0413 KOps/s $\textbf{\color{#d91a1a}-6.81\%}$
test_plain_set_nested_inplace 90.4160μs 28.1242μs 35.5565 KOps/s 37.8238 KOps/s $\textbf{\color{#d91a1a}-5.99\%}$
test_plain_set_stack_nested_inplace 68.2080μs 28.5149μs 35.0694 KOps/s 37.8899 KOps/s $\textbf{\color{#d91a1a}-7.44\%}$
test_items 42.5390μs 4.1822μs 239.1058 KOps/s 234.9301 KOps/s $\color{#35bf28}+1.78\%$
test_items_nested 0.5270ms 0.3860ms 2.5907 KOps/s 2.5815 KOps/s $\color{#35bf28}+0.36\%$
test_items_nested_locked 0.7049ms 0.3879ms 2.5779 KOps/s 2.5746 KOps/s $\color{#35bf28}+0.13\%$
test_items_nested_leaf 0.1532ms 81.7796μs 12.2280 KOps/s 12.4404 KOps/s $\color{#d91a1a}-1.71\%$
test_items_stack_nested 0.6681ms 0.3884ms 2.5749 KOps/s 2.5679 KOps/s $\color{#35bf28}+0.27\%$
test_items_stack_nested_leaf 0.1524ms 83.3459μs 11.9982 KOps/s 11.9533 KOps/s $\color{#35bf28}+0.38\%$
test_items_stack_nested_locked 0.8341ms 0.3876ms 2.5801 KOps/s 2.5399 KOps/s $\color{#35bf28}+1.58\%$
test_keys 36.2750μs 3.5054μs 285.2717 KOps/s 286.5756 KOps/s $\color{#d91a1a}-0.46\%$
test_keys_nested 0.2553ms 0.1398ms 7.1506 KOps/s 7.3854 KOps/s $\color{#d91a1a}-3.18\%$
test_keys_nested_locked 1.4961ms 0.1397ms 7.1588 KOps/s 7.0919 KOps/s $\color{#35bf28}+0.94\%$
test_keys_nested_leaf 0.1837ms 0.1177ms 8.4939 KOps/s 8.3937 KOps/s $\color{#35bf28}+1.19\%$
test_keys_stack_nested 0.2543ms 0.1355ms 7.3815 KOps/s 7.4291 KOps/s $\color{#d91a1a}-0.64\%$
test_keys_stack_nested_leaf 0.2079ms 0.1183ms 8.4548 KOps/s 8.5049 KOps/s $\color{#d91a1a}-0.59\%$
test_keys_stack_nested_locked 0.2582ms 0.1444ms 6.9229 KOps/s 7.1095 KOps/s $\color{#d91a1a}-2.62\%$
test_values 9.0548μs 1.0426μs 959.1143 KOps/s 946.4509 KOps/s $\color{#35bf28}+1.34\%$
test_values_nested 0.1722ms 94.9326μs 10.5338 KOps/s 10.9274 KOps/s $\color{#d91a1a}-3.60\%$
test_values_nested_locked 0.1580ms 94.6643μs 10.5636 KOps/s 10.7749 KOps/s $\color{#d91a1a}-1.96\%$
test_values_nested_leaf 0.1480ms 81.2121μs 12.3134 KOps/s 12.3745 KOps/s $\color{#d91a1a}-0.49\%$
test_values_stack_nested 0.1587ms 96.8799μs 10.3221 KOps/s 10.6034 KOps/s $\color{#d91a1a}-2.65\%$
test_values_stack_nested_leaf 0.1881ms 80.8342μs 12.3710 KOps/s 12.5551 KOps/s $\color{#d91a1a}-1.47\%$
test_values_stack_nested_locked 0.1576ms 95.1402μs 10.5108 KOps/s 10.3305 KOps/s $\color{#35bf28}+1.75\%$
test_membership 37.6100μs 0.9122μs 1.0962 MOps/s 1.1124 MOps/s $\color{#d91a1a}-1.46\%$
test_membership_nested 37.7000μs 2.8388μs 352.2670 KOps/s 364.4237 KOps/s $\color{#d91a1a}-3.34\%$
test_membership_nested_leaf 51.1860μs 2.8331μs 352.9666 KOps/s 362.3185 KOps/s $\color{#d91a1a}-2.58\%$
test_membership_stacked_nested 17.6430μs 2.8411μs 351.9819 KOps/s 363.7218 KOps/s $\color{#d91a1a}-3.23\%$
test_membership_stacked_nested_leaf 38.9390μs 2.8625μs 349.3403 KOps/s 360.0837 KOps/s $\color{#d91a1a}-2.98\%$
test_membership_nested_last 37.3500μs 4.3139μs 231.8083 KOps/s 238.8381 KOps/s $\color{#d91a1a}-2.94\%$
test_membership_nested_leaf_last 41.7880μs 4.2854μs 233.3489 KOps/s 239.3048 KOps/s $\color{#d91a1a}-2.49\%$
test_membership_stacked_nested_last 37.4400μs 5.6021μs 178.5046 KOps/s 136.5082 KOps/s $\textbf{\color{#35bf28}+30.76\%}$
test_membership_stacked_nested_leaf_last 48.4410μs 5.5444μs 180.3617 KOps/s 135.7152 KOps/s $\textbf{\color{#35bf28}+32.90\%}$
test_nested_getleaf 47.6630μs 10.4839μs 95.3841 KOps/s 92.7710 KOps/s $\color{#35bf28}+2.82\%$
test_nested_get 48.1000μs 9.9818μs 100.1823 KOps/s 98.3399 KOps/s $\color{#35bf28}+1.87\%$
test_stacked_getleaf 52.3380μs 10.5587μs 94.7084 KOps/s 94.4509 KOps/s $\color{#35bf28}+0.27\%$
test_stacked_get 46.7180μs 10.0954μs 99.0550 KOps/s 98.6173 KOps/s $\color{#35bf28}+0.44\%$
test_nested_getitemleaf 47.6490μs 11.0752μs 90.2915 KOps/s 89.8590 KOps/s $\color{#35bf28}+0.48\%$
test_nested_getitem 56.7870μs 10.3668μs 96.4616 KOps/s 96.4934 KOps/s $\color{#d91a1a}-0.03\%$
test_stacked_getitemleaf 53.0290μs 11.0686μs 90.3456 KOps/s 91.2466 KOps/s $\color{#d91a1a}-0.99\%$
test_stacked_getitem 58.6200μs 10.3961μs 96.1902 KOps/s 97.2583 KOps/s $\color{#d91a1a}-1.10\%$
test_lock_nested 89.4535ms 0.5997ms 1.6675 KOps/s 1.9778 KOps/s $\textbf{\color{#d91a1a}-15.69\%}$
test_lock_stack_nested 0.8378ms 0.4730ms 2.1141 KOps/s 2.1290 KOps/s $\color{#d91a1a}-0.70\%$
test_unlock_nested 99.7273ms 0.5322ms 1.8792 KOps/s 2.3616 KOps/s $\textbf{\color{#d91a1a}-20.43\%}$
test_unlock_stack_nested 0.6320ms 0.3850ms 2.5972 KOps/s 2.5791 KOps/s $\color{#35bf28}+0.70\%$
test_flatten_speed 0.1928ms 0.1019ms 9.8111 KOps/s 10.0484 KOps/s $\color{#d91a1a}-2.36\%$
test_unflatten_speed 0.9445ms 0.5163ms 1.9367 KOps/s 1.9405 KOps/s $\color{#d91a1a}-0.20\%$
test_common_ops 3.7999ms 1.2220ms 818.3106 Ops/s 898.5405 Ops/s $\textbf{\color{#d91a1a}-8.93\%}$
test_creation 36.0570μs 2.1618μs 462.5734 KOps/s 472.4821 KOps/s $\color{#d91a1a}-2.10\%$
test_creation_empty 54.5220μs 21.6864μs 46.1118 KOps/s 56.1588 KOps/s $\textbf{\color{#d91a1a}-17.89\%}$
test_creation_nested_1 75.5920μs 25.1915μs 39.6960 KOps/s 48.4045 KOps/s $\textbf{\color{#d91a1a}-17.99\%}$
test_creation_nested_2 0.1013ms 29.3001μs 34.1296 KOps/s 39.3737 KOps/s $\textbf{\color{#d91a1a}-13.32\%}$
test_clone 0.1469ms 17.2667μs 57.9149 KOps/s 56.3933 KOps/s $\color{#35bf28}+2.70\%$
test_getitem[int] 1.2246ms 16.9448μs 59.0153 KOps/s 59.8372 KOps/s $\color{#d91a1a}-1.37\%$
test_getitem[slice_int] 0.1646ms 31.2298μs 32.0207 KOps/s 33.1821 KOps/s $\color{#d91a1a}-3.50\%$
test_getitem[range] 0.2556ms 57.0881μs 17.5168 KOps/s 17.4704 KOps/s $\color{#35bf28}+0.27\%$
test_getitem[tuple] 0.1425ms 25.4526μs 39.2888 KOps/s 40.0696 KOps/s $\color{#d91a1a}-1.95\%$
test_getitem[list] 0.3315ms 52.5512μs 19.0291 KOps/s 18.8991 KOps/s $\color{#35bf28}+0.69\%$
test_setitem_dim[int] 82.1750μs 33.4065μs 29.9343 KOps/s 30.8317 KOps/s $\color{#d91a1a}-2.91\%$
test_setitem_dim[slice_int] 0.1117ms 61.2164μs 16.3355 KOps/s 16.4271 KOps/s $\color{#d91a1a}-0.56\%$
test_setitem_dim[range] 0.1354ms 84.1958μs 11.8771 KOps/s 12.0553 KOps/s $\color{#d91a1a}-1.48\%$
test_setitem_dim[tuple] 0.1177ms 49.4817μs 20.2095 KOps/s 20.4041 KOps/s $\color{#d91a1a}-0.95\%$
test_setitem 0.1773ms 31.8906μs 31.3572 KOps/s 33.0976 KOps/s $\textbf{\color{#d91a1a}-5.26\%}$
test_set 0.1633ms 30.9857μs 32.2730 KOps/s 34.3446 KOps/s $\textbf{\color{#d91a1a}-6.03\%}$
test_set_shared 3.9570ms 0.2221ms 4.5015 KOps/s 4.6107 KOps/s $\color{#d91a1a}-2.37\%$
test_update 0.2035ms 40.3438μs 24.7870 KOps/s 27.2428 KOps/s $\textbf{\color{#d91a1a}-9.01\%}$
test_update_nested 0.1957ms 50.6821μs 19.7308 KOps/s 20.9333 KOps/s $\textbf{\color{#d91a1a}-5.74\%}$
test_update__nested 0.4235ms 45.5109μs 21.9728 KOps/s 22.1768 KOps/s $\color{#d91a1a}-0.92\%$
test_set_nested 0.1718ms 33.7762μs 29.6067 KOps/s 31.2477 KOps/s $\textbf{\color{#d91a1a}-5.25\%}$
test_set_nested_new 0.1635ms 38.4937μs 25.9783 KOps/s 27.1531 KOps/s $\color{#d91a1a}-4.33\%$
test_select 0.2140ms 58.0130μs 17.2375 KOps/s 18.0311 KOps/s $\color{#d91a1a}-4.40\%$
test_select_nested 0.1223ms 59.8790μs 16.7004 KOps/s 16.8351 KOps/s $\color{#d91a1a}-0.80\%$
test_exclude_nested 0.1548ms 76.9315μs 12.9986 KOps/s 13.3060 KOps/s $\color{#d91a1a}-2.31\%$
test_empty[True] 0.6459ms 0.3577ms 2.7954 KOps/s 2.8330 KOps/s $\color{#d91a1a}-1.32\%$
test_empty[False] 10.3870μs 1.2660μs 789.8949 KOps/s 805.6344 KOps/s $\color{#d91a1a}-1.95\%$
test_unbind_speed 0.5711ms 0.3044ms 3.2851 KOps/s 3.2731 KOps/s $\color{#35bf28}+0.37\%$
test_unbind_speed_stack0 0.6610ms 0.2957ms 3.3814 KOps/s 3.4083 KOps/s $\color{#d91a1a}-0.79\%$
test_unbind_speed_stack1 96.5183ms 0.8092ms 1.2358 KOps/s 1.3589 KOps/s $\textbf{\color{#d91a1a}-9.06\%}$
test_split 91.1060ms 2.1886ms 456.9046 Ops/s 456.5758 Ops/s $\color{#35bf28}+0.07\%$
test_chunk 3.1672ms 2.0099ms 497.5456 Ops/s 455.3382 Ops/s $\textbf{\color{#35bf28}+9.27\%}$
test_creation[device0] 3.3161ms 0.1201ms 8.3270 KOps/s 8.7671 KOps/s $\textbf{\color{#d91a1a}-5.02\%}$
test_creation_from_tensor 0.3255ms 0.1172ms 8.5325 KOps/s 8.5932 KOps/s $\color{#d91a1a}-0.71\%$
test_add_one[memmap_tensor0] 0.1748ms 6.9227μs 144.4519 KOps/s 137.5889 KOps/s $\color{#35bf28}+4.99\%$
test_contiguous[memmap_tensor0] 33.8330μs 1.9477μs 513.4239 KOps/s 533.9819 KOps/s $\color{#d91a1a}-3.85\%$
test_stack[memmap_tensor0] 61.5260μs 5.5022μs 181.7439 KOps/s 177.3345 KOps/s $\color{#35bf28}+2.49\%$
test_memmaptd_index 0.6604ms 0.4139ms 2.4160 KOps/s 2.4659 KOps/s $\color{#d91a1a}-2.02\%$
test_memmaptd_index_astensor 1.0218ms 0.5193ms 1.9256 KOps/s 1.9606 KOps/s $\color{#d91a1a}-1.78\%$
test_memmaptd_index_op 1.6408ms 1.1209ms 892.1718 Ops/s 968.5715 Ops/s $\textbf{\color{#d91a1a}-7.89\%}$
test_serialize_model 0.2386s 0.1337s 7.4811 Ops/s 8.4839 Ops/s $\textbf{\color{#d91a1a}-11.82\%}$
test_serialize_model_pickle 0.4477s 0.3950s 2.5314 Ops/s 2.4156 Ops/s $\color{#35bf28}+4.80\%$
test_serialize_weights 0.1238s 0.1155s 8.6598 Ops/s 8.2248 Ops/s $\textbf{\color{#35bf28}+5.29\%}$
test_serialize_weights_returnearly 0.2639s 0.1727s 5.7908 Ops/s 5.5926 Ops/s $\color{#35bf28}+3.55\%$
test_serialize_weights_pickle 1.0130s 0.6918s 1.4454 Ops/s 2.5175 Ops/s $\textbf{\color{#d91a1a}-42.59\%}$
test_serialize_weights_filesystem 0.1506s 0.1431s 6.9882 Ops/s 7.1176 Ops/s $\color{#d91a1a}-1.82\%$
test_serialize_model_filesystem 0.1559s 0.1443s 6.9319 Ops/s 6.6802 Ops/s $\color{#35bf28}+3.77\%$
test_reshape_pytree 85.0590μs 38.6935μs 25.8441 KOps/s 25.3629 KOps/s $\color{#35bf28}+1.90\%$
test_reshape_td 95.3090μs 44.9784μs 22.2329 KOps/s 22.0480 KOps/s $\color{#35bf28}+0.84\%$
test_view_pytree 95.3790μs 38.9839μs 25.6516 KOps/s 26.0911 KOps/s $\color{#d91a1a}-1.68\%$
test_view_td 0.1137ms 52.1441μs 19.1776 KOps/s 19.3224 KOps/s $\color{#d91a1a}-0.75\%$
test_unbind_pytree 79.1890μs 35.9758μs 27.7965 KOps/s 27.6849 KOps/s $\color{#35bf28}+0.40\%$
test_unbind_td 0.3032ms 45.2341μs 22.1072 KOps/s 21.9040 KOps/s $\color{#35bf28}+0.93\%$
test_split_pytree 91.8120μs 37.8812μs 26.3983 KOps/s 25.8690 KOps/s $\color{#35bf28}+2.05\%$
test_split_td 0.4998ms 57.7244μs 17.3237 KOps/s 17.2897 KOps/s $\color{#35bf28}+0.20\%$
test_add_pytree 0.1019ms 44.8682μs 22.2875 KOps/s 22.1637 KOps/s $\color{#35bf28}+0.56\%$
test_add_td 0.2580ms 90.4067μs 11.0611 KOps/s 11.8552 KOps/s $\textbf{\color{#d91a1a}-6.70\%}$
test_compile_add_one_nested[tensordict-compile] 0.1543ms 59.3970μs 16.8359 KOps/s 17.4008 KOps/s $\color{#d91a1a}-3.25\%$
test_compile_add_one_nested[tensordict-eager] 0.4438ms 0.1943ms 5.1479 KOps/s 5.0858 KOps/s $\color{#35bf28}+1.22\%$
test_compile_add_one_nested[pytree-compile] 0.1292ms 56.6683μs 17.6466 KOps/s 17.6242 KOps/s $\color{#35bf28}+0.13\%$
test_compile_add_one_nested[pytree-eager] 0.3300ms 0.1368ms 7.3089 KOps/s 7.1393 KOps/s $\color{#35bf28}+2.38\%$
test_compile_copy_nested[tensordict-compile] 58.8610μs 23.4061μs 42.7238 KOps/s 41.7495 KOps/s $\color{#35bf28}+2.33\%$
test_compile_copy_nested[tensordict-eager] 0.1758ms 76.4328μs 13.0834 KOps/s 13.3360 KOps/s $\color{#d91a1a}-1.89\%$
test_compile_copy_nested[pytree-compile] 0.1550ms 75.4810μs 13.2484 KOps/s 12.9833 KOps/s $\color{#35bf28}+2.04\%$
test_compile_copy_nested[pytree-eager] 0.1477ms 69.0063μs 14.4914 KOps/s 14.5170 KOps/s $\color{#d91a1a}-0.18\%$
test_compile_add_one_flat[tensordict-compile] 0.2690ms 0.1859ms 5.3795 KOps/s 5.5532 KOps/s $\color{#d91a1a}-3.13\%$
test_compile_add_one_flat[tensordict-eager] 0.4361ms 0.2415ms 4.1401 KOps/s 4.1599 KOps/s $\color{#d91a1a}-0.48\%$
test_compile_add_one_flat[tensorclass-compile] 0.1195ms 49.5013μs 20.2015 KOps/s 21.0713 KOps/s $\color{#d91a1a}-4.13\%$
test_compile_add_one_flat[tensorclass-eager] 0.6788ms 76.5115μs 13.0699 KOps/s 12.7764 KOps/s $\color{#35bf28}+2.30\%$
test_compile_add_one_flat[pytree-compile] 0.2779ms 0.1731ms 5.7756 KOps/s 5.7547 KOps/s $\color{#35bf28}+0.36\%$
test_compile_add_one_flat[pytree-eager] 0.3792ms 0.2801ms 3.5695 KOps/s 3.4207 KOps/s $\color{#35bf28}+4.35\%$
test_compile_add_self_flat[tensordict-eager] 0.5562ms 0.2781ms 3.5953 KOps/s 3.6315 KOps/s $\color{#d91a1a}-1.00\%$
test_compile_add_self_flat[tensordict-compile] 0.6518ms 0.1873ms 5.3399 KOps/s 5.5452 KOps/s $\color{#d91a1a}-3.70\%$
test_compile_add_self_flat[tensorclass-eager] 0.1863ms 76.5071μs 13.0707 KOps/s 13.4980 KOps/s $\color{#d91a1a}-3.17\%$
test_compile_add_self_flat[tensorclass-compile] 0.1251ms 49.3500μs 20.2634 KOps/s 20.6997 KOps/s $\color{#d91a1a}-2.11\%$
test_compile_add_self_flat[pytree-eager] 0.4994ms 0.2287ms 4.3724 KOps/s 4.2532 KOps/s $\color{#35bf28}+2.80\%$
test_compile_add_self_flat[pytree-compile] 0.2387ms 0.1754ms 5.7007 KOps/s 5.7733 KOps/s $\color{#d91a1a}-1.26\%$
test_compile_copy_flat[tensordict-compile] 0.2100ms 0.1108ms 9.0244 KOps/s 9.0277 KOps/s $\color{#d91a1a}-0.04\%$
test_compile_copy_flat[tensordict-eager] 0.1572ms 82.8723μs 12.0668 KOps/s 12.1751 KOps/s $\color{#d91a1a}-0.89\%$
test_compile_copy_flat[pytree-compile] 0.1471ms 77.9259μs 12.8327 KOps/s 12.3232 KOps/s $\color{#35bf28}+4.13\%$
test_compile_copy_flat[pytree-eager] 0.1698ms 69.9335μs 14.2993 KOps/s 13.6020 KOps/s $\textbf{\color{#35bf28}+5.13\%}$
test_compile_assign_and_add[tensordict-compile] 0.2951ms 0.1933ms 5.1745 KOps/s 5.1878 KOps/s $\color{#d91a1a}-0.26\%$
test_compile_assign_and_add[tensordict-eager] 2.0485ms 1.7646ms 566.7028 Ops/s 558.8199 Ops/s $\color{#35bf28}+1.41\%$
test_compile_assign_and_add[pytree-compile] 0.3601ms 0.1892ms 5.2862 KOps/s 5.2120 KOps/s $\color{#35bf28}+1.42\%$
test_compile_assign_and_add[pytree-eager] 1.3326ms 1.0641ms 939.7382 Ops/s 906.8611 Ops/s $\color{#35bf28}+3.63\%$
test_compile_assign_and_add_stack[compile] 0.5336ms 0.4097ms 2.4408 KOps/s 2.4296 KOps/s $\color{#35bf28}+0.46\%$
test_compile_assign_and_add_stack[eager] 6.4719ms 4.2142ms 237.2906 Ops/s 255.8997 Ops/s $\textbf{\color{#d91a1a}-7.27\%}$
test_compile_indexing[tensor-tensordict-compile] 99.4360μs 34.2576μs 29.1906 KOps/s 29.3765 KOps/s $\color{#d91a1a}-0.63\%$
test_compile_indexing[tensor-tensordict-eager] 1.0700ms 47.9226μs 20.8670 KOps/s 21.0265 KOps/s $\color{#d91a1a}-0.76\%$
test_compile_indexing[tensor-tensorclass-compile] 99.5970μs 29.7912μs 33.5669 KOps/s 33.9720 KOps/s $\color{#d91a1a}-1.19\%$
test_compile_indexing[tensor-tensorclass-eager] 96.5610μs 29.6479μs 33.7292 KOps/s 33.5001 KOps/s $\color{#35bf28}+0.68\%$
test_compile_indexing[tensor-pytree-compile] 98.8050μs 29.4615μs 33.9426 KOps/s 34.4206 KOps/s $\color{#d91a1a}-1.39\%$
test_compile_indexing[tensor-pytree-eager] 0.1061ms 29.4809μs 33.9203 KOps/s 34.6707 KOps/s $\color{#d91a1a}-2.16\%$
test_compile_indexing[slice-tensordict-compile] 0.1804ms 74.6895μs 13.3888 KOps/s 13.6045 KOps/s $\color{#d91a1a}-1.59\%$
test_compile_indexing[slice-tensordict-eager] 0.5405ms 27.8492μs 35.9077 KOps/s 35.2151 KOps/s $\color{#35bf28}+1.97\%$
test_compile_indexing[slice-tensorclass-compile] 0.1491ms 68.4474μs 14.6098 KOps/s 14.6393 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_indexing[slice-tensorclass-eager] 86.6620μs 22.9814μs 43.5134 KOps/s 42.8390 KOps/s $\color{#35bf28}+1.57\%$
test_compile_indexing[slice-pytree-compile] 0.1601ms 68.1011μs 14.6841 KOps/s 14.7263 KOps/s $\color{#d91a1a}-0.29\%$
test_compile_indexing[slice-pytree-eager] 75.1610μs 22.8500μs 43.7637 KOps/s 42.3332 KOps/s $\color{#35bf28}+3.38\%$
test_compile_indexing[int-tensordict-compile] 0.1715ms 72.7920μs 13.7378 KOps/s 13.4590 KOps/s $\color{#35bf28}+2.07\%$
test_compile_indexing[int-tensordict-eager] 0.3984s 43.4155μs 23.0333 KOps/s 36.7152 KOps/s $\textbf{\color{#d91a1a}-37.27\%}$
test_compile_indexing[int-tensorclass-compile] 0.1286ms 67.9327μs 14.7204 KOps/s 14.7863 KOps/s $\color{#d91a1a}-0.45\%$
test_compile_indexing[int-tensorclass-eager] 75.4320μs 22.5543μs 44.3375 KOps/s 43.0920 KOps/s $\color{#35bf28}+2.89\%$
test_compile_indexing[int-pytree-compile] 0.1441ms 67.2737μs 14.8647 KOps/s 14.8154 KOps/s $\color{#35bf28}+0.33\%$
test_compile_indexing[int-pytree-eager] 65.0910μs 22.7458μs 43.9642 KOps/s 43.1101 KOps/s $\color{#35bf28}+1.98\%$
test_mod_add[eager] 81.0420μs 26.7137μs 37.4339 KOps/s 39.5779 KOps/s $\textbf{\color{#d91a1a}-5.42\%}$
test_mod_add[compile] 82.6350μs 38.1781μs 26.1930 KOps/s 26.6170 KOps/s $\color{#d91a1a}-1.59\%$
test_mod_add[compile-overhead] 0.1118ms 38.1698μs 26.1987 KOps/s 26.6869 KOps/s $\color{#d91a1a}-1.83\%$
test_mod_wrap[eager] 0.3740ms 0.2042ms 4.8975 KOps/s 4.8439 KOps/s $\color{#35bf28}+1.11\%$
test_mod_wrap[compile] 0.4345ms 0.2291ms 4.3640 KOps/s 4.3471 KOps/s $\color{#35bf28}+0.39\%$
test_mod_wrap[compile-overhead] 0.4312ms 0.2280ms 4.3852 KOps/s 4.4068 KOps/s $\color{#d91a1a}-0.49\%$
test_mod_wrap_and_backward[eager] 13.8316ms 11.8182ms 84.6153 Ops/s 90.7208 Ops/s $\textbf{\color{#d91a1a}-6.73\%}$
test_mod_wrap_and_backward[compile] 14.2215ms 11.4195ms 87.5695 Ops/s 91.8601 Ops/s $\color{#d91a1a}-4.67\%$
test_mod_wrap_and_backward[compile-overhead] 15.6912ms 12.3351ms 81.0698 Ops/s 92.1110 Ops/s $\textbf{\color{#d91a1a}-11.99\%}$
test_seq_add[eager] 0.1714ms 96.6755μs 10.3439 KOps/s 11.0069 KOps/s $\textbf{\color{#d91a1a}-6.02\%}$
test_seq_add[compile] 0.1349ms 64.1625μs 15.5854 KOps/s 15.4166 KOps/s $\color{#35bf28}+1.10\%$
test_seq_add[compile-overhead] 0.1405ms 63.7505μs 15.6862 KOps/s 15.4104 KOps/s $\color{#35bf28}+1.79\%$
test_seq_wrap[eager] 0.7068ms 0.3860ms 2.5909 KOps/s 2.6518 KOps/s $\color{#d91a1a}-2.30\%$
test_seq_wrap[compile] 0.4818ms 0.2641ms 3.7869 KOps/s 3.7308 KOps/s $\color{#35bf28}+1.50\%$
test_seq_wrap[compile-overhead] 0.4964ms 0.2636ms 3.7939 KOps/s 3.7605 KOps/s $\color{#35bf28}+0.89\%$
test_func_call_runtime[False-eager] 0.9878ms 0.5084ms 1.9671 KOps/s 1.8828 KOps/s $\color{#35bf28}+4.48\%$
test_func_call_runtime[False-compile] 0.6433ms 0.4891ms 2.0448 KOps/s 2.0304 KOps/s $\color{#35bf28}+0.71\%$
test_func_call_runtime[False-compile-overhead] 0.6603ms 0.4919ms 2.0329 KOps/s 2.0339 KOps/s $\color{#d91a1a}-0.05\%$
test_func_call_runtime[True-eager] 0.8584ms 0.7250ms 1.3793 KOps/s 1.3272 KOps/s $\color{#35bf28}+3.92\%$
test_func_call_runtime[True-compile] 0.7807ms 0.5082ms 1.9677 KOps/s 1.9675 KOps/s $\color{#35bf28}+0.01\%$
test_func_call_runtime[True-compile-overhead] 0.7129ms 0.5109ms 1.9574 KOps/s 1.9560 KOps/s $\color{#35bf28}+0.07\%$
test_func_call_cm_runtime[False-eager] 0.9305ms 0.5169ms 1.9348 KOps/s 1.9200 KOps/s $\color{#35bf28}+0.77\%$
test_func_call_cm_runtime[False-compile] 0.6114ms 0.4923ms 2.0311 KOps/s 2.0185 KOps/s $\color{#35bf28}+0.63\%$
test_func_call_cm_runtime[False-compile-overhead] 0.6484ms 0.4976ms 2.0096 KOps/s 2.0228 KOps/s $\color{#d91a1a}-0.65\%$
test_func_call_cm_runtime[True-eager] 1.0805ms 0.8730ms 1.1454 KOps/s 1.1191 KOps/s $\color{#35bf28}+2.35\%$
test_func_call_cm_runtime[True-compile] 1.0016ms 0.7143ms 1.3999 KOps/s 1.3499 KOps/s $\color{#35bf28}+3.71\%$
test_func_call_cm_runtime[True-compile-overhead] 1.0233ms 0.7109ms 1.4067 KOps/s 1.3476 KOps/s $\color{#35bf28}+4.39\%$
test_vmap_func_call_cm_runtime[eager] 2.5226ms 1.8812ms 531.5659 Ops/s 518.4103 Ops/s $\color{#35bf28}+2.54\%$
test_vmap_func_call_cm_runtime[compile] 2.6473ms 1.9369ms 516.2857 Ops/s 507.9148 Ops/s $\color{#35bf28}+1.65\%$
test_vmap_func_call_cm_runtime[compile-overhead] 2.6104ms 1.9308ms 517.9296 Ops/s 509.6742 Ops/s $\color{#35bf28}+1.62\%$
test_distributed 0.2952ms 0.1275ms 7.8420 KOps/s 7.7174 KOps/s $\color{#35bf28}+1.61\%$
test_tdmodule 87.5940μs 19.6569μs 50.8728 KOps/s 57.1917 KOps/s $\textbf{\color{#d91a1a}-11.05\%}$
test_tdmodule_dispatch 58.1990μs 38.8629μs 25.7315 KOps/s 28.4388 KOps/s $\textbf{\color{#d91a1a}-9.52\%}$
test_tdseq 59.8420μs 22.1425μs 45.1619 KOps/s 51.1099 KOps/s $\textbf{\color{#d91a1a}-11.64\%}$
test_tdseq_dispatch 0.1061ms 46.3251μs 21.5866 KOps/s 24.6488 KOps/s $\textbf{\color{#d91a1a}-12.42\%}$
test_instantiation_functorch 1.7108ms 1.5617ms 640.3273 Ops/s 627.5378 Ops/s $\color{#35bf28}+2.04\%$
test_exec_functorch 0.2721ms 0.1829ms 5.4685 KOps/s 5.5778 KOps/s $\color{#d91a1a}-1.96\%$
test_exec_functional_call 0.3031ms 0.1694ms 5.9040 KOps/s 5.7886 KOps/s $\color{#35bf28}+1.99\%$
test_exec_td_decorator 0.5168ms 0.2267ms 4.4105 KOps/s 4.2915 KOps/s $\color{#35bf28}+2.77\%$
test_vmap_mlp_speed_decorator[True-True] 0.8866ms 0.6395ms 1.5636 KOps/s 1.5739 KOps/s $\color{#d91a1a}-0.65\%$
test_vmap_mlp_speed_decorator[True-False] 1.0184ms 0.6420ms 1.5577 KOps/s 1.5631 KOps/s $\color{#d91a1a}-0.35\%$
test_vmap_mlp_speed_decorator[False-True] 0.9681ms 0.5307ms 1.8842 KOps/s 1.8891 KOps/s $\color{#d91a1a}-0.26\%$
test_vmap_mlp_speed_decorator[False-False] 0.7866ms 0.5208ms 1.9201 KOps/s 1.8782 KOps/s $\color{#35bf28}+2.23\%$
test_to_module_speed[True] 1.7326ms 1.4337ms 697.5130 Ops/s 704.9324 Ops/s $\color{#d91a1a}-1.05\%$
test_to_module_speed[False] 1.9690ms 1.4132ms 707.6301 Ops/s 723.3719 Ops/s $\color{#d91a1a}-2.18\%$
test_tc_init 0.1245ms 51.0511μs 19.5882 KOps/s 21.2275 KOps/s $\textbf{\color{#d91a1a}-7.72\%}$
test_tc_init_nested 0.1885ms 0.1037ms 9.6409 KOps/s 10.7345 KOps/s $\textbf{\color{#d91a1a}-10.19\%}$
test_tc_first_layer_tensor 43.8920μs 1.6058μs 622.7614 KOps/s 663.2911 KOps/s $\textbf{\color{#d91a1a}-6.11\%}$
test_tc_first_layer_nontensor 29.0440μs 4.9544μs 201.8401 KOps/s 211.4343 KOps/s $\color{#d91a1a}-4.54\%$
test_tc_second_layer_tensor 42.3700μs 2.9922μs 334.2058 KOps/s 356.0910 KOps/s $\textbf{\color{#d91a1a}-6.15\%}$
test_tc_second_layer_nontensor 35.9170μs 6.3340μs 157.8789 KOps/s 164.2914 KOps/s $\color{#d91a1a}-3.90\%$
test_unbind 0.4440s 13.0013ms 76.9155 Ops/s 75.2684 Ops/s $\color{#35bf28}+2.19\%$
test_full_like 8.5405ms 7.2319ms 138.2754 Ops/s 134.3570 Ops/s $\color{#35bf28}+2.92\%$
test_zeros_like 12.8500ms 7.3232ms 136.5517 Ops/s 351.0415 Ops/s $\textbf{\color{#d91a1a}-61.10\%}$
test_ones_like 15.1793ms 7.4877ms 133.5517 Ops/s 162.1008 Ops/s $\textbf{\color{#d91a1a}-17.61\%}$
test_clone 15.0942ms 8.9759ms 111.4099 Ops/s 125.0199 Ops/s $\textbf{\color{#d91a1a}-10.89\%}$
test_squeeze 61.4660μs 12.8262μs 77.9656 KOps/s 77.0240 KOps/s $\color{#35bf28}+1.22\%$
test_unsqueeze 0.1673ms 92.7590μs 10.7806 KOps/s 10.5117 KOps/s $\color{#35bf28}+2.56\%$
test_split 0.4916ms 0.1946ms 5.1384 KOps/s 5.0088 KOps/s $\color{#35bf28}+2.59\%$
test_permute 0.3569ms 0.2143ms 4.6673 KOps/s 4.4731 KOps/s $\color{#35bf28}+4.34\%$
test_stack 28.9969ms 24.2488ms 41.2391 Ops/s 39.1042 Ops/s $\textbf{\color{#35bf28}+5.46\%}$
test_cat 27.3352ms 24.0222ms 41.6282 Ops/s 39.3059 Ops/s $\textbf{\color{#35bf28}+5.91\%}$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 218. Improved: $\large\color{#35bf28}31$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1638ms 16.6855μs 59.9321 KOps/s 55.1854 KOps/s $\textbf{\color{#35bf28}+8.60\%}$
test_plain_set_stack_nested 44.3010μs 16.7760μs 59.6088 KOps/s 54.5676 KOps/s $\textbf{\color{#35bf28}+9.24\%}$
test_plain_set_nested_inplace 51.6710μs 18.0707μs 55.3383 KOps/s 51.2363 KOps/s $\textbf{\color{#35bf28}+8.01\%}$
test_plain_set_stack_nested_inplace 51.2110μs 18.0208μs 55.4915 KOps/s 51.5282 KOps/s $\textbf{\color{#35bf28}+7.69\%}$
test_items 25.0100μs 2.9777μs 335.8285 KOps/s 329.3874 KOps/s $\color{#35bf28}+1.96\%$
test_items_nested 0.3820ms 0.3422ms 2.9226 KOps/s 2.9415 KOps/s $\color{#d91a1a}-0.64\%$
test_items_nested_locked 0.3976ms 0.3460ms 2.8904 KOps/s 2.9250 KOps/s $\color{#d91a1a}-1.18\%$
test_items_nested_leaf 97.8620μs 62.6628μs 15.9584 KOps/s 15.8679 KOps/s $\color{#35bf28}+0.57\%$
test_items_stack_nested 0.3993ms 0.3469ms 2.8823 KOps/s 2.9574 KOps/s $\color{#d91a1a}-2.54\%$
test_items_stack_nested_leaf 99.0730μs 64.3699μs 15.5352 KOps/s 15.8721 KOps/s $\color{#d91a1a}-2.12\%$
test_items_stack_nested_locked 0.3861ms 0.3476ms 2.8770 KOps/s 2.9382 KOps/s $\color{#d91a1a}-2.08\%$
test_keys 34.0510μs 3.5176μs 284.2809 KOps/s 288.2366 KOps/s $\color{#d91a1a}-1.37\%$
test_keys_nested 0.1435ms 71.9040μs 13.9074 KOps/s 14.0864 KOps/s $\color{#d91a1a}-1.27\%$
test_keys_nested_locked 2.3270ms 78.0250μs 12.8164 KOps/s 12.8048 KOps/s $\color{#35bf28}+0.09\%$
test_keys_nested_leaf 85.5220μs 63.0413μs 15.8626 KOps/s 16.1462 KOps/s $\color{#d91a1a}-1.76\%$
test_keys_stack_nested 0.1078ms 73.0579μs 13.6878 KOps/s 13.7224 KOps/s $\color{#d91a1a}-0.25\%$
test_keys_stack_nested_leaf 92.8020μs 64.9582μs 15.3945 KOps/s 15.7413 KOps/s $\color{#d91a1a}-2.20\%$
test_keys_stack_nested_locked 0.1115ms 78.8803μs 12.6774 KOps/s 12.8138 KOps/s $\color{#d91a1a}-1.06\%$
test_values 6.6368μs 0.8835μs 1.1319 MOps/s 1.1242 MOps/s $\color{#35bf28}+0.69\%$
test_values_nested 0.1162ms 48.8264μs 20.4807 KOps/s 20.3163 KOps/s $\color{#35bf28}+0.81\%$
test_values_nested_locked 72.6820μs 50.8498μs 19.6657 KOps/s 19.6397 KOps/s $\color{#35bf28}+0.13\%$
test_values_nested_leaf 65.9610μs 43.5726μs 22.9502 KOps/s 23.2398 KOps/s $\color{#d91a1a}-1.25\%$
test_values_stack_nested 74.2010μs 50.8001μs 19.6850 KOps/s 20.1578 KOps/s $\color{#d91a1a}-2.35\%$
test_values_stack_nested_leaf 80.2810μs 44.5285μs 22.4575 KOps/s 22.8818 KOps/s $\color{#d91a1a}-1.85\%$
test_values_stack_nested_locked 78.4120μs 52.5020μs 19.0469 KOps/s 19.3749 KOps/s $\color{#d91a1a}-1.69\%$
test_membership 1.7050μs 0.5351μs 1.8687 MOps/s 1.8775 MOps/s $\color{#d91a1a}-0.47\%$
test_membership_nested 19.2705μs 1.9833μs 504.2115 KOps/s 500.5065 KOps/s $\color{#35bf28}+0.74\%$
test_membership_nested_leaf 14.8600μs 1.9921μs 501.9769 KOps/s 513.0412 KOps/s $\color{#d91a1a}-2.16\%$
test_membership_stacked_nested 35.8310μs 2.0647μs 484.3360 KOps/s 499.8336 KOps/s $\color{#d91a1a}-3.10\%$
test_membership_stacked_nested_leaf 33.1610μs 2.0584μs 485.8164 KOps/s 494.3424 KOps/s $\color{#d91a1a}-1.72\%$
test_membership_nested_last 33.0810μs 3.1096μs 321.5813 KOps/s 324.3490 KOps/s $\color{#d91a1a}-0.85\%$
test_membership_nested_leaf_last 36.0010μs 3.1278μs 319.7144 KOps/s 326.1055 KOps/s $\color{#d91a1a}-1.96\%$
test_membership_stacked_nested_last 39.9710μs 4.4993μs 222.2549 KOps/s 327.2699 KOps/s $\textbf{\color{#d91a1a}-32.09\%}$
test_membership_stacked_nested_leaf_last 26.2310μs 4.4850μs 222.9636 KOps/s 325.4662 KOps/s $\textbf{\color{#d91a1a}-31.49\%}$
test_nested_getleaf 42.3410μs 6.1998μs 161.2947 KOps/s 160.4330 KOps/s $\color{#35bf28}+0.54\%$
test_nested_get 39.6510μs 5.8866μs 169.8779 KOps/s 169.4269 KOps/s $\color{#35bf28}+0.27\%$
test_stacked_getleaf 27.4310μs 6.2417μs 160.2133 KOps/s 160.6070 KOps/s $\color{#d91a1a}-0.25\%$
test_stacked_get 31.9010μs 5.8209μs 171.7957 KOps/s 173.0850 KOps/s $\color{#d91a1a}-0.74\%$
test_nested_getitemleaf 26.8310μs 6.3190μs 158.2532 KOps/s 159.0815 KOps/s $\color{#d91a1a}-0.52\%$
test_nested_getitem 41.4010μs 5.7831μs 172.9168 KOps/s 168.6792 KOps/s $\color{#35bf28}+2.51\%$
test_stacked_getitemleaf 25.2200μs 6.2621μs 159.6908 KOps/s 160.0689 KOps/s $\color{#d91a1a}-0.24\%$
test_stacked_getitem 36.9610μs 5.8546μs 170.8061 KOps/s 170.1766 KOps/s $\color{#35bf28}+0.37\%$
test_lock_nested 5.0274ms 0.4549ms 2.1984 KOps/s 2.3029 KOps/s $\color{#d91a1a}-4.53\%$
test_lock_stack_nested 0.4454ms 0.4077ms 2.4530 KOps/s 2.5164 KOps/s $\color{#d91a1a}-2.52\%$
test_unlock_nested 0.7779ms 0.3773ms 2.6501 KOps/s 2.7161 KOps/s $\color{#d91a1a}-2.43\%$
test_unlock_stack_nested 0.3734ms 0.3404ms 2.9379 KOps/s 3.0092 KOps/s $\color{#d91a1a}-2.37\%$
test_flatten_speed 0.1170ms 77.6010μs 12.8864 KOps/s 12.9764 KOps/s $\color{#d91a1a}-0.69\%$
test_unflatten_speed 0.3734ms 0.3311ms 3.0205 KOps/s 3.0725 KOps/s $\color{#d91a1a}-1.69\%$
test_common_ops 1.5734ms 1.2308ms 812.4669 Ops/s 778.7113 Ops/s $\color{#35bf28}+4.33\%$
test_creation 22.3800μs 1.5560μs 642.6835 KOps/s 651.5088 KOps/s $\color{#d91a1a}-1.35\%$
test_creation_empty 45.2010μs 15.1133μs 66.1669 KOps/s 54.6025 KOps/s $\textbf{\color{#35bf28}+21.18\%}$
test_creation_nested_1 44.0010μs 16.8857μs 59.2219 KOps/s 47.7056 KOps/s $\textbf{\color{#35bf28}+24.14\%}$
test_creation_nested_2 54.2610μs 19.6859μs 50.7978 KOps/s 41.9500 KOps/s $\textbf{\color{#35bf28}+21.09\%}$
test_clone 60.0310μs 27.5019μs 36.3612 KOps/s 35.9424 KOps/s $\color{#35bf28}+1.16\%$
test_getitem[int] 1.2710ms 16.1128μs 62.0624 KOps/s 63.3410 KOps/s $\color{#d91a1a}-2.02\%$
test_getitem[slice_int] 92.7539ms 38.8388μs 25.7474 KOps/s 36.8771 KOps/s $\textbf{\color{#d91a1a}-30.18\%}$
test_getitem[range] 0.2148ms 0.1042ms 9.5940 KOps/s 9.4816 KOps/s $\color{#35bf28}+1.19\%$
test_getitem[tuple] 0.1236ms 23.8936μs 41.8522 KOps/s 40.7321 KOps/s $\color{#35bf28}+2.75\%$
test_getitem[list] 0.1838ms 94.0290μs 10.6350 KOps/s 10.0418 KOps/s $\textbf{\color{#35bf28}+5.91\%}$
test_setitem_dim[int] 67.1720μs 42.2750μs 23.6546 KOps/s 21.5672 KOps/s $\textbf{\color{#35bf28}+9.68\%}$
test_setitem_dim[slice_int] 90.4820μs 64.0936μs 15.6022 KOps/s 15.5505 KOps/s $\color{#35bf28}+0.33\%$
test_setitem_dim[range] 0.1529ms 0.1226ms 8.1568 KOps/s 8.1159 KOps/s $\color{#35bf28}+0.50\%$
test_setitem_dim[tuple] 85.9020μs 62.8826μs 15.9027 KOps/s 17.0291 KOps/s $\textbf{\color{#d91a1a}-6.61\%}$
test_setitem 88.7620μs 44.2428μs 22.6025 KOps/s 23.9670 KOps/s $\textbf{\color{#d91a1a}-5.69\%}$
test_set 85.6920μs 43.4240μs 23.0287 KOps/s 24.3297 KOps/s $\textbf{\color{#d91a1a}-5.35\%}$
test_set_shared 0.3498ms 54.5132μs 18.3442 KOps/s 18.8673 KOps/s $\color{#d91a1a}-2.77\%$
test_update 89.6120μs 48.3047μs 20.7019 KOps/s 18.8238 KOps/s $\textbf{\color{#35bf28}+9.98\%}$
test_update_nested 99.9920μs 55.4945μs 18.0198 KOps/s 16.0261 KOps/s $\textbf{\color{#35bf28}+12.44\%}$
test_update__nested 0.1837ms 61.7679μs 16.1896 KOps/s 16.5999 KOps/s $\color{#d91a1a}-2.47\%$
test_set_nested 76.9420μs 41.6296μs 24.0214 KOps/s 22.6074 KOps/s $\textbf{\color{#35bf28}+6.25\%}$
test_set_nested_new 98.1130μs 45.2258μs 22.1113 KOps/s 20.8934 KOps/s $\textbf{\color{#35bf28}+5.83\%}$
test_select 0.1038ms 59.2452μs 16.8790 KOps/s 16.5600 KOps/s $\color{#35bf28}+1.93\%$
test_select_nested 0.4399ms 45.6784μs 21.8922 KOps/s 22.3497 KOps/s $\color{#d91a1a}-2.05\%$
test_exclude_nested 0.1115ms 61.7320μs 16.1990 KOps/s 16.1919 KOps/s $\color{#35bf28}+0.04\%$
test_empty[True] 0.3034ms 0.2653ms 3.7689 KOps/s 3.8101 KOps/s $\color{#d91a1a}-1.08\%$
test_empty[False] 3.0501μs 0.7592μs 1.3172 MOps/s 1.3077 MOps/s $\color{#35bf28}+0.73\%$
test_to 57.8110μs 26.7585μs 37.3713 KOps/s 36.0802 KOps/s $\color{#35bf28}+3.58\%$
test_to_nonblocking 67.1320μs 26.2691μs 38.0676 KOps/s 38.7117 KOps/s $\color{#d91a1a}-1.66\%$
test_unbind_speed 0.3252ms 0.2871ms 3.4830 KOps/s 3.5531 KOps/s $\color{#d91a1a}-1.97\%$
test_unbind_speed_stack0 0.3286ms 0.2881ms 3.4704 KOps/s 3.6168 KOps/s $\color{#d91a1a}-4.05\%$
test_unbind_speed_stack1 92.1259ms 0.7323ms 1.3655 KOps/s 1.3969 KOps/s $\color{#d91a1a}-2.25\%$
test_split 94.1119ms 2.1896ms 456.7016 Ops/s 459.7550 Ops/s $\color{#d91a1a}-0.66\%$
test_chunk 95.5782ms 2.1873ms 457.1811 Ops/s 459.7560 Ops/s $\color{#d91a1a}-0.56\%$
test_creation[device0] 0.3556ms 0.1257ms 7.9562 KOps/s 7.8731 KOps/s $\color{#35bf28}+1.05\%$
test_creation_from_tensor 0.4395ms 0.1284ms 7.7860 KOps/s 7.5137 KOps/s $\color{#35bf28}+3.62\%$
test_add_one[memmap_tensor0] 0.2337ms 8.7571μs 114.1930 KOps/s 109.4873 KOps/s $\color{#35bf28}+4.30\%$
test_contiguous[memmap_tensor0] 33.6210μs 2.1764μs 459.4643 KOps/s 453.8293 KOps/s $\color{#35bf28}+1.24\%$
test_stack[memmap_tensor0] 38.1210μs 6.8399μs 146.2002 KOps/s 153.5152 KOps/s $\color{#d91a1a}-4.76\%$
test_memmaptd_index 1.2629ms 0.4243ms 2.3566 KOps/s 2.3637 KOps/s $\color{#d91a1a}-0.30\%$
test_memmaptd_index_astensor 0.9803ms 0.4963ms 2.0149 KOps/s 2.0114 KOps/s $\color{#35bf28}+0.17\%$
test_memmaptd_index_op 1.4217ms 1.0218ms 978.6520 Ops/s 940.8073 Ops/s $\color{#35bf28}+4.02\%$
test_serialize_model 0.1306s 0.1299s 7.6994 Ops/s 7.6720 Ops/s $\color{#35bf28}+0.36\%$
test_serialize_model_pickle 1.3521s 1.2175s 0.8214 Ops/s 0.8242 Ops/s $\color{#d91a1a}-0.34\%$
test_serialize_weights 0.1306s 0.1301s 7.6868 Ops/s 7.7118 Ops/s $\color{#d91a1a}-0.32\%$
test_serialize_weights_returnearly 0.2158s 56.6634ms 17.6481 Ops/s 17.8598 Ops/s $\color{#d91a1a}-1.19\%$
test_serialize_weights_pickle 1.3471s 1.1856s 0.8435 Ops/s 0.8218 Ops/s $\color{#35bf28}+2.64\%$
test_reshape_pytree 70.4820μs 39.2319μs 25.4894 KOps/s 27.0514 KOps/s $\textbf{\color{#d91a1a}-5.77\%}$
test_reshape_td 80.3320μs 46.8819μs 21.3302 KOps/s 23.1507 KOps/s $\textbf{\color{#d91a1a}-7.86\%}$
test_view_pytree 73.1920μs 38.7283μs 25.8209 KOps/s 27.4944 KOps/s $\textbf{\color{#d91a1a}-6.09\%}$
test_view_td 0.1116ms 49.6978μs 20.1216 KOps/s 20.7581 KOps/s $\color{#d91a1a}-3.07\%$
test_unbind_pytree 69.6810μs 35.0751μs 28.5103 KOps/s 29.1902 KOps/s $\color{#d91a1a}-2.33\%$
test_unbind_td 0.5031ms 44.2393μs 22.6044 KOps/s 23.1951 KOps/s $\color{#d91a1a}-2.55\%$
test_split_pytree 0.1080ms 47.5342μs 21.0375 KOps/s 21.3982 KOps/s $\color{#d91a1a}-1.69\%$
test_split_td 0.6599ms 57.5211μs 17.3849 KOps/s 15.3502 KOps/s $\textbf{\color{#35bf28}+13.26\%}$
test_add_pytree 92.3320μs 55.9325μs 17.8787 KOps/s 17.4784 KOps/s $\color{#35bf28}+2.29\%$
test_add_td 0.1340ms 91.3042μs 10.9524 KOps/s 9.7555 KOps/s $\textbf{\color{#35bf28}+12.27\%}$
test_compile_add_one_nested[tensordict-compile] 0.2451ms 0.1602ms 6.2435 KOps/s 6.0786 KOps/s $\color{#35bf28}+2.71\%$
test_compile_add_one_nested[tensordict-eager] 0.2917ms 0.1629ms 6.1397 KOps/s 6.2090 KOps/s $\color{#d91a1a}-1.12\%$
test_compile_add_one_nested[pytree-compile] 0.1999ms 0.1522ms 6.5688 KOps/s 6.4492 KOps/s $\color{#35bf28}+1.85\%$
test_compile_add_one_nested[pytree-eager] 0.2331ms 0.1824ms 5.4813 KOps/s 5.4822 KOps/s $\color{#d91a1a}-0.02\%$
test_compile_copy_nested[tensordict-compile] 60.1310μs 22.2424μs 44.9591 KOps/s 45.2081 KOps/s $\color{#d91a1a}-0.55\%$
test_compile_copy_nested[tensordict-eager] 93.0620μs 48.4121μs 20.6560 KOps/s 20.2864 KOps/s $\color{#35bf28}+1.82\%$
test_compile_copy_nested[pytree-compile] 0.4360ms 65.3987μs 15.2908 KOps/s 14.9866 KOps/s $\color{#35bf28}+2.03\%$
test_compile_copy_nested[pytree-eager] 80.8420μs 51.3374μs 19.4790 KOps/s 19.3675 KOps/s $\color{#35bf28}+0.58\%$
test_compile_add_one_flat[tensordict-compile] 0.3545ms 0.3163ms 3.1614 KOps/s 3.1299 KOps/s $\color{#35bf28}+1.01\%$
test_compile_add_one_flat[tensordict-eager] 0.3163ms 0.2320ms 4.3104 KOps/s 4.3117 KOps/s $\color{#d91a1a}-0.03\%$
test_compile_add_one_flat[tensorclass-compile] 0.1661ms 0.1266ms 7.8995 KOps/s 7.8143 KOps/s $\color{#35bf28}+1.09\%$
test_compile_add_one_flat[tensorclass-eager] 0.1228ms 65.0247μs 15.3788 KOps/s 14.9531 KOps/s $\color{#35bf28}+2.85\%$
test_compile_add_one_flat[pytree-compile] 0.3812ms 0.3249ms 3.0778 KOps/s 3.0343 KOps/s $\color{#35bf28}+1.44\%$
test_compile_add_one_flat[pytree-eager] 0.6954ms 0.6189ms 1.6159 KOps/s 1.6546 KOps/s $\color{#d91a1a}-2.34\%$
test_compile_add_self_flat[tensordict-eager] 0.4249ms 0.2825ms 3.5401 KOps/s 3.5256 KOps/s $\color{#35bf28}+0.41\%$
test_compile_add_self_flat[tensordict-compile] 0.3733ms 0.3201ms 3.1241 KOps/s 3.1213 KOps/s $\color{#35bf28}+0.09\%$
test_compile_add_self_flat[tensorclass-eager] 0.1538ms 78.2538μs 12.7789 KOps/s 12.8435 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_add_self_flat[tensorclass-compile] 0.1763ms 0.1277ms 7.8303 KOps/s 7.7407 KOps/s $\color{#35bf28}+1.16\%$
test_compile_add_self_flat[pytree-eager] 0.6343ms 0.5192ms 1.9259 KOps/s 1.9503 KOps/s $\color{#d91a1a}-1.25\%$
test_compile_add_self_flat[pytree-compile] 0.3791ms 0.3250ms 3.0770 KOps/s 3.0307 KOps/s $\color{#35bf28}+1.53\%$
test_compile_copy_flat[tensordict-compile] 49.0520μs 19.9900μs 50.0250 KOps/s 51.5626 KOps/s $\color{#d91a1a}-2.98\%$
test_compile_copy_flat[tensordict-eager] 70.9310μs 39.7833μs 25.1362 KOps/s 25.2256 KOps/s $\color{#d91a1a}-0.35\%$
test_compile_copy_flat[pytree-compile] 0.1084ms 71.7595μs 13.9354 KOps/s 13.9247 KOps/s $\color{#35bf28}+0.08\%$
test_compile_copy_flat[pytree-eager] 87.2220μs 53.5551μs 18.6724 KOps/s 19.0043 KOps/s $\color{#d91a1a}-1.75\%$
test_compile_assign_and_add[tensordict-compile] 2.3736ms 0.8339ms 1.1991 KOps/s 1.1234 KOps/s $\textbf{\color{#35bf28}+6.74\%}$
test_compile_assign_and_add[tensordict-eager] 3.2985ms 3.1249ms 320.0059 Ops/s 315.0157 Ops/s $\color{#35bf28}+1.58\%$
test_compile_assign_and_add[pytree-compile] 2.3846ms 0.8328ms 1.2007 KOps/s 1.0929 KOps/s $\textbf{\color{#35bf28}+9.87\%}$
test_compile_assign_and_add[pytree-eager] 3.2782ms 3.1069ms 321.8671 Ops/s 323.5072 Ops/s $\color{#d91a1a}-0.51\%$
test_compile_indexing[tensor-tensordict-compile] 0.2000ms 0.1202ms 8.3165 KOps/s 8.5074 KOps/s $\color{#d91a1a}-2.24\%$
test_compile_indexing[tensor-tensordict-eager] 0.1880ms 63.3903μs 15.7753 KOps/s 15.9382 KOps/s $\color{#d91a1a}-1.02\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1768ms 0.1110ms 9.0126 KOps/s 8.8531 KOps/s $\color{#35bf28}+1.80\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1038ms 45.9610μs 21.7576 KOps/s 22.8745 KOps/s $\color{#d91a1a}-4.88\%$
test_compile_indexing[tensor-pytree-compile] 0.1652ms 0.1168ms 8.5587 KOps/s 8.3283 KOps/s $\color{#35bf28}+2.77\%$
test_compile_indexing[tensor-pytree-eager] 0.1104ms 44.6229μs 22.4100 KOps/s 21.5430 KOps/s $\color{#35bf28}+4.02\%$
test_compile_indexing[slice-tensordict-compile] 0.2030ms 0.1476ms 6.7764 KOps/s 6.9179 KOps/s $\color{#d91a1a}-2.05\%$
test_compile_indexing[slice-tensordict-eager] 0.1459ms 25.1804μs 39.7134 KOps/s 40.2599 KOps/s $\color{#d91a1a}-1.36\%$
test_compile_indexing[slice-tensorclass-compile] 0.1967ms 0.1432ms 6.9843 KOps/s 7.2206 KOps/s $\color{#d91a1a}-3.27\%$
test_compile_indexing[slice-tensorclass-eager] 59.9410μs 20.8403μs 47.9840 KOps/s 46.0266 KOps/s $\color{#35bf28}+4.25\%$
test_compile_indexing[slice-pytree-compile] 0.1963ms 0.1442ms 6.9371 KOps/s 7.1573 KOps/s $\color{#d91a1a}-3.08\%$
test_compile_indexing[slice-pytree-eager] 60.2710μs 20.5491μs 48.6639 KOps/s 47.3919 KOps/s $\color{#35bf28}+2.68\%$
test_compile_indexing[int-tensordict-compile] 0.2587ms 0.1476ms 6.7729 KOps/s 6.7884 KOps/s $\color{#d91a1a}-0.23\%$
test_compile_indexing[int-tensordict-eager] 0.4697ms 24.8065μs 40.3120 KOps/s 39.7571 KOps/s $\color{#35bf28}+1.40\%$
test_compile_indexing[int-tensorclass-compile] 0.1961ms 0.1432ms 6.9825 KOps/s 7.1772 KOps/s $\color{#d91a1a}-2.71\%$
test_compile_indexing[int-tensorclass-eager] 57.9710μs 20.8092μs 48.0556 KOps/s 47.8228 KOps/s $\color{#35bf28}+0.49\%$
test_compile_indexing[int-pytree-compile] 0.1894ms 0.1442ms 6.9327 KOps/s 7.1763 KOps/s $\color{#d91a1a}-3.39\%$
test_compile_indexing[int-pytree-eager] 0.3293ms 20.6698μs 48.3799 KOps/s 47.6125 KOps/s $\color{#35bf28}+1.61\%$
test_mod_add[eager] 75.1120μs 33.1251μs 30.1886 KOps/s 30.1242 KOps/s $\color{#35bf28}+0.21\%$
test_mod_add[compile] 0.1229ms 84.0373μs 11.8995 KOps/s 11.8354 KOps/s $\color{#35bf28}+0.54\%$
test_mod_add[compile-overhead] 0.2924ms 0.1474ms 6.7860 KOps/s 6.5127 KOps/s $\color{#35bf28}+4.20\%$
test_mod_wrap[eager] 0.3157ms 0.2440ms 4.0984 KOps/s 4.0047 KOps/s $\color{#35bf28}+2.34\%$
test_mod_wrap[compile] 1.3793ms 0.2989ms 3.3455 KOps/s 3.3262 KOps/s $\color{#35bf28}+0.58\%$
test_mod_wrap[compile-overhead] 7.6636ms 4.0512ms 246.8407 Ops/s 247.9119 Ops/s $\color{#d91a1a}-0.43\%$
test_mod_wrap_and_backward[eager] 1.4182ms 1.2965ms 771.3288 Ops/s 714.9866 Ops/s $\textbf{\color{#35bf28}+7.88\%}$
test_mod_wrap_and_backward[compile] 7.5924ms 1.3230ms 755.8310 Ops/s 708.8568 Ops/s $\textbf{\color{#35bf28}+6.63\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3231ms 0.8908ms 1.1226 KOps/s 942.6499 Ops/s $\textbf{\color{#35bf28}+19.09\%}$
test_seq_add[eager] 0.1471ms 96.5888μs 10.3532 KOps/s 9.8774 KOps/s $\color{#35bf28}+4.82\%$
test_seq_add[compile] 0.2690ms 89.6352μs 11.1563 KOps/s 11.2082 KOps/s $\color{#d91a1a}-0.46\%$
test_seq_add[compile-overhead] 0.1674ms 0.1229ms 8.1397 KOps/s 7.7378 KOps/s $\textbf{\color{#35bf28}+5.19\%}$
test_seq_wrap[eager] 0.4374ms 0.3665ms 2.7286 KOps/s 2.4539 KOps/s $\textbf{\color{#35bf28}+11.19\%}$
test_seq_wrap[compile] 0.3756ms 0.3092ms 3.2337 KOps/s 3.1272 KOps/s $\color{#35bf28}+3.41\%$
test_seq_wrap[compile-overhead] 0.2678ms 0.2171ms 4.6059 KOps/s 4.5697 KOps/s $\color{#35bf28}+0.79\%$
test_func_call_runtime[False-eager] 0.8411ms 0.7643ms 1.3083 KOps/s 1.3203 KOps/s $\color{#d91a1a}-0.91\%$
test_func_call_runtime[False-compile] 0.8971ms 0.7779ms 1.2855 KOps/s 1.2040 KOps/s $\textbf{\color{#35bf28}+6.77\%}$
test_func_call_runtime[False-compile-overhead] 0.4365ms 0.3548ms 2.8185 KOps/s 2.7834 KOps/s $\color{#35bf28}+1.26\%$
test_func_call_runtime[True-eager] 0.9487ms 0.8667ms 1.1538 KOps/s 1.1314 KOps/s $\color{#35bf28}+1.98\%$
test_func_call_runtime[True-compile] 0.9175ms 0.8014ms 1.2478 KOps/s 1.2594 KOps/s $\color{#d91a1a}-0.92\%$
test_func_call_runtime[True-compile-overhead] 0.4562ms 0.3759ms 2.6603 KOps/s 2.6280 KOps/s $\color{#35bf28}+1.23\%$
test_func_call_cm_runtime[False-eager] 0.8849ms 0.7001ms 1.4284 KOps/s 1.4204 KOps/s $\color{#35bf28}+0.56\%$
test_func_call_cm_runtime[False-compile] 0.8642ms 0.7821ms 1.2787 KOps/s 1.2803 KOps/s $\color{#d91a1a}-0.12\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4187ms 0.3573ms 2.7986 KOps/s 2.7698 KOps/s $\color{#35bf28}+1.04\%$
test_func_call_cm_runtime[True-eager] 1.1066ms 0.9829ms 1.0174 KOps/s 1.0055 KOps/s $\color{#35bf28}+1.18\%$
test_func_call_cm_runtime[True-compile] 0.8957ms 0.8262ms 1.2103 KOps/s 1.2035 KOps/s $\color{#35bf28}+0.57\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4615ms 0.4060ms 2.4629 KOps/s 2.4548 KOps/s $\color{#35bf28}+0.33\%$
test_vmap_func_call_cm_runtime[eager] 2.4633ms 2.0203ms 494.9743 Ops/s 492.5395 Ops/s $\color{#35bf28}+0.49\%$
test_vmap_func_call_cm_runtime[compile] 0.9746ms 0.9027ms 1.1078 KOps/s 1.1941 KOps/s $\textbf{\color{#d91a1a}-7.23\%}$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4501ms 0.4041ms 2.4747 KOps/s 2.4534 KOps/s $\color{#35bf28}+0.87\%$
test_distributed 2.4661ms 0.1876ms 5.3305 KOps/s 8.4803 KOps/s $\textbf{\color{#d91a1a}-37.14\%}$
test_tdmodule 0.2523ms 15.4241μs 64.8335 KOps/s 57.0932 KOps/s $\textbf{\color{#35bf28}+13.56\%}$
test_tdmodule_dispatch 49.2310μs 29.3286μs 34.0965 KOps/s 30.9073 KOps/s $\textbf{\color{#35bf28}+10.32\%}$
test_tdseq 36.1500μs 15.8427μs 63.1205 KOps/s 58.2916 KOps/s $\textbf{\color{#35bf28}+8.28\%}$
test_tdseq_dispatch 52.0210μs 31.6685μs 31.5771 KOps/s 28.6993 KOps/s $\textbf{\color{#35bf28}+10.03\%}$
test_instantiation_functorch 2.0723ms 1.8912ms 528.7719 Ops/s 529.8665 Ops/s $\color{#d91a1a}-0.21\%$
test_exec_functorch 0.3268ms 0.2094ms 4.7749 KOps/s 4.8022 KOps/s $\color{#d91a1a}-0.57\%$
test_exec_functional_call 0.3037ms 0.2132ms 4.6901 KOps/s 4.5580 KOps/s $\color{#35bf28}+2.90\%$
test_exec_td_decorator 0.4427ms 0.2567ms 3.8959 KOps/s 3.7034 KOps/s $\textbf{\color{#35bf28}+5.20\%}$
test_vmap_mlp_speed_decorator[True-True] 0.8414ms 0.6613ms 1.5121 KOps/s 1.4634 KOps/s $\color{#35bf28}+3.33\%$
test_vmap_mlp_speed_decorator[True-False] 0.8321ms 0.6692ms 1.4944 KOps/s 1.4674 KOps/s $\color{#35bf28}+1.84\%$
test_vmap_mlp_speed_decorator[False-True] 0.7128ms 0.5817ms 1.7190 KOps/s 1.6190 KOps/s $\textbf{\color{#35bf28}+6.17\%}$
test_vmap_mlp_speed_decorator[False-False] 0.7400ms 0.5816ms 1.7194 KOps/s 1.7160 KOps/s $\color{#35bf28}+0.20\%$
test_vmap_transformer_speed_decorator[True-True] 19.2268ms 18.9621ms 52.7366 Ops/s 52.8356 Ops/s $\color{#d91a1a}-0.19\%$
test_vmap_transformer_speed_decorator[True-False] 19.0585ms 18.9413ms 52.7946 Ops/s 52.8079 Ops/s $\color{#d91a1a}-0.03\%$
test_vmap_transformer_speed_decorator[False-True] 19.3243ms 18.8541ms 53.0389 Ops/s 53.4392 Ops/s $\color{#d91a1a}-0.75\%$
test_vmap_transformer_speed_decorator[False-False] 19.5072ms 18.9682ms 52.7198 Ops/s 53.1335 Ops/s $\color{#d91a1a}-0.78\%$
test_to_module_speed[True] 1.4409ms 1.0554ms 947.5408 Ops/s 957.5927 Ops/s $\color{#d91a1a}-1.05\%$
test_to_module_speed[False] 1.4630ms 1.0207ms 979.7345 Ops/s 966.3061 Ops/s $\color{#35bf28}+1.39\%$
test_tc_init 60.6510μs 34.3597μs 29.1039 KOps/s 25.1654 KOps/s $\textbf{\color{#35bf28}+15.65\%}$
test_tc_init_nested 0.1262ms 70.3515μs 14.2143 KOps/s 12.6109 KOps/s $\textbf{\color{#35bf28}+12.72\%}$
test_tc_first_layer_tensor 6.8862μs 0.7063μs 1.4157 MOps/s 1.4242 MOps/s $\color{#d91a1a}-0.59\%$
test_tc_first_layer_nontensor 19.9510μs 2.2899μs 436.6947 KOps/s 435.5944 KOps/s $\color{#35bf28}+0.25\%$
test_tc_second_layer_tensor 20.9310μs 1.4819μs 674.7998 KOps/s 710.0120 KOps/s $\color{#d91a1a}-4.96\%$
test_tc_second_layer_nontensor 56.8010μs 2.9344μs 340.7887 KOps/s 329.1007 KOps/s $\color{#35bf28}+3.55\%$
test_unbind 0.1893s 11.9083ms 83.9750 Ops/s 91.4091 Ops/s $\textbf{\color{#d91a1a}-8.13\%}$
test_full_like 0.6594ms 0.5755ms 1.7376 KOps/s 1.7434 KOps/s $\color{#d91a1a}-0.33\%$
test_zeros_like 0.2796ms 0.1978ms 5.0551 KOps/s 5.0519 KOps/s $\color{#35bf28}+0.06\%$
test_ones_like 0.2380ms 0.1977ms 5.0578 KOps/s 5.0549 KOps/s $\color{#35bf28}+0.06\%$
test_clone 0.4535ms 0.4146ms 2.4118 KOps/s 2.4084 KOps/s $\color{#35bf28}+0.14\%$
test_squeeze 38.5210μs 10.0275μs 99.7257 KOps/s 99.6260 KOps/s $\color{#35bf28}+0.10\%$
test_unsqueeze 0.2402ms 77.4732μs 12.9077 KOps/s 12.9102 KOps/s $\color{#d91a1a}-0.02\%$
test_split 0.4552ms 0.1658ms 6.0303 KOps/s 6.2583 KOps/s $\color{#d91a1a}-3.64\%$
test_permute 0.2993ms 0.1867ms 5.3553 KOps/s 5.5243 KOps/s $\color{#d91a1a}-3.06\%$
test_stack 1.2605ms 0.8732ms 1.1452 KOps/s 1.1471 KOps/s $\color{#d91a1a}-0.16\%$
test_cat 1.2495ms 1.2312ms 812.1946 Ops/s 812.0175 Ops/s $\color{#35bf28}+0.02\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants