Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] better AddStateIndependentNormalScale #1028

Merged
merged 1 commit into from
Oct 4, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Oct 4, 2024

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 4, 2024
ghstack-source-id: b5911c0b4e023d3c8e20968732ff58da061f978b
Pull Request resolved: #1028
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 4, 2024
Copy link

github-actions bot commented Oct 4, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 58.3890μs 24.2638μs 41.2137 KOps/s 41.6689 KOps/s $\color{#d91a1a}-1.09\%$
test_plain_set_stack_nested 64.7910μs 24.7675μs 40.3755 KOps/s 40.6260 KOps/s $\color{#d91a1a}-0.62\%$
test_plain_set_nested_inplace 62.8180μs 26.6703μs 37.4949 KOps/s 37.3632 KOps/s $\color{#35bf28}+0.35\%$
test_plain_set_stack_nested_inplace 74.2790μs 26.6319μs 37.5489 KOps/s 37.3452 KOps/s $\color{#35bf28}+0.55\%$
test_items 32.4210μs 4.1304μs 242.1064 KOps/s 240.6548 KOps/s $\color{#35bf28}+0.60\%$
test_items_nested 0.5777ms 0.3855ms 2.5938 KOps/s 2.5576 KOps/s $\color{#35bf28}+1.42\%$
test_items_nested_locked 0.6880ms 0.3840ms 2.6040 KOps/s 2.5077 KOps/s $\color{#35bf28}+3.84\%$
test_items_nested_leaf 0.1469ms 80.5938μs 12.4079 KOps/s 12.3626 KOps/s $\color{#35bf28}+0.37\%$
test_items_stack_nested 0.7196ms 0.3919ms 2.5519 KOps/s 2.4900 KOps/s $\color{#35bf28}+2.49\%$
test_items_stack_nested_leaf 0.1539ms 82.1395μs 12.1744 KOps/s 11.8075 KOps/s $\color{#35bf28}+3.11\%$
test_items_stack_nested_locked 0.7396ms 0.3926ms 2.5473 KOps/s 2.5183 KOps/s $\color{#35bf28}+1.15\%$
test_keys 39.6140μs 3.6141μs 276.6949 KOps/s 283.7571 KOps/s $\color{#d91a1a}-2.49\%$
test_keys_nested 0.1858ms 0.1345ms 7.4347 KOps/s 7.2949 KOps/s $\color{#35bf28}+1.92\%$
test_keys_nested_locked 1.6189ms 0.1409ms 7.0967 KOps/s 7.0873 KOps/s $\color{#35bf28}+0.13\%$
test_keys_nested_leaf 0.1883ms 0.1182ms 8.4624 KOps/s 8.4178 KOps/s $\color{#35bf28}+0.53\%$
test_keys_stack_nested 0.2288ms 0.1334ms 7.4948 KOps/s 7.4317 KOps/s $\color{#35bf28}+0.85\%$
test_keys_stack_nested_leaf 0.2017ms 0.1168ms 8.5624 KOps/s 8.4908 KOps/s $\color{#35bf28}+0.84\%$
test_keys_stack_nested_locked 0.2321ms 0.1400ms 7.1434 KOps/s 7.1055 KOps/s $\color{#35bf28}+0.53\%$
test_values 7.1414μs 1.0585μs 944.7608 KOps/s 923.0691 KOps/s $\color{#35bf28}+2.35\%$
test_values_nested 0.1550ms 94.2638μs 10.6085 KOps/s 10.7818 KOps/s $\color{#d91a1a}-1.61\%$
test_values_nested_locked 0.1541ms 93.1352μs 10.7371 KOps/s 10.7484 KOps/s $\color{#d91a1a}-0.11\%$
test_values_nested_leaf 0.1367ms 79.1001μs 12.6422 KOps/s 12.8107 KOps/s $\color{#d91a1a}-1.32\%$
test_values_stack_nested 0.1542ms 92.8300μs 10.7724 KOps/s 10.8777 KOps/s $\color{#d91a1a}-0.97\%$
test_values_stack_nested_leaf 0.1424ms 79.3704μs 12.5992 KOps/s 12.5925 KOps/s $\color{#35bf28}+0.05\%$
test_values_stack_nested_locked 0.2317ms 93.4508μs 10.7008 KOps/s 10.8741 KOps/s $\color{#d91a1a}-1.59\%$
test_membership 5.9697μs 0.7420μs 1.3477 MOps/s 1.1605 MOps/s $\textbf{\color{#35bf28}+16.13\%}$
test_membership_nested 25.0260μs 2.7282μs 366.5428 KOps/s 363.5107 KOps/s $\color{#35bf28}+0.83\%$
test_membership_nested_leaf 22.7930μs 2.7539μs 363.1259 KOps/s 362.5998 KOps/s $\color{#35bf28}+0.15\%$
test_membership_stacked_nested 26.1390μs 2.7345μs 365.6998 KOps/s 358.1913 KOps/s $\color{#35bf28}+2.10\%$
test_membership_stacked_nested_leaf 74.9500μs 2.7474μs 363.9764 KOps/s 363.2722 KOps/s $\color{#35bf28}+0.19\%$
test_membership_nested_last 30.8670μs 4.2279μs 236.5242 KOps/s 241.9288 KOps/s $\color{#d91a1a}-2.23\%$
test_membership_nested_leaf_last 22.0120μs 4.1882μs 238.7687 KOps/s 240.8186 KOps/s $\color{#d91a1a}-0.85\%$
test_membership_stacked_nested_last 34.1740μs 4.1383μs 241.6465 KOps/s 127.7012 KOps/s $\textbf{\color{#35bf28}+89.23\%}$
test_membership_stacked_nested_leaf_last 31.3380μs 4.1802μs 239.2213 KOps/s 127.4901 KOps/s $\textbf{\color{#35bf28}+87.64\%}$
test_nested_getleaf 37.7600μs 10.5795μs 94.5227 KOps/s 94.3793 KOps/s $\color{#35bf28}+0.15\%$
test_nested_get 57.4190μs 10.0153μs 99.8474 KOps/s 99.4219 KOps/s $\color{#35bf28}+0.43\%$
test_stacked_getleaf 38.3310μs 10.5478μs 94.8069 KOps/s 94.3094 KOps/s $\color{#35bf28}+0.53\%$
test_stacked_get 51.6770μs 9.9661μs 100.3405 KOps/s 99.4117 KOps/s $\color{#35bf28}+0.93\%$
test_nested_getitemleaf 61.7060μs 11.7445μs 85.1460 KOps/s 90.0718 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_nested_getitem 68.4360μs 10.1704μs 98.3243 KOps/s 96.3534 KOps/s $\color{#35bf28}+2.05\%$
test_stacked_getitemleaf 43.5520μs 11.0220μs 90.7279 KOps/s 90.5853 KOps/s $\color{#35bf28}+0.16\%$
test_stacked_getitem 36.8290μs 10.2397μs 97.6593 KOps/s 96.8239 KOps/s $\color{#35bf28}+0.86\%$
test_lock_nested 90.3454ms 0.6184ms 1.6170 KOps/s 1.9623 KOps/s $\textbf{\color{#d91a1a}-17.59\%}$
test_lock_stack_nested 0.6461ms 0.4865ms 2.0553 KOps/s 2.1230 KOps/s $\color{#d91a1a}-3.19\%$
test_unlock_nested 93.0569ms 0.5362ms 1.8650 KOps/s 2.3497 KOps/s $\textbf{\color{#d91a1a}-20.63\%}$
test_unlock_stack_nested 0.7374ms 0.3976ms 2.5152 KOps/s 2.6081 KOps/s $\color{#d91a1a}-3.56\%$
test_flatten_speed 0.1891ms 0.1007ms 9.9279 KOps/s 9.9992 KOps/s $\color{#d91a1a}-0.71\%$
test_unflatten_speed 0.7083ms 0.5150ms 1.9419 KOps/s 1.9861 KOps/s $\color{#d91a1a}-2.23\%$
test_common_ops 4.9984ms 1.1679ms 856.2506 Ops/s 860.7456 Ops/s $\color{#d91a1a}-0.52\%$
test_creation 21.8310μs 2.1055μs 474.9375 KOps/s 491.0379 KOps/s $\color{#d91a1a}-3.28\%$
test_creation_empty 64.7210μs 17.6829μs 56.5517 KOps/s 53.4103 KOps/s $\textbf{\color{#35bf28}+5.88\%}$
test_creation_nested_1 46.3560μs 20.9900μs 47.6418 KOps/s 44.9470 KOps/s $\textbf{\color{#35bf28}+6.00\%}$
test_creation_nested_2 77.0340μs 25.0643μs 39.8974 KOps/s 37.9299 KOps/s $\textbf{\color{#35bf28}+5.19\%}$
test_clone 0.1165ms 17.7673μs 56.2832 KOps/s 58.4988 KOps/s $\color{#d91a1a}-3.79\%$
test_getitem[int] 1.2183ms 16.5824μs 60.3047 KOps/s 59.6805 KOps/s $\color{#35bf28}+1.05\%$
test_getitem[slice_int] 0.1475ms 32.0243μs 31.2263 KOps/s 30.3361 KOps/s $\color{#35bf28}+2.93\%$
test_getitem[range] 0.3414ms 60.8627μs 16.4304 KOps/s 17.0029 KOps/s $\color{#d91a1a}-3.37\%$
test_getitem[tuple] 0.1325ms 25.4708μs 39.2606 KOps/s 38.2784 KOps/s $\color{#35bf28}+2.57\%$
test_getitem[list] 0.2636ms 55.5404μs 18.0049 KOps/s 18.2941 KOps/s $\color{#d91a1a}-1.58\%$
test_setitem_dim[int] 76.5630μs 35.1950μs 28.4131 KOps/s 29.0655 KOps/s $\color{#d91a1a}-2.24\%$
test_setitem_dim[slice_int] 0.1684ms 65.0976μs 15.3615 KOps/s 15.9109 KOps/s $\color{#d91a1a}-3.45\%$
test_setitem_dim[range] 0.1465ms 86.4543μs 11.5668 KOps/s 11.4979 KOps/s $\color{#35bf28}+0.60\%$
test_setitem_dim[tuple] 0.1063ms 52.1018μs 19.1932 KOps/s 19.2157 KOps/s $\color{#d91a1a}-0.12\%$
test_setitem 0.1996ms 31.5936μs 31.6520 KOps/s 32.1179 KOps/s $\color{#d91a1a}-1.45\%$
test_set 0.1391ms 30.3619μs 32.9360 KOps/s 33.2717 KOps/s $\color{#d91a1a}-1.01\%$
test_set_shared 1.1496ms 0.2220ms 4.5053 KOps/s 4.4746 KOps/s $\color{#35bf28}+0.69\%$
test_update 0.1643ms 36.9923μs 27.0327 KOps/s 25.8051 KOps/s $\color{#35bf28}+4.76\%$
test_update_nested 0.1409ms 50.1102μs 19.9560 KOps/s 19.9312 KOps/s $\color{#35bf28}+0.12\%$
test_update__nested 0.1262ms 40.8231μs 24.4959 KOps/s 26.6550 KOps/s $\textbf{\color{#d91a1a}-8.10\%}$
test_set_nested 95.2780μs 32.6547μs 30.6234 KOps/s 30.4666 KOps/s $\color{#35bf28}+0.51\%$
test_set_nested_new 0.1222ms 38.2067μs 26.1734 KOps/s 26.2415 KOps/s $\color{#d91a1a}-0.26\%$
test_select 0.1387ms 55.7309μs 17.9434 KOps/s 18.0212 KOps/s $\color{#d91a1a}-0.43\%$
test_select_nested 0.1549ms 60.6147μs 16.4976 KOps/s 16.8778 KOps/s $\color{#d91a1a}-2.25\%$
test_exclude_nested 0.1483ms 74.8158μs 13.3662 KOps/s 13.4155 KOps/s $\color{#d91a1a}-0.37\%$
test_empty[True] 1.0592ms 0.3547ms 2.8192 KOps/s 2.8422 KOps/s $\color{#d91a1a}-0.81\%$
test_empty[False] 8.5685μs 1.2222μs 818.2222 KOps/s 829.5272 KOps/s $\color{#d91a1a}-1.36\%$
test_unbind_speed 0.4914ms 0.3056ms 3.2721 KOps/s 3.2038 KOps/s $\color{#35bf28}+2.13\%$
test_unbind_speed_stack0 0.6571ms 0.3057ms 3.2714 KOps/s 3.4017 KOps/s $\color{#d91a1a}-3.83\%$
test_unbind_speed_stack1 93.3837ms 0.8294ms 1.2057 KOps/s 1.3552 KOps/s $\textbf{\color{#d91a1a}-11.03\%}$
test_split 3.1473ms 2.0165ms 495.9160 Ops/s 453.1185 Ops/s $\textbf{\color{#35bf28}+9.45\%}$
test_chunk 94.6383ms 2.1993ms 454.6842 Ops/s 450.6180 Ops/s $\color{#35bf28}+0.90\%$
test_creation[device0] 0.2543ms 0.1209ms 8.2723 KOps/s 8.1614 KOps/s $\color{#35bf28}+1.36\%$
test_creation_from_tensor 3.1850ms 0.1219ms 8.2003 KOps/s 8.3542 KOps/s $\color{#d91a1a}-1.84\%$
test_add_one[memmap_tensor0] 0.2815ms 7.9482μs 125.8152 KOps/s 137.3672 KOps/s $\textbf{\color{#d91a1a}-8.41\%}$
test_contiguous[memmap_tensor0] 28.7240μs 1.8921μs 528.5096 KOps/s 513.8749 KOps/s $\color{#35bf28}+2.85\%$
test_stack[memmap_tensor0] 49.6830μs 5.8076μs 172.1868 KOps/s 171.2012 KOps/s $\color{#35bf28}+0.58\%$
test_memmaptd_index 99.2589ms 0.5549ms 1.8020 KOps/s 2.3777 KOps/s $\textbf{\color{#d91a1a}-24.21\%}$
test_memmaptd_index_astensor 1.0905ms 0.5223ms 1.9147 KOps/s 1.9097 KOps/s $\color{#35bf28}+0.26\%$
test_memmaptd_index_op 1.5238ms 1.0646ms 939.3153 Ops/s 920.8463 Ops/s $\color{#35bf28}+2.01\%$
test_serialize_model 0.1323s 0.1217s 8.2166 Ops/s 8.3264 Ops/s $\color{#d91a1a}-1.32\%$
test_serialize_model_pickle 0.4509s 0.3951s 2.5308 Ops/s 2.5162 Ops/s $\color{#35bf28}+0.58\%$
test_serialize_weights 0.1250s 0.1167s 8.5685 Ops/s 8.8310 Ops/s $\color{#d91a1a}-2.97\%$
test_serialize_weights_returnearly 0.2490s 0.1749s 5.7187 Ops/s 6.3351 Ops/s $\textbf{\color{#d91a1a}-9.73\%}$
test_serialize_weights_pickle 1.0946s 0.7389s 1.3533 Ops/s 2.5299 Ops/s $\textbf{\color{#d91a1a}-46.51\%}$
test_serialize_weights_filesystem 0.1485s 0.1438s 6.9559 Ops/s 6.9332 Ops/s $\color{#35bf28}+0.33\%$
test_serialize_model_filesystem 0.2359s 0.1590s 6.2905 Ops/s 6.5340 Ops/s $\color{#d91a1a}-3.73\%$
test_reshape_pytree 85.2090μs 39.7971μs 25.1275 KOps/s 25.7610 KOps/s $\color{#d91a1a}-2.46\%$
test_reshape_td 0.1052ms 46.3455μs 21.5771 KOps/s 22.2916 KOps/s $\color{#d91a1a}-3.21\%$
test_view_pytree 95.2580μs 39.8337μs 25.1044 KOps/s 26.0319 KOps/s $\color{#d91a1a}-3.56\%$
test_view_td 0.1507ms 53.6176μs 18.6506 KOps/s 19.5714 KOps/s $\color{#d91a1a}-4.70\%$
test_unbind_pytree 76.1620μs 36.4780μs 27.4138 KOps/s 27.2173 KOps/s $\color{#35bf28}+0.72\%$
test_unbind_td 0.2928ms 45.9057μs 21.7838 KOps/s 22.1765 KOps/s $\color{#d91a1a}-1.77\%$
test_split_pytree 0.1065ms 38.4387μs 26.0154 KOps/s 26.1896 KOps/s $\color{#d91a1a}-0.67\%$
test_split_td 0.1938ms 57.8514μs 17.2857 KOps/s 17.2344 KOps/s $\color{#35bf28}+0.30\%$
test_add_pytree 0.1858ms 49.2759μs 20.2939 KOps/s 21.4968 KOps/s $\textbf{\color{#d91a1a}-5.60\%}$
test_add_td 0.1770ms 87.9896μs 11.3650 KOps/s 11.1680 KOps/s $\color{#35bf28}+1.76\%$
test_compile_add_one_nested[tensordict-compile] 0.1627ms 57.8996μs 17.2713 KOps/s 16.6227 KOps/s $\color{#35bf28}+3.90\%$
test_compile_add_one_nested[tensordict-eager] 0.2602ms 0.1970ms 5.0753 KOps/s 5.0000 KOps/s $\color{#35bf28}+1.51\%$
test_compile_add_one_nested[pytree-compile] 0.1319ms 58.1415μs 17.1994 KOps/s 17.3122 KOps/s $\color{#d91a1a}-0.65\%$
test_compile_add_one_nested[pytree-eager] 0.2597ms 0.1441ms 6.9415 KOps/s 6.9196 KOps/s $\color{#35bf28}+0.32\%$
test_compile_copy_nested[tensordict-compile] 84.6080μs 22.9797μs 43.5166 KOps/s 41.7986 KOps/s $\color{#35bf28}+4.11\%$
test_compile_copy_nested[tensordict-eager] 0.1505ms 74.7103μs 13.3850 KOps/s 13.4477 KOps/s $\color{#d91a1a}-0.47\%$
test_compile_copy_nested[pytree-compile] 0.1780ms 75.8064μs 13.1915 KOps/s 13.1492 KOps/s $\color{#35bf28}+0.32\%$
test_compile_copy_nested[pytree-eager] 0.1205ms 68.5801μs 14.5815 KOps/s 14.4119 KOps/s $\color{#35bf28}+1.18\%$
test_compile_add_one_flat[tensordict-compile] 0.2747ms 0.1828ms 5.4690 KOps/s 5.4432 KOps/s $\color{#35bf28}+0.47\%$
test_compile_add_one_flat[tensordict-eager] 0.4468ms 0.2413ms 4.1449 KOps/s 4.1671 KOps/s $\color{#d91a1a}-0.53\%$
test_compile_add_one_flat[tensorclass-compile] 0.1396ms 47.7805μs 20.9291 KOps/s 20.3363 KOps/s $\color{#35bf28}+2.91\%$
test_compile_add_one_flat[tensorclass-eager] 0.1556ms 78.1717μs 12.7923 KOps/s 12.9370 KOps/s $\color{#d91a1a}-1.12\%$
test_compile_add_one_flat[pytree-compile] 0.2976ms 0.1766ms 5.6628 KOps/s 5.6445 KOps/s $\color{#35bf28}+0.32\%$
test_compile_add_one_flat[pytree-eager] 0.5208ms 0.2982ms 3.3534 KOps/s 3.4333 KOps/s $\color{#d91a1a}-2.33\%$
test_compile_add_self_flat[tensordict-eager] 0.5121ms 0.2755ms 3.6303 KOps/s 3.6499 KOps/s $\color{#d91a1a}-0.54\%$
test_compile_add_self_flat[tensordict-compile] 0.4616ms 0.1873ms 5.3378 KOps/s 5.4219 KOps/s $\color{#d91a1a}-1.55\%$
test_compile_add_self_flat[tensorclass-eager] 0.1778ms 75.7233μs 13.2060 KOps/s 13.5323 KOps/s $\color{#d91a1a}-2.41\%$
test_compile_add_self_flat[tensorclass-compile] 0.1328ms 49.0834μs 20.3735 KOps/s 20.0738 KOps/s $\color{#35bf28}+1.49\%$
test_compile_add_self_flat[pytree-eager] 0.4422ms 0.2416ms 4.1395 KOps/s 4.2614 KOps/s $\color{#d91a1a}-2.86\%$
test_compile_add_self_flat[pytree-compile] 0.3063ms 0.1788ms 5.5942 KOps/s 5.6960 KOps/s $\color{#d91a1a}-1.79\%$
test_compile_copy_flat[tensordict-compile] 0.2257ms 0.1125ms 8.8907 KOps/s 8.9498 KOps/s $\color{#d91a1a}-0.66\%$
test_compile_copy_flat[tensordict-eager] 0.1810ms 78.1776μs 12.7914 KOps/s 13.0020 KOps/s $\color{#d91a1a}-1.62\%$
test_compile_copy_flat[pytree-compile] 0.1485ms 80.8335μs 12.3711 KOps/s 12.3885 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_copy_flat[pytree-eager] 0.1428ms 72.6062μs 13.7729 KOps/s 14.0801 KOps/s $\color{#d91a1a}-2.18\%$
test_compile_assign_and_add[tensordict-compile] 0.3247ms 0.1940ms 5.1539 KOps/s 5.1122 KOps/s $\color{#35bf28}+0.81\%$
test_compile_assign_and_add[tensordict-eager] 2.8221ms 1.7444ms 573.2729 Ops/s 568.5812 Ops/s $\color{#35bf28}+0.83\%$
test_compile_assign_and_add[pytree-compile] 0.3798ms 0.1933ms 5.1726 KOps/s 5.1580 KOps/s $\color{#35bf28}+0.28\%$
test_compile_assign_and_add[pytree-eager] 1.3598ms 1.1162ms 895.9330 Ops/s 906.5866 Ops/s $\color{#d91a1a}-1.18\%$
test_compile_assign_and_add_stack[compile] 0.6615ms 0.4207ms 2.3773 KOps/s 2.4043 KOps/s $\color{#d91a1a}-1.12\%$
test_compile_assign_and_add_stack[eager] 6.1645ms 4.0814ms 245.0139 Ops/s 242.9475 Ops/s $\color{#35bf28}+0.85\%$
test_compile_indexing[tensor-tensordict-compile] 0.1247ms 33.1638μs 30.1534 KOps/s 28.8330 KOps/s $\color{#35bf28}+4.58\%$
test_compile_indexing[tensor-tensordict-eager] 1.0904ms 50.4913μs 19.8054 KOps/s 20.3073 KOps/s $\color{#d91a1a}-2.47\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1286ms 30.3010μs 33.0022 KOps/s 33.0245 KOps/s $\color{#d91a1a}-0.07\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1095ms 29.1299μs 34.3290 KOps/s 34.1421 KOps/s $\color{#35bf28}+0.55\%$
test_compile_indexing[tensor-pytree-compile] 0.1713ms 30.4313μs 32.8609 KOps/s 33.5366 KOps/s $\color{#d91a1a}-2.01\%$
test_compile_indexing[tensor-pytree-eager] 0.1075ms 29.2311μs 34.2101 KOps/s 34.1580 KOps/s $\color{#35bf28}+0.15\%$
test_compile_indexing[slice-tensordict-compile] 0.1630ms 73.7334μs 13.5624 KOps/s 13.7809 KOps/s $\color{#d91a1a}-1.59\%$
test_compile_indexing[slice-tensordict-eager] 0.5874ms 26.6961μs 37.4587 KOps/s 35.1406 KOps/s $\textbf{\color{#35bf28}+6.60\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1826ms 69.2789μs 14.4344 KOps/s 14.5478 KOps/s $\color{#d91a1a}-0.78\%$
test_compile_indexing[slice-tensorclass-eager] 60.4730μs 23.5330μs 42.4934 KOps/s 41.5428 KOps/s $\color{#35bf28}+2.29\%$
test_compile_indexing[slice-pytree-compile] 0.1424ms 68.2797μs 14.6456 KOps/s 14.6505 KOps/s $\color{#d91a1a}-0.03\%$
test_compile_indexing[slice-pytree-eager] 0.1110ms 23.1394μs 43.2163 KOps/s 42.3667 KOps/s $\color{#35bf28}+2.01\%$
test_compile_indexing[int-tensordict-compile] 0.1951ms 74.3403μs 13.4516 KOps/s 13.9065 KOps/s $\color{#d91a1a}-3.27\%$
test_compile_indexing[int-tensordict-eager] 0.7292ms 26.9878μs 37.0537 KOps/s 35.9358 KOps/s $\color{#35bf28}+3.11\%$
test_compile_indexing[int-tensorclass-compile] 0.1725ms 70.6209μs 14.1601 KOps/s 14.6974 KOps/s $\color{#d91a1a}-3.66\%$
test_compile_indexing[int-tensorclass-eager] 0.1780ms 23.2536μs 43.0041 KOps/s 42.4564 KOps/s $\color{#35bf28}+1.29\%$
test_compile_indexing[int-pytree-compile] 0.2498ms 70.5700μs 14.1703 KOps/s 14.5948 KOps/s $\color{#d91a1a}-2.91\%$
test_compile_indexing[int-pytree-eager] 72.7160μs 23.3207μs 42.8803 KOps/s 42.8341 KOps/s $\color{#35bf28}+0.11\%$
test_mod_add[eager] 86.3410μs 24.6481μs 40.5711 KOps/s 38.0650 KOps/s $\textbf{\color{#35bf28}+6.58\%}$
test_mod_add[compile] 85.6810μs 39.3445μs 25.4165 KOps/s 25.8943 KOps/s $\color{#d91a1a}-1.84\%$
test_mod_add[compile-overhead] 0.1121ms 40.1801μs 24.8880 KOps/s 25.7321 KOps/s $\color{#d91a1a}-3.28\%$
test_mod_wrap[eager] 0.4207ms 0.2152ms 4.6458 KOps/s 4.7840 KOps/s $\color{#d91a1a}-2.89\%$
test_mod_wrap[compile] 0.3700ms 0.2356ms 4.2452 KOps/s 4.2562 KOps/s $\color{#d91a1a}-0.26\%$
test_mod_wrap[compile-overhead] 0.4522ms 0.2327ms 4.2965 KOps/s 4.3035 KOps/s $\color{#d91a1a}-0.16\%$
test_mod_wrap_and_backward[eager] 13.0851ms 10.8222ms 92.4027 Ops/s 89.9961 Ops/s $\color{#35bf28}+2.67\%$
test_mod_wrap_and_backward[compile] 12.4883ms 11.0451ms 90.5380 Ops/s 88.1676 Ops/s $\color{#35bf28}+2.69\%$
test_mod_wrap_and_backward[compile-overhead] 11.9943ms 11.0348ms 90.6223 Ops/s 82.8575 Ops/s $\textbf{\color{#35bf28}+9.37\%}$
test_seq_add[eager] 0.2504ms 91.2330μs 10.9609 KOps/s 10.7176 KOps/s $\color{#35bf28}+2.27\%$
test_seq_add[compile] 0.1674ms 67.2862μs 14.8619 KOps/s 15.4722 KOps/s $\color{#d91a1a}-3.94\%$
test_seq_add[compile-overhead] 0.1517ms 65.3200μs 15.3092 KOps/s 15.6618 KOps/s $\color{#d91a1a}-2.25\%$
test_seq_wrap[eager] 0.6657ms 0.3939ms 2.5387 KOps/s 2.6178 KOps/s $\color{#d91a1a}-3.02\%$
test_seq_wrap[compile] 1.2460ms 0.2781ms 3.5961 KOps/s 3.7049 KOps/s $\color{#d91a1a}-2.94\%$
test_seq_wrap[compile-overhead] 1.3426ms 0.2771ms 3.6091 KOps/s 3.6439 KOps/s $\color{#d91a1a}-0.96\%$
test_func_call_runtime[False-eager] 1.0252ms 0.5429ms 1.8418 KOps/s 1.8864 KOps/s $\color{#d91a1a}-2.36\%$
test_func_call_runtime[False-compile] 0.7468ms 0.5251ms 1.9043 KOps/s 1.9416 KOps/s $\color{#d91a1a}-1.92\%$
test_func_call_runtime[False-compile-overhead] 0.7365ms 0.5225ms 1.9138 KOps/s 1.9402 KOps/s $\color{#d91a1a}-1.36\%$
test_func_call_runtime[True-eager] 1.6813ms 0.7824ms 1.2780 KOps/s 1.3416 KOps/s $\color{#d91a1a}-4.74\%$
test_func_call_runtime[True-compile] 1.1783ms 0.5285ms 1.8923 KOps/s 1.8914 KOps/s $\color{#35bf28}+0.05\%$
test_func_call_runtime[True-compile-overhead] 0.9523ms 0.5363ms 1.8645 KOps/s 1.8975 KOps/s $\color{#d91a1a}-1.74\%$
test_func_call_cm_runtime[False-eager] 1.2286ms 0.5463ms 1.8305 KOps/s 1.8727 KOps/s $\color{#d91a1a}-2.25\%$
test_func_call_cm_runtime[False-compile] 1.1377ms 0.5205ms 1.9214 KOps/s 1.9306 KOps/s $\color{#d91a1a}-0.48\%$
test_func_call_cm_runtime[False-compile-overhead] 0.7198ms 0.5182ms 1.9297 KOps/s 1.9435 KOps/s $\color{#d91a1a}-0.71\%$
test_func_call_cm_runtime[True-eager] 1.5363ms 0.9220ms 1.0846 KOps/s 1.1059 KOps/s $\color{#d91a1a}-1.93\%$
test_func_call_cm_runtime[True-compile] 1.0701ms 0.7724ms 1.2946 KOps/s 1.3348 KOps/s $\color{#d91a1a}-3.01\%$
test_func_call_cm_runtime[True-compile-overhead] 1.1173ms 0.7737ms 1.2925 KOps/s 1.3208 KOps/s $\color{#d91a1a}-2.14\%$
test_vmap_func_call_cm_runtime[eager] 2.5657ms 1.9555ms 511.3707 Ops/s 515.4657 Ops/s $\color{#d91a1a}-0.79\%$
test_vmap_func_call_cm_runtime[compile] 2.9231ms 2.0063ms 498.4410 Ops/s 502.9212 Ops/s $\color{#d91a1a}-0.89\%$
test_vmap_func_call_cm_runtime[compile-overhead] 2.7532ms 2.0164ms 495.9296 Ops/s 501.0523 Ops/s $\color{#d91a1a}-1.02\%$
test_distributed 0.3481ms 0.1259ms 7.9417 KOps/s 7.7822 KOps/s $\color{#35bf28}+2.05\%$
test_tdmodule 31.6290μs 17.3866μs 57.5156 KOps/s 54.6844 KOps/s $\textbf{\color{#35bf28}+5.18\%}$
test_tdmodule_dispatch 98.4840μs 36.3408μs 27.5173 KOps/s 27.1136 KOps/s $\color{#35bf28}+1.49\%$
test_tdseq 57.9980μs 20.5003μs 48.7798 KOps/s 45.4632 KOps/s $\textbf{\color{#35bf28}+7.30\%}$
test_tdseq_dispatch 96.2700μs 41.7139μs 23.9728 KOps/s 23.4581 KOps/s $\color{#35bf28}+2.19\%$
test_instantiation_functorch 1.9711ms 1.6217ms 616.6324 Ops/s 619.9585 Ops/s $\color{#d91a1a}-0.54\%$
test_instantiation_td 3.4287ms 1.2054ms 829.5926 Ops/s 832.1445 Ops/s $\color{#d91a1a}-0.31\%$
test_exec_functorch 0.4531ms 0.1949ms 5.1301 KOps/s 5.2333 KOps/s $\color{#d91a1a}-1.97\%$
test_exec_functional_call 0.3506ms 0.1790ms 5.5851 KOps/s 5.6597 KOps/s $\color{#d91a1a}-1.32\%$
test_exec_td 0.4848ms 0.2061ms 4.8523 KOps/s 4.9423 KOps/s $\color{#d91a1a}-1.82\%$
test_exec_td_decorator 0.9682ms 0.2397ms 4.1711 KOps/s 4.1967 KOps/s $\color{#d91a1a}-0.61\%$
test_vmap_mlp_speed[True-True] 1.4358ms 0.6991ms 1.4304 KOps/s 1.4422 KOps/s $\color{#d91a1a}-0.82\%$
test_vmap_mlp_speed[True-False] 0.9654ms 0.6956ms 1.4376 KOps/s 1.4499 KOps/s $\color{#d91a1a}-0.85\%$
test_vmap_mlp_speed[False-True] 0.8617ms 0.5479ms 1.8251 KOps/s 1.8523 KOps/s $\color{#d91a1a}-1.47\%$
test_vmap_mlp_speed[False-False] 0.8984ms 0.5420ms 1.8450 KOps/s 1.8507 KOps/s $\color{#d91a1a}-0.30\%$
test_vmap_mlp_speed_decorator[True-True] 1.6616ms 0.6505ms 1.5374 KOps/s 1.5398 KOps/s $\color{#d91a1a}-0.16\%$
test_vmap_mlp_speed_decorator[True-False] 1.0416ms 0.6496ms 1.5394 KOps/s 1.5330 KOps/s $\color{#35bf28}+0.42\%$
test_vmap_mlp_speed_decorator[False-True] 0.8587ms 0.5404ms 1.8505 KOps/s 1.8659 KOps/s $\color{#d91a1a}-0.83\%$
test_vmap_mlp_speed_decorator[False-False] 1.0426ms 0.5408ms 1.8490 KOps/s 1.8575 KOps/s $\color{#d91a1a}-0.46\%$
test_to_module_speed[True] 2.0379ms 1.4165ms 705.9527 Ops/s 712.7198 Ops/s $\color{#d91a1a}-0.95\%$
test_to_module_speed[False] 1.5458ms 1.3870ms 720.9640 Ops/s 734.5983 Ops/s $\color{#d91a1a}-1.86\%$
test_tc_init 92.9740μs 44.8525μs 22.2953 KOps/s 21.3358 KOps/s $\color{#35bf28}+4.50\%$
test_tc_init_nested 0.1606ms 86.6660μs 11.5385 KOps/s 10.5921 KOps/s $\textbf{\color{#35bf28}+8.94\%}$
test_tc_first_layer_tensor 26.4290μs 1.5149μs 660.1072 KOps/s 648.7821 KOps/s $\color{#35bf28}+1.75\%$
test_tc_first_layer_nontensor 22.7830μs 4.6929μs 213.0878 KOps/s 216.5014 KOps/s $\color{#d91a1a}-1.58\%$
test_tc_second_layer_tensor 65.0900μs 2.7741μs 360.4756 KOps/s 350.6166 KOps/s $\color{#35bf28}+2.81\%$
test_tc_second_layer_nontensor 63.3790μs 6.0620μs 164.9629 KOps/s 166.3476 KOps/s $\color{#d91a1a}-0.83\%$
test_unbind 0.4635s 13.0243ms 76.7797 Ops/s 71.2245 Ops/s $\textbf{\color{#35bf28}+7.80\%}$
test_full_like 14.0529ms 8.5487ms 116.9765 Ops/s 133.8156 Ops/s $\textbf{\color{#d91a1a}-12.58\%}$
test_zeros_like 6.8364ms 3.2949ms 303.5013 Ops/s 355.5127 Ops/s $\textbf{\color{#d91a1a}-14.63\%}$
test_ones_like 6.8777ms 3.7629ms 265.7539 Ops/s 167.1866 Ops/s $\textbf{\color{#35bf28}+58.96\%}$
test_clone 9.8628ms 5.4492ms 183.5116 Ops/s 130.4426 Ops/s $\textbf{\color{#35bf28}+40.68\%}$
test_squeeze 65.8230μs 12.6423μs 79.0994 KOps/s 80.2156 KOps/s $\color{#d91a1a}-1.39\%$
test_unsqueeze 0.3375ms 93.0169μs 10.7507 KOps/s 10.6437 KOps/s $\color{#35bf28}+1.01\%$
test_split 0.3672ms 0.1970ms 5.0751 KOps/s 4.9805 KOps/s $\color{#35bf28}+1.90\%$
test_permute 0.3580ms 0.2262ms 4.4212 KOps/s 4.4546 KOps/s $\color{#d91a1a}-0.75\%$
test_stack 27.4948ms 25.1978ms 39.6860 Ops/s 38.6689 Ops/s $\color{#35bf28}+2.63\%$
test_cat 29.0101ms 25.5683ms 39.1110 Ops/s 38.3051 Ops/s $\color{#35bf28}+2.10\%$

Copy link

github-actions bot commented Oct 4, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}31$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1423ms 16.7357μs 59.7524 KOps/s 56.5284 KOps/s $\textbf{\color{#35bf28}+5.70\%}$
test_plain_set_stack_nested 56.1710μs 16.8278μs 59.4256 KOps/s 55.9927 KOps/s $\textbf{\color{#35bf28}+6.13\%}$
test_plain_set_nested_inplace 48.6200μs 18.0181μs 55.4998 KOps/s 52.7857 KOps/s $\textbf{\color{#35bf28}+5.14\%}$
test_plain_set_stack_nested_inplace 48.6700μs 17.8248μs 56.1015 KOps/s 53.1827 KOps/s $\textbf{\color{#35bf28}+5.49\%}$
test_items 31.4600μs 2.9069μs 344.0033 KOps/s 348.7128 KOps/s $\color{#d91a1a}-1.35\%$
test_items_nested 0.3949ms 0.3411ms 2.9321 KOps/s 2.9389 KOps/s $\color{#d91a1a}-0.23\%$
test_items_nested_locked 0.3926ms 0.3437ms 2.9092 KOps/s 2.9305 KOps/s $\color{#d91a1a}-0.73\%$
test_items_nested_leaf 0.1009ms 62.6362μs 15.9652 KOps/s 15.8279 KOps/s $\color{#35bf28}+0.87\%$
test_items_stack_nested 0.3997ms 0.3482ms 2.8716 KOps/s 2.9235 KOps/s $\color{#d91a1a}-1.77\%$
test_items_stack_nested_leaf 89.0510μs 64.8153μs 15.4285 KOps/s 15.5715 KOps/s $\color{#d91a1a}-0.92\%$
test_items_stack_nested_locked 0.4038ms 0.3496ms 2.8601 KOps/s 2.9309 KOps/s $\color{#d91a1a}-2.42\%$
test_keys 44.1200μs 3.4366μs 290.9821 KOps/s 292.4053 KOps/s $\color{#d91a1a}-0.49\%$
test_keys_nested 0.1080ms 71.0590μs 14.0728 KOps/s 14.3451 KOps/s $\color{#d91a1a}-1.90\%$
test_keys_nested_locked 2.3715ms 77.3580μs 12.9269 KOps/s 12.8336 KOps/s $\color{#35bf28}+0.73\%$
test_keys_nested_leaf 95.6320μs 61.7956μs 16.1824 KOps/s 16.4192 KOps/s $\color{#d91a1a}-1.44\%$
test_keys_stack_nested 0.1167ms 71.5100μs 13.9841 KOps/s 14.0911 KOps/s $\color{#d91a1a}-0.76\%$
test_keys_stack_nested_leaf 91.0410μs 63.6188μs 15.7186 KOps/s 15.9523 KOps/s $\color{#d91a1a}-1.46\%$
test_keys_stack_nested_locked 0.1162ms 77.4400μs 12.9132 KOps/s 13.1604 KOps/s $\color{#d91a1a}-1.88\%$
test_values 9.0352μs 0.8407μs 1.1894 MOps/s 1.1658 MOps/s $\color{#35bf28}+2.03\%$
test_values_nested 80.4810μs 49.0078μs 20.4049 KOps/s 20.4939 KOps/s $\color{#d91a1a}-0.43\%$
test_values_nested_locked 79.5810μs 50.6781μs 19.7324 KOps/s 19.8986 KOps/s $\color{#d91a1a}-0.84\%$
test_values_nested_leaf 79.5420μs 42.5785μs 23.4861 KOps/s 23.3740 KOps/s $\color{#35bf28}+0.48\%$
test_values_stack_nested 88.5720μs 50.8311μs 19.6730 KOps/s 20.2900 KOps/s $\color{#d91a1a}-3.04\%$
test_values_stack_nested_leaf 72.7310μs 43.6166μs 22.9271 KOps/s 23.3851 KOps/s $\color{#d91a1a}-1.96\%$
test_values_stack_nested_locked 97.1420μs 51.1885μs 19.5356 KOps/s 19.5222 KOps/s $\color{#35bf28}+0.07\%$
test_membership 1.8995μs 0.5016μs 1.9935 MOps/s 1.9819 MOps/s $\color{#35bf28}+0.58\%$
test_membership_nested 25.8455μs 1.8917μs 528.6371 KOps/s 525.1662 KOps/s $\color{#35bf28}+0.66\%$
test_membership_nested_leaf 16.5600μs 1.9037μs 525.2828 KOps/s 538.3513 KOps/s $\color{#d91a1a}-2.43\%$
test_membership_stacked_nested 23.8810μs 1.9684μs 508.0165 KOps/s 515.5881 KOps/s $\color{#d91a1a}-1.47\%$
test_membership_stacked_nested_leaf 28.9500μs 2.0147μs 496.3458 KOps/s 513.3398 KOps/s $\color{#d91a1a}-3.31\%$
test_membership_nested_last 29.7110μs 3.0638μs 326.3884 KOps/s 337.7153 KOps/s $\color{#d91a1a}-3.35\%$
test_membership_nested_leaf_last 37.3800μs 3.0443μs 328.4821 KOps/s 334.2325 KOps/s $\color{#d91a1a}-1.72\%$
test_membership_stacked_nested_last 31.9710μs 3.5436μs 282.2017 KOps/s 120.3593 KOps/s $\textbf{\color{#35bf28}+134.47\%}$
test_membership_stacked_nested_leaf_last 25.7810μs 3.5477μs 281.8728 KOps/s 121.3895 KOps/s $\textbf{\color{#35bf28}+132.21\%}$
test_nested_getleaf 31.7610μs 6.1121μs 163.6088 KOps/s 163.7989 KOps/s $\color{#d91a1a}-0.12\%$
test_nested_get 36.3710μs 5.7616μs 173.5621 KOps/s 172.2025 KOps/s $\color{#35bf28}+0.79\%$
test_stacked_getleaf 37.1310μs 6.0498μs 165.2960 KOps/s 163.6326 KOps/s $\color{#35bf28}+1.02\%$
test_stacked_get 28.4500μs 5.6815μs 176.0112 KOps/s 171.8623 KOps/s $\color{#35bf28}+2.41\%$
test_nested_getitemleaf 36.4600μs 6.1655μs 162.1917 KOps/s 161.2197 KOps/s $\color{#35bf28}+0.60\%$
test_nested_getitem 34.6110μs 5.7949μs 172.5665 KOps/s 170.6196 KOps/s $\color{#35bf28}+1.14\%$
test_stacked_getitemleaf 36.0500μs 6.0714μs 164.7079 KOps/s 162.8861 KOps/s $\color{#35bf28}+1.12\%$
test_stacked_getitem 30.5800μs 5.6554μs 176.8211 KOps/s 171.5395 KOps/s $\color{#35bf28}+3.08\%$
test_lock_nested 4.9128ms 0.4433ms 2.2557 KOps/s 2.2909 KOps/s $\color{#d91a1a}-1.54\%$
test_lock_stack_nested 0.4644ms 0.4004ms 2.4975 KOps/s 2.5767 KOps/s $\color{#d91a1a}-3.08\%$
test_unlock_nested 0.7794ms 0.3772ms 2.6508 KOps/s 2.6654 KOps/s $\color{#d91a1a}-0.55\%$
test_unlock_stack_nested 0.3792ms 0.3384ms 2.9547 KOps/s 3.0780 KOps/s $\color{#d91a1a}-4.00\%$
test_flatten_speed 0.1118ms 77.3906μs 12.9215 KOps/s 13.0224 KOps/s $\color{#d91a1a}-0.78\%$
test_unflatten_speed 0.3806ms 0.3191ms 3.1335 KOps/s 3.0672 KOps/s $\color{#35bf28}+2.16\%$
test_common_ops 1.6218ms 1.2709ms 786.8371 Ops/s 768.2553 Ops/s $\color{#35bf28}+2.42\%$
test_creation 26.2910μs 1.4856μs 673.1458 KOps/s 674.8182 KOps/s $\color{#d91a1a}-0.25\%$
test_creation_empty 40.0610μs 15.4194μs 64.8532 KOps/s 57.9181 KOps/s $\textbf{\color{#35bf28}+11.97\%}$
test_creation_nested_1 54.6910μs 17.6016μs 56.8129 KOps/s 52.2371 KOps/s $\textbf{\color{#35bf28}+8.76\%}$
test_creation_nested_2 52.9200μs 19.9163μs 50.2100 KOps/s 45.8447 KOps/s $\textbf{\color{#35bf28}+9.52\%}$
test_clone 55.8510μs 29.9012μs 33.4435 KOps/s 35.1329 KOps/s $\color{#d91a1a}-4.81\%$
test_getitem[int] 92.4022ms 23.3864μs 42.7599 KOps/s 59.9319 KOps/s $\textbf{\color{#d91a1a}-28.65\%}$
test_getitem[slice_int] 0.1211ms 28.3042μs 35.3304 KOps/s 35.0527 KOps/s $\color{#35bf28}+0.79\%$
test_getitem[range] 0.2178ms 0.1103ms 9.0702 KOps/s 9.2988 KOps/s $\color{#d91a1a}-2.46\%$
test_getitem[tuple] 0.1272ms 24.4828μs 40.8450 KOps/s 41.2139 KOps/s $\color{#d91a1a}-0.90\%$
test_getitem[list] 0.1963ms 98.3504μs 10.1677 KOps/s 9.6626 KOps/s $\textbf{\color{#35bf28}+5.23\%}$
test_setitem_dim[int] 68.1720μs 45.3140μs 22.0682 KOps/s 21.0523 KOps/s $\color{#35bf28}+4.83\%$
test_setitem_dim[slice_int] 93.3120μs 66.8406μs 14.9610 KOps/s 14.9262 KOps/s $\color{#35bf28}+0.23\%$
test_setitem_dim[range] 0.1790ms 0.1312ms 7.6214 KOps/s 7.8111 KOps/s $\color{#d91a1a}-2.43\%$
test_setitem_dim[tuple] 96.0820μs 64.6615μs 15.4652 KOps/s 15.6947 KOps/s $\color{#d91a1a}-1.46\%$
test_setitem 88.9820μs 42.7362μs 23.3994 KOps/s 21.6875 KOps/s $\textbf{\color{#35bf28}+7.89\%}$
test_set 94.5420μs 43.3233μs 23.0823 KOps/s 22.2315 KOps/s $\color{#35bf28}+3.83\%$
test_set_shared 0.3458ms 54.7176μs 18.2756 KOps/s 18.4538 KOps/s $\color{#d91a1a}-0.97\%$
test_update 96.0420μs 51.7313μs 19.3307 KOps/s 19.2693 KOps/s $\color{#35bf28}+0.32\%$
test_update_nested 0.1094ms 59.7830μs 16.7272 KOps/s 16.6587 KOps/s $\color{#35bf28}+0.41\%$
test_update__nested 0.1106ms 65.5302μs 15.2601 KOps/s 15.3016 KOps/s $\color{#d91a1a}-0.27\%$
test_set_nested 83.4220μs 46.1569μs 21.6652 KOps/s 20.7886 KOps/s $\color{#35bf28}+4.22\%$
test_set_nested_new 89.8520μs 48.9444μs 20.4314 KOps/s 19.6188 KOps/s $\color{#35bf28}+4.14\%$
test_select 0.1048ms 62.8124μs 15.9204 KOps/s 15.3797 KOps/s $\color{#35bf28}+3.52\%$
test_select_nested 67.6320μs 44.1386μs 22.6559 KOps/s 23.2932 KOps/s $\color{#d91a1a}-2.74\%$
test_exclude_nested 0.1114ms 59.5756μs 16.7854 KOps/s 16.7120 KOps/s $\color{#35bf28}+0.44\%$
test_empty[True] 0.3298ms 0.2596ms 3.8517 KOps/s 3.8812 KOps/s $\color{#d91a1a}-0.76\%$
test_empty[False] 2.9811μs 0.7385μs 1.3541 MOps/s 1.3100 MOps/s $\color{#35bf28}+3.37\%$
test_to 55.9910μs 26.7009μs 37.4519 KOps/s 36.2138 KOps/s $\color{#35bf28}+3.42\%$
test_to_nonblocking 59.1710μs 25.6447μs 38.9944 KOps/s 35.0508 KOps/s $\textbf{\color{#35bf28}+11.25\%}$
test_unbind_speed 0.3316ms 0.2925ms 3.4194 KOps/s 3.4157 KOps/s $\color{#35bf28}+0.11\%$
test_unbind_speed_stack0 0.3307ms 0.2856ms 3.5010 KOps/s 3.5958 KOps/s $\color{#d91a1a}-2.64\%$
test_unbind_speed_stack1 91.4093ms 0.7135ms 1.4015 KOps/s 1.4090 KOps/s $\color{#d91a1a}-0.53\%$
test_split 92.8449ms 2.1980ms 454.9658 Ops/s 447.5672 Ops/s $\color{#35bf28}+1.65\%$
test_chunk 94.7864ms 2.1938ms 455.8360 Ops/s 444.7776 Ops/s $\color{#35bf28}+2.49\%$
test_creation[device0] 0.3389ms 0.1297ms 7.7113 KOps/s 7.8659 KOps/s $\color{#d91a1a}-1.97\%$
test_creation_from_tensor 0.3469ms 0.1281ms 7.8089 KOps/s 7.4675 KOps/s $\color{#35bf28}+4.57\%$
test_add_one[memmap_tensor0] 0.2193ms 9.2367μs 108.2640 KOps/s 107.8833 KOps/s $\color{#35bf28}+0.35\%$
test_contiguous[memmap_tensor0] 40.2000μs 2.2211μs 450.2210 KOps/s 458.2078 KOps/s $\color{#d91a1a}-1.74\%$
test_stack[memmap_tensor0] 38.9410μs 7.1184μs 140.4807 KOps/s 144.9507 KOps/s $\color{#d91a1a}-3.08\%$
test_memmaptd_index 1.2570ms 0.4363ms 2.2923 KOps/s 2.2368 KOps/s $\color{#35bf28}+2.48\%$
test_memmaptd_index_astensor 1.0010ms 0.5039ms 1.9846 KOps/s 1.9360 KOps/s $\color{#35bf28}+2.51\%$
test_memmaptd_index_op 1.4649ms 1.0593ms 944.0508 Ops/s 929.4001 Ops/s $\color{#35bf28}+1.58\%$
test_serialize_model 0.1274s 0.1267s 7.8937 Ops/s 7.9191 Ops/s $\color{#d91a1a}-0.32\%$
test_serialize_model_pickle 1.3692s 1.2170s 0.8217 Ops/s 0.8248 Ops/s $\color{#d91a1a}-0.38\%$
test_serialize_weights 0.1265s 0.1258s 7.9498 Ops/s 7.2366 Ops/s $\textbf{\color{#35bf28}+9.86\%}$
test_serialize_weights_returnearly 0.2123s 56.8247ms 17.5980 Ops/s 17.1329 Ops/s $\color{#35bf28}+2.71\%$
test_serialize_weights_pickle 1.3520s 1.2128s 0.8245 Ops/s 0.8222 Ops/s $\color{#35bf28}+0.28\%$
test_reshape_pytree 79.4510μs 36.8362μs 27.1472 KOps/s 26.9702 KOps/s $\color{#35bf28}+0.66\%$
test_reshape_td 75.9710μs 41.4816μs 24.1071 KOps/s 21.9840 KOps/s $\textbf{\color{#35bf28}+9.66\%}$
test_view_pytree 72.3910μs 37.4068μs 26.7331 KOps/s 26.1584 KOps/s $\color{#35bf28}+2.20\%$
test_view_td 87.1020μs 46.8623μs 21.3391 KOps/s 20.1478 KOps/s $\textbf{\color{#35bf28}+5.91\%}$
test_unbind_pytree 65.2010μs 36.9212μs 27.0847 KOps/s 28.7121 KOps/s $\textbf{\color{#d91a1a}-5.67\%}$
test_unbind_td 0.4094ms 47.5471μs 21.0318 KOps/s 23.3650 KOps/s $\textbf{\color{#d91a1a}-9.99\%}$
test_split_pytree 0.5058ms 49.0169μs 20.4011 KOps/s 20.4075 KOps/s $\color{#d91a1a}-0.03\%$
test_split_td 92.9630ms 65.7714μs 15.2042 KOps/s 17.1371 KOps/s $\textbf{\color{#d91a1a}-11.28\%}$
test_add_pytree 0.2159ms 60.2547μs 16.5962 KOps/s 17.7736 KOps/s $\textbf{\color{#d91a1a}-6.62\%}$
test_add_td 0.1727ms 90.5780μs 11.0402 KOps/s 9.7175 KOps/s $\textbf{\color{#35bf28}+13.61\%}$
test_compile_add_one_nested[tensordict-compile] 0.2744ms 0.1622ms 6.1662 KOps/s 5.9074 KOps/s $\color{#35bf28}+4.38\%$
test_compile_add_one_nested[tensordict-eager] 0.3122ms 0.1664ms 6.0099 KOps/s 6.0587 KOps/s $\color{#d91a1a}-0.80\%$
test_compile_add_one_nested[pytree-compile] 0.1822ms 0.1435ms 6.9679 KOps/s 6.9724 KOps/s $\color{#d91a1a}-0.06\%$
test_compile_add_one_nested[pytree-eager] 0.2335ms 0.1834ms 5.4517 KOps/s 5.4662 KOps/s $\color{#d91a1a}-0.26\%$
test_compile_copy_nested[tensordict-compile] 49.9410μs 22.0971μs 45.2548 KOps/s 46.0296 KOps/s $\color{#d91a1a}-1.68\%$
test_compile_copy_nested[tensordict-eager] 79.1610μs 49.2764μs 20.2937 KOps/s 20.0404 KOps/s $\color{#35bf28}+1.26\%$
test_compile_copy_nested[pytree-compile] 0.1050ms 64.6360μs 15.4712 KOps/s 15.3571 KOps/s $\color{#35bf28}+0.74\%$
test_compile_copy_nested[pytree-eager] 80.5920μs 49.4008μs 20.2426 KOps/s 20.2432 KOps/s $-0.00\%$
test_compile_add_one_flat[tensordict-compile] 0.3975ms 0.3209ms 3.1163 KOps/s 3.1066 KOps/s $\color{#35bf28}+0.31\%$
test_compile_add_one_flat[tensordict-eager] 0.3250ms 0.2379ms 4.2035 KOps/s 4.3498 KOps/s $\color{#d91a1a}-3.36\%$
test_compile_add_one_flat[tensorclass-compile] 0.1736ms 0.1267ms 7.8921 KOps/s 7.5782 KOps/s $\color{#35bf28}+4.14\%$
test_compile_add_one_flat[tensorclass-eager] 0.1080ms 66.0398μs 15.1424 KOps/s 14.7490 KOps/s $\color{#35bf28}+2.67\%$
test_compile_add_one_flat[pytree-compile] 0.4687ms 0.3261ms 3.0662 KOps/s 3.1179 KOps/s $\color{#d91a1a}-1.66\%$
test_compile_add_one_flat[pytree-eager] 0.7945ms 0.6434ms 1.5542 KOps/s 1.6148 KOps/s $\color{#d91a1a}-3.75\%$
test_compile_add_self_flat[tensordict-eager] 0.4166ms 0.2923ms 3.4206 KOps/s 3.5876 KOps/s $\color{#d91a1a}-4.66\%$
test_compile_add_self_flat[tensordict-compile] 0.4639ms 0.3324ms 3.0088 KOps/s 3.1091 KOps/s $\color{#d91a1a}-3.23\%$
test_compile_add_self_flat[tensorclass-eager] 0.2045ms 80.3316μs 12.4484 KOps/s 13.1272 KOps/s $\textbf{\color{#d91a1a}-5.17\%}$
test_compile_add_self_flat[tensorclass-compile] 5.1195ms 0.1354ms 7.3881 KOps/s 7.7893 KOps/s $\textbf{\color{#d91a1a}-5.15\%}$
test_compile_add_self_flat[pytree-eager] 0.6511ms 0.5534ms 1.8071 KOps/s 1.8847 KOps/s $\color{#d91a1a}-4.12\%$
test_compile_add_self_flat[pytree-compile] 0.3777ms 0.3189ms 3.1355 KOps/s 3.1363 KOps/s $\color{#d91a1a}-0.03\%$
test_compile_copy_flat[tensordict-compile] 60.0210μs 20.5942μs 48.5574 KOps/s 52.7423 KOps/s $\textbf{\color{#d91a1a}-7.93\%}$
test_compile_copy_flat[tensordict-eager] 75.6310μs 38.9632μs 25.6653 KOps/s 24.2892 KOps/s $\textbf{\color{#35bf28}+5.67\%}$
test_compile_copy_flat[pytree-compile] 0.1144ms 69.1166μs 14.4683 KOps/s 14.2719 KOps/s $\color{#35bf28}+1.38\%$
test_compile_copy_flat[pytree-eager] 0.1090ms 51.1624μs 19.5456 KOps/s 19.2744 KOps/s $\color{#35bf28}+1.41\%$
test_compile_assign_and_add[tensordict-compile] 2.3962ms 0.8541ms 1.1709 KOps/s 1.0973 KOps/s $\textbf{\color{#35bf28}+6.71\%}$
test_compile_assign_and_add[tensordict-eager] 3.4039ms 3.2170ms 310.8495 Ops/s 306.3086 Ops/s $\color{#35bf28}+1.48\%$
test_compile_assign_and_add[pytree-compile] 2.3065ms 0.8175ms 1.2233 KOps/s 1.1165 KOps/s $\textbf{\color{#35bf28}+9.56\%}$
test_compile_assign_and_add[pytree-eager] 3.2276ms 3.1475ms 317.7091 Ops/s 317.0454 Ops/s $\color{#35bf28}+0.21\%$
test_compile_indexing[tensor-tensordict-compile] 0.1530ms 0.1074ms 9.3082 KOps/s 9.2256 KOps/s $\color{#35bf28}+0.89\%$
test_compile_indexing[tensor-tensordict-eager] 0.1876ms 61.0547μs 16.3788 KOps/s 15.6756 KOps/s $\color{#35bf28}+4.49\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1460ms 0.1060ms 9.4302 KOps/s 9.0480 KOps/s $\color{#35bf28}+4.22\%$
test_compile_indexing[tensor-tensorclass-eager] 92.7410μs 45.0695μs 22.1880 KOps/s 21.6481 KOps/s $\color{#35bf28}+2.49\%$
test_compile_indexing[tensor-pytree-compile] 0.1522ms 0.1090ms 9.1784 KOps/s 8.9923 KOps/s $\color{#35bf28}+2.07\%$
test_compile_indexing[tensor-pytree-eager] 87.2310μs 45.1765μs 22.1354 KOps/s 21.4942 KOps/s $\color{#35bf28}+2.98\%$
test_compile_indexing[slice-tensordict-compile] 0.2177ms 0.1409ms 7.0969 KOps/s 7.1843 KOps/s $\color{#d91a1a}-1.22\%$
test_compile_indexing[slice-tensordict-eager] 0.1583ms 24.7774μs 40.3594 KOps/s 38.3022 KOps/s $\textbf{\color{#35bf28}+5.37\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1756ms 0.1351ms 7.3993 KOps/s 7.5364 KOps/s $\color{#d91a1a}-1.82\%$
test_compile_indexing[slice-tensorclass-eager] 55.2510μs 21.0112μs 47.5936 KOps/s 46.3228 KOps/s $\color{#35bf28}+2.74\%$
test_compile_indexing[slice-pytree-compile] 0.1769ms 0.1317ms 7.5939 KOps/s 7.1622 KOps/s $\textbf{\color{#35bf28}+6.03\%}$
test_compile_indexing[slice-pytree-eager] 74.6610μs 20.9046μs 47.8365 KOps/s 46.8255 KOps/s $\color{#35bf28}+2.16\%$
test_compile_indexing[int-tensordict-compile] 0.1737ms 0.1374ms 7.2802 KOps/s 7.1761 KOps/s $\color{#35bf28}+1.45\%$
test_compile_indexing[int-tensordict-eager] 0.4930ms 24.6602μs 40.5511 KOps/s 37.0827 KOps/s $\textbf{\color{#35bf28}+9.35\%}$
test_compile_indexing[int-tensorclass-compile] 0.2777ms 0.1365ms 7.3264 KOps/s 7.5099 KOps/s $\color{#d91a1a}-2.44\%$
test_compile_indexing[int-tensorclass-eager] 42.6210μs 21.2679μs 47.0192 KOps/s 47.5101 KOps/s $\color{#d91a1a}-1.03\%$
test_compile_indexing[int-pytree-compile] 0.1906ms 0.1369ms 7.3065 KOps/s 7.1917 KOps/s $\color{#35bf28}+1.60\%$
test_compile_indexing[int-pytree-eager] 59.8810μs 20.7755μs 48.1336 KOps/s 46.1364 KOps/s $\color{#35bf28}+4.33\%$
test_mod_add[eager] 79.2420μs 33.4720μs 29.8757 KOps/s 30.4741 KOps/s $\color{#d91a1a}-1.96\%$
test_mod_add[compile] 0.2139ms 71.9198μs 13.9044 KOps/s 13.3007 KOps/s $\color{#35bf28}+4.54\%$
test_mod_add[compile-overhead] 0.2542ms 0.1354ms 7.3861 KOps/s 7.1248 KOps/s $\color{#35bf28}+3.67\%$
test_mod_wrap[eager] 0.8793ms 0.7762ms 1.2884 KOps/s 1.2712 KOps/s $\color{#35bf28}+1.35\%$
test_mod_wrap[compile] 2.0104ms 0.8370ms 1.1948 KOps/s 1.2054 KOps/s $\color{#d91a1a}-0.88\%$
test_mod_wrap[compile-overhead] 4.8816ms 3.0480ms 328.0816 Ops/s 326.3853 Ops/s $\color{#35bf28}+0.52\%$
test_mod_wrap_and_backward[eager] 4.1679ms 4.0289ms 248.2090 Ops/s 242.1125 Ops/s $\color{#35bf28}+2.52\%$
test_mod_wrap_and_backward[compile] 4.5927ms 4.0538ms 246.6793 Ops/s 243.1358 Ops/s $\color{#35bf28}+1.46\%$
test_mod_wrap_and_backward[compile-overhead] 1.3214ms 0.9075ms 1.1019 KOps/s 947.6109 Ops/s $\textbf{\color{#35bf28}+16.29\%}$
test_seq_add[eager] 0.1691ms 0.1025ms 9.7518 KOps/s 9.2484 KOps/s $\textbf{\color{#35bf28}+5.44\%}$
test_seq_add[compile] 0.1512ms 81.7035μs 12.2394 KOps/s 11.9050 KOps/s $\color{#35bf28}+2.81\%$
test_seq_add[compile-overhead] 0.1631ms 0.1162ms 8.6078 KOps/s 8.7365 KOps/s $\color{#d91a1a}-1.47\%$
test_seq_wrap[eager] 1.1051ms 0.9270ms 1.0787 KOps/s 1.0695 KOps/s $\color{#35bf28}+0.86\%$
test_seq_wrap[compile] 1.1602ms 0.8627ms 1.1591 KOps/s 1.1730 KOps/s $\color{#d91a1a}-1.18\%$
test_seq_wrap[compile-overhead] 0.2754ms 0.2223ms 4.4977 KOps/s 4.3842 KOps/s $\color{#35bf28}+2.59\%$
test_func_call_runtime[False-eager] 2.4523ms 2.3534ms 424.9210 Ops/s 415.7170 Ops/s $\color{#35bf28}+2.21\%$
test_func_call_runtime[False-compile] 2.6346ms 2.4065ms 415.5372 Ops/s 421.0807 Ops/s $\color{#d91a1a}-1.32\%$
test_func_call_runtime[False-compile-overhead] 0.5160ms 0.3587ms 2.7880 KOps/s 2.7621 KOps/s $\color{#35bf28}+0.94\%$
test_func_call_runtime[True-eager] 2.7246ms 2.5124ms 398.0268 Ops/s 398.3127 Ops/s $\color{#d91a1a}-0.07\%$
test_func_call_runtime[True-compile] 2.5340ms 2.4269ms 412.0536 Ops/s 416.9603 Ops/s $\color{#d91a1a}-1.18\%$
test_func_call_runtime[True-compile-overhead] 0.4361ms 0.3817ms 2.6201 KOps/s 2.5884 KOps/s $\color{#35bf28}+1.22\%$
test_func_call_cm_runtime[False-eager] 2.5820ms 2.3504ms 425.4614 Ops/s 425.3230 Ops/s $\color{#35bf28}+0.03\%$
test_func_call_cm_runtime[False-compile] 3.3716ms 2.4235ms 412.6323 Ops/s 421.8261 Ops/s $\color{#d91a1a}-2.18\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4121ms 0.3627ms 2.7570 KOps/s 2.7277 KOps/s $\color{#35bf28}+1.07\%$
test_func_call_cm_runtime[True-eager] 2.7337ms 2.6280ms 380.5103 Ops/s 381.8068 Ops/s $\color{#d91a1a}-0.34\%$
test_func_call_cm_runtime[True-compile] 2.5885ms 2.4570ms 407.0020 Ops/s 411.6115 Ops/s $\color{#d91a1a}-1.12\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4524ms 0.4058ms 2.4641 KOps/s 2.4209 KOps/s $\color{#35bf28}+1.78\%$
test_vmap_func_call_cm_runtime[eager] 4.1681ms 3.7258ms 268.3953 Ops/s 265.8805 Ops/s $\color{#35bf28}+0.95\%$
test_vmap_func_call_cm_runtime[compile] 2.5715ms 2.4971ms 400.4624 Ops/s 407.4220 Ops/s $\color{#d91a1a}-1.71\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5064ms 0.4203ms 2.3795 KOps/s 2.4075 KOps/s $\color{#d91a1a}-1.16\%$
test_distributed 2.2564ms 0.2277ms 4.3911 KOps/s 8.4109 KOps/s $\textbf{\color{#d91a1a}-47.79\%}$
test_tdmodule 33.4010μs 14.8592μs 67.2985 KOps/s 63.0890 KOps/s $\textbf{\color{#35bf28}+6.67\%}$
test_tdmodule_dispatch 0.1389ms 28.4702μs 35.1245 KOps/s 31.9094 KOps/s $\textbf{\color{#35bf28}+10.08\%}$
test_tdseq 35.1410μs 15.4230μs 64.8382 KOps/s 59.8306 KOps/s $\textbf{\color{#35bf28}+8.37\%}$
test_tdseq_dispatch 51.5910μs 31.0826μs 32.1724 KOps/s 29.6275 KOps/s $\textbf{\color{#35bf28}+8.59\%}$
test_instantiation_functorch 2.0268ms 1.8565ms 538.6506 Ops/s 530.9760 Ops/s $\color{#35bf28}+1.45\%$
test_instantiation_td 1.8418ms 1.1940ms 837.5336 Ops/s 815.5357 Ops/s $\color{#35bf28}+2.70\%$
test_exec_functorch 1.0934ms 1.0021ms 997.9489 Ops/s 1.0003 KOps/s $\color{#d91a1a}-0.23\%$
test_exec_functional_call 1.0748ms 1.0003ms 999.6634 Ops/s 991.0057 Ops/s $\color{#35bf28}+0.87\%$
test_exec_td 1.1040ms 1.0300ms 970.8783 Ops/s 972.3659 Ops/s $\color{#d91a1a}-0.15\%$
test_exec_td_decorator 1.1508ms 1.0589ms 944.3695 Ops/s 938.4328 Ops/s $\color{#35bf28}+0.63\%$
test_vmap_mlp_speed[True-True] 1.3513ms 1.2597ms 793.8120 Ops/s 786.7953 Ops/s $\color{#35bf28}+0.89\%$
test_vmap_mlp_speed[True-False] 1.3639ms 1.2630ms 791.7664 Ops/s 789.6626 Ops/s $\color{#35bf28}+0.27\%$
test_vmap_mlp_speed[False-True] 1.2819ms 1.1529ms 867.3409 Ops/s 869.6870 Ops/s $\color{#d91a1a}-0.27\%$
test_vmap_mlp_speed[False-False] 1.2547ms 1.1512ms 868.6713 Ops/s 863.6865 Ops/s $\color{#35bf28}+0.58\%$
test_vmap_mlp_speed_decorator[True-True] 1.3386ms 1.2375ms 808.1064 Ops/s 810.0433 Ops/s $\color{#d91a1a}-0.24\%$
test_vmap_mlp_speed_decorator[True-False] 1.6914ms 1.2383ms 807.5816 Ops/s 809.4758 Ops/s $\color{#d91a1a}-0.23\%$
test_vmap_mlp_speed_decorator[False-True] 1.2927ms 1.1536ms 866.8157 Ops/s 867.1741 Ops/s $\color{#d91a1a}-0.04\%$
test_vmap_mlp_speed_decorator[False-False] 1.2501ms 1.1560ms 865.0159 Ops/s 865.4151 Ops/s $\color{#d91a1a}-0.05\%$
test_vmap_transformer_speed[True-True] 13.3088ms 12.9652ms 77.1297 Ops/s 76.3558 Ops/s $\color{#35bf28}+1.01\%$
test_vmap_transformer_speed[True-False] 13.4270ms 12.9892ms 76.9873 Ops/s 76.5024 Ops/s $\color{#35bf28}+0.63\%$
test_vmap_transformer_speed[False-True] 13.3921ms 12.8237ms 77.9806 Ops/s 78.0220 Ops/s $\color{#d91a1a}-0.05\%$
test_vmap_transformer_speed[False-False] 13.2321ms 12.7641ms 78.3445 Ops/s 77.9944 Ops/s $\color{#35bf28}+0.45\%$
test_vmap_transformer_speed_decorator[True-True] 33.9418ms 33.4486ms 29.8966 Ops/s 30.0246 Ops/s $\color{#d91a1a}-0.43\%$
test_vmap_transformer_speed_decorator[True-False] 34.0363ms 33.4086ms 29.9325 Ops/s 29.8874 Ops/s $\color{#35bf28}+0.15\%$
test_vmap_transformer_speed_decorator[False-True] 34.4638ms 33.3476ms 29.9871 Ops/s 30.0393 Ops/s $\color{#d91a1a}-0.17\%$
test_vmap_transformer_speed_decorator[False-False] 33.9998ms 33.3381ms 29.9957 Ops/s 30.0852 Ops/s $\color{#d91a1a}-0.30\%$
test_to_module_speed[True] 1.2446ms 0.9870ms 1.0132 KOps/s 980.9178 Ops/s $\color{#35bf28}+3.29\%$
test_to_module_speed[False] 1.3460ms 0.9653ms 1.0359 KOps/s 1.0061 KOps/s $\color{#35bf28}+2.96\%$
test_tc_init 70.6310μs 35.2663μs 28.3557 KOps/s 26.6116 KOps/s $\textbf{\color{#35bf28}+6.55\%}$
test_tc_init_nested 0.1128ms 70.3442μs 14.2158 KOps/s 13.2240 KOps/s $\textbf{\color{#35bf28}+7.50\%}$
test_tc_first_layer_tensor 4.1273μs 0.6876μs 1.4543 MOps/s 1.4907 MOps/s $\color{#d91a1a}-2.44\%$
test_tc_first_layer_nontensor 31.9610μs 2.2752μs 439.5173 KOps/s 446.8183 KOps/s $\color{#d91a1a}-1.63\%$
test_tc_second_layer_tensor 8.2575μs 1.3871μs 720.9354 KOps/s 728.5186 KOps/s $\color{#d91a1a}-1.04\%$
test_tc_second_layer_nontensor 26.0010μs 2.8743μs 347.9164 KOps/s 334.5002 KOps/s $\color{#35bf28}+4.01\%$
test_unbind 0.1929s 12.1914ms 82.0250 Ops/s 92.8402 Ops/s $\textbf{\color{#d91a1a}-11.65\%}$
test_full_like 0.6646ms 0.5751ms 1.7388 KOps/s 1.7402 KOps/s $\color{#d91a1a}-0.08\%$
test_zeros_like 0.2636ms 0.1978ms 5.0553 KOps/s 5.0550 KOps/s $+0.01\%$
test_ones_like 0.2336ms 0.1977ms 5.0593 KOps/s 5.0576 KOps/s $\color{#35bf28}+0.03\%$
test_clone 0.4500ms 0.4139ms 2.4163 KOps/s 2.4169 KOps/s $\color{#d91a1a}-0.03\%$
test_squeeze 34.9900μs 9.8769μs 101.2466 KOps/s 98.1188 KOps/s $\color{#35bf28}+3.19\%$
test_unsqueeze 0.2251ms 76.1287μs 13.1356 KOps/s 12.8015 KOps/s $\color{#35bf28}+2.61\%$
test_split 0.4429ms 0.1593ms 6.2790 KOps/s 6.2572 KOps/s $\color{#35bf28}+0.35\%$
test_permute 0.2209ms 0.1787ms 5.5944 KOps/s 5.2968 KOps/s $\textbf{\color{#35bf28}+5.62\%}$
test_stack 1.2554ms 0.8653ms 1.1557 KOps/s 1.1817 KOps/s $\color{#d91a1a}-2.21\%$
test_cat 1.2503ms 1.2312ms 812.1867 Ops/s 812.1426 Ops/s $+0.01\%$

@vmoens vmoens added the Refactor Refactoring code - not a new feature label Oct 4, 2024
@vmoens vmoens merged commit 38d5629 into gh/vmoens/25/base Oct 4, 2024
50 of 51 checks passed
@vmoens vmoens deleted the gh/vmoens/25/head branch October 4, 2024 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Refactor Refactoring code - not a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants