Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Minor] Refactor is_dynamo_compiling for older torch versions #978

Merged
merged 1 commit into from
Sep 2, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Sep 2, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 2, 2024
@vmoens vmoens merged commit a50f826 into main Sep 2, 2024
12 of 22 checks passed
@vmoens vmoens deleted the minor-refact branch September 2, 2024 15:24
@vmoens vmoens added the bug Something isn't working label Sep 2, 2024
Copy link

github-actions bot commented Sep 2, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}32$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 40.6160μs 19.7482μs 50.6375 KOps/s 48.7404 KOps/s $\color{#35bf28}+3.89\%$
test_plain_set_stack_nested 79.9890μs 19.8562μs 50.3620 KOps/s 47.7754 KOps/s $\textbf{\color{#35bf28}+5.41\%}$
test_plain_set_nested_inplace 59.3910μs 21.6849μs 46.1151 KOps/s 44.6706 KOps/s $\color{#35bf28}+3.23\%$
test_plain_set_stack_nested_inplace 81.9430μs 21.4769μs 46.5617 KOps/s 44.6234 KOps/s $\color{#35bf28}+4.34\%$
test_items 28.0920μs 4.1713μs 239.7355 KOps/s 245.3229 KOps/s $\color{#d91a1a}-2.28\%$
test_items_nested 0.5884ms 0.3293ms 3.0371 KOps/s 3.0432 KOps/s $\color{#d91a1a}-0.20\%$
test_items_nested_locked 0.5595ms 0.3295ms 3.0353 KOps/s 3.0518 KOps/s $\color{#d91a1a}-0.54\%$
test_items_nested_leaf 0.1669ms 85.2070μs 11.7361 KOps/s 11.9302 KOps/s $\color{#d91a1a}-1.63\%$
test_items_stack_nested 0.4367ms 0.3331ms 3.0023 KOps/s 3.0229 KOps/s $\color{#d91a1a}-0.68\%$
test_items_stack_nested_leaf 0.1440ms 83.7257μs 11.9438 KOps/s 11.9156 KOps/s $\color{#35bf28}+0.24\%$
test_items_stack_nested_locked 0.6072ms 0.3307ms 3.0235 KOps/s 3.0476 KOps/s $\color{#d91a1a}-0.79\%$
test_keys 23.4240μs 3.5820μs 279.1732 KOps/s 281.5793 KOps/s $\color{#d91a1a}-0.85\%$
test_keys_nested 0.1570ms 95.9911μs 10.4176 KOps/s 10.3411 KOps/s $\color{#35bf28}+0.74\%$
test_keys_nested_locked 1.7541ms 0.1033ms 9.6848 KOps/s 9.8433 KOps/s $\color{#d91a1a}-1.61\%$
test_keys_nested_leaf 0.1319ms 79.6212μs 12.5595 KOps/s 12.3974 KOps/s $\color{#35bf28}+1.31\%$
test_keys_stack_nested 0.1665ms 96.4507μs 10.3680 KOps/s 10.4116 KOps/s $\color{#d91a1a}-0.42\%$
test_keys_stack_nested_leaf 0.1560ms 79.6336μs 12.5575 KOps/s 12.5079 KOps/s $\color{#35bf28}+0.40\%$
test_keys_stack_nested_locked 0.1995ms 0.1006ms 9.9413 KOps/s 9.9847 KOps/s $\color{#d91a1a}-0.44\%$
test_values 10.0666μs 1.0665μs 937.6236 KOps/s 930.8752 KOps/s $\color{#35bf28}+0.72\%$
test_values_nested 0.1032ms 49.0933μs 20.3694 KOps/s 21.0510 KOps/s $\color{#d91a1a}-3.24\%$
test_values_nested_locked 0.1053ms 48.3739μs 20.6723 KOps/s 20.2186 KOps/s $\color{#35bf28}+2.24\%$
test_values_nested_leaf 0.1027ms 42.7788μs 23.3760 KOps/s 23.4633 KOps/s $\color{#d91a1a}-0.37\%$
test_values_stack_nested 93.5520μs 48.6406μs 20.5589 KOps/s 19.9018 KOps/s $\color{#35bf28}+3.30\%$
test_values_stack_nested_leaf 0.1033ms 42.0637μs 23.7735 KOps/s 23.7269 KOps/s $\color{#35bf28}+0.20\%$
test_values_stack_nested_locked 0.1059ms 48.6682μs 20.5473 KOps/s 20.9920 KOps/s $\color{#d91a1a}-2.12\%$
test_membership 6.8227μs 0.6953μs 1.4382 MOps/s 1.1760 MOps/s $\textbf{\color{#35bf28}+22.30\%}$
test_membership_nested 57.0670μs 2.5624μs 390.2649 KOps/s 387.6283 KOps/s $\color{#35bf28}+0.68\%$
test_membership_nested_leaf 17.1320μs 2.6004μs 384.5574 KOps/s 381.7824 KOps/s $\color{#35bf28}+0.73\%$
test_membership_stacked_nested 46.1760μs 2.5614μs 390.4139 KOps/s 394.1541 KOps/s $\color{#d91a1a}-0.95\%$
test_membership_stacked_nested_leaf 24.6860μs 2.5610μs 390.4688 KOps/s 388.7209 KOps/s $\color{#35bf28}+0.45\%$
test_membership_nested_last 44.1820μs 3.7612μs 265.8747 KOps/s 266.6414 KOps/s $\color{#d91a1a}-0.29\%$
test_membership_nested_leaf_last 29.8560μs 3.7437μs 267.1174 KOps/s 267.2166 KOps/s $\color{#d91a1a}-0.04\%$
test_membership_stacked_nested_last 57.3260μs 7.7178μs 129.5711 KOps/s 189.2594 KOps/s $\textbf{\color{#d91a1a}-31.54\%}$
test_membership_stacked_nested_leaf_last 31.5490μs 7.8017μs 128.1774 KOps/s 187.9519 KOps/s $\textbf{\color{#d91a1a}-31.80\%}$
test_nested_getleaf 52.7780μs 10.7730μs 92.8246 KOps/s 92.7957 KOps/s $\color{#35bf28}+0.03\%$
test_nested_get 48.8010μs 10.0898μs 99.1096 KOps/s 97.9743 KOps/s $\color{#35bf28}+1.16\%$
test_stacked_getleaf 33.1120μs 10.7260μs 93.2313 KOps/s 94.2531 KOps/s $\color{#d91a1a}-1.08\%$
test_stacked_get 59.5210μs 10.1917μs 98.1191 KOps/s 98.8359 KOps/s $\color{#d91a1a}-0.73\%$
test_nested_getitemleaf 35.1260μs 11.0653μs 90.3725 KOps/s 90.7773 KOps/s $\color{#d91a1a}-0.45\%$
test_nested_getitem 55.6040μs 10.2512μs 97.5497 KOps/s 97.7960 KOps/s $\color{#d91a1a}-0.25\%$
test_stacked_getitemleaf 58.7090μs 11.0604μs 90.4125 KOps/s 91.8992 KOps/s $\color{#d91a1a}-1.62\%$
test_stacked_getitem 36.6590μs 10.1591μs 98.4342 KOps/s 98.0965 KOps/s $\color{#35bf28}+0.34\%$
test_lock_nested 84.3474ms 0.5605ms 1.7842 KOps/s 2.0924 KOps/s $\textbf{\color{#d91a1a}-14.73\%}$
test_lock_stack_nested 0.8018ms 0.4330ms 2.3094 KOps/s 2.2620 KOps/s $\color{#35bf28}+2.10\%$
test_unlock_nested 92.6162ms 0.4919ms 2.0331 KOps/s 2.5200 KOps/s $\textbf{\color{#d91a1a}-19.32\%}$
test_unlock_stack_nested 0.5360ms 0.3530ms 2.8330 KOps/s 2.7589 KOps/s $\color{#35bf28}+2.69\%$
test_flatten_speed 0.2158ms 0.1035ms 9.6577 KOps/s 9.5899 KOps/s $\color{#35bf28}+0.71\%$
test_unflatten_speed 0.9575ms 0.4595ms 2.1765 KOps/s 2.1959 KOps/s $\color{#d91a1a}-0.88\%$
test_common_ops 1.7717ms 1.0616ms 941.9634 Ops/s 889.9881 Ops/s $\textbf{\color{#35bf28}+5.84\%}$
test_creation 19.2160μs 2.0780μs 481.2275 KOps/s 494.1742 KOps/s $\color{#d91a1a}-2.62\%$
test_creation_empty 41.4480μs 17.6180μs 56.7602 KOps/s 53.4166 KOps/s $\textbf{\color{#35bf28}+6.26\%}$
test_creation_nested_1 49.6820μs 20.4867μs 48.8121 KOps/s 46.0095 KOps/s $\textbf{\color{#35bf28}+6.09\%}$
test_creation_nested_2 0.1038ms 24.8753μs 40.2005 KOps/s 38.1403 KOps/s $\textbf{\color{#35bf28}+5.40\%}$
test_clone 61.1940μs 16.4864μs 60.6559 KOps/s 59.3598 KOps/s $\color{#35bf28}+2.18\%$
test_getitem[int] 1.2877ms 16.1142μs 62.0572 KOps/s 62.4060 KOps/s $\color{#d91a1a}-0.56\%$
test_getitem[slice_int] 0.1315ms 29.2896μs 34.1418 KOps/s 34.0860 KOps/s $\color{#35bf28}+0.16\%$
test_getitem[range] 0.4260ms 56.7385μs 17.6247 KOps/s 17.2958 KOps/s $\color{#35bf28}+1.90\%$
test_getitem[tuple] 0.1615ms 24.3308μs 41.1001 KOps/s 40.8824 KOps/s $\color{#35bf28}+0.53\%$
test_getitem[list] 0.1962ms 50.7886μs 19.6895 KOps/s 18.8750 KOps/s $\color{#35bf28}+4.32\%$
test_setitem_dim[int] 76.5720μs 40.4747μs 24.7068 KOps/s 23.4155 KOps/s $\textbf{\color{#35bf28}+5.51\%}$
test_setitem_dim[slice_int] 0.1354ms 69.5277μs 14.3828 KOps/s 13.7628 KOps/s $\color{#35bf28}+4.50\%$
test_setitem_dim[range] 0.1601ms 92.7722μs 10.7791 KOps/s 10.4523 KOps/s $\color{#35bf28}+3.13\%$
test_setitem_dim[tuple] 0.1037ms 57.4528μs 17.4056 KOps/s 16.9919 KOps/s $\color{#35bf28}+2.43\%$
test_setitem 0.1374ms 28.6791μs 34.8685 KOps/s 32.6768 KOps/s $\textbf{\color{#35bf28}+6.71\%}$
test_set 0.1421ms 27.7901μs 35.9840 KOps/s 33.4772 KOps/s $\textbf{\color{#35bf28}+7.49\%}$
test_set_shared 1.1912ms 0.2077ms 4.8147 KOps/s 4.7010 KOps/s $\color{#35bf28}+2.42\%$
test_update 0.1925ms 33.9619μs 29.4447 KOps/s 27.0021 KOps/s $\textbf{\color{#35bf28}+9.05\%}$
test_update_nested 0.1602ms 44.4179μs 22.5134 KOps/s 21.0671 KOps/s $\textbf{\color{#35bf28}+6.87\%}$
test_update__nested 0.1554ms 32.9518μs 30.3473 KOps/s 28.4562 KOps/s $\textbf{\color{#35bf28}+6.65\%}$
test_set_nested 0.1421ms 30.6158μs 32.6629 KOps/s 30.7262 KOps/s $\textbf{\color{#35bf28}+6.30\%}$
test_set_nested_new 0.1604ms 34.5577μs 28.9371 KOps/s 27.0344 KOps/s $\textbf{\color{#35bf28}+7.04\%}$
test_select 0.1466ms 52.1139μs 19.1888 KOps/s 18.5801 KOps/s $\color{#35bf28}+3.28\%$
test_select_nested 0.1177ms 58.9244μs 16.9709 KOps/s 17.2266 KOps/s $\color{#d91a1a}-1.48\%$
test_exclude_nested 0.1495ms 74.3566μs 13.4487 KOps/s 13.7485 KOps/s $\color{#d91a1a}-2.18\%$
test_empty[True] 0.4567ms 0.3081ms 3.2460 KOps/s 3.2850 KOps/s $\color{#d91a1a}-1.19\%$
test_empty[False] 6.9588μs 1.1196μs 893.1597 KOps/s 875.9643 KOps/s $\color{#35bf28}+1.96\%$
test_unbind_speed 0.6441ms 0.2909ms 3.4377 KOps/s 3.3929 KOps/s $\color{#35bf28}+1.32\%$
test_unbind_speed_stack0 0.6391ms 0.2818ms 3.5488 KOps/s 3.4894 KOps/s $\color{#35bf28}+1.70\%$
test_unbind_speed_stack1 97.2726ms 0.7749ms 1.2904 KOps/s 1.4113 KOps/s $\textbf{\color{#d91a1a}-8.56\%}$
test_split 92.5389ms 2.1081ms 474.3705 Ops/s 471.7048 Ops/s $\color{#35bf28}+0.57\%$
test_chunk 3.1805ms 1.9457ms 513.9660 Ops/s 465.9159 Ops/s $\textbf{\color{#35bf28}+10.31\%}$
test_creation[device0] 0.2268ms 0.1150ms 8.6945 KOps/s 8.3123 KOps/s $\color{#35bf28}+4.60\%$
test_creation_from_tensor 0.2604ms 0.1151ms 8.6897 KOps/s 8.4566 KOps/s $\color{#35bf28}+2.76\%$
test_add_one[memmap_tensor0] 0.1993ms 7.0172μs 142.5073 KOps/s 134.9961 KOps/s $\textbf{\color{#35bf28}+5.56\%}$
test_contiguous[memmap_tensor0] 27.5720μs 1.9657μs 508.7366 KOps/s 520.5052 KOps/s $\color{#d91a1a}-2.26\%$
test_stack[memmap_tensor0] 49.3410μs 5.3989μs 185.2223 KOps/s 176.1673 KOps/s $\textbf{\color{#35bf28}+5.14\%}$
test_memmaptd_index 1.0741ms 0.3901ms 2.5635 KOps/s 2.5683 KOps/s $\color{#d91a1a}-0.19\%$
test_memmaptd_index_astensor 0.9354ms 0.4709ms 2.1237 KOps/s 2.1354 KOps/s $\color{#d91a1a}-0.55\%$
test_memmaptd_index_op 1.8138ms 1.0037ms 996.2743 Ops/s 973.1946 Ops/s $\color{#35bf28}+2.37\%$
test_serialize_model 0.1226s 0.1138s 8.7871 Ops/s 8.6556 Ops/s $\color{#35bf28}+1.52\%$
test_serialize_model_pickle 0.4732s 0.3985s 2.5097 Ops/s 2.5117 Ops/s $\color{#d91a1a}-0.08\%$
test_serialize_weights 0.1237s 0.1139s 8.7810 Ops/s 8.5989 Ops/s $\color{#35bf28}+2.12\%$
test_serialize_weights_returnearly 0.1869s 0.1621s 6.1679 Ops/s 6.4022 Ops/s $\color{#d91a1a}-3.66\%$
test_serialize_weights_pickle 1.0885s 0.7444s 1.3435 Ops/s 2.5084 Ops/s $\textbf{\color{#d91a1a}-46.44\%}$
test_serialize_weights_filesystem 0.1473s 0.1405s 7.1158 Ops/s 6.9298 Ops/s $\color{#35bf28}+2.68\%$
test_serialize_model_filesystem 0.1451s 0.1398s 7.1526 Ops/s 6.0092 Ops/s $\textbf{\color{#35bf28}+19.03\%}$
test_reshape_pytree 85.6190μs 37.8718μs 26.4049 KOps/s 25.7362 KOps/s $\color{#35bf28}+2.60\%$
test_reshape_td 98.3630μs 43.9911μs 22.7319 KOps/s 22.3848 KOps/s $\color{#35bf28}+1.55\%$
test_view_pytree 0.1245ms 38.4217μs 26.0269 KOps/s 26.2002 KOps/s $\color{#d91a1a}-0.66\%$
test_view_td 0.1074ms 49.6384μs 20.1457 KOps/s 19.9064 KOps/s $\color{#35bf28}+1.20\%$
test_unbind_pytree 0.1151ms 34.9423μs 28.6186 KOps/s 28.1647 KOps/s $\color{#35bf28}+1.61\%$
test_unbind_td 0.2923ms 42.9722μs 23.2709 KOps/s 22.6437 KOps/s $\color{#35bf28}+2.77\%$
test_split_pytree 83.8760μs 37.6799μs 26.5393 KOps/s 26.5722 KOps/s $\color{#d91a1a}-0.12\%$
test_split_td 0.4432ms 55.1250μs 18.1406 KOps/s 17.4549 KOps/s $\color{#35bf28}+3.93\%$
test_add_pytree 90.7590μs 44.5197μs 22.4620 KOps/s 22.5277 KOps/s $\color{#d91a1a}-0.29\%$
test_add_td 0.2284ms 85.8752μs 11.6448 KOps/s 11.8737 KOps/s $\color{#d91a1a}-1.93\%$
test_compile_add_one_nested[tensordict-compile] 0.1257ms 56.6921μs 17.6391 KOps/s 17.7544 KOps/s $\color{#d91a1a}-0.65\%$
test_compile_add_one_nested[tensordict-eager] 0.3399ms 0.1854ms 5.3927 KOps/s 5.4037 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_add_one_nested[pytree-compile] 0.1317ms 56.6826μs 17.6421 KOps/s 17.8929 KOps/s $\color{#d91a1a}-1.40\%$
test_compile_add_one_nested[pytree-eager] 0.2861ms 0.1397ms 7.1600 KOps/s 7.0830 KOps/s $\color{#35bf28}+1.09\%$
test_compile_copy_nested[tensordict-compile] 61.2340μs 21.2160μs 47.1342 KOps/s 47.5739 KOps/s $\color{#d91a1a}-0.92\%$
test_compile_copy_nested[tensordict-eager] 0.1806ms 68.2448μs 14.6531 KOps/s 15.3446 KOps/s $\color{#d91a1a}-4.51\%$
test_compile_copy_nested[pytree-compile] 0.1570ms 75.0825μs 13.3187 KOps/s 13.5801 KOps/s $\color{#d91a1a}-1.93\%$
test_compile_copy_nested[pytree-eager] 0.1348ms 67.8194μs 14.7450 KOps/s 14.6057 KOps/s $\color{#35bf28}+0.95\%$
test_compile_add_one_flat[tensordict-compile] 0.3079ms 0.1722ms 5.8073 KOps/s 5.7774 KOps/s $\color{#35bf28}+0.52\%$
test_compile_add_one_flat[tensordict-eager] 0.3177ms 0.1897ms 5.2709 KOps/s 5.3177 KOps/s $\color{#d91a1a}-0.88\%$
test_compile_add_one_flat[tensorclass-compile] 0.1106ms 42.8869μs 23.3171 KOps/s 24.4666 KOps/s $\color{#d91a1a}-4.70\%$
test_compile_add_one_flat[tensorclass-eager] 0.2375ms 68.2506μs 14.6519 KOps/s 13.9383 KOps/s $\textbf{\color{#35bf28}+5.12\%}$
test_compile_add_one_flat[pytree-compile] 0.2826ms 0.1741ms 5.7435 KOps/s 5.7904 KOps/s $\color{#d91a1a}-0.81\%$
test_compile_add_one_flat[pytree-eager] 0.6329ms 0.2836ms 3.5264 KOps/s 3.4234 KOps/s $\color{#35bf28}+3.01\%$
test_compile_add_self_flat[tensordict-eager] 0.3418ms 0.2008ms 4.9810 KOps/s 4.9916 KOps/s $\color{#d91a1a}-0.21\%$
test_compile_add_self_flat[tensordict-compile] 0.3558ms 0.1742ms 5.7421 KOps/s 5.7044 KOps/s $\color{#35bf28}+0.66\%$
test_compile_add_self_flat[tensorclass-eager] 0.1254ms 60.8140μs 16.4436 KOps/s 15.9498 KOps/s $\color{#35bf28}+3.10\%$
test_compile_add_self_flat[tensorclass-compile] 97.7820μs 41.8546μs 23.8922 KOps/s 23.4791 KOps/s $\color{#35bf28}+1.76\%$
test_compile_add_self_flat[pytree-eager] 0.3978ms 0.2292ms 4.3628 KOps/s 4.2541 KOps/s $\color{#35bf28}+2.56\%$
test_compile_add_self_flat[pytree-compile] 0.3770ms 0.1742ms 5.7414 KOps/s 5.7433 KOps/s $\color{#d91a1a}-0.03\%$
test_compile_copy_flat[tensordict-compile] 0.2814ms 0.1026ms 9.7436 KOps/s 9.8949 KOps/s $\color{#d91a1a}-1.53\%$
test_compile_copy_flat[tensordict-eager] 0.1238ms 59.3882μs 16.8384 KOps/s 17.3972 KOps/s $\color{#d91a1a}-3.21\%$
test_compile_copy_flat[pytree-compile] 0.1366ms 76.1852μs 13.1259 KOps/s 13.3092 KOps/s $\color{#d91a1a}-1.38\%$
test_compile_copy_flat[pytree-eager] 0.1427ms 68.5840μs 14.5807 KOps/s 14.8653 KOps/s $\color{#d91a1a}-1.91\%$
test_compile_assign_and_add[tensordict-compile] 0.2713ms 0.1920ms 5.2082 KOps/s 5.1272 KOps/s $\color{#35bf28}+1.58\%$
test_compile_assign_and_add[tensordict-eager] 2.8136ms 1.6764ms 596.5160 Ops/s 598.0294 Ops/s $\color{#d91a1a}-0.25\%$
test_compile_assign_and_add[pytree-compile] 0.3703ms 0.1928ms 5.1862 KOps/s 5.1488 KOps/s $\color{#35bf28}+0.73\%$
test_compile_assign_and_add[pytree-eager] 1.3206ms 1.0787ms 927.0548 Ops/s 876.4051 Ops/s $\textbf{\color{#35bf28}+5.78\%}$
test_compile_assign_and_add_stack[compile] 0.5122ms 0.4197ms 2.3827 KOps/s 2.3621 KOps/s $\color{#35bf28}+0.87\%$
test_compile_assign_and_add_stack[eager] 3.9364ms 3.6559ms 273.5286 Ops/s 257.0175 Ops/s $\textbf{\color{#35bf28}+6.42\%}$
test_compile_indexing[tensor-tensordict-compile] 96.3300μs 34.6449μs 28.8643 KOps/s 29.0388 KOps/s $\color{#d91a1a}-0.60\%$
test_compile_indexing[tensor-tensordict-eager] 0.7203ms 46.0178μs 21.7307 KOps/s 21.2441 KOps/s $\color{#35bf28}+2.29\%$
test_compile_indexing[tensor-tensorclass-compile] 74.3280μs 28.7141μs 34.8261 KOps/s 34.7307 KOps/s $\color{#35bf28}+0.27\%$
test_compile_indexing[tensor-tensorclass-eager] 82.5640μs 27.3659μs 36.5418 KOps/s 35.7886 KOps/s $\color{#35bf28}+2.10\%$
test_compile_indexing[tensor-pytree-compile] 80.6410μs 28.7168μs 34.8229 KOps/s 34.3185 KOps/s $\color{#35bf28}+1.47\%$
test_compile_indexing[tensor-pytree-eager] 73.3260μs 27.2981μs 36.6326 KOps/s 35.4352 KOps/s $\color{#35bf28}+3.38\%$
test_compile_indexing[slice-tensordict-compile] 0.1367ms 72.7728μs 13.7414 KOps/s 13.5515 KOps/s $\color{#35bf28}+1.40\%$
test_compile_indexing[slice-tensordict-eager] 0.5326ms 26.3956μs 37.8851 KOps/s 35.7127 KOps/s $\textbf{\color{#35bf28}+6.08\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1498ms 66.5166μs 15.0338 KOps/s 14.8123 KOps/s $\color{#35bf28}+1.50\%$
test_compile_indexing[slice-tensorclass-eager] 73.8680μs 22.8772μs 43.7117 KOps/s 43.7096 KOps/s $+0.00\%$
test_compile_indexing[slice-pytree-compile] 0.1472ms 66.4547μs 15.0479 KOps/s 14.7827 KOps/s $\color{#35bf28}+1.79\%$
test_compile_indexing[slice-pytree-eager] 66.3140μs 22.8312μs 43.7998 KOps/s 43.6431 KOps/s $\color{#35bf28}+0.36\%$
test_compile_indexing[int-tensordict-compile] 0.1759ms 72.3624μs 13.8193 KOps/s 13.8136 KOps/s $\color{#35bf28}+0.04\%$
test_compile_indexing[int-tensordict-eager] 0.9729ms 26.3668μs 37.9266 KOps/s 36.5549 KOps/s $\color{#35bf28}+3.75\%$
test_compile_indexing[int-tensorclass-compile] 0.1290ms 65.8332μs 15.1899 KOps/s 14.7307 KOps/s $\color{#35bf28}+3.12\%$
test_compile_indexing[int-tensorclass-eager] 65.2420μs 22.7201μs 44.0138 KOps/s 44.3869 KOps/s $\color{#d91a1a}-0.84\%$
test_compile_indexing[int-pytree-compile] 0.1449ms 65.8311μs 15.1904 KOps/s 14.6673 KOps/s $\color{#35bf28}+3.57\%$
test_compile_indexing[int-pytree-eager] 58.9200μs 22.5642μs 44.3181 KOps/s 44.2082 KOps/s $\color{#35bf28}+0.25\%$
test_mod_add[eager] 91.2690μs 23.9681μs 41.7222 KOps/s 40.5845 KOps/s $\color{#35bf28}+2.80\%$
test_mod_add[compile] 95.5680μs 38.4926μs 25.9790 KOps/s 25.3099 KOps/s $\color{#35bf28}+2.64\%$
test_mod_add[compile-overhead] 86.5810μs 39.4629μs 25.3403 KOps/s 25.8750 KOps/s $\color{#d91a1a}-2.07\%$
test_mod_wrap[eager] 0.4200ms 0.2065ms 4.8438 KOps/s 4.8433 KOps/s $+0.01\%$
test_mod_wrap[compile] 0.4630ms 0.2283ms 4.3805 KOps/s 4.2791 KOps/s $\color{#35bf28}+2.37\%$
test_mod_wrap[compile-overhead] 0.4310ms 0.2272ms 4.4007 KOps/s 4.3363 KOps/s $\color{#35bf28}+1.48\%$
test_mod_wrap_and_backward[eager] 13.5234ms 11.3194ms 88.3440 Ops/s 86.6768 Ops/s $\color{#35bf28}+1.92\%$
test_mod_wrap_and_backward[compile] 18.3699ms 11.3735ms 87.9240 Ops/s 81.9887 Ops/s $\textbf{\color{#35bf28}+7.24\%}$
test_mod_wrap_and_backward[compile-overhead] 12.9051ms 11.6189ms 86.0669 Ops/s 78.7318 Ops/s $\textbf{\color{#35bf28}+9.32\%}$
test_seq_add[eager] 0.1994ms 87.6922μs 11.4035 KOps/s 11.2252 KOps/s $\color{#35bf28}+1.59\%$
test_seq_add[compile] 0.1340ms 62.3279μs 16.0442 KOps/s 15.5756 KOps/s $\color{#35bf28}+3.01\%$
test_seq_add[compile-overhead] 0.1223ms 60.6914μs 16.4768 KOps/s 16.0220 KOps/s $\color{#35bf28}+2.84\%$
test_seq_wrap[eager] 0.7394ms 0.3811ms 2.6237 KOps/s 2.6420 KOps/s $\color{#d91a1a}-0.69\%$
test_seq_wrap[compile] 0.4045ms 0.2605ms 3.8385 KOps/s 3.7226 KOps/s $\color{#35bf28}+3.11\%$
test_seq_wrap[compile-overhead] 0.3859ms 0.2620ms 3.8166 KOps/s 3.7343 KOps/s $\color{#35bf28}+2.20\%$
test_func_call_runtime[False-eager] 0.7144ms 0.5166ms 1.9357 KOps/s 1.9488 KOps/s $\color{#d91a1a}-0.67\%$
test_func_call_runtime[False-compile] 0.6222ms 0.4892ms 2.0443 KOps/s 2.0319 KOps/s $\color{#35bf28}+0.61\%$
test_func_call_runtime[False-compile-overhead] 0.9819ms 0.4885ms 2.0472 KOps/s 2.0123 KOps/s $\color{#35bf28}+1.73\%$
test_func_call_runtime[True-eager] 0.8687ms 0.7268ms 1.3760 KOps/s 1.3612 KOps/s $\color{#35bf28}+1.08\%$
test_func_call_runtime[True-compile] 0.6517ms 0.5081ms 1.9681 KOps/s 1.9740 KOps/s $\color{#d91a1a}-0.30\%$
test_func_call_runtime[True-compile-overhead] 0.6257ms 0.5042ms 1.9834 KOps/s 1.9655 KOps/s $\color{#35bf28}+0.91\%$
test_func_call_cm_runtime[False-eager] 0.9788ms 0.5149ms 1.9422 KOps/s 1.9501 KOps/s $\color{#d91a1a}-0.40\%$
test_func_call_cm_runtime[False-compile] 0.7754ms 0.4923ms 2.0311 KOps/s 2.0250 KOps/s $\color{#35bf28}+0.30\%$
test_func_call_cm_runtime[False-compile-overhead] 0.8255ms 0.4899ms 2.0413 KOps/s 2.0309 KOps/s $\color{#35bf28}+0.51\%$
test_func_call_cm_runtime[True-eager] 1.4115ms 0.8510ms 1.1750 KOps/s 1.1585 KOps/s $\color{#35bf28}+1.43\%$
test_func_call_cm_runtime[True-compile] 1.2110ms 0.7348ms 1.3610 KOps/s 1.3656 KOps/s $\color{#d91a1a}-0.34\%$
test_func_call_cm_runtime[True-compile-overhead] 0.8881ms 0.7351ms 1.3603 KOps/s 1.3603 KOps/s $-0.00\%$
test_vmap_func_call_cm_runtime[eager] 2.5132ms 1.8153ms 550.8692 Ops/s 537.8812 Ops/s $\color{#35bf28}+2.41\%$
test_vmap_func_call_cm_runtime[compile] 2.4455ms 1.8630ms 536.7787 Ops/s 523.2766 Ops/s $\color{#35bf28}+2.58\%$
test_vmap_func_call_cm_runtime[compile-overhead] 2.5852ms 1.8658ms 535.9588 Ops/s 524.2920 Ops/s $\color{#35bf28}+2.23\%$
test_distributed 0.2680ms 0.1256ms 7.9622 KOps/s 7.8551 KOps/s $\color{#35bf28}+1.36\%$
test_tdmodule 42.3490μs 16.3598μs 61.1256 KOps/s 56.1962 KOps/s $\textbf{\color{#35bf28}+8.77\%}$
test_tdmodule_dispatch 58.9890μs 34.1283μs 29.3012 KOps/s 27.2697 KOps/s $\textbf{\color{#35bf28}+7.45\%}$
test_tdseq 37.2300μs 19.2893μs 51.8422 KOps/s 47.1549 KOps/s $\textbf{\color{#35bf28}+9.94\%}$
test_tdseq_dispatch 62.8570μs 38.4266μs 26.0236 KOps/s 23.6605 KOps/s $\textbf{\color{#35bf28}+9.99\%}$
test_instantiation_functorch 1.7228ms 1.5620ms 640.1936 Ops/s 640.3391 Ops/s $\color{#d91a1a}-0.02\%$
test_instantiation_td 1.8786ms 1.1485ms 870.7309 Ops/s 858.4039 Ops/s $\color{#35bf28}+1.44\%$
test_exec_functorch 0.2526ms 0.1811ms 5.5217 KOps/s 5.3173 KOps/s $\color{#35bf28}+3.84\%$
test_exec_functional_call 0.3298ms 0.1711ms 5.8441 KOps/s 5.7952 KOps/s $\color{#35bf28}+0.84\%$
test_exec_td 0.3121ms 0.1607ms 6.2224 KOps/s 5.8066 KOps/s $\textbf{\color{#35bf28}+7.16\%}$
test_exec_td_decorator 1.3913ms 0.2162ms 4.6254 KOps/s 4.5143 KOps/s $\color{#35bf28}+2.46\%$
test_vmap_mlp_speed[True-True] 0.9966ms 0.6269ms 1.5951 KOps/s 1.5620 KOps/s $\color{#35bf28}+2.12\%$
test_vmap_mlp_speed[True-False] 0.9967ms 0.6259ms 1.5976 KOps/s 1.5750 KOps/s $\color{#35bf28}+1.44\%$
test_vmap_mlp_speed[False-True] 0.7470ms 0.4859ms 2.0579 KOps/s 2.0095 KOps/s $\color{#35bf28}+2.41\%$
test_vmap_mlp_speed[False-False] 0.7620ms 0.4861ms 2.0573 KOps/s 2.0261 KOps/s $\color{#35bf28}+1.54\%$
test_vmap_mlp_speed_decorator[True-True] 1.4522ms 0.6029ms 1.6588 KOps/s 1.6077 KOps/s $\color{#35bf28}+3.18\%$
test_vmap_mlp_speed_decorator[True-False] 0.9287ms 0.6045ms 1.6542 KOps/s 1.6216 KOps/s $\color{#35bf28}+2.01\%$
test_vmap_mlp_speed_decorator[False-True] 0.7259ms 0.4958ms 2.0168 KOps/s 1.9702 KOps/s $\color{#35bf28}+2.36\%$
test_vmap_mlp_speed_decorator[False-False] 0.7738ms 0.4962ms 2.0152 KOps/s 1.9085 KOps/s $\textbf{\color{#35bf28}+5.60\%}$
test_to_module_speed[True] 2.0709ms 1.2813ms 780.4787 Ops/s 781.8946 Ops/s $\color{#d91a1a}-0.18\%$
test_to_module_speed[False] 1.7846ms 1.2413ms 805.6262 Ops/s 809.5953 Ops/s $\color{#d91a1a}-0.49\%$
test_tc_init 90.1970μs 40.9940μs 24.3938 KOps/s 23.2404 KOps/s $\color{#35bf28}+4.96\%$
test_tc_init_nested 0.1615ms 82.3116μs 12.1490 KOps/s 11.8072 KOps/s $\color{#35bf28}+2.89\%$
test_tc_first_layer_tensor 17.0720μs 1.5520μs 644.3383 KOps/s 664.0841 KOps/s $\color{#d91a1a}-2.97\%$
test_tc_first_layer_nontensor 28.0220μs 4.7784μs 209.2759 KOps/s 216.5307 KOps/s $\color{#d91a1a}-3.35\%$
test_tc_second_layer_tensor 30.4170μs 2.8899μs 346.0375 KOps/s 360.7911 KOps/s $\color{#d91a1a}-4.09\%$
test_tc_second_layer_nontensor 35.1350μs 6.1068μs 163.7520 KOps/s 167.2705 KOps/s $\color{#d91a1a}-2.10\%$
test_unbind 7.7071ms 7.2453ms 138.0203 Ops/s 77.1217 Ops/s $\textbf{\color{#35bf28}+78.96\%}$
test_full_like 19.8340ms 11.7669ms 84.9840 Ops/s 143.6220 Ops/s $\textbf{\color{#d91a1a}-40.83\%}$
test_zeros_like 15.4499ms 7.2307ms 138.2995 Ops/s 376.9328 Ops/s $\textbf{\color{#d91a1a}-63.31\%}$
test_ones_like 16.0450ms 7.5102ms 133.1529 Ops/s 307.7219 Ops/s $\textbf{\color{#d91a1a}-56.73\%}$
test_clone 15.9288ms 8.8910ms 112.4731 Ops/s 198.1230 Ops/s $\textbf{\color{#d91a1a}-43.23\%}$
test_squeeze 59.5410μs 11.8390μs 84.4668 KOps/s 81.6620 KOps/s $\color{#35bf28}+3.43\%$
test_unsqueeze 0.1957ms 90.0215μs 11.1085 KOps/s 11.0503 KOps/s $\color{#35bf28}+0.53\%$
test_split 0.4724ms 0.1871ms 5.3457 KOps/s 5.1737 KOps/s $\color{#35bf28}+3.32\%$
test_permute 0.3813ms 0.2199ms 4.5480 KOps/s 4.5628 KOps/s $\color{#d91a1a}-0.33\%$
test_stack 32.3506ms 24.1939ms 41.3327 Ops/s 40.5624 Ops/s $\color{#35bf28}+1.90\%$
test_cat 29.5299ms 23.5464ms 42.4693 Ops/s 40.1336 Ops/s $\textbf{\color{#35bf28}+5.82\%}$

Copy link

github-actions bot commented Sep 2, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}37$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.6072ms 13.2161μs 75.6654 KOps/s 68.4079 KOps/s $\textbf{\color{#35bf28}+10.61\%}$
test_plain_set_stack_nested 36.3110μs 13.1081μs 76.2884 KOps/s 68.6673 KOps/s $\textbf{\color{#35bf28}+11.10\%}$
test_plain_set_nested_inplace 48.9810μs 14.0501μs 71.1737 KOps/s 64.4046 KOps/s $\textbf{\color{#35bf28}+10.51\%}$
test_plain_set_stack_nested_inplace 57.0110μs 13.9800μs 71.5309 KOps/s 63.4362 KOps/s $\textbf{\color{#35bf28}+12.76\%}$
test_items 30.8000μs 2.8456μs 351.4138 KOps/s 351.1814 KOps/s $\color{#35bf28}+0.07\%$
test_items_nested 0.3436ms 0.3122ms 3.2034 KOps/s 3.1545 KOps/s $\color{#35bf28}+1.55\%$
test_items_nested_locked 0.4091ms 0.3170ms 3.1546 KOps/s 3.1593 KOps/s $\color{#d91a1a}-0.15\%$
test_items_nested_leaf 98.5730μs 63.2314μs 15.8149 KOps/s 15.9364 KOps/s $\color{#d91a1a}-0.76\%$
test_items_stack_nested 0.3677ms 0.3157ms 3.1680 KOps/s 3.1806 KOps/s $\color{#d91a1a}-0.40\%$
test_items_stack_nested_leaf 88.3820μs 63.0284μs 15.8659 KOps/s 15.6403 KOps/s $\color{#35bf28}+1.44\%$
test_items_stack_nested_locked 0.3653ms 0.3164ms 3.1610 KOps/s 3.1541 KOps/s $\color{#35bf28}+0.22\%$
test_keys 37.6310μs 3.3670μs 297.0023 KOps/s 296.0093 KOps/s $\color{#35bf28}+0.34\%$
test_keys_nested 89.3120μs 55.0754μs 18.1569 KOps/s 18.0804 KOps/s $\color{#35bf28}+0.42\%$
test_keys_nested_locked 0.8166ms 59.3678μs 16.8441 KOps/s 16.7150 KOps/s $\color{#35bf28}+0.77\%$
test_keys_nested_leaf 86.7020μs 46.2114μs 21.6397 KOps/s 22.4212 KOps/s $\color{#d91a1a}-3.49\%$
test_keys_stack_nested 90.5420μs 54.3489μs 18.3996 KOps/s 18.2306 KOps/s $\color{#35bf28}+0.93\%$
test_keys_stack_nested_leaf 82.5120μs 46.8954μs 21.3240 KOps/s 20.9813 KOps/s $\color{#35bf28}+1.63\%$
test_keys_stack_nested_locked 89.3520μs 58.5948μs 17.0664 KOps/s 16.8183 KOps/s $\color{#35bf28}+1.48\%$
test_values 4.3400μs 0.8011μs 1.2483 MOps/s 1.2485 MOps/s $-0.01\%$
test_values_nested 0.1501ms 27.2043μs 36.7589 KOps/s 36.8266 KOps/s $\color{#d91a1a}-0.18\%$
test_values_nested_locked 0.1041ms 29.4658μs 33.9376 KOps/s 34.5887 KOps/s $\color{#d91a1a}-1.88\%$
test_values_nested_leaf 50.4510μs 24.1675μs 41.3779 KOps/s 41.6778 KOps/s $\color{#d91a1a}-0.72\%$
test_values_stack_nested 61.5410μs 28.1002μs 35.5869 KOps/s 35.4205 KOps/s $\color{#35bf28}+0.47\%$
test_values_stack_nested_leaf 0.2166ms 24.5866μs 40.6725 KOps/s 40.0861 KOps/s $\color{#35bf28}+1.46\%$
test_values_stack_nested_locked 0.2255ms 29.9074μs 33.4365 KOps/s 33.2617 KOps/s $\color{#35bf28}+0.53\%$
test_membership 2.2371μs 0.4762μs 2.1001 MOps/s 2.1348 MOps/s $\color{#d91a1a}-1.63\%$
test_membership_nested 13.9805μs 1.7555μs 569.6494 KOps/s 577.7176 KOps/s $\color{#d91a1a}-1.40\%$
test_membership_nested_leaf 29.3573μs 1.6998μs 588.2995 KOps/s 576.9035 KOps/s $\color{#35bf28}+1.98\%$
test_membership_stacked_nested 37.2610μs 1.7761μs 563.0375 KOps/s 564.6706 KOps/s $\color{#d91a1a}-0.29\%$
test_membership_stacked_nested_leaf 22.2100μs 1.7725μs 564.1730 KOps/s 561.9916 KOps/s $\color{#35bf28}+0.39\%$
test_membership_nested_last 27.8600μs 2.6132μs 382.6661 KOps/s 383.3116 KOps/s $\color{#d91a1a}-0.17\%$
test_membership_nested_leaf_last 28.5210μs 2.5969μs 385.0685 KOps/s 380.5323 KOps/s $\color{#35bf28}+1.19\%$
test_membership_stacked_nested_last 29.6310μs 7.5128μs 133.1057 KOps/s 387.0979 KOps/s $\textbf{\color{#d91a1a}-65.61\%}$
test_membership_stacked_nested_leaf_last 35.0410μs 7.5660μs 132.1707 KOps/s 378.4749 KOps/s $\textbf{\color{#d91a1a}-65.08\%}$
test_nested_getleaf 36.9910μs 6.2072μs 161.1029 KOps/s 163.7071 KOps/s $\color{#d91a1a}-1.59\%$
test_nested_get 35.3110μs 5.7967μs 172.5120 KOps/s 173.5751 KOps/s $\color{#d91a1a}-0.61\%$
test_stacked_getleaf 35.3110μs 6.0878μs 164.2627 KOps/s 166.0093 KOps/s $\color{#d91a1a}-1.05\%$
test_stacked_get 34.1910μs 5.7376μs 174.2886 KOps/s 176.2237 KOps/s $\color{#d91a1a}-1.10\%$
test_nested_getitemleaf 26.2100μs 6.2523μs 159.9403 KOps/s 164.0315 KOps/s $\color{#d91a1a}-2.49\%$
test_nested_getitem 0.1934ms 5.8194μs 171.8397 KOps/s 173.2113 KOps/s $\color{#d91a1a}-0.79\%$
test_stacked_getitemleaf 41.0210μs 6.1596μs 162.3490 KOps/s 165.5694 KOps/s $\color{#d91a1a}-1.95\%$
test_stacked_getitem 28.6600μs 5.6899μs 175.7509 KOps/s 177.4107 KOps/s $\color{#d91a1a}-0.94\%$
test_lock_nested 1.2135ms 0.4095ms 2.4420 KOps/s 2.4337 KOps/s $\color{#35bf28}+0.34\%$
test_lock_stack_nested 0.4986ms 0.3659ms 2.7332 KOps/s 2.6624 KOps/s $\color{#35bf28}+2.66\%$
test_unlock_nested 0.7512ms 0.3520ms 2.8409 KOps/s 2.8420 KOps/s $\color{#d91a1a}-0.04\%$
test_unlock_stack_nested 0.4737ms 0.3056ms 3.2718 KOps/s 3.1605 KOps/s $\color{#35bf28}+3.52\%$
test_flatten_speed 0.1646ms 79.2347μs 12.6207 KOps/s 12.7295 KOps/s $\color{#d91a1a}-0.85\%$
test_unflatten_speed 0.3100ms 0.2812ms 3.5563 KOps/s 3.6054 KOps/s $\color{#d91a1a}-1.36\%$
test_common_ops 1.5070ms 1.1974ms 835.1521 Ops/s 809.4717 Ops/s $\color{#35bf28}+3.17\%$
test_creation 21.5010μs 1.4785μs 676.3631 KOps/s 685.2129 KOps/s $\color{#d91a1a}-1.29\%$
test_creation_empty 0.7672ms 13.4055μs 74.5963 KOps/s 59.2329 KOps/s $\textbf{\color{#35bf28}+25.94\%}$
test_creation_nested_1 0.1032ms 15.2753μs 65.4653 KOps/s 53.6020 KOps/s $\textbf{\color{#35bf28}+22.13\%}$
test_creation_nested_2 47.9810μs 17.8282μs 56.0909 KOps/s 47.5734 KOps/s $\textbf{\color{#35bf28}+17.90\%}$
test_clone 0.1815ms 28.3602μs 35.2607 KOps/s 35.3438 KOps/s $\color{#d91a1a}-0.23\%$
test_getitem[int] 1.0398ms 15.8084μs 63.2575 KOps/s 65.2209 KOps/s $\color{#d91a1a}-3.01\%$
test_getitem[slice_int] 0.1434ms 26.3197μs 37.9944 KOps/s 37.7490 KOps/s $\color{#35bf28}+0.65\%$
test_getitem[range] 0.2949ms 0.1092ms 9.1615 KOps/s 9.6199 KOps/s $\color{#d91a1a}-4.76\%$
test_getitem[tuple] 0.1040s 29.6286μs 33.7512 KOps/s 44.7243 KOps/s $\textbf{\color{#d91a1a}-24.54\%}$
test_getitem[list] 0.2585ms 95.8082μs 10.4375 KOps/s 10.4691 KOps/s $\color{#d91a1a}-0.30\%$
test_setitem_dim[int] 0.1810ms 47.2325μs 21.1719 KOps/s 19.7480 KOps/s $\textbf{\color{#35bf28}+7.21\%}$
test_setitem_dim[slice_int] 97.7720μs 70.7853μs 14.1272 KOps/s 13.5847 KOps/s $\color{#35bf28}+3.99\%$
test_setitem_dim[range] 0.3146ms 0.1310ms 7.6313 KOps/s 7.5451 KOps/s $\color{#35bf28}+1.14\%$
test_setitem_dim[tuple] 0.2037ms 64.2059μs 15.5749 KOps/s 14.8295 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_setitem 0.1866ms 38.8292μs 25.7538 KOps/s 24.4427 KOps/s $\textbf{\color{#35bf28}+5.36\%}$
test_set 0.2377ms 37.2510μs 26.8449 KOps/s 25.2673 KOps/s $\textbf{\color{#35bf28}+6.24\%}$
test_set_shared 0.3464ms 49.2556μs 20.3023 KOps/s 20.1157 KOps/s $\color{#35bf28}+0.93\%$
test_update 0.2438ms 44.5160μs 22.4638 KOps/s 20.2961 KOps/s $\textbf{\color{#35bf28}+10.68\%}$
test_update_nested 0.2004ms 50.9683μs 19.6200 KOps/s 17.7534 KOps/s $\textbf{\color{#35bf28}+10.51\%}$
test_update__nested 0.2069ms 56.1181μs 17.8196 KOps/s 17.4142 KOps/s $\color{#35bf28}+2.33\%$
test_set_nested 0.2056ms 39.1830μs 25.5213 KOps/s 23.5206 KOps/s $\textbf{\color{#35bf28}+8.51\%}$
test_set_nested_new 0.1931ms 42.2977μs 23.6420 KOps/s 22.0390 KOps/s $\textbf{\color{#35bf28}+7.27\%}$
test_select 0.2083ms 55.7178μs 17.9476 KOps/s 16.8731 KOps/s $\textbf{\color{#35bf28}+6.37\%}$
test_select_nested 0.4767ms 43.1273μs 23.1872 KOps/s 24.0721 KOps/s $\color{#d91a1a}-3.68\%$
test_exclude_nested 0.1003ms 59.3833μs 16.8397 KOps/s 16.8220 KOps/s $\color{#35bf28}+0.11\%$
test_empty[True] 0.3661ms 0.2417ms 4.1377 KOps/s 4.1509 KOps/s $\color{#d91a1a}-0.32\%$
test_empty[False] 3.1651μs 0.7112μs 1.4061 MOps/s 1.3938 MOps/s $\color{#35bf28}+0.88\%$
test_to 0.1901ms 23.0867μs 43.3150 KOps/s 39.6802 KOps/s $\textbf{\color{#35bf28}+9.16\%}$
test_to_nonblocking 0.2079ms 23.3955μs 42.7433 KOps/s 42.1385 KOps/s $\color{#35bf28}+1.44\%$
test_unbind_speed 0.3485ms 0.2749ms 3.6374 KOps/s 3.6468 KOps/s $\color{#d91a1a}-0.26\%$
test_unbind_speed_stack0 0.4506ms 0.2606ms 3.8377 KOps/s 3.6186 KOps/s $\textbf{\color{#35bf28}+6.05\%}$
test_unbind_speed_stack1 0.7874ms 0.6208ms 1.6108 KOps/s 1.4332 KOps/s $\textbf{\color{#35bf28}+12.39\%}$
test_split 0.1021s 2.1639ms 462.1203 Ops/s 471.5261 Ops/s $\color{#d91a1a}-1.99\%$
test_chunk 0.1075s 2.1933ms 455.9396 Ops/s 470.3793 Ops/s $\color{#d91a1a}-3.07\%$
test_creation[device0] 0.3992ms 0.1308ms 7.6452 KOps/s 7.9809 KOps/s $\color{#d91a1a}-4.21\%$
test_creation_from_tensor 0.4770ms 0.1315ms 7.6069 KOps/s 7.8747 KOps/s $\color{#d91a1a}-3.40\%$
test_add_one[memmap_tensor0] 0.1374ms 8.3269μs 120.0927 KOps/s 113.3581 KOps/s $\textbf{\color{#35bf28}+5.94\%}$
test_contiguous[memmap_tensor0] 39.1810μs 2.1557μs 463.8936 KOps/s 468.4915 KOps/s $\color{#d91a1a}-0.98\%$
test_stack[memmap_tensor0] 45.6710μs 6.3729μs 156.9150 KOps/s 154.0970 KOps/s $\color{#35bf28}+1.83\%$
test_memmaptd_index 1.1364ms 0.4107ms 2.4347 KOps/s 2.4406 KOps/s $\color{#d91a1a}-0.24\%$
test_memmaptd_index_astensor 0.9053ms 0.4670ms 2.1414 KOps/s 2.1515 KOps/s $\color{#d91a1a}-0.47\%$
test_memmaptd_index_op 1.3444ms 0.9528ms 1.0495 KOps/s 981.2824 Ops/s $\textbf{\color{#35bf28}+6.95\%}$
test_serialize_model 0.1308s 0.1300s 7.6894 Ops/s 7.6927 Ops/s $\color{#d91a1a}-0.04\%$
test_serialize_model_pickle 1.3475s 1.2138s 0.8239 Ops/s 0.8240 Ops/s $\color{#d91a1a}-0.02\%$
test_serialize_weights 0.2312s 0.1439s 6.9494 Ops/s 6.9192 Ops/s $\color{#35bf28}+0.44\%$
test_serialize_weights_returnearly 0.2546s 57.6230ms 17.3542 Ops/s 17.9120 Ops/s $\color{#d91a1a}-3.11\%$
test_serialize_weights_pickle 1.3471s 1.2126s 0.8247 Ops/s 0.8212 Ops/s $\color{#35bf28}+0.43\%$
test_reshape_pytree 0.1562ms 34.3108μs 29.1453 KOps/s 29.1172 KOps/s $\color{#35bf28}+0.10\%$
test_reshape_td 0.1319ms 42.0384μs 23.7878 KOps/s 24.8850 KOps/s $\color{#d91a1a}-4.41\%$
test_view_pytree 0.1980ms 35.5815μs 28.1045 KOps/s 29.0900 KOps/s $\color{#d91a1a}-3.39\%$
test_view_td 0.1724ms 48.1889μs 20.7517 KOps/s 21.9301 KOps/s $\textbf{\color{#d91a1a}-5.37\%}$
test_unbind_pytree 0.1427ms 33.7756μs 29.6072 KOps/s 29.0634 KOps/s $\color{#35bf28}+1.87\%$
test_unbind_td 0.4187ms 41.1576μs 24.2969 KOps/s 23.8007 KOps/s $\color{#35bf28}+2.08\%$
test_split_pytree 0.1926ms 45.3704μs 22.0408 KOps/s 21.8031 KOps/s $\color{#35bf28}+1.09\%$
test_split_td 0.5088ms 55.1047μs 18.1473 KOps/s 17.9058 KOps/s $\color{#35bf28}+1.35\%$
test_add_pytree 0.2037ms 55.1414μs 18.1352 KOps/s 17.8760 KOps/s $\color{#35bf28}+1.45\%$
test_add_td 0.2955ms 84.4261μs 11.8447 KOps/s 10.8269 KOps/s $\textbf{\color{#35bf28}+9.40\%}$
test_compile_add_one_nested[tensordict-compile] 0.4396ms 0.2218ms 4.5079 KOps/s 4.6172 KOps/s $\color{#d91a1a}-2.37\%$
test_compile_add_one_nested[tensordict-eager] 0.2995ms 0.1538ms 6.5032 KOps/s 6.3073 KOps/s $\color{#35bf28}+3.11\%$
test_compile_add_one_nested[pytree-compile] 0.3045ms 0.1489ms 6.7170 KOps/s 6.9179 KOps/s $\color{#d91a1a}-2.90\%$
test_compile_add_one_nested[pytree-eager] 0.3380ms 0.1757ms 5.6911 KOps/s 5.7252 KOps/s $\color{#d91a1a}-0.60\%$
test_compile_copy_nested[tensordict-compile] 0.1562ms 20.9004μs 47.8461 KOps/s 47.9179 KOps/s $\color{#d91a1a}-0.15\%$
test_compile_copy_nested[tensordict-eager] 0.1873ms 41.1138μs 24.3228 KOps/s 23.6706 KOps/s $\color{#35bf28}+2.76\%$
test_compile_copy_nested[pytree-compile] 0.2531ms 62.8251μs 15.9172 KOps/s 15.5134 KOps/s $\color{#35bf28}+2.60\%$
test_compile_copy_nested[pytree-eager] 0.1385ms 48.8312μs 20.4787 KOps/s 20.1897 KOps/s $\color{#35bf28}+1.43\%$
test_compile_add_one_flat[tensordict-compile] 0.5546ms 0.3215ms 3.1100 KOps/s 3.1444 KOps/s $\color{#d91a1a}-1.10\%$
test_compile_add_one_flat[tensordict-eager] 0.3999ms 0.2091ms 4.7835 KOps/s 4.7401 KOps/s $\color{#35bf28}+0.92\%$
test_compile_add_one_flat[tensorclass-compile] 0.3108ms 0.1330ms 7.5176 KOps/s 7.6961 KOps/s $\color{#d91a1a}-2.32\%$
test_compile_add_one_flat[tensorclass-eager] 0.2618ms 65.6150μs 15.2404 KOps/s 16.2252 KOps/s $\textbf{\color{#d91a1a}-6.07\%}$
test_compile_add_one_flat[pytree-compile] 0.4623ms 0.3150ms 3.1743 KOps/s 3.1711 KOps/s $\color{#35bf28}+0.10\%$
test_compile_add_one_flat[pytree-eager] 0.8190ms 0.5840ms 1.7124 KOps/s 1.6784 KOps/s $\color{#35bf28}+2.02\%$
test_compile_add_self_flat[tensordict-eager] 0.3807ms 0.2433ms 4.1095 KOps/s 4.0594 KOps/s $\color{#35bf28}+1.24\%$
test_compile_add_self_flat[tensordict-compile] 0.4532ms 0.3187ms 3.1378 KOps/s 3.1490 KOps/s $\color{#d91a1a}-0.35\%$
test_compile_add_self_flat[tensorclass-eager] 0.2554ms 73.3717μs 13.6292 KOps/s 14.2183 KOps/s $\color{#d91a1a}-4.14\%$
test_compile_add_self_flat[tensorclass-compile] 0.2771ms 0.1299ms 7.6964 KOps/s 7.6179 KOps/s $\color{#35bf28}+1.03\%$
test_compile_add_self_flat[pytree-eager] 0.6903ms 0.4938ms 2.0251 KOps/s 1.9634 KOps/s $\color{#35bf28}+3.15\%$
test_compile_add_self_flat[pytree-compile] 0.5162ms 0.3179ms 3.1458 KOps/s 3.1733 KOps/s $\color{#d91a1a}-0.87\%$
test_compile_copy_flat[tensordict-compile] 0.1476ms 18.4734μs 54.1320 KOps/s 56.1176 KOps/s $\color{#d91a1a}-3.54\%$
test_compile_copy_flat[tensordict-eager] 59.5910μs 26.6896μs 37.4678 KOps/s 35.9956 KOps/s $\color{#35bf28}+4.09\%$
test_compile_copy_flat[pytree-compile] 0.1171ms 69.7274μs 14.3416 KOps/s 14.2188 KOps/s $\color{#35bf28}+0.86\%$
test_compile_copy_flat[pytree-eager] 0.1037ms 50.2943μs 19.8830 KOps/s 19.4701 KOps/s $\color{#35bf28}+2.12\%$
test_compile_assign_and_add[tensordict-compile] 2.2605ms 0.8030ms 1.2453 KOps/s 1.1593 KOps/s $\textbf{\color{#35bf28}+7.42\%}$
test_compile_assign_and_add[tensordict-eager] 3.4532ms 3.1265ms 319.8420 Ops/s 325.4605 Ops/s $\color{#d91a1a}-1.73\%$
test_compile_assign_and_add[pytree-compile] 2.2717ms 0.7973ms 1.2543 KOps/s 1.1609 KOps/s $\textbf{\color{#35bf28}+8.04\%}$
test_compile_assign_and_add[pytree-eager] 3.2541ms 2.9996ms 333.3763 Ops/s 328.9188 Ops/s $\color{#35bf28}+1.36\%$
test_compile_indexing[tensor-tensordict-compile] 0.2555ms 0.1107ms 9.0314 KOps/s 8.9721 KOps/s $\color{#35bf28}+0.66\%$
test_compile_indexing[tensor-tensordict-eager] 0.2144ms 57.5054μs 17.3897 KOps/s 17.2276 KOps/s $\color{#35bf28}+0.94\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2982ms 0.1034ms 9.6688 KOps/s 9.4184 KOps/s $\color{#35bf28}+2.66\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2192ms 40.4743μs 24.7070 KOps/s 24.4812 KOps/s $\color{#35bf28}+0.92\%$
test_compile_indexing[tensor-pytree-compile] 0.2481ms 0.1036ms 9.6504 KOps/s 9.6425 KOps/s $\color{#35bf28}+0.08\%$
test_compile_indexing[tensor-pytree-eager] 0.2189ms 40.0921μs 24.9426 KOps/s 24.4352 KOps/s $\color{#35bf28}+2.08\%$
test_compile_indexing[slice-tensordict-compile] 0.3201ms 0.1383ms 7.2286 KOps/s 7.2133 KOps/s $\color{#35bf28}+0.21\%$
test_compile_indexing[slice-tensordict-eager] 0.1698ms 24.2041μs 41.3153 KOps/s 42.9237 KOps/s $\color{#d91a1a}-3.75\%$
test_compile_indexing[slice-tensorclass-compile] 0.3134ms 0.1307ms 7.6501 KOps/s 7.5846 KOps/s $\color{#35bf28}+0.86\%$
test_compile_indexing[slice-tensorclass-eager] 87.9920μs 20.5059μs 48.7666 KOps/s 49.9577 KOps/s $\color{#d91a1a}-2.38\%$
test_compile_indexing[slice-pytree-compile] 0.3106ms 0.1316ms 7.5981 KOps/s 7.5752 KOps/s $\color{#35bf28}+0.30\%$
test_compile_indexing[slice-pytree-eager] 86.1420μs 20.1007μs 49.7494 KOps/s 49.7524 KOps/s $-0.01\%$
test_compile_indexing[int-tensordict-compile] 0.3186ms 0.1389ms 7.2009 KOps/s 7.2024 KOps/s $\color{#d91a1a}-0.02\%$
test_compile_indexing[int-tensordict-eager] 0.5187ms 23.6475μs 42.2878 KOps/s 41.6674 KOps/s $\color{#35bf28}+1.49\%$
test_compile_indexing[int-tensorclass-compile] 0.2771ms 0.1307ms 7.6484 KOps/s 7.5927 KOps/s $\color{#35bf28}+0.73\%$
test_compile_indexing[int-tensorclass-eager] 58.8120μs 20.2493μs 49.3845 KOps/s 49.4665 KOps/s $\color{#d91a1a}-0.17\%$
test_compile_indexing[int-pytree-compile] 0.3261ms 0.1316ms 7.5976 KOps/s 7.6052 KOps/s $\color{#d91a1a}-0.10\%$
test_compile_indexing[int-pytree-eager] 0.1186ms 20.2153μs 49.4676 KOps/s 49.6696 KOps/s $\color{#d91a1a}-0.41\%$
test_mod_add[eager] 0.1807ms 29.3173μs 34.1095 KOps/s 32.0413 KOps/s $\textbf{\color{#35bf28}+6.45\%}$
test_mod_add[compile] 0.2180ms 68.8697μs 14.5202 KOps/s 14.4732 KOps/s $\color{#35bf28}+0.32\%$
test_mod_add[compile-overhead] 0.2699ms 0.1383ms 7.2324 KOps/s 6.5198 KOps/s $\textbf{\color{#35bf28}+10.93\%}$
test_mod_wrap[eager] 0.3833ms 0.2356ms 4.2447 KOps/s 4.1194 KOps/s $\color{#35bf28}+3.04\%$
test_mod_wrap[compile] 1.0296ms 0.2823ms 3.5426 KOps/s 3.3238 KOps/s $\textbf{\color{#35bf28}+6.58\%}$
test_mod_wrap[compile-overhead] 8.1063ms 4.2068ms 237.7118 Ops/s 244.6791 Ops/s $\color{#d91a1a}-2.85\%$
test_mod_wrap_and_backward[eager] 1.5722ms 1.3120ms 762.1896 Ops/s 744.3428 Ops/s $\color{#35bf28}+2.40\%$
test_mod_wrap_and_backward[compile] 2.1810ms 1.2770ms 783.1059 Ops/s 765.8159 Ops/s $\color{#35bf28}+2.26\%$
test_mod_wrap_and_backward[compile-overhead] 1.2788ms 0.8802ms 1.1361 KOps/s 1.1098 KOps/s $\color{#35bf28}+2.37\%$
test_seq_add[eager] 0.2416ms 91.3018μs 10.9527 KOps/s 9.9369 KOps/s $\textbf{\color{#35bf28}+10.22\%}$
test_seq_add[compile] 0.4040ms 79.9971μs 12.5005 KOps/s 11.8280 KOps/s $\textbf{\color{#35bf28}+5.69\%}$
test_seq_add[compile-overhead] 0.2749ms 0.1199ms 8.3388 KOps/s 8.6390 KOps/s $\color{#d91a1a}-3.47\%$
test_seq_wrap[eager] 0.5170ms 0.3655ms 2.7359 KOps/s 2.6190 KOps/s $\color{#35bf28}+4.46\%$
test_seq_wrap[compile] 1.0083ms 0.3012ms 3.3199 KOps/s 3.2452 KOps/s $\color{#35bf28}+2.30\%$
test_seq_wrap[compile-overhead] 0.3647ms 0.2189ms 4.5677 KOps/s 4.5603 KOps/s $\color{#35bf28}+0.16\%$
test_func_call_runtime[False-eager] 0.9451ms 0.7427ms 1.3465 KOps/s 1.3697 KOps/s $\color{#d91a1a}-1.69\%$
test_func_call_runtime[False-compile] 1.3288ms 0.7580ms 1.3193 KOps/s 1.2463 KOps/s $\textbf{\color{#35bf28}+5.86\%}$
test_func_call_runtime[False-compile-overhead] 0.4955ms 0.3558ms 2.8109 KOps/s 2.8181 KOps/s $\color{#d91a1a}-0.26\%$
test_func_call_runtime[True-eager] 1.0565ms 0.8852ms 1.1297 KOps/s 1.1086 KOps/s $\color{#35bf28}+1.90\%$
test_func_call_runtime[True-compile] 0.9734ms 0.7920ms 1.2627 KOps/s 1.2107 KOps/s $\color{#35bf28}+4.29\%$
test_func_call_runtime[True-compile-overhead] 0.5244ms 0.3872ms 2.5826 KOps/s 2.5468 KOps/s $\color{#35bf28}+1.41\%$
test_func_call_cm_runtime[False-eager] 0.9688ms 0.7657ms 1.3060 KOps/s 1.2850 KOps/s $\color{#35bf28}+1.64\%$
test_func_call_cm_runtime[False-compile] 0.9881ms 0.7542ms 1.3259 KOps/s 1.2812 KOps/s $\color{#35bf28}+3.49\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5081ms 0.3572ms 2.7998 KOps/s 2.8125 KOps/s $\color{#d91a1a}-0.45\%$
test_func_call_cm_runtime[True-eager] 1.1248ms 0.9761ms 1.0245 KOps/s 1.0110 KOps/s $\color{#35bf28}+1.33\%$
test_func_call_cm_runtime[True-compile] 1.0043ms 0.8072ms 1.2389 KOps/s 1.2043 KOps/s $\color{#35bf28}+2.87\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5732ms 0.4120ms 2.4269 KOps/s 2.4468 KOps/s $\color{#d91a1a}-0.81\%$
test_vmap_func_call_cm_runtime[eager] 2.5733ms 2.0459ms 488.7854 Ops/s 486.7284 Ops/s $\color{#35bf28}+0.42\%$
test_vmap_func_call_cm_runtime[compile] 0.9802ms 0.8367ms 1.1952 KOps/s 1.1762 KOps/s $\color{#35bf28}+1.61\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5668ms 0.4193ms 2.3850 KOps/s 2.3794 KOps/s $\color{#35bf28}+0.24\%$
test_distributed 6.8439ms 0.2082ms 4.8022 KOps/s 8.6138 KOps/s $\textbf{\color{#d91a1a}-44.25\%}$
test_tdmodule 98.2920μs 13.3824μs 74.7252 KOps/s 65.0965 KOps/s $\textbf{\color{#35bf28}+14.79\%}$
test_tdmodule_dispatch 58.0410μs 27.1635μs 36.8141 KOps/s 32.9296 KOps/s $\textbf{\color{#35bf28}+11.80\%}$
test_tdseq 33.9000μs 14.3748μs 69.5663 KOps/s 61.5740 KOps/s $\textbf{\color{#35bf28}+12.98\%}$
test_tdseq_dispatch 56.8820μs 29.3316μs 34.0929 KOps/s 30.1122 KOps/s $\textbf{\color{#35bf28}+13.22\%}$
test_instantiation_functorch 2.0726ms 1.8256ms 547.7760 Ops/s 542.5614 Ops/s $\color{#35bf28}+0.96\%$
test_instantiation_td 1.8354ms 1.1876ms 842.0190 Ops/s 835.1816 Ops/s $\color{#35bf28}+0.82\%$
test_exec_functorch 0.3481ms 0.2010ms 4.9755 KOps/s 4.8692 KOps/s $\color{#35bf28}+2.18\%$
test_exec_functional_call 0.3560ms 0.2038ms 4.9076 KOps/s 4.8241 KOps/s $\color{#35bf28}+1.73\%$
test_exec_td 0.3612ms 0.2091ms 4.7833 KOps/s 4.4610 KOps/s $\textbf{\color{#35bf28}+7.22\%}$
test_exec_td_decorator 0.9880ms 0.2536ms 3.9428 KOps/s 3.9109 KOps/s $\color{#35bf28}+0.82\%$
test_vmap_mlp_speed[True-True] 0.8487ms 0.6729ms 1.4861 KOps/s 1.4530 KOps/s $\color{#35bf28}+2.28\%$
test_vmap_mlp_speed[True-False] 0.8190ms 0.6713ms 1.4896 KOps/s 1.4733 KOps/s $\color{#35bf28}+1.11\%$
test_vmap_mlp_speed[False-True] 0.7350ms 0.5691ms 1.7571 KOps/s 1.7236 KOps/s $\color{#35bf28}+1.94\%$
test_vmap_mlp_speed[False-False] 0.7300ms 0.5674ms 1.7624 KOps/s 1.7586 KOps/s $\color{#35bf28}+0.21\%$
test_vmap_mlp_speed_decorator[True-True] 0.8821ms 0.6636ms 1.5070 KOps/s 1.4940 KOps/s $\color{#35bf28}+0.87\%$
test_vmap_mlp_speed_decorator[True-False] 0.9802ms 0.6624ms 1.5097 KOps/s 1.4884 KOps/s $\color{#35bf28}+1.43\%$
test_vmap_mlp_speed_decorator[False-True] 0.7441ms 0.5828ms 1.7159 KOps/s 1.6590 KOps/s $\color{#35bf28}+3.43\%$
test_vmap_mlp_speed_decorator[False-False] 0.7503ms 0.5834ms 1.7142 KOps/s 1.6989 KOps/s $\color{#35bf28}+0.90\%$
test_vmap_transformer_speed[True-True] 8.4159ms 8.1833ms 122.1999 Ops/s 121.3871 Ops/s $\color{#35bf28}+0.67\%$
test_vmap_transformer_speed[True-False] 8.3822ms 8.1328ms 122.9584 Ops/s 121.2329 Ops/s $\color{#35bf28}+1.42\%$
test_vmap_transformer_speed[False-True] 8.1345ms 7.9411ms 125.9269 Ops/s 124.6007 Ops/s $\color{#35bf28}+1.06\%$
test_vmap_transformer_speed[False-False] 8.2982ms 7.9897ms 125.1614 Ops/s 124.6303 Ops/s $\color{#35bf28}+0.43\%$
test_vmap_transformer_speed_decorator[True-True] 19.5087ms 19.1247ms 52.2884 Ops/s 52.2040 Ops/s $\color{#35bf28}+0.16\%$
test_vmap_transformer_speed_decorator[True-False] 19.6295ms 19.1266ms 52.2833 Ops/s 52.0480 Ops/s $\color{#35bf28}+0.45\%$
test_vmap_transformer_speed_decorator[False-True] 19.3717ms 19.0438ms 52.5106 Ops/s 52.4516 Ops/s $\color{#35bf28}+0.11\%$
test_vmap_transformer_speed_decorator[False-False] 19.3190ms 19.0241ms 52.5648 Ops/s 52.2497 Ops/s $\color{#35bf28}+0.60\%$
test_to_module_speed[True] 1.4461ms 0.9185ms 1.0887 KOps/s 1.0872 KOps/s $\color{#35bf28}+0.15\%$
test_to_module_speed[False] 1.2669ms 0.8926ms 1.1203 KOps/s 1.1113 KOps/s $\color{#35bf28}+0.81\%$
test_tc_init 0.1330ms 31.5668μs 31.6788 KOps/s 29.1547 KOps/s $\textbf{\color{#35bf28}+8.66\%}$
test_tc_init_nested 0.1881ms 61.7561μs 16.1927 KOps/s 14.1405 KOps/s $\textbf{\color{#35bf28}+14.51\%}$
test_tc_first_layer_tensor 5.2344μs 0.6729μs 1.4861 MOps/s 1.4830 MOps/s $\color{#35bf28}+0.20\%$
test_tc_first_layer_nontensor 21.5610μs 2.2455μs 445.3296 KOps/s 447.2126 KOps/s $\color{#d91a1a}-0.42\%$
test_tc_second_layer_tensor 26.6910μs 1.4528μs 688.3388 KOps/s 737.4686 KOps/s $\textbf{\color{#d91a1a}-6.66\%}$
test_tc_second_layer_nontensor 22.4600μs 2.9507μs 338.9069 KOps/s 340.8949 KOps/s $\color{#d91a1a}-0.58\%$
test_unbind 0.2052s 12.5227ms 79.8550 Ops/s 86.4881 Ops/s $\textbf{\color{#d91a1a}-7.67\%}$
test_full_like 0.7908ms 0.5745ms 1.7405 KOps/s 1.7370 KOps/s $\color{#35bf28}+0.20\%$
test_zeros_like 0.3604ms 0.1985ms 5.0376 KOps/s 5.0408 KOps/s $\color{#d91a1a}-0.06\%$
test_ones_like 0.3819ms 0.1984ms 5.0409 KOps/s 5.0476 KOps/s $\color{#d91a1a}-0.13\%$
test_clone 0.5882ms 0.4145ms 2.4124 KOps/s 2.4089 KOps/s $\color{#35bf28}+0.15\%$
test_squeeze 0.1502ms 9.7587μs 102.4728 KOps/s 99.0829 KOps/s $\color{#35bf28}+3.42\%$
test_unsqueeze 0.2311ms 70.7898μs 14.1263 KOps/s 13.9160 KOps/s $\color{#35bf28}+1.51\%$
test_split 0.4246ms 0.1536ms 6.5084 KOps/s 6.3680 KOps/s $\color{#35bf28}+2.21\%$
test_permute 0.2793ms 0.1699ms 5.8845 KOps/s 5.7314 KOps/s $\color{#35bf28}+2.67\%$
test_stack 1.3980ms 0.8644ms 1.1568 KOps/s 1.1228 KOps/s $\color{#35bf28}+3.03\%$
test_cat 1.3862ms 1.2324ms 811.3971 Ops/s 811.7858 Ops/s $\color{#d91a1a}-0.05\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants