-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] NJT with lengths #1021
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Oct 2, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 53.3800μs | 25.0200μs | 39.9681 KOps/s | 39.4212 KOps/s | |
test_plain_set_stack_nested | 66.4550μs | 25.2316μs | 39.6329 KOps/s | 39.0633 KOps/s | |
test_plain_set_nested_inplace | 71.5340μs | 27.7053μs | 36.0941 KOps/s | 35.8677 KOps/s | |
test_plain_set_stack_nested_inplace | 76.3630μs | 27.3828μs | 36.5192 KOps/s | 35.9755 KOps/s | |
test_items | 25.8090μs | 4.1907μs | 238.6219 KOps/s | 242.2509 KOps/s | |
test_items_nested | 0.5462ms | 0.3809ms | 2.6255 KOps/s | 2.6089 KOps/s | |
test_items_nested_locked | 0.6870ms | 0.3807ms | 2.6270 KOps/s | 2.6261 KOps/s | |
test_items_nested_leaf | 0.1232ms | 81.9748μs | 12.1989 KOps/s | 12.3750 KOps/s | |
test_items_stack_nested | 0.5924ms | 0.3865ms | 2.5871 KOps/s | 2.5848 KOps/s | |
test_items_stack_nested_leaf | 0.1189ms | 83.1433μs | 12.0274 KOps/s | 11.9737 KOps/s | |
test_items_stack_nested_locked | 0.7364ms | 0.3871ms | 2.5836 KOps/s | 2.6016 KOps/s | |
test_keys | 26.1190μs | 3.5762μs | 279.6262 KOps/s | 288.3073 KOps/s | |
test_keys_nested | 0.2525ms | 0.1345ms | 7.4346 KOps/s | 7.3941 KOps/s | |
test_keys_nested_locked | 0.7023ms | 0.1395ms | 7.1661 KOps/s | 7.1053 KOps/s | |
test_keys_nested_leaf | 0.2213ms | 0.1176ms | 8.5000 KOps/s | 8.3967 KOps/s | |
test_keys_stack_nested | 0.6175ms | 0.1364ms | 7.3339 KOps/s | 7.3978 KOps/s | |
test_keys_stack_nested_leaf | 0.2045ms | 0.1174ms | 8.5197 KOps/s | 8.4471 KOps/s | |
test_keys_stack_nested_locked | 0.2699ms | 0.1390ms | 7.1962 KOps/s | 7.0800 KOps/s | |
test_values | 5.6806μs | 1.0515μs | 950.9997 KOps/s | 949.3443 KOps/s | |
test_values_nested | 0.2954ms | 95.4006μs | 10.4821 KOps/s | 10.7193 KOps/s | |
test_values_nested_locked | 0.1645ms | 93.8110μs | 10.6597 KOps/s | 10.9481 KOps/s | |
test_values_nested_leaf | 0.1451ms | 79.4256μs | 12.5904 KOps/s | 12.1165 KOps/s | |
test_values_stack_nested | 0.1716ms | 95.5031μs | 10.4709 KOps/s | 10.7536 KOps/s | |
test_values_stack_nested_leaf | 0.1474ms | 80.4712μs | 12.4268 KOps/s | 12.6380 KOps/s | |
test_values_stack_nested_locked | 0.1838ms | 93.5245μs | 10.6924 KOps/s | 10.8392 KOps/s | |
test_membership | 25.0070μs | 0.9702μs | 1.0307 MOps/s | 1.1572 MOps/s | |
test_membership_nested | 29.3550μs | 2.7981μs | 357.3891 KOps/s | 360.1800 KOps/s | |
test_membership_nested_leaf | 0.1414ms | 2.8774μs | 347.5415 KOps/s | 356.5648 KOps/s | |
test_membership_stacked_nested | 27.7930μs | 2.7714μs | 360.8296 KOps/s | 362.9257 KOps/s | |
test_membership_stacked_nested_leaf | 25.8890μs | 2.8242μs | 354.0766 KOps/s | 359.5121 KOps/s | |
test_membership_nested_last | 28.2030μs | 4.1972μs | 238.2519 KOps/s | 236.1840 KOps/s | |
test_membership_nested_leaf_last | 40.8470μs | 4.2309μs | 236.3575 KOps/s | 230.6388 KOps/s | |
test_membership_stacked_nested_last | 21.3800μs | 4.2352μs | 236.1189 KOps/s | 238.8099 KOps/s | |
test_membership_stacked_nested_leaf_last | 27.3510μs | 4.2675μs | 234.3295 KOps/s | 236.1292 KOps/s | |
test_nested_getleaf | 0.1437ms | 10.6130μs | 94.2243 KOps/s | 88.5823 KOps/s | |
test_nested_get | 34.2440μs | 9.9001μs | 101.0089 KOps/s | 94.5915 KOps/s | |
test_stacked_getleaf | 54.5740μs | 10.4278μs | 95.8972 KOps/s | 90.8843 KOps/s | |
test_stacked_get | 50.4050μs | 9.9999μs | 100.0005 KOps/s | 94.8199 KOps/s | |
test_nested_getitemleaf | 58.9600μs | 10.9778μs | 91.0932 KOps/s | 85.9751 KOps/s | |
test_nested_getitem | 52.3080μs | 10.2363μs | 97.6915 KOps/s | 93.2864 KOps/s | |
test_stacked_getitemleaf | 43.6820μs | 10.8427μs | 92.2278 KOps/s | 86.8323 KOps/s | |
test_stacked_getitem | 38.0220μs | 10.3486μs | 96.6315 KOps/s | 92.4575 KOps/s | |
test_lock_nested | 0.9864ms | 0.4998ms | 2.0007 KOps/s | 1.9279 KOps/s | |
test_lock_stack_nested | 0.7085ms | 0.4721ms | 2.1181 KOps/s | 2.0368 KOps/s | |
test_unlock_nested | 0.1000s | 0.5187ms | 1.9280 KOps/s | 2.2779 KOps/s | |
test_unlock_stack_nested | 0.7451ms | 0.3853ms | 2.5953 KOps/s | 2.4485 KOps/s | |
test_flatten_speed | 0.1878ms | 0.1017ms | 9.8354 KOps/s | 10.0565 KOps/s | |
test_unflatten_speed | 0.7284ms | 0.5068ms | 1.9732 KOps/s | 1.8968 KOps/s | |
test_common_ops | 3.9040ms | 1.1559ms | 865.1340 Ops/s | 849.4647 Ops/s | |
test_creation | 18.3650μs | 2.0777μs | 481.2969 KOps/s | 488.6584 KOps/s | |
test_creation_empty | 66.3040μs | 20.1563μs | 49.6123 KOps/s | 51.1045 KOps/s | |
test_creation_nested_1 | 83.6350μs | 23.2965μs | 42.9248 KOps/s | 42.9168 KOps/s | |
test_creation_nested_2 | 1.1630ms | 27.6571μs | 36.1570 KOps/s | 36.8797 KOps/s | |
test_clone | 0.1166ms | 16.8798μs | 59.2425 KOps/s | 55.3604 KOps/s | |
test_getitem[int] | 0.8096ms | 16.6729μs | 59.9777 KOps/s | 57.6000 KOps/s | |
test_getitem[slice_int] | 0.1920ms | 30.0830μs | 33.2414 KOps/s | 31.4595 KOps/s | |
test_getitem[range] | 0.1705ms | 57.7900μs | 17.3040 KOps/s | 16.5776 KOps/s | |
test_getitem[tuple] | 0.1572ms | 25.6691μs | 38.9573 KOps/s | 38.0054 KOps/s | |
test_getitem[list] | 0.1741ms | 53.0323μs | 18.8564 KOps/s | 18.0890 KOps/s | |
test_setitem_dim[int] | 65.5830μs | 31.4647μs | 31.7817 KOps/s | 30.3288 KOps/s | |
test_setitem_dim[slice_int] | 93.9760μs | 59.2012μs | 16.8916 KOps/s | 16.0349 KOps/s | |
test_setitem_dim[range] | 0.1871ms | 82.4727μs | 12.1252 KOps/s | 11.5441 KOps/s | |
test_setitem_dim[tuple] | 90.5500μs | 48.0518μs | 20.8109 KOps/s | 18.8803 KOps/s | |
test_setitem | 0.1244ms | 30.3768μs | 32.9198 KOps/s | 30.7936 KOps/s | |
test_set | 0.1213ms | 29.8320μs | 33.5211 KOps/s | 31.6533 KOps/s | |
test_set_shared | 1.2555ms | 0.2146ms | 4.6588 KOps/s | 4.4192 KOps/s | |
test_update | 0.1574ms | 38.8254μs | 25.7563 KOps/s | 24.6900 KOps/s | |
test_update_nested | 0.4476ms | 50.2305μs | 19.9082 KOps/s | 19.1125 KOps/s | |
test_update__nested | 0.1108ms | 37.0921μs | 26.9599 KOps/s | 24.9905 KOps/s | |
test_set_nested | 85.0690μs | 32.2682μs | 30.9902 KOps/s | 28.9165 KOps/s | |
test_set_nested_new | 0.1175ms | 37.9677μs | 26.3382 KOps/s | 25.3750 KOps/s | |
test_select | 0.1509ms | 54.7559μs | 18.2629 KOps/s | 17.2343 KOps/s | |
test_select_nested | 0.1322ms | 59.7417μs | 16.7387 KOps/s | 15.6524 KOps/s | |
test_exclude_nested | 0.1510ms | 75.6784μs | 13.2138 KOps/s | 12.5378 KOps/s | |
test_empty[True] | 0.6344ms | 0.3525ms | 2.8366 KOps/s | 2.8030 KOps/s | |
test_empty[False] | 9.9737μs | 1.2265μs | 815.2952 KOps/s | 709.5573 KOps/s | |
test_unbind_speed | 0.4193ms | 0.2963ms | 3.3751 KOps/s | 3.1660 KOps/s | |
test_unbind_speed_stack0 | 0.5027ms | 0.2965ms | 3.3724 KOps/s | 3.1883 KOps/s | |
test_unbind_speed_stack1 | 94.3807ms | 0.8324ms | 1.2014 KOps/s | 1.2650 KOps/s | |
test_split | 95.0244ms | 2.1625ms | 462.4213 Ops/s | 439.0356 Ops/s | |
test_chunk | 2.1796ms | 1.9507ms | 512.6445 Ops/s | 435.9829 Ops/s | |
test_creation[device0] | 4.2702ms | 0.1168ms | 8.5639 KOps/s | 8.2587 KOps/s | |
test_creation_from_tensor | 0.2685ms | 0.1146ms | 8.7233 KOps/s | 8.5394 KOps/s | |
test_add_one[memmap_tensor0] | 0.3104ms | 6.8681μs | 145.6000 KOps/s | 132.6507 KOps/s | |
test_contiguous[memmap_tensor0] | 21.7910μs | 1.9216μs | 520.4013 KOps/s | 534.6308 KOps/s | |
test_stack[memmap_tensor0] | 51.4060μs | 5.3490μs | 186.9520 KOps/s | 171.4215 KOps/s | |
test_memmaptd_index | 0.6385ms | 0.4011ms | 2.4932 KOps/s | 2.3974 KOps/s | |
test_memmaptd_index_astensor | 0.8931ms | 0.5018ms | 1.9927 KOps/s | 1.9144 KOps/s | |
test_memmaptd_index_op | 1.9263ms | 1.0578ms | 945.3655 Ops/s | 911.1476 Ops/s | |
test_serialize_model | 0.2287s | 0.1341s | 7.4570 Ops/s | 8.2624 Ops/s | |
test_serialize_model_pickle | 0.4490s | 0.3976s | 2.5152 Ops/s | 2.5062 Ops/s | |
test_serialize_weights | 0.1278s | 0.1181s | 8.4653 Ops/s | 8.7002 Ops/s | |
test_serialize_weights_returnearly | 0.1729s | 0.1604s | 6.2330 Ops/s | 6.3212 Ops/s | |
test_serialize_weights_pickle | 0.5584s | 0.4155s | 2.4066 Ops/s | 1.0879 Ops/s | |
test_serialize_weights_filesystem | 0.2345s | 0.1565s | 6.3890 Ops/s | 7.0338 Ops/s | |
test_serialize_model_filesystem | 0.1644s | 0.1516s | 6.5971 Ops/s | 7.0658 Ops/s | |
test_reshape_pytree | 87.2830μs | 38.8832μs | 25.7181 KOps/s | 24.6251 KOps/s | |
test_reshape_td | 96.8320μs | 45.1612μs | 22.1429 KOps/s | 20.3950 KOps/s | |
test_view_pytree | 92.8740μs | 38.7193μs | 25.8269 KOps/s | 25.0866 KOps/s | |
test_view_td | 0.1394ms | 50.6637μs | 19.7380 KOps/s | 18.2163 KOps/s | |
test_unbind_pytree | 0.1530ms | 35.8123μs | 27.9233 KOps/s | 27.3047 KOps/s | |
test_unbind_td | 0.2914ms | 44.5252μs | 22.4592 KOps/s | 20.7565 KOps/s | |
test_split_pytree | 79.0980μs | 37.8468μs | 26.4223 KOps/s | 25.2338 KOps/s | |
test_split_td | 0.5052ms | 56.9364μs | 17.5635 KOps/s | 16.5672 KOps/s | |
test_add_pytree | 0.1004ms | 43.1381μs | 23.1814 KOps/s | 20.9568 KOps/s | |
test_add_td | 0.2247ms | 86.4150μs | 11.5721 KOps/s | 10.8633 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1205ms | 57.9576μs | 17.2540 KOps/s | 16.7274 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2620ms | 0.1919ms | 5.2105 KOps/s | 4.9869 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1147ms | 56.9487μs | 17.5597 KOps/s | 17.5924 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3702ms | 0.1383ms | 7.2294 KOps/s | 6.8930 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 56.9870μs | 23.8945μs | 41.8506 KOps/s | 43.7908 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1835ms | 73.4641μs | 13.6121 KOps/s | 12.9596 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1324ms | 75.3356μs | 13.2739 KOps/s | 13.0936 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1215ms | 69.0396μs | 14.4844 KOps/s | 14.4025 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3690ms | 0.1801ms | 5.5540 KOps/s | 5.4429 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3072ms | 0.2339ms | 4.2756 KOps/s | 4.1708 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1531ms | 46.3904μs | 21.5562 KOps/s | 20.6102 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1705ms | 74.5722μs | 13.4098 KOps/s | 12.8939 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3636ms | 0.1733ms | 5.7701 KOps/s | 5.6415 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3807ms | 0.2806ms | 3.5643 KOps/s | 3.3845 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.5575ms | 0.2701ms | 3.7024 KOps/s | 3.6156 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3656ms | 0.1902ms | 5.2568 KOps/s | 5.3851 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1482ms | 71.9673μs | 13.8952 KOps/s | 13.3697 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1117ms | 48.8302μs | 20.4791 KOps/s | 19.9970 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4608ms | 0.2329ms | 4.2942 KOps/s | 4.1754 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3997ms | 0.1737ms | 5.7579 KOps/s | 5.6611 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2588ms | 0.1124ms | 8.8932 KOps/s | 8.5377 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1572ms | 77.6944μs | 12.8709 KOps/s | 12.8352 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.2013ms | 77.4166μs | 12.9171 KOps/s | 12.7981 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1400ms | 70.8953μs | 14.1053 KOps/s | 14.3171 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.4223ms | 0.1928ms | 5.1856 KOps/s | 5.0580 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.9143ms | 1.6980ms | 588.9114 Ops/s | 565.4061 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2815ms | 0.1919ms | 5.2104 KOps/s | 5.1506 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.2536ms | 1.0809ms | 925.1515 Ops/s | 878.1216 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.7266ms | 0.4223ms | 2.3678 KOps/s | 2.3678 KOps/s | |
test_compile_assign_and_add_stack[eager] | 5.9074ms | 4.0609ms | 246.2482 Ops/s | 243.0277 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1135ms | 34.0722μs | 29.3495 KOps/s | 28.1804 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 1.4604ms | 47.8963μs | 20.8785 KOps/s | 19.8234 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 79.4790μs | 29.7533μs | 33.6097 KOps/s | 32.4803 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 93.8260μs | 29.3939μs | 34.0206 KOps/s | 33.2448 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 86.1110μs | 29.6962μs | 33.6743 KOps/s | 32.6041 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 73.7280μs | 29.3438μs | 34.0788 KOps/s | 33.2632 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1576ms | 73.9741μs | 13.5182 KOps/s | 12.8876 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5132ms | 26.9670μs | 37.0823 KOps/s | 34.5049 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1376ms | 68.7079μs | 14.5544 KOps/s | 14.2884 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 69.3290μs | 23.2875μs | 42.9414 KOps/s | 41.6290 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1379ms | 69.2488μs | 14.4407 KOps/s | 14.3773 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 69.3500μs | 23.1192μs | 43.2541 KOps/s | 41.1644 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1512ms | 73.8986μs | 13.5321 KOps/s | 13.4313 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.8440ms | 26.9164μs | 37.1521 KOps/s | 34.7144 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1659ms | 68.8990μs | 14.5140 KOps/s | 14.5626 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.2892ms | 22.8269μs | 43.8080 KOps/s | 42.5100 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1398ms | 69.0616μs | 14.4798 KOps/s | 14.5712 KOps/s | |
test_compile_indexing[int-pytree-eager] | 77.2940μs | 22.9215μs | 43.6271 KOps/s | 41.8979 KOps/s | |
test_mod_add[eager] | 0.1152ms | 25.4514μs | 39.2905 KOps/s | 39.0850 KOps/s | |
test_mod_add[compile] | 91.6120μs | 38.6160μs | 25.8960 KOps/s | 25.5926 KOps/s | |
test_mod_add[compile-overhead] | 98.2140μs | 38.8274μs | 25.7550 KOps/s | 25.4751 KOps/s | |
test_mod_wrap[eager] | 0.3474ms | 0.2030ms | 4.9261 KOps/s | 4.6852 KOps/s | |
test_mod_wrap[compile] | 0.4443ms | 0.2243ms | 4.4584 KOps/s | 4.1913 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3617ms | 0.2224ms | 4.4956 KOps/s | 4.2126 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.2967ms | 10.8835ms | 91.8824 Ops/s | 88.5985 Ops/s | |
test_mod_wrap_and_backward[compile] | 12.3223ms | 10.8155ms | 92.4599 Ops/s | 83.9952 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 12.4372ms | 10.9018ms | 91.7280 Ops/s | 81.3399 Ops/s | |
test_seq_add[eager] | 0.2065ms | 90.2388μs | 11.0817 KOps/s | 10.6327 KOps/s | |
test_seq_add[compile] | 0.1345ms | 65.0796μs | 15.3658 KOps/s | 14.9145 KOps/s | |
test_seq_add[compile-overhead] | 0.2802ms | 63.1141μs | 15.8443 KOps/s | 15.3937 KOps/s | |
test_seq_wrap[eager] | 0.6623ms | 0.3837ms | 2.6063 KOps/s | 2.5095 KOps/s | |
test_seq_wrap[compile] | 1.3074ms | 0.2632ms | 3.7991 KOps/s | 3.6272 KOps/s | |
test_seq_wrap[compile-overhead] | 1.3440ms | 0.2621ms | 3.8152 KOps/s | 3.6443 KOps/s | |
test_func_call_runtime[False-eager] | 0.5995ms | 0.4871ms | 2.0531 KOps/s | 1.8655 KOps/s | |
test_func_call_runtime[False-compile] | 0.6339ms | 0.4841ms | 2.0657 KOps/s | 1.9431 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 1.2356ms | 0.4913ms | 2.0356 KOps/s | 1.9485 KOps/s | |
test_func_call_runtime[True-eager] | 0.8172ms | 0.7070ms | 1.4145 KOps/s | 1.2986 KOps/s | |
test_func_call_runtime[True-compile] | 1.0531ms | 0.5011ms | 1.9956 KOps/s | 1.9153 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.8414ms | 0.5009ms | 1.9965 KOps/s | 1.9077 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8321ms | 0.4909ms | 2.0371 KOps/s | 1.8427 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8921ms | 0.4885ms | 2.0472 KOps/s | 1.9506 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.9383ms | 0.4868ms | 2.0541 KOps/s | 1.9455 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0819ms | 0.8616ms | 1.1606 KOps/s | 1.0967 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.8381ms | 0.7097ms | 1.4090 KOps/s | 1.3225 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.8575ms | 0.7079ms | 1.4127 KOps/s | 1.3218 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4854ms | 1.8641ms | 536.4592 Ops/s | 512.8263 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.7762ms | 1.9511ms | 512.5311 Ops/s | 496.1255 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 2.7719ms | 1.9243ms | 519.6738 Ops/s | 493.2836 Ops/s | |
test_distributed | 0.2821ms | 0.1238ms | 8.0781 KOps/s | 7.8254 KOps/s | |
test_tdmodule | 0.1189ms | 18.6799μs | 53.5336 KOps/s | 53.0995 KOps/s | |
test_tdmodule_dispatch | 65.5530μs | 36.9647μs | 27.0529 KOps/s | 26.3639 KOps/s | |
test_tdseq | 49.2420μs | 21.5577μs | 46.3871 KOps/s | 45.1439 KOps/s | |
test_tdseq_dispatch | 67.4360μs | 43.0471μs | 23.2304 KOps/s | 22.2179 KOps/s | |
test_instantiation_functorch | 1.7668ms | 1.5212ms | 657.3773 Ops/s | 627.0977 Ops/s | |
test_instantiation_td | 2.0159ms | 1.1599ms | 862.1517 Ops/s | 842.3335 Ops/s | |
test_exec_functorch | 0.4105ms | 0.1810ms | 5.5244 KOps/s | 5.3047 KOps/s | |
test_exec_functional_call | 0.3366ms | 0.1663ms | 6.0137 KOps/s | 5.6172 KOps/s | |
test_exec_td | 0.3830ms | 0.1912ms | 5.2307 KOps/s | 4.7037 KOps/s | |
test_exec_td_decorator | 1.2647ms | 0.2256ms | 4.4319 KOps/s | 4.1882 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.9840ms | 0.6740ms | 1.4838 KOps/s | 1.4200 KOps/s | |
test_vmap_mlp_speed[True-False] | 1.5970ms | 0.7610ms | 1.3140 KOps/s | 1.4254 KOps/s | |
test_vmap_mlp_speed[False-True] | 8.6105ms | 0.5719ms | 1.7486 KOps/s | 1.7864 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.9102ms | 0.5264ms | 1.8998 KOps/s | 1.8217 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9924ms | 0.6310ms | 1.5848 KOps/s | 1.5159 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1695ms | 0.6287ms | 1.5906 KOps/s | 1.5107 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7843ms | 0.5159ms | 1.9383 KOps/s | 1.8443 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7446ms | 0.5164ms | 1.9364 KOps/s | 1.8470 KOps/s | |
test_to_module_speed[True] | 2.3256ms | 1.4106ms | 708.9163 Ops/s | 699.1890 Ops/s | |
test_to_module_speed[False] | 1.9047ms | 1.3624ms | 733.9848 Ops/s | 714.3055 Ops/s | |
test_tc_init | 0.1325ms | 48.1454μs | 20.7704 KOps/s | 20.4392 KOps/s | |
test_tc_init_nested | 0.1728ms | 95.6428μs | 10.4556 KOps/s | 10.3447 KOps/s | |
test_tc_first_layer_tensor | 39.7950μs | 1.4890μs | 671.5709 KOps/s | 642.6344 KOps/s | |
test_tc_first_layer_nontensor | 37.1990μs | 4.6538μs | 214.8788 KOps/s | 214.0217 KOps/s | |
test_tc_second_layer_tensor | 23.2830μs | 2.7627μs | 361.9708 KOps/s | 350.8080 KOps/s | |
test_tc_second_layer_nontensor | 39.8750μs | 5.8090μs | 172.1476 KOps/s | 164.5517 KOps/s | |
test_unbind | 0.5008s | 14.0735ms | 71.0556 Ops/s | 72.5494 Ops/s | |
test_full_like | 10.0488ms | 8.5033ms | 117.6015 Ops/s | 127.4186 Ops/s | |
test_zeros_like | 3.5584ms | 2.8977ms | 345.1006 Ops/s | 340.6211 Ops/s | |
test_ones_like | 4.4911ms | 3.5654ms | 280.4728 Ops/s | 285.2310 Ops/s | |
test_clone | 7.2982ms | 5.6991ms | 175.4668 Ops/s | 191.6563 Ops/s | |
test_squeeze | 86.6720μs | 12.1880μs | 82.0476 KOps/s | 78.7862 KOps/s | |
test_unsqueeze | 0.1588ms | 89.2467μs | 11.2049 KOps/s | 10.5569 KOps/s | |
test_split | 0.3954ms | 0.1909ms | 5.2384 KOps/s | 4.9394 KOps/s | |
test_permute | 0.3872ms | 0.2154ms | 4.6429 KOps/s | 4.4424 KOps/s | |
test_stack | 28.6764ms | 26.4943ms | 37.7440 Ops/s | 39.7300 Ops/s | |
test_cat | 36.7037ms | 26.1818ms | 38.1945 Ops/s | 40.1681 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 94.3450μs | 18.1591μs | 55.0687 KOps/s | 56.5595 KOps/s | |
test_plain_set_stack_nested | 46.4320μs | 18.3920μs | 54.3716 KOps/s | 56.0996 KOps/s | |
test_plain_set_nested_inplace | 59.9230μs | 19.4651μs | 51.3740 KOps/s | 52.5217 KOps/s | |
test_plain_set_stack_nested_inplace | 54.3030μs | 19.5858μs | 51.0573 KOps/s | 52.3912 KOps/s | |
test_items | 25.9510μs | 2.8387μs | 352.2749 KOps/s | 351.5825 KOps/s | |
test_items_nested | 0.4176ms | 0.3398ms | 2.9426 KOps/s | 2.9339 KOps/s | |
test_items_nested_locked | 0.3750ms | 0.3378ms | 2.9606 KOps/s | 2.9312 KOps/s | |
test_items_nested_leaf | 86.0550μs | 62.3385μs | 16.0415 KOps/s | 15.9361 KOps/s | |
test_items_stack_nested | 0.4456ms | 0.3444ms | 2.9033 KOps/s | 2.9351 KOps/s | |
test_items_stack_nested_leaf | 98.8750μs | 64.4393μs | 15.5185 KOps/s | 15.4247 KOps/s | |
test_items_stack_nested_locked | 0.3696ms | 0.3414ms | 2.9292 KOps/s | 2.9386 KOps/s | |
test_keys | 28.0420μs | 3.3976μs | 294.3222 KOps/s | 292.7498 KOps/s | |
test_keys_nested | 95.1550μs | 71.3099μs | 14.0233 KOps/s | 14.0728 KOps/s | |
test_keys_nested_locked | 2.4709ms | 76.8099μs | 13.0192 KOps/s | 12.8970 KOps/s | |
test_keys_nested_leaf | 92.7750μs | 62.1053μs | 16.1017 KOps/s | 16.0817 KOps/s | |
test_keys_stack_nested | 0.1034ms | 71.6412μs | 13.9585 KOps/s | 14.0114 KOps/s | |
test_keys_stack_nested_leaf | 89.7340μs | 63.2744μs | 15.8042 KOps/s | 15.8235 KOps/s | |
test_keys_stack_nested_locked | 0.1109ms | 78.0420μs | 12.8136 KOps/s | 12.8390 KOps/s | |
test_values | 5.6087μs | 0.8608μs | 1.1617 MOps/s | 1.1919 MOps/s | |
test_values_nested | 76.1340μs | 49.0730μs | 20.3778 KOps/s | 20.5651 KOps/s | |
test_values_nested_locked | 83.7840μs | 50.6829μs | 19.7305 KOps/s | 20.0726 KOps/s | |
test_values_nested_leaf | 74.3840μs | 42.3151μs | 23.6322 KOps/s | 23.4391 KOps/s | |
test_values_stack_nested | 95.6450μs | 50.1254μs | 19.9500 KOps/s | 20.2612 KOps/s | |
test_values_stack_nested_leaf | 74.7440μs | 44.0773μs | 22.6874 KOps/s | 23.0130 KOps/s | |
test_values_stack_nested_locked | 80.1040μs | 51.4687μs | 19.4293 KOps/s | 19.5549 KOps/s | |
test_membership | 2.2286μs | 0.4984μs | 2.0063 MOps/s | 1.9756 MOps/s | |
test_membership_nested | 40.9820μs | 1.9569μs | 511.0035 KOps/s | 526.3374 KOps/s | |
test_membership_nested_leaf | 17.7210μs | 1.8577μs | 538.2895 KOps/s | 523.1507 KOps/s | |
test_membership_stacked_nested | 41.2420μs | 1.9477μs | 513.4361 KOps/s | 508.1804 KOps/s | |
test_membership_stacked_nested_leaf | 23.2110μs | 1.9548μs | 511.5743 KOps/s | 509.0857 KOps/s | |
test_membership_nested_last | 24.8910μs | 2.9795μs | 335.6225 KOps/s | 337.5107 KOps/s | |
test_membership_nested_leaf_last | 23.2610μs | 2.9782μs | 335.7756 KOps/s | 334.5817 KOps/s | |
test_membership_stacked_nested_last | 32.7120μs | 3.4994μs | 285.7592 KOps/s | 327.0336 KOps/s | |
test_membership_stacked_nested_leaf_last | 41.7120μs | 3.4773μs | 287.5811 KOps/s | 332.3641 KOps/s | |
test_nested_getleaf | 28.0210μs | 6.1106μs | 163.6506 KOps/s | 162.9005 KOps/s | |
test_nested_get | 29.8920μs | 5.7934μs | 172.6094 KOps/s | 172.1643 KOps/s | |
test_stacked_getleaf | 25.9610μs | 6.0576μs | 165.0821 KOps/s | 164.9229 KOps/s | |
test_stacked_get | 36.0920μs | 5.7910μs | 172.6809 KOps/s | 174.8111 KOps/s | |
test_nested_getitemleaf | 32.2320μs | 6.1508μs | 162.5811 KOps/s | 162.2485 KOps/s | |
test_nested_getitem | 26.2210μs | 5.7927μs | 172.6304 KOps/s | 172.5229 KOps/s | |
test_stacked_getitemleaf | 35.6620μs | 6.1091μs | 163.6907 KOps/s | 163.3225 KOps/s | |
test_stacked_getitem | 21.7110μs | 5.6611μs | 176.6445 KOps/s | 175.9428 KOps/s | |
test_lock_nested | 4.5354ms | 0.4401ms | 2.2725 KOps/s | 2.2632 KOps/s | |
test_lock_stack_nested | 0.4460ms | 0.3968ms | 2.5203 KOps/s | 2.5464 KOps/s | |
test_unlock_nested | 0.7782ms | 0.3715ms | 2.6915 KOps/s | 2.6520 KOps/s | |
test_unlock_stack_nested | 0.3671ms | 0.3328ms | 3.0053 KOps/s | 3.0223 KOps/s | |
test_flatten_speed | 0.1125ms | 76.9215μs | 13.0003 KOps/s | 13.0492 KOps/s | |
test_unflatten_speed | 0.3834ms | 0.3300ms | 3.0300 KOps/s | 3.0577 KOps/s | |
test_common_ops | 1.6554ms | 1.3381ms | 747.3095 Ops/s | 754.6756 Ops/s | |
test_creation | 21.7910μs | 1.4719μs | 679.4060 KOps/s | 678.8851 KOps/s | |
test_creation_empty | 51.4430μs | 18.0578μs | 55.3776 KOps/s | 57.5736 KOps/s | |
test_creation_nested_1 | 45.6520μs | 20.3602μs | 49.1155 KOps/s | 51.1718 KOps/s | |
test_creation_nested_2 | 83.1340μs | 21.9056μs | 45.6503 KOps/s | 45.4121 KOps/s | |
test_clone | 57.8430μs | 29.1474μs | 34.3083 KOps/s | 33.4741 KOps/s | |
test_getitem[int] | 1.3189ms | 16.0798μs | 62.1899 KOps/s | 59.5154 KOps/s | |
test_getitem[slice_int] | 0.1335ms | 28.2006μs | 35.4603 KOps/s | 34.5089 KOps/s | |
test_getitem[range] | 0.1476ms | 0.1088ms | 9.1894 KOps/s | 8.8480 KOps/s | |
test_getitem[tuple] | 0.1313ms | 24.3680μs | 41.0374 KOps/s | 41.2852 KOps/s | |
test_getitem[list] | 0.2016ms | 0.1000ms | 9.9960 KOps/s | 9.8716 KOps/s | |
test_setitem_dim[int] | 0.1270ms | 44.6685μs | 22.3871 KOps/s | 21.6715 KOps/s | |
test_setitem_dim[slice_int] | 91.5440μs | 67.8826μs | 14.7313 KOps/s | 14.5130 KOps/s | |
test_setitem_dim[range] | 0.1695ms | 0.1282ms | 7.7973 KOps/s | 7.6476 KOps/s | |
test_setitem_dim[tuple] | 86.8240μs | 61.2655μs | 16.3224 KOps/s | 16.1791 KOps/s | |
test_setitem | 75.8140μs | 44.6410μs | 22.4009 KOps/s | 22.4391 KOps/s | |
test_set | 0.1095ms | 44.0619μs | 22.6954 KOps/s | 23.0076 KOps/s | |
test_set_shared | 0.3659ms | 54.9636μs | 18.1939 KOps/s | 18.0586 KOps/s | |
test_update | 0.1107ms | 54.0859μs | 18.4891 KOps/s | 18.7140 KOps/s | |
test_update_nested | 0.1093ms | 60.4428μs | 16.5446 KOps/s | 16.4544 KOps/s | |
test_update__nested | 99.7050μs | 60.4492μs | 16.5428 KOps/s | 15.6716 KOps/s | |
test_set_nested | 86.2940μs | 46.4746μs | 21.5172 KOps/s | 21.8484 KOps/s | |
test_set_nested_new | 88.9850μs | 49.5548μs | 20.1797 KOps/s | 20.1936 KOps/s | |
test_select | 0.5015ms | 63.3523μs | 15.7847 KOps/s | 15.6813 KOps/s | |
test_select_nested | 78.7530μs | 41.7681μs | 23.9417 KOps/s | 23.7203 KOps/s | |
test_exclude_nested | 99.4050μs | 59.4546μs | 16.8196 KOps/s | 16.8494 KOps/s | |
test_empty[True] | 0.2880ms | 0.2580ms | 3.8761 KOps/s | 3.7804 KOps/s | |
test_empty[False] | 3.1731μs | 0.7392μs | 1.3527 MOps/s | 1.3422 MOps/s | |
test_to | 54.8830μs | 26.4904μs | 37.7495 KOps/s | 36.9746 KOps/s | |
test_to_nonblocking | 60.8430μs | 24.9973μs | 40.0043 KOps/s | 38.5780 KOps/s | |
test_unbind_speed | 1.5697ms | 0.2899ms | 3.4499 KOps/s | 3.4533 KOps/s | |
test_unbind_speed_stack0 | 0.3464ms | 0.2876ms | 3.4775 KOps/s | 3.5343 KOps/s | |
test_unbind_speed_stack1 | 91.9298ms | 0.7287ms | 1.3723 KOps/s | 1.4047 KOps/s | |
test_split | 94.0950ms | 2.2010ms | 454.3339 Ops/s | 434.6510 Ops/s | |
test_chunk | 94.3049ms | 2.1908ms | 456.4471 Ops/s | 432.6757 Ops/s | |
test_creation[device0] | 0.3298ms | 0.1270ms | 7.8759 KOps/s | 7.7971 KOps/s | |
test_creation_from_tensor | 0.3990ms | 0.1290ms | 7.7521 KOps/s | 7.6330 KOps/s | |
test_add_one[memmap_tensor0] | 0.2750ms | 8.8817μs | 112.5910 KOps/s | 110.4297 KOps/s | |
test_contiguous[memmap_tensor0] | 23.1810μs | 2.2307μs | 448.2820 KOps/s | 448.2500 KOps/s | |
test_stack[memmap_tensor0] | 35.6320μs | 6.6475μs | 150.4324 KOps/s | 142.9378 KOps/s | |
test_memmaptd_index | 1.1234ms | 0.4281ms | 2.3361 KOps/s | 2.2466 KOps/s | |
test_memmaptd_index_astensor | 0.7708ms | 0.5033ms | 1.9869 KOps/s | 1.9690 KOps/s | |
test_memmaptd_index_op | 1.4717ms | 1.0747ms | 930.5216 Ops/s | 921.3406 Ops/s | |
test_serialize_model | 0.1312s | 0.1309s | 7.6410 Ops/s | 7.6919 Ops/s | |
test_serialize_model_pickle | 1.3475s | 1.2122s | 0.8249 Ops/s | 0.8206 Ops/s | |
test_serialize_weights | 0.2221s | 0.1427s | 7.0071 Ops/s | 7.7429 Ops/s | |
test_serialize_weights_returnearly | 0.2191s | 56.0352ms | 17.8459 Ops/s | 17.5022 Ops/s | |
test_serialize_weights_pickle | 1.3730s | 1.2167s | 0.8219 Ops/s | 0.8220 Ops/s | |
test_reshape_pytree | 87.2940μs | 35.3559μs | 28.2838 KOps/s | 27.0274 KOps/s | |
test_reshape_td | 90.3240μs | 42.2968μs | 23.6425 KOps/s | 23.6769 KOps/s | |
test_view_pytree | 81.2040μs | 36.0725μs | 27.7220 KOps/s | 27.4229 KOps/s | |
test_view_td | 86.5450μs | 46.1661μs | 21.6609 KOps/s | 20.7161 KOps/s | |
test_unbind_pytree | 75.9230μs | 35.2578μs | 28.3625 KOps/s | 28.4316 KOps/s | |
test_unbind_td | 0.5239ms | 43.9439μs | 22.7563 KOps/s | 22.8208 KOps/s | |
test_split_pytree | 93.5640μs | 46.1924μs | 21.6486 KOps/s | 22.1953 KOps/s | |
test_split_td | 0.4713ms | 57.5467μs | 17.3772 KOps/s | 17.0933 KOps/s | |
test_add_pytree | 0.1454ms | 60.0033μs | 16.6658 KOps/s | 17.7596 KOps/s | |
test_add_td | 0.1440ms | 0.1046ms | 9.5613 KOps/s | 10.5779 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2328ms | 0.1611ms | 6.2083 KOps/s | 6.1726 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2917ms | 0.1654ms | 6.0443 KOps/s | 6.1328 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2083ms | 0.1450ms | 6.8973 KOps/s | 6.9338 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2381ms | 0.1834ms | 5.4524 KOps/s | 5.2996 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1005ms | 20.8125μs | 48.0480 KOps/s | 47.0397 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 77.5840μs | 49.0104μs | 20.4038 KOps/s | 19.7530 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2906ms | 64.5426μs | 15.4936 KOps/s | 15.4745 KOps/s | |
test_compile_copy_nested[pytree-eager] | 85.5440μs | 49.3214μs | 20.2752 KOps/s | 20.4643 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4258ms | 0.3234ms | 3.0923 KOps/s | 3.1197 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3170ms | 0.2401ms | 4.1647 KOps/s | 4.2561 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2227ms | 0.1286ms | 7.7749 KOps/s | 7.4455 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1053ms | 66.5697μs | 15.0219 KOps/s | 14.7062 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.4711ms | 0.3212ms | 3.1132 KOps/s | 3.0694 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.7064ms | 0.6239ms | 1.6029 KOps/s | 1.4364 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3684ms | 0.2887ms | 3.4642 KOps/s | 3.5709 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3789ms | 0.3260ms | 3.0679 KOps/s | 3.1033 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1179ms | 77.7951μs | 12.8543 KOps/s | 12.4816 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1757ms | 0.1310ms | 7.6320 KOps/s | 7.7374 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6653ms | 0.5288ms | 1.8912 KOps/s | 1.6864 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3730ms | 0.3208ms | 3.1173 KOps/s | 3.1391 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 57.8230μs | 19.3022μs | 51.8075 KOps/s | 51.3642 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 80.5530μs | 38.8061μs | 25.7691 KOps/s | 23.6251 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1238ms | 69.9257μs | 14.3009 KOps/s | 14.2410 KOps/s | |
test_compile_copy_flat[pytree-eager] | 82.6040μs | 50.9343μs | 19.6332 KOps/s | 19.3157 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.4274ms | 0.8450ms | 1.1834 KOps/s | 1.1089 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.6479ms | 3.2691ms | 305.8987 Ops/s | 304.1377 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.3290ms | 0.8221ms | 1.2165 KOps/s | 1.1081 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.2709ms | 3.1567ms | 316.7894 Ops/s | 303.2997 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1701ms | 0.1092ms | 9.1545 KOps/s | 9.1488 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.1919ms | 60.6901μs | 16.4772 KOps/s | 15.9317 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1504ms | 0.1040ms | 9.6165 KOps/s | 9.6208 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1534ms | 43.8246μs | 22.8183 KOps/s | 22.2996 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2033ms | 0.1037ms | 9.6406 KOps/s | 9.5413 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 89.1440μs | 43.0613μs | 23.2227 KOps/s | 22.0491 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1995ms | 0.1381ms | 7.2418 KOps/s | 7.2553 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1561ms | 24.8283μs | 40.2766 KOps/s | 37.5601 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1753ms | 0.1311ms | 7.6263 KOps/s | 7.4995 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 57.6130μs | 20.6637μs | 48.3941 KOps/s | 46.0420 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1811ms | 0.1326ms | 7.5394 KOps/s | 7.5306 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 55.3520μs | 20.4458μs | 48.9099 KOps/s | 46.7199 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1916ms | 0.1386ms | 7.2168 KOps/s | 7.1542 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5027ms | 24.3145μs | 41.1277 KOps/s | 37.1407 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1854ms | 0.1330ms | 7.5209 KOps/s | 7.5008 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 58.3630μs | 28.7004μs | 34.8427 KOps/s | 46.1948 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1746ms | 0.1324ms | 7.5531 KOps/s | 7.5012 KOps/s | |
test_compile_indexing[int-pytree-eager] | 57.1820μs | 20.5821μs | 48.5858 KOps/s | 39.1002 KOps/s | |
test_mod_add[eager] | 76.6540μs | 33.8482μs | 29.5436 KOps/s | 29.6159 KOps/s | |
test_mod_add[compile] | 0.1055ms | 67.7787μs | 14.7539 KOps/s | 13.8070 KOps/s | |
test_mod_add[compile-overhead] | 0.2616ms | 0.1361ms | 7.3496 KOps/s | 6.6274 KOps/s | |
test_mod_wrap[eager] | 0.9055ms | 0.7946ms | 1.2585 KOps/s | 1.2611 KOps/s | |
test_mod_wrap[compile] | 1.9298ms | 0.8425ms | 1.1869 KOps/s | 1.1882 KOps/s | |
test_mod_wrap[compile-overhead] | 4.9186ms | 3.0563ms | 327.1976 Ops/s | 324.6231 Ops/s | |
test_mod_wrap_and_backward[eager] | 4.1804ms | 4.0287ms | 248.2216 Ops/s | 241.3058 Ops/s | |
test_mod_wrap_and_backward[compile] | 4.2944ms | 4.0472ms | 247.0824 Ops/s | 241.1878 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3870ms | 0.9209ms | 1.0859 KOps/s | 978.8893 Ops/s | |
test_seq_add[eager] | 0.1419ms | 0.1034ms | 9.6708 KOps/s | 9.6389 KOps/s | |
test_seq_add[compile] | 0.1468ms | 79.1160μs | 12.6397 KOps/s | 12.0675 KOps/s | |
test_seq_add[compile-overhead] | 0.1660ms | 0.1141ms | 8.7627 KOps/s | 8.7051 KOps/s | |
test_seq_wrap[eager] | 1.1254ms | 0.9400ms | 1.0639 KOps/s | 1.0534 KOps/s | |
test_seq_wrap[compile] | 0.9453ms | 0.8568ms | 1.1672 KOps/s | 1.1729 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2715ms | 0.2199ms | 4.5485 KOps/s | 4.5162 KOps/s | |
test_func_call_runtime[False-eager] | 2.4327ms | 2.3588ms | 423.9449 Ops/s | 418.5620 Ops/s | |
test_func_call_runtime[False-compile] | 2.5411ms | 2.4097ms | 414.9936 Ops/s | 417.2169 Ops/s | |
test_func_call_runtime[False-compile-overhead] | 0.4154ms | 0.3599ms | 2.7787 KOps/s | 2.7557 KOps/s | |
test_func_call_runtime[True-eager] | 2.5917ms | 2.5165ms | 397.3713 Ops/s | 393.3121 Ops/s | |
test_func_call_runtime[True-compile] | 2.4900ms | 2.4192ms | 413.3574 Ops/s | 410.3641 Ops/s | |
test_func_call_runtime[True-compile-overhead] | 0.4345ms | 0.3814ms | 2.6217 KOps/s | 2.5925 KOps/s | |
test_func_call_cm_runtime[False-eager] | 2.4451ms | 2.3578ms | 424.1280 Ops/s | 423.7546 Ops/s | |
test_func_call_cm_runtime[False-compile] | 2.4984ms | 2.3993ms | 416.7904 Ops/s | 412.9805 Ops/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4223ms | 0.3621ms | 2.7616 KOps/s | 2.7432 KOps/s | |
test_func_call_cm_runtime[True-eager] | 2.7639ms | 2.6334ms | 379.7352 Ops/s | 376.8753 Ops/s | |
test_func_call_cm_runtime[True-compile] | 2.7004ms | 2.4625ms | 406.0948 Ops/s | 408.3351 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4551ms | 0.4049ms | 2.4698 KOps/s | 2.4330 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 4.1670ms | 3.7624ms | 265.7881 Ops/s | 265.1598 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.6021ms | 2.4851ms | 402.4037 Ops/s | 405.8004 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4959ms | 0.4114ms | 2.4309 KOps/s | 2.4154 KOps/s | |
test_distributed | 3.2177ms | 0.3131ms | 3.1934 KOps/s | 8.8422 KOps/s | |
test_tdmodule | 0.1221ms | 16.3919μs | 61.0056 KOps/s | 62.1633 KOps/s | |
test_tdmodule_dispatch | 61.3530μs | 31.9486μs | 31.3003 KOps/s | 32.5337 KOps/s | |
test_tdseq | 36.4610μs | 17.4014μs | 57.4667 KOps/s | 59.1538 KOps/s | |
test_tdseq_dispatch | 58.0530μs | 35.3585μs | 28.2818 KOps/s | 29.3953 KOps/s | |
test_instantiation_functorch | 2.0347ms | 1.8641ms | 536.4628 Ops/s | 519.8861 Ops/s | |
test_instantiation_td | 1.7880ms | 1.1985ms | 834.3578 Ops/s | 814.3373 Ops/s | |
test_exec_functorch | 1.0614ms | 1.0024ms | 997.5735 Ops/s | 993.7790 Ops/s | |
test_exec_functional_call | 1.1054ms | 0.9991ms | 1.0009 KOps/s | 984.4036 Ops/s | |
test_exec_td | 1.1170ms | 1.0339ms | 967.1808 Ops/s | 960.8452 Ops/s | |
test_exec_td_decorator | 1.7599ms | 1.0576ms | 945.5147 Ops/s | 931.3298 Ops/s | |
test_vmap_mlp_speed[True-True] | 1.3624ms | 1.2809ms | 780.6784 Ops/s | 777.0963 Ops/s | |
test_vmap_mlp_speed[True-False] | 1.3859ms | 1.2844ms | 778.5715 Ops/s | 782.0901 Ops/s | |
test_vmap_mlp_speed[False-True] | 1.3403ms | 1.1642ms | 858.9348 Ops/s | 860.3825 Ops/s | |
test_vmap_mlp_speed[False-False] | 1.2542ms | 1.1642ms | 858.9887 Ops/s | 856.9822 Ops/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.9971ms | 1.2538ms | 797.5615 Ops/s | 800.3945 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.3590ms | 1.2545ms | 797.1313 Ops/s | 794.7663 Ops/s | |
test_vmap_mlp_speed_decorator[False-True] | 1.3367ms | 1.1660ms | 857.6344 Ops/s | 858.6826 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.2639ms | 1.1677ms | 856.3572 Ops/s | 860.8212 Ops/s | |
test_vmap_transformer_speed[True-True] | 13.3609ms | 13.1153ms | 76.2470 Ops/s | 75.3767 Ops/s | |
test_vmap_transformer_speed[True-False] | 13.2639ms | 13.0704ms | 76.5090 Ops/s | 75.3251 Ops/s | |
test_vmap_transformer_speed[False-True] | 13.2810ms | 12.8928ms | 77.5629 Ops/s | 76.8515 Ops/s | |
test_vmap_transformer_speed[False-False] | 12.9869ms | 12.9219ms | 77.3881 Ops/s | 76.4952 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 33.8700ms | 33.6856ms | 29.6863 Ops/s | 29.5403 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 34.0244ms | 33.7194ms | 29.6565 Ops/s | 29.4989 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 34.0312ms | 33.6259ms | 29.7390 Ops/s | 29.5847 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 33.7731ms | 33.5842ms | 29.7759 Ops/s | 29.5997 Ops/s | |
test_to_module_speed[True] | 1.5358ms | 0.9951ms | 1.0049 KOps/s | 993.2219 Ops/s | |
test_to_module_speed[False] | 1.4003ms | 0.9632ms | 1.0382 KOps/s | 1.0206 KOps/s | |
test_tc_init | 60.0630μs | 36.9607μs | 27.0558 KOps/s | 27.1113 KOps/s | |
test_tc_init_nested | 0.1214ms | 76.5382μs | 13.0654 KOps/s | 13.5149 KOps/s | |
test_tc_first_layer_tensor | 5.9917μs | 0.6750μs | 1.4815 MOps/s | 1.4676 MOps/s | |
test_tc_first_layer_nontensor | 23.6820μs | 2.2735μs | 439.8579 KOps/s | 450.4643 KOps/s | |
test_tc_second_layer_tensor | 17.7710μs | 1.3764μs | 726.5198 KOps/s | 735.4019 KOps/s | |
test_tc_second_layer_nontensor | 45.6120μs | 2.9740μs | 336.2518 KOps/s | 339.8299 KOps/s | |
test_unbind | 0.1982s | 12.3887ms | 80.7189 Ops/s | 89.9488 Ops/s | |
test_full_like | 0.6571ms | 0.5723ms | 1.7473 KOps/s | 1.7441 KOps/s | |
test_zeros_like | 0.3265ms | 0.1980ms | 5.0516 KOps/s | 5.0549 KOps/s | |
test_ones_like | 0.2287ms | 0.1978ms | 5.0556 KOps/s | 5.0589 KOps/s | |
test_clone | 0.4509ms | 0.4137ms | 2.4170 KOps/s | 2.4165 KOps/s | |
test_squeeze | 36.8420μs | 9.8555μs | 101.4665 KOps/s | 98.1107 KOps/s | |
test_unsqueeze | 0.2365ms | 73.8078μs | 13.5487 KOps/s | 12.8081 KOps/s | |
test_split | 0.4073ms | 0.1588ms | 6.2965 KOps/s | 6.0688 KOps/s | |
test_permute | 0.2328ms | 0.1775ms | 5.6328 KOps/s | 5.4087 KOps/s | |
test_stack | 1.2547ms | 0.8786ms | 1.1381 KOps/s | 1.1707 KOps/s | |
test_cat | 1.2568ms | 1.2314ms | 812.0839 Ops/s | 812.0048 Ops/s |
vmoens
added a commit
that referenced
this pull request
Oct 4, 2024
ghstack-source-id: 9e659036f70a1584a686453d4a4dd2c6a1cf932b Pull Request resolved: #1021
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):