libtorch: new recipe #24759

valgur · 2024-07-30T06:28:42Z

Summary

Changes to recipe: libtorch/2.4.0

Motivation

Tensors and Dynamic neural networks in Python with strong GPU acceleration.

Details

Continues from #5100 by @SpaceIm.

CUDA, HIP and SYCL backends are currently disabled since the PR is complex enough already and these can be addressed in a follow-up PR. Vulkan and Metal (TODO) should be usable as GPU backends currently.

Distributed feature is disabled as well to limit the scope and due to openmpi not yet being available (#18980).

Android and iOS builds are probably broken and need testing.

Non-OpenBLAS BLAS backends are probably not usable due to OpenBLAS being required for LAPACK. A separate LAPACK recipe would be required to fix that (such as #23798).

Closes #6861.

TODO:

Export missing CMake variables.
Test with Metal on macOS.
Submit bugfix patches upstream.
Create a recipe for pocketfft and unvendor.

Read the contributing guidelines
Checked that this PR is not a duplicate: list of PRs by recipe
Tested locally with at least one configuration using a recent version of Conan

XNNPACK was not correctly added to project dependencies. Prefer namespaced targets, if possible.

github-actions · 2024-07-30T22:53:09Z

Hooks produced the following warnings for commit 87a1370

libtorch/2.4.0@#f680755600363ae5e29186ad5b798792

post_source(): WARN: [SHORT_PATHS USAGE (KB-H066)] The file './third_party/kineto/libkineto/third_party/dynolog/hbt/src/perf_event/json_events/generated/intel/sapphirerapids_uncore_experimental.cpp' has a very long path and may exceed Windows max path length. Add 'short_paths = True' in your recipe.
post_source(): WARN: [SHORT_PATHS USAGE (KB-H066)] The file './third_party/kineto/libkineto/third_party/dynolog/third_party/googletest/googlemock/include/gmock/internal/custom/gmock-generated-actions.h' has a very long path and may exceed Windows max path length. Add 'short_paths = True' in your recipe.
post_source(): WARN: [SHORT_PATHS USAGE (KB-H066)] The file './third_party/kineto/libkineto/third_party/dynolog/third_party/json/doc/mkdocs/docs/api/byte_container_with_subtype/byte_container_with_subtype.md' has a very long path and may exceed Windows max path length. Add 'short_paths = True' in your recipe.
post_source(): WARN: [SHORT_PATHS USAGE (KB-H066)] The file './third_party/kineto/libkineto/third_party/dynolog/third_party/json/test/reports/2016-09-09-nativejson_benchmark/conformance_overall_Result.png' has a very long path and may exceed Windows max path length. Add 'short_paths = True' in your recipe.
post_source(): WARN: [SHORT_PATHS USAGE (KB-H066)] The file './third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM/testing/python3/dcptestautomation/parse_dcgmproftester_single_metric.py' has a very long path and may exceed Windows max path length. Add 'short_paths = True' in your recipe.
post_source(): WARN: [SHORT_PATHS USAGE (KB-H066)] The file './third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM/testing/python3/tests/nvswitch_tests/test_nvswitch_with_running_fm.py' has a very long path and may exceed Windows max path length. Add 'short_paths = True' in your recipe.
post_source(): WARN: [SHORT_PATHS USAGE (KB-H066)] The file './third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM/scripts/verify_package_contents/datacenter-gpu-manager_VERSION_arm64.deb.txt' has a very long path and may exceed Windows max path length. Add 'short_paths = True' in your recipe.
post_source(): WARN: [SHORT_PATHS USAGE (KB-H066)] The file './test/dynamo_expected_failures/TestExpandedWeightFunctionalCPU.test_expanded_weights_per_sample_grad_input_no_grad_nn_functional_group_norm_cpu_float64' has a very long path and may exceed Windows max path length. Add 'short_paths = True' in your recipe.
post_source(): WARN: [SHORT_PATHS USAGE (KB-H066)] The file './test/dynamo_skips/TestProxyTensorOpInfoCPU.test_make_fx_symbolic_exhaustive_inplace_nn_functional_feature_alpha_dropout_without_train_cpu_float32' has a very long path and may exceed Windows max path length. Add 'short_paths = True' in your recipe.
post_package(): WARN: [SHORT_PATHS USAGE (KB-H066)] The file './include/ATen/native/transformers/cuda/mem_eff_attention/iterators/predicated_tile_iterator_residual_last.h' has a very long path and may exceed Windows max path length. Add 'short_paths = True' in your recipe.
post_package(): WARN: [SHORT_PATHS USAGE (KB-H066)] The file './include/ATen/native/transformers/cuda/mem_eff_attention/epilogue/epilogue_thread_apply_logsumexp.h' has a very long path and may exceed Windows max path length. Add 'short_paths = True' in your recipe.
post_package(): WARN: [SHORT_PATHS USAGE (KB-H066)] The file './include/ATen/ops/max_pool2d_with_indices_backward_compositeexplicitautogradnonfunctional_dispatch.h' has a very long path and may exceed Windows max path length. Add 'short_paths = True' in your recipe.

conan-center-bot · 2024-08-13T01:16:43Z

Conan v1 pipeline ❌

Failure in build 6 (ff36ad93e96684208459986e2a5088b01a02883d):

libtorch/2.4.0:
An unexpected error happened and has been reported

^{Note: To save resources, CI tries to finish as soon as an error is found. For this reason you might find that not all the references have been launched or not all the configurations for a given reference. Also, take into account that we cannot guarantee the order of execution as it depends on CI workload and workers availability.}

Conan v2 pipeline ❌

Note: Conan v2 builds are now mandatory. Please read our discussion about it.

The v2 pipeline failed. Please, review the errors and note this is required for pull requests to be merged. In case this recipe is still not ported to Conan 2.x, please, ping @conan-io/barbarians on the PR and we will help you.

Failure in build 8 (ff36ad93e96684208459986e2a5088b01a02883d):

libtorch/2.4.0:
CI failed to create some packages (All logs)

Logs for packageID 999239f19123416d584ffc8c46c1df33a363bf09:

[settings]
arch=armv8
build_type=Release
compiler=apple-clang
compiler.cppstd=17
compiler.libcxx=libc++
compiler.version=13
os=Macos
[options]
*/*:shared=False

[...]
--   USE_NCCL              : OFF
--   USE_NNPACK            : OFF
--   USE_NUMPY             : OFF
--   USE_OBSERVERS         : False
--   USE_OPENCL            : False
--   USE_OPENMP            : False
--   USE_MIMALLOC          : False
--   USE_VULKAN            : False
--   USE_PROF              : OFF
--   USE_PYTORCH_QNNPACK   : True
--   USE_XNNPACK           : True
--   USE_DISTRIBUTED       : OFF
--   Public Dependencies  : 
--   Private Dependencies : cpuinfo;fp16::fp16;fmt::fmt;pthreadpool::pthreadpool;flatbuffers::flatbuffers;xnnpack::xnnpack;Threads::Threads;cpuinfo;pytorch_qnnpack;fp16;onnx::onnx;foxi_loader;fmt::fmt-header-only;kineto
--   Public CUDA Deps.    : 
--   Private CUDA Deps.   : 
--   USE_COREML_DELEGATE     : False
--   BUILD_LAZY_TS_BACKEND   : True
--   USE_ROCM_KERNEL_ASSERT : OFF
-- Configuring done (5.6s)
-- Generating done (0.5s)
-- Build files have been written to: /Users/jenkins/workspace/prod-v2/bsr/75828/debae/p/b/libto3d26c80da6c4e/b/build/Release
[  0%] Linking C static library ../../lib/libfxdiv.a
[  0%] Built target clog
[  0%] Built target libkineto_defs.bzl
ar: no archive members specified
usage:  ar -d [-TLsv] archive file ...
	ar -m [-TLsv] archive file ...
	ar -m [-abiTLsv] position archive file ...
	ar -p [-TLsv] archive [file ...]
	ar -q [-cTLsv] archive file ...
	ar -r [-cuTLsv] archive file ...
	ar -r [-abciuTLsv] position archive file ...
	ar -t [-TLsv] archive [file ...]
	ar -x [-ouTLsv] archive [file ...]
make[2]: *** [lib/libfxdiv.a] Error 1
make[1]: *** [confu-deps/pytorch_qnnpack/CMakeFiles/fxdiv.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[  0%] Built target kineto_api
[  1%] Built target kineto_base
[  8%] Built target c10
[  8%] Built target ATEN_CPU_FILES_GEN_TARGET
make: *** [all] Error 2

libtorch/2.4.0: ERROR: 
Package '999239f19123416d584ffc8c46c1df33a363bf09' build failed
libtorch/2.4.0: WARN: Build folder /Users/jenkins/workspace/prod-v2/bsr/75828/debae/p/b/libto3d26c80da6c4e/b/build/Release
ERROR: libtorch/2.4.0: Error in build() method, line 497
	cmake.build(cli_args=["--parallel", "1"])
	ConanException: Error 2 while executing

^{Note: To save resources, CI tries to finish as soon as an error is found. For this reason you might find that not all the references have been launched or not all the configurations for a given reference. Also, take into account that we cannot guarantee the order of execution as it depends on CI workload and workers availability.}

hasB4K · 2024-09-26T09:52:58Z

Hello @valgur, thanks for this amazing PR. Do you plan to continue working on it? 🤞Having libtorch in Conan would be so neat. Since OpenMPI is now available, do you plan to let the user to enable the distributed feature?

keef-cognitiv · 2024-10-04T23:44:47Z