Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tools to build TF AOT models #9093

Merged
1 change: 1 addition & 0 deletions cmssw-tool-conf.spec
Original file line number Diff line number Diff line change
@@ -183,6 +183,7 @@ Requires: xtl
Requires: xgboost
Requires: pytorch

## INCLUDE tfaot-models
## INCLUDE cmssw-vectorization
## INCLUDE cmssw-drop-tools
## INCLUDE scram-tool-conf
12 changes: 12 additions & 0 deletions pip/cms-tfaot.file
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
BuildRequires: py3-pip py3-setuptools py3-wheel
Requires: py3-PyYAML py3-cmsml

%define github_user riga
%define tag a9ee8f7d7091015a496f4be0880763a23b5287ab
%define branch master
%define source0 git+https://github.com/%{github_user}/cms-tfaot.git?obj=%{branch}/%{tag}&export=%{n}-%{realversion}&output=/%{n}-%{realversion}-%{tag}.tgz

# copy test models
%define PipPostInstall \
mkdir -p %{i}/share; \
cp -r cmsdist-tmp/pip-req-build-*/test_models %{i}/share/
3 changes: 2 additions & 1 deletion pip/requirements.txt
Original file line number Diff line number Diff line change
@@ -56,7 +56,8 @@ charset-normalizer==3.1.0
cleo==2.0.1
click==8.1.3
clikit==0.6.2
cmsml==0.2.2
cmsml==0.2.5
cms-tfaot==1.0.0
contourpy==1.0.7
correctionlib==2.2.2
crashtest==0.4.1
1 change: 1 addition & 0 deletions python_tools.spec
Original file line number Diff line number Diff line change
@@ -12,6 +12,7 @@ Requires: py3-keras
Requires: py3-scikit-learn
#save for the end
Requires: py3-tensorflow
Requires: py3-cms-tfaot
Requires: py3-cmsml
Requires: py3-law
Requires: py3-protobuf
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
<tool name="tensorflow-xla-runtime" version="@TOOL_VERSION@">
<client>
<environment name="TENSORFLOW_XLA_RUNTIME_BASE" default="@TOOL_ROOT@"/>
<environment name="LIBDIR" default="$TENSORFLOW_XLA_RUNTIME_BASE/lib/archive"/>
<environment name="LIBDIR" default="$TENSORFLOW_XLA_RUNTIME_BASE/lib"/>
</client>
<lib name="tf_xla_runtime-static"/>
<lib name="tf_xla_runtime"/>

<use name="eigen"/>
<use name="tensorflow-includes"/>
</tool>
12 changes: 12 additions & 0 deletions tensorflow-xla-runtime-absl.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
--- tensorflow/xla_aot_runtime_src/CMakeLists.txt 2024-03-24 08:28:34.000000000 +0100
+++ tensorflow/xla_aot_runtime_src/CMakeLists.txt 2024-03-25 11:17:58.108587945 +0100
@@ -14,6 +14,8 @@
-Wno-sign-compare
)

-add_library(tf_xla_runtime STATIC
+find_package(absl REQUIRED)
+add_library(tf_xla_runtime SHARED
$<TARGET_OBJECTS:tf_xla_runtime_objects>
)
+target_link_libraries(tf_xla_runtime absl::strings absl::str_format_internal)
23 changes: 17 additions & 6 deletions tensorflow-xla-runtime.spec
Original file line number Diff line number Diff line change
@@ -4,28 +4,39 @@

Source99: scram-tools.file/tools/eigen/env

Requires: eigen py3-tensorflow
Patch0: tensorflow-xla-runtime-absl

Requires: eigen py3-tensorflow abseil-cpp
BuildRequires: cmake

%prep

cp -r ${PY3_TENSORFLOW_ROOT}/lib/python%{cms_python3_major_minor_version}/site-packages/tensorflow .
%patch -p0

%build

source %{_sourcedir}/env
export CPATH="${CPATH}:${EIGEN_ROOT}/include/eigen3"

CXXFLAGS="-fPIC %{arch_build_flags} ${CMS_EIGEN_CXX_FLAGS}"
CXXFLAGS="-fPIC -Wl,-z,defs %{arch_build_flags} ${CMS_EIGEN_CXX_FLAGS}"
%ifarch x86_64
CXXFLAGS="${CXXFLAGS} -msse3"
CXXFLAGS="${CXXFLAGS} -msse3"
%endif

pushd tensorflow/xla_aot_runtime_src
cmake . -DCMAKE_CXX_FLAGS="${CXXFLAGS}" -DCMAKE_CXX_STANDARD=%{cms_cxx_standard} -DBUILD_SHARED_LIBS=OFF
# remove unnecessary implementations that use symbols that are not even existing
rm tensorflow/compiler/xla/service/cpu/runtime_fork_join.cc

cmake . \
-DCMAKE_CXX_FLAGS="${CXXFLAGS}" \
-DCMAKE_CXX_STANDARD=%{cms_cxx_standard} \
-DCMAKE_PREFIX_PATH=${ABSEIL_CPP_ROOT} \
-DBUILD_SHARED_LIBS=ON
make %{makeprocesses}
popd

%install

mkdir -p %{i}/lib/archive
mv tensorflow/xla_aot_runtime_src/libtf_xla_runtime.a %{i}/lib/archive/libtf_xla_runtime-static.a
mkdir -p %{i}/lib
mv tensorflow/xla_aot_runtime_src/libtf_xla_runtime.so %{i}/lib/
41 changes: 41 additions & 0 deletions tfaot-compile.file
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
## tfaot common compilation and requirement file
## specs including this file should provide:
## 1. a variable %{aot_config}, pointing to the aot config file of the model to compile (required)
## 2. a variable %{aot_source}, referring to a fetched source to unpack during %prep (optional)
## (in this case a "Source" should be defined and %{aot_source} is likely %{n}-%{realversion})

BuildRequires: py3-cms-tfaot
Requires: tensorflow-xla-runtime

%ifarch ppc64le
%define build_arch powerpc64le-unknown-linux-gnu
%else
%define build_arch %{_arch}-unknown-linux-gnu
%endif

%prep
%if "%{?aot_source}"
%setup -n %{aot_source}
%endif

%build
cms_tfaot_compile \
--aot-config "%{aot_config}" \
--tool-name "%{n}" \
--tool-base "%{i}" \
--output-directory compiled_model \
--additional-flags="--target_triple %{build_arch}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @riga, looks like this worked ( atleast ppc64le build was successful)

Copy link
Contributor

@smuzaffar smuzaffar Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rida, I guess cms_tfaot_compile tool is used by developers to work with new models ... right? Do we want developers to explicitly pass --additional-flags/--target_triple or may be you can set some default values in side the script ?

Copy link
Contributor Author

@riga riga Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. When run with --dev the tool (almost) mimics what the tfaot-compile.file does. The default arch for the --target_triple is x86_64 so I guess this should apply in most cases. However, in dev mode we could check which arch is currently used and issue a warning, together with instructions on how to change the triple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smuzaffar I just added a warning in cms-externals/cms-tfaot@a03ccef. Was your plan to move the project to cms-externals? If not, I would adjust the commit hash in this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was your plan to move the project to cms-externals? If not, I would adjust the commit hash in this PR.

right, we should move it to cms-externals. I have added you developers of cms-externals. Can you please transfer the repo there. If you do not want to transfer then I can fork it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it and updated the reference in the spec.


%install
mkdir -p %{i}/lib
mv compiled_model/*.o %{i}/lib/

mkdir -p %{i}/include/%{n}
mv compiled_model/*.h %{i}/include/%{n}

mkdir -p %{i}/etc/scram.d
mv compiled_model/%{n}.xml %{i}/etc/scram.d/

%post
%{relocateConfig}etc/scram.d/%{n}.xml
Copy link
Contributor

@smuzaffar smuzaffar Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@riga, there is hard-coded build path in tfaot-model-*/version/include/tfaot-model-test-multi/test_multi.h. Though it is in comment section but it is better to run relocation on these generated header files. I would suggest to add the following line here to relocated all generated header files

%relocateConfigAll include/%{n} *.h

[a]

e.g.

#ifndef TFAOT_MODEL_TEST_MULTI_H
#define TFAOT_MODEL_TEST_MULTI_H

/*
 * Auto-generated AOT wrapper for
 *   model path  : /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/py3-cms-tfaot/1.0.0-b8bfd2f94522378f26f348963d4ff4ac/share/test_models/multi/saved_model
 *   prefix      : test_multi
 *   namespace   : tfaot_model
 *   class name  : test_multi
 *   batch sizes : 1, 2, 4
 */

Copy link
Contributor Author

@riga riga Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added 👍
Plus, the last commit includes some final changes to the aot python tools.
The accompanying cmssw PR is also updated with adjustments to the unit tests of the dev workflow which failed tonight.

%relocateConfigAll include/%{n} *.h
5 changes: 5 additions & 0 deletions tfaot-model-test-multi.spec
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
### RPM external tfaot-model-test-multi 1.0.0

%define aot_config $PY3_CMS_TFAOT_ROOT/share/test_models/multi/aot_config.yaml

## INCLUDE tfaot-compile
5 changes: 5 additions & 0 deletions tfaot-model-test-simple.spec
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
### RPM external tfaot-model-test-simple 1.0.0

%define aot_config $PY3_CMS_TFAOT_ROOT/share/test_models/simple/aot_config.yaml

## INCLUDE tfaot-compile
3 changes: 3 additions & 0 deletions tfaot-models.file
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# test models needed by unit tests in PhysicsTools/TensorFlowAOT
Requires: tfaot-model-test-simple
Requires: tfaot-model-test-multi