Skip to content

Commit

Permalink
[ML-108] Update PCA GPU, LiR CPU and Improve JAR packaging and libs l…
Browse files Browse the repository at this point in the history
…oading (#111)

* Update build scripts for CPU_GPU_PROFILE

* Refactor and add PCA GPU

* Update load gpu libs and use checkClusterPlatformCompatibility to load libs

* update LinearRegression for checkClusterPlatformCompatibility

* Update pom and README

* Update README
  • Loading branch information
xwu99 authored Aug 4, 2021
1 parent 7bf73e4 commit a639def
Show file tree
Hide file tree
Showing 28 changed files with 535 additions and 144 deletions.
13 changes: 9 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,10 @@ OAP MLlib is an optimized package to accelerate machine learning algorithms in

## Compatibility

OAP MLlib tried to maintain the same API interfaces and produce same results that are identical with Spark MLlib. However due to the nature of float point operations, there may be some small deviation from the original result, we will try our best to make sure the error is within acceptable range.
OAP MLlib maintains the same API interfaces with Spark MLlib. That means the application built with Spark MLlib can be running directly with minimum configuration.

Most of the algorithms can produce the same results that are identical with Spark MLlib. However due to the nature of distributed float point operations, there may be some small deviation from the original result, we will make sure the error is within acceptable range and the accuracy is on par with Spark MLlib.

For those algorithms that are not accelerated by OAP MLlib, the original Spark MLlib one will be used.

## Online Documentation
Expand Down Expand Up @@ -216,8 +219,10 @@ als-pyspark | ALS example for PySpark

Algorithm | Category | Maturity
------------------|----------|-------------
K-Means | CPU, GPU | Experimental
PCA | CPU | Experimental
ALS | CPU | Experimental
K-Means | CPU | Stable
K-Means | GPU | Experimental
PCA | CPU | Stable
PCA | GPU | Experimental
ALS | CPU | Stable
Naive Bayes | CPU | Experimental
Linear Regression | CPU | Experimental
85 changes: 85 additions & 0 deletions mllib-dal/build-cpu-gpu.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
#!/usr/bin/env bash

# Check envs for building
if [[ -z $JAVA_HOME ]]; then
echo JAVA_HOME not defined!
exit 1
fi

if [[ -z $(which mvn) ]]; then
echo Maven not found!
exit 1
fi

if [[ -z $DAALROOT ]]; then
echo DAALROOT not defined!
exit 1
fi

if [[ -z $TBBROOT ]]; then
echo TBBROOT not defined!
exit 1
fi

if [[ -z $CCL_ROOT ]]; then
echo CCL_ROOT not defined!
exit 1
fi

versionArray=(
spark-3.0.0 \
spark-3.0.1 \
spark-3.0.2 \
spark-3.1.1
)

SPARK_VER=spark-3.1.1
MVN_NO_TRANSFER_PROGRESS=

print_usage() {
echo
echo Usage: ./build.sh [-p spark-x.x.x] [-q] [-h]
echo
echo Supported Spark versions:
for version in ${versionArray[*]}
do
echo " $version"
done
echo
}

while getopts "hqp:" opt
do
case $opt in
p) SPARK_VER=$OPTARG ;;
q) MVN_NO_TRANSFER_PROGRESS=--no-transfer-progress ;;
h | *)
print_usage
exit 1
;;
esac
done

if [[ ! ${versionArray[*]} =~ $SPARK_VER ]]; then
echo Error: $SPARK_VER version is not supported!
exit 1
fi

export PLATFORM_PROFILE=CPU_GPU_PROFILE

print_usage

echo === Building Environments ===
echo JAVA_HOME=$JAVA_HOME
echo DAALROOT=$DAALROOT
echo TBBROOT=$TBBROOT
echo CCL_ROOT=$CCL_ROOT
echo Maven Version: $(mvn -v | head -n 1 | cut -f3 -d" ")
echo Clang Version: $(clang -dumpversion)
echo Spark Version: $SPARK_VER
echo Platform Profile: $PLATFORM_PROFILE
echo =============================
echo
echo Building with $SPARK_VER ...
echo
mvn $MVN_NO_TRANSFER_PROGRESS -P$SPARK_VER -DskipTests clean package
3 changes: 3 additions & 0 deletions mllib-dal/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,8 @@ fi

print_usage

export PLATFORM_PROFILE=CPU_ONLY_PROFILE

echo === Building Environments ===
echo JAVA_HOME=$JAVA_HOME
echo DAALROOT=$DAALROOT
Expand All @@ -75,6 +77,7 @@ echo CCL_ROOT=$CCL_ROOT
echo Maven Version: $(mvn -v | head -n 1 | cut -f3 -d" ")
echo Clang Version: $(clang -dumpversion)
echo Spark Version: $SPARK_VER
echo Platform Profile: $PLATFORM_PROFILE
echo =============================
echo
echo Building with $SPARK_VER ...
Expand Down
22 changes: 19 additions & 3 deletions mllib-dal/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,9 @@
<ccl.lib>libccl.so</ccl.lib>
<ccl.fabric.lib>libfabric.so.1</ccl.fabric.lib>
<ccl.mpi.lib>libmpi.so.12.0.0</ccl.mpi.lib>
<opencl.lib>libOpenCL.so.1</opencl.lib>
<sycl.lib>libsycl.so.5</sycl.lib>
<assembly.description>src/assembly/assembly.xml</assembly.description>
</properties>

<dependencies>
Expand Down Expand Up @@ -149,10 +152,20 @@
<profiles>

<profile>
<id>spark-3.0.0</id>
<id>cpu-gpu</id>
<activation>
<activeByDefault>true</activeByDefault>
<property>
<name>env.PLATFORM_PROFILE</name>
<value>CPU_GPU_PROFILE</value>
</property>
</activation>
<properties>
<assembly.description>src/assembly/assembly-cpu-gpu.xml</assembly.description>
</properties>
</profile>

<profile>
<id>spark-3.0.0</id>
<properties>
<spark.version>3.0.0</spark.version>
<scalatest.version>3.0.8</scalatest.version>
Expand All @@ -177,6 +190,9 @@

<profile>
<id>spark-3.1.1</id>
<activation>
<activeByDefault>true</activeByDefault>
</activation>
<properties>
<spark.version>3.1.1</spark.version>
<scalatest.version>3.2.3</scalatest.version>
Expand Down Expand Up @@ -435,7 +451,7 @@
<appendAssemblyId>false</appendAssemblyId>
<descriptors>
<!-- use customized assembly -->
<descriptor>src/assembly/assembly.xml</descriptor>
<descriptor>${assembly.description}</descriptor>
</descriptors>
</configuration>
<executions>
Expand Down
103 changes: 103 additions & 0 deletions mllib-dal/src/assembly/assembly-cpu-gpu.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
<assembly xmlns="http://maven.apache.org/ASSEMBLY/2.1.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/ASSEMBLY/2.1.0 http://maven.apache.org/xsd/assembly-2.1.0.xsd">
<id>jar-with-dependencies</id>
<formats>
<format>jar</format>
</formats>
<includeBaseDirectory>false</includeBaseDirectory>
<dependencySets>
<dependencySet>
<outputDirectory>/</outputDirectory>
<useProjectArtifact>true</useProjectArtifact>
<unpack>true</unpack>
<scope>runtime</scope>
</dependencySet>
<!-- Include local oneDAL jar-->
<dependencySet>
<outputDirectory>/</outputDirectory>
<unpack>true</unpack>
<scope>system</scope>
</dependencySet>
</dependencySets>
<fileSets>
<fileSet>
<directory>${project.basedir}</directory>
<outputDirectory>/</outputDirectory>
<includes>
<include>README*</include>
<include>LICENSE*</include>
<include>NOTICE*</include>
</includes>
</fileSet>
<fileSet>
<directory>${project.build.directory}</directory>
<outputDirectory>lib</outputDirectory>
<includes>
<include>*.so</include>
</includes>
</fileSet>
</fileSets>
<files>
<!-- Include TBB libraries into JAR -->
<file>
<source>${env.TBBROOT}/lib/intel64/gcc4.8/${tbb.lib}</source>
<outputDirectory>lib</outputDirectory>
<destName>libtbb.so.2</destName>
</file>
<file>
<source>${env.TBBROOT}/lib/intel64/gcc4.8/${tbb.malloc.lib}</source>
<outputDirectory>lib</outputDirectory>
<destName>libtbbmalloc.so.2</destName>
</file>
<!-- Include DAL libraries into JAR -->
<file>
<source>${env.DAALROOT}/lib/intel64/${dal.java.lib}</source>
<outputDirectory>lib</outputDirectory>
<destName>libJavaAPI.so</destName>
</file>
<!-- Include oneCCL libraries into JAR -->
<file>
<source>${env.CCL_ROOT}/lib/${ccl.fabric.lib}</source>
<outputDirectory>lib</outputDirectory>
</file>
<file>
<source>${env.CCL_ROOT}/lib/${ccl.mpi.lib}</source>
<outputDirectory>lib</outputDirectory>
<destName>libmpi.so.12</destName>
</file>
<file>
<source>${env.CCL_ROOT}/lib/libccl.so</source>
<outputDirectory>lib</outputDirectory>
</file>
<file>
<source>${env.CCL_ROOT}/lib/prov/libsockets-fi.so</source>
<outputDirectory>lib</outputDirectory>
</file>
<!-- Include SYCL libraries into JAR -->
<file>
<source>${env.CMPLR_ROOT}/linux/compiler/lib/intel64_lin/libintlc.so.5</source>
<outputDirectory>lib</outputDirectory>
</file>
<file>
<source>${env.CMPLR_ROOT}/linux/compiler/lib/intel64_lin/libsvml.so</source>
<outputDirectory>lib</outputDirectory>
</file>
<file>
<source>${env.CMPLR_ROOT}/linux/compiler/lib/intel64_lin/libirng.so</source>
<outputDirectory>lib</outputDirectory>
</file>
<file>
<source>${env.CMPLR_ROOT}/linux/compiler/lib/intel64_lin/libimf.so</source>
<outputDirectory>lib</outputDirectory>
</file>
<file>
<source>${env.CMPLR_ROOT}/linux/lib/${opencl.lib}</source>
<outputDirectory>lib</outputDirectory>
</file>
<file>
<source>${env.CMPLR_ROOT}/linux/lib/${sycl.lib}</source>
<outputDirectory>lib</outputDirectory>
</file>
</files>
</assembly>
23 changes: 22 additions & 1 deletion mllib-dal/src/main/java/org/apache/spark/ml/util/LibLoader.java
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,12 @@ public static String getTempSubDir() {
}

/**
* Load oneCCL and MLlibDAL libs
* Load all native libs
*/
public static synchronized void loadLibraries() throws IOException {
if (!loadLibSYCL()) {
log.debug("SYCL libraries are not available, will load CPU libraries only.");
}
loadLibCCL();
loadLibMLlibDAL();
}
Expand All @@ -59,6 +62,24 @@ private static synchronized void loadLibCCL() throws IOException {
loadFromJar(subDir, "libsockets-fi.so");
}

/**
* Load SYCL libs in dependency order
*/
private static synchronized Boolean loadLibSYCL() throws IOException {
// Check if SYCL libraries are available
InputStream streamIn = LibLoader.class.getResourceAsStream(LIBRARY_PATH_IN_JAR + "/libsycl.so.5");
if (streamIn == null) {
return false;
}
loadFromJar(subDir, "libintlc.so.5");
loadFromJar(subDir, "libimf.so");
loadFromJar(subDir, "libirng.so");
loadFromJar(subDir, "libsvml.so");
loadFromJar(subDir, "libOpenCL.so.1");
loadFromJar(subDir, "libsycl.so.5");
return true;
}

/**
* Load MLlibDAL lib, it depends TBB libs that are loaded by oneDAL, so this
* function should be called after oneDAL loadLibrary
Expand Down
1 change: 0 additions & 1 deletion mllib-dal/src/main/native/ALSDALImpl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@

#include <assert.h>
#include <chrono>
#include <daal.h>
#include <iostream>

#include "OneCCL.h"
Expand Down
1 change: 0 additions & 1 deletion mllib-dal/src/main/native/ALSShuffle.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@
#include <algorithm>
#include <cstring>
#include <iostream>
#include <oneapi/ccl.hpp>
#include <set>
#include <vector>

Expand Down
Loading

0 comments on commit a639def

Please sign in to comment.