This document contains the instructions & scripts on installing necessary dependencies and building OAP. You can get more detailed information from OAP each module blew.
- SQL Index and Data Source Cache
- RDD Cache PMem Extension
- Shuffle Remote PMem Extension
- Remote Shuffle
- Intel MLlib
- Unified Arrow Data Source
- Native SQL Engine
OAP is built with Apache Maven and Oracle Java 8, and mainly required tools to install on your cluster are listed below.
-
Requirements for Shuffle Remote PMem Extension
If enable Shuffle Remote PMem extension with RDMA, you can refer to Shuffle Remote PMem Extension Guide to configure and validate RDMA in advance.
We provide scripts below to help automatically install dependencies above except RDMA, need change to root account, run:
# git clone -b <tag-version> https://github.com/Intel-bigdata/OAP.git
# cd OAP
# sh dev/install-compile-time-dependencies.shRun the following command to learn more.
# sh $OAP_HOME/dev/scripts/prepare_oap_env.sh --helpRun the following command to automatically install specific dependency such as Maven.
# sh $OAP_HOME/dev/scripts/prepare_oap_env.sh --prepare_mavenNOTE: If you use install-compile-time-dependencies.sh or prepare_oap_env.sh to install GCC, or your GCC is not installed in the default path, please ensure you have exported CC (and CXX) before calling maven.
# export CXX=$OAPHOME/dev/thirdparty/gcc7/bin/g++
# export CC=$OAPHOME/dev/thirdparty/gcc7/bin/gccTo build OAP package, use
$ sh $OAP_HOME/dev/compile-oap.sh
#or
$ mvn clean -DskipTests package$ sh $OAP_HOME/dev/compile-oap.sh --oap-cache
#or
$ mvn clean -pl com.intel.oap:oap-cache -am packageTo run all the tests, use
$ mvn clean test$ mvn clean -pl com.intel.oap:oap-cache -am test
When use SQL Index and Data Source Cache with PMem, finish steps of Prerequisites for building to ensure needed dependencies have been installed.
Add -Ppersistent-memory to build OAP with PMem support.
$ mvn clean -q -Ppersistent-memory -DskipTests packageFor vmemcache strategy, build OAP with command :
$ mvn clean -q -Pvmemcache -DskipTests packageYou can build OAP with command below to use all of them:
$ mvn clean -q -Ppersistent-memory -Pvmemcache -DskipTests packageIf you want to generate a release package after you mvn package all modules, use the following command, then you can find a tarball named oap-$VERSION-bin-spark-3.0.0.tar.gz under directory OAP/dev/release-package .
$ sh $OAP_HOME/dev/compile-oap.shThis session introduces what is required before submitting a code change to OAP.
-
We continue to use the Github Issues to track the new features/tasks/issues.
-
For every commit, we need an issue id for the commit.
-
Format the log message as following: [OAP-IssuesId][optional:ModuleName] detailed message
like [OAP-1406][rpmem-shuffle]Add shuffle block removing operation within one Spark context
-
Always merge your pull request as a single commit and the commit message follow the above format.
-
The formal features names in 0.9 are: SQL Index, SQL Data Source Cache, Native SQL Engine, Unified Arrow Data Source, RDD Cache PMem Extension, RPMem Shuffle, Remote Shuffle, Intel MLlib.
We don’t strictly request the module id the same as the feature name. Please align in the feature members to use a consistent name in the log message.