Skip to content

Commit a97527b

Browse files
committed
Update build hibench readme
* docs/build-hibench.md: * Update 2.4 version to specify Spark Version. * Add Specify Hadoop version documentation. * Add Build using JDK 11 documentation. * README.md: * Update Supported Hadoop/Spark releases to hadoop 3.2 and spark 2.4 Signed-off-by: Luis Ponce <[email protected]>
1 parent 3d0c2ad commit a97527b

File tree

2 files changed

+21
-6
lines changed

2 files changed

+21
-6
lines changed

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -135,12 +135,12 @@ There are totally 19 workloads in HiBench. The workloads are divided into 6 cate
135135
4. Fixwindow (fixwindow)
136136

137137
The workloads performs a window based aggregation. It tests the performance of window operation in the streaming frameworks.
138-
139-
140-
### Supported Hadoop/Spark/Flink/Storm/Gearpump releases: ###
141138

142-
- Hadoop: Apache Hadoop 2.x, CDH5, HDP
143-
- Spark: Spark 1.6.x, Spark 2.0.x, Spark 2.1.x, Spark 2.2.x
139+
### Supported Hadoop/Spark releases: ###
140+
- Hadoop: Apache Hadoop 2.x, 3.2, CDH5, HDP
141+
- Spark: Spark 1.6.x, Spark 2.0.x, Spark 2.1.x, Spark 2.2.x, Spark 2.4.x
142+
143+
### Supported Flink/Storm/Gearpump releases: ###
144144
- Flink: 1.0.3
145145
- Storm: 1.0.1
146146
- Gearpump: 0.8.1

docs/build-hibench.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ Because some Maven plugins cannot support Scala version perfectly, there are som
2828

2929

3030
### Specify Spark Version ###
31-
To specify the spark version, use -Dspark=xxx(1.6, 2.0, 2.1 or 2.2). By default, it builds for spark 2.0
31+
To specify the spark version, use -Dspark=xxx(1.6, 2.0, 2.1, 2.2 or 2.4). By default, it builds for spark 2.0
3232

3333
mvn -Psparkbench -Dspark=1.6 -Dscala=2.11 clean package
3434
tips:
@@ -37,6 +37,11 @@ default . For example , if we want use spark2.0 and scala2.11 to build hibench.
3737
package` , but for spark2.0 and scala2.10 , we need use the command `mvn -Dspark=2.0 -Dscala=2.10 clean package` .
3838
Similarly , the spark1.6 is associated with the scala2.10 by default.
3939

40+
### Specify Hadoop Version ###
41+
To specify the spark version, use -Dhadoop=xxx(3.2). By default, it builds for hadoop 2.4
42+
43+
mvn -Psparkbench -Dhadoop=3.2 -Dspark=2.4 clean package
44+
4045
### Build a single module ###
4146
If you are only interested in a single workload in HiBench. You can build a single module. For example, the below command only builds the SQL workloads for Spark.
4247

@@ -48,3 +53,13 @@ Supported modules includes: micro, ml(machine learning), sql, websearch, graph,
4853
For Spark 2.0 and Spark 2.1, we add the benchmark support for Structured Streaming. This is a new module which cannot be compiled in Spark 1.6. And it won't get compiled by default even if you specify the spark version as 2.0 or 2.1. You must explicitly specify it like this:
4954

5055
mvn -Psparkbench -Dmodules -PstructuredStreaming clean package
56+
57+
### Build using JDK 1.11
58+
**For Java 11 it is suitable to be built for Spark 2.4 _(Compiled with Scala 2.12)_ and/or Hadoop 3.2 only**
59+
60+
If you are interested in building using Java 11 indicate that streaming benchmarks won't be compiled, and specify scala, spark, hadoop and maven compiler version as below
61+
62+
mvn clean package -Psparkbench -Phadoopbench -Dhadoop=3.2 -Dspark=2.4 -Dscala=2.12 -Dexclude-streaming -Dmaven-compiler-plugin.version=3.8.0
63+
64+
Supported frameworks only: hadoopbench, sparkbench (Does not support flinkbench, stormbench, gearpumpbench)
65+
Supported modules includes: micro, ml(machine learning), websearch and graph (does not support streaming, structuredStreaming and sql)

0 commit comments

Comments
 (0)