Skip to content

Commit

Permalink
Merge pull request #284 from Yikun/patch-1
Browse files Browse the repository at this point in the history
Update doc for Spark + Volcano
  • Loading branch information
volcano-sh-bot authored Sep 28, 2022
2 parents 4c34699 + a8ab0d8 commit 8d9cf22
Show file tree
Hide file tree
Showing 2 changed files with 21 additions and 5 deletions.
12 changes: 10 additions & 2 deletions content/en/docs/spark_on_volcano.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,15 @@ Spark is a fast and versatile big data clustering computing system. It provides

### Spark on Volcano

Spark operates on Volcano in two forms.Here we take the form of a simpler Spark-Operator [1]. There is also a more complex deployment method that can be referred to [2].
Currently, there are two ways to support the integration of Spark on Kubernetes and volcano.
- Spark on Kubernetes native support: maintained by the [Apache Spark community](https://github.com/apache/spark) and Volcano community
- Spark Operator support: maintained by the [GoogleCloudPlatform community](https://github.com/GoogleCloudPlatform/spark-on-k8s-operator) and Volcano community

#### Spark on Kubernetes native support (spark-submit)

Spark on Kubernetes with Volcano as a custom scheduler is supported since Spark v3.3.0 and Volcano v1.5.1. See more detail in [link](https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-volcano-as-customized-scheduler-for-spark-on-kubernetes).

#### Spark Operator support (spark-operator)

Install Spark-Operator through Helm.

Expand Down Expand Up @@ -90,4 +98,4 @@ Deploy the Spark application and see the status.
```
$ kubectl apply -f spark-pi.yaml
$ kubectl get SparkApplication
```
```
14 changes: 11 additions & 3 deletions content/zh/docs/spark_on_volcano.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,17 @@ linktitle = "Spark"

Spark是一款快速通用的大数据集群计算系统。它提供了Scala、Java、Python和R的高级api,以及一个支持用于数据分析的通用计算图的优化引擎。它还支持一组丰富的高级工具,包括用于SQL和DataFrames的Spark SQL、用于机器学习的MLlib、用于图形处理的GraphX和用于流处理的Spark Streaming。

### Spark on volcano
### Spark on Volcano

Spark在volcano上的运行有两种形式,这里采用比较简单的spark-operator的形式[1]。还有一种较为复杂的部署方式可以参考[2]
当前,有两种方式可以支持Spark和Volcano集成:
- Spark on Kubernetes native支持: 由[Apache Spark社区](https://github.com/apache/spark)和Volcano社区共同维护。
- Spark Operator支持: 由[GoogleCloudPlatform community](https://github.com/GoogleCloudPlatform/spark-on-k8s-operator)和Volcano社区共同维护。

#### Spark on Kubernetes native支持 (spark-submit)

从Apache Spark v3.3.0版本及Volcano v1.5.1版本开始,Spark支持Volcano作为自定义调度,查看[链接](https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-volcano-as-customized-scheduler-for-spark-on-kubernetes)了解更多。

#### Spark Operator支持 (spark-operator)

通过helm安装spark-operator。

Expand Down Expand Up @@ -90,4 +98,4 @@ spec:
```
$ kubectl apply -f spark-pi.yaml
$ kubectl get SparkApplication
```
```

0 comments on commit 8d9cf22

Please sign in to comment.