Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update doc for Spark + Volcano #284

Merged
merged 1 commit into from
Sep 28, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions content/en/docs/spark_on_volcano.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,15 @@ Spark is a fast and versatile big data clustering computing system. It provides

### Spark on Volcano

Spark operates on Volcano in two forms.Here we take the form of a simpler Spark-Operator [1]. There is also a more complex deployment method that can be referred to [2].
Currently, there are two ways to support the integration of Spark on Kubernetes and volcano.
- Spark on Kubernetes native support: maintained by the [Apache Spark community](https://github.com/apache/spark) and Volcano community
- Spark Operator support: maintained by the [GoogleCloudPlatform community](https://github.com/GoogleCloudPlatform/spark-on-k8s-operator) and Volcano community

#### Spark on Kubernetes native support (spark-submit)

Spark on Kubernetes with Volcano as a custom scheduler is supported since Spark v3.3.0 and Volcano v1.5.1. See more detail in [link](https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-volcano-as-customized-scheduler-for-spark-on-kubernetes).

#### Spark Operator support (spark-operator)

Install Spark-Operator through Helm.

Expand Down Expand Up @@ -90,4 +98,4 @@ Deploy the Spark application and see the status.
```
$ kubectl apply -f spark-pi.yaml
$ kubectl get SparkApplication
```
```
14 changes: 11 additions & 3 deletions content/zh/docs/spark_on_volcano.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,17 @@ linktitle = "Spark"

Spark是一款快速通用的大数据集群计算系统。它提供了Scala、Java、Python和R的高级api,以及一个支持用于数据分析的通用计算图的优化引擎。它还支持一组丰富的高级工具,包括用于SQL和DataFrames的Spark SQL、用于机器学习的MLlib、用于图形处理的GraphX和用于流处理的Spark Streaming。

### Spark on volcano
### Spark on Volcano

Spark在volcano上的运行有两种形式,这里采用比较简单的spark-operator的形式[1]。还有一种较为复杂的部署方式可以参考[2]。
当前,有两种方式可以支持Spark和Volcano集成:
- Spark on Kubernetes native支持: 由[Apache Spark社区](https://github.com/apache/spark)和Volcano社区共同维护。
- Spark Operator支持: 由[GoogleCloudPlatform community](https://github.com/GoogleCloudPlatform/spark-on-k8s-operator)和Volcano社区共同维护。

#### Spark on Kubernetes native支持 (spark-submit)

从Apache Spark v3.3.0版本及Volcano v1.5.1版本开始,Spark支持Volcano作为自定义调度,查看[链接](https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-volcano-as-customized-scheduler-for-spark-on-kubernetes)了解更多。

#### Spark Operator支持 (spark-operator)

通过helm安装spark-operator。

Expand Down Expand Up @@ -90,4 +98,4 @@ spec:
```
$ kubectl apply -f spark-pi.yaml
$ kubectl get SparkApplication
```
```