From a8ab0d8447f703745addd9c039379f6a3daf7ecf Mon Sep 17 00:00:00 2001 From: Yikun Jiang Date: Thu, 15 Sep 2022 16:28:40 +0800 Subject: [PATCH] Update spark_on_volcano.md Signed-off-by: Yikun Jiang --- content/en/docs/spark_on_volcano.md | 12 ++++++++++-- content/zh/docs/spark_on_volcano.md | 14 +++++++++++--- 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/content/en/docs/spark_on_volcano.md b/content/en/docs/spark_on_volcano.md index 7e451260..638bc7b3 100644 --- a/content/en/docs/spark_on_volcano.md +++ b/content/en/docs/spark_on_volcano.md @@ -24,7 +24,15 @@ Spark is a fast and versatile big data clustering computing system. It provides ### Spark on Volcano -Spark operates on Volcano in two forms.Here we take the form of a simpler Spark-Operator [1]. There is also a more complex deployment method that can be referred to [2]. +Currently, there are two ways to support the integration of Spark on Kubernetes and volcano. +- Spark on Kubernetes native support: maintained by the [Apache Spark community](https://github.com/apache/spark) and Volcano community +- Spark Operator support: maintained by the [GoogleCloudPlatform community](https://github.com/GoogleCloudPlatform/spark-on-k8s-operator) and Volcano community + +#### Spark on Kubernetes native support (spark-submit) + +Spark on Kubernetes with Volcano as a custom scheduler is supported since Spark v3.3.0 and Volcano v1.5.1. See more detail in [link](https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-volcano-as-customized-scheduler-for-spark-on-kubernetes). + +#### Spark Operator support (spark-operator) Install Spark-Operator through Helm. @@ -90,4 +98,4 @@ Deploy the Spark application and see the status. ``` $ kubectl apply -f spark-pi.yaml $ kubectl get SparkApplication -``` \ No newline at end of file +``` diff --git a/content/zh/docs/spark_on_volcano.md b/content/zh/docs/spark_on_volcano.md index 1d08004b..de2fa0f4 100644 --- a/content/zh/docs/spark_on_volcano.md +++ b/content/zh/docs/spark_on_volcano.md @@ -22,9 +22,17 @@ linktitle = "Spark" Spark是一款快速通用的大数据集群计算系统。它提供了Scala、Java、Python和R的高级api,以及一个支持用于数据分析的通用计算图的优化引擎。它还支持一组丰富的高级工具,包括用于SQL和DataFrames的Spark SQL、用于机器学习的MLlib、用于图形处理的GraphX和用于流处理的Spark Streaming。 -### Spark on volcano +### Spark on Volcano -Spark在volcano上的运行有两种形式,这里采用比较简单的spark-operator的形式[1]。还有一种较为复杂的部署方式可以参考[2]。 +当前,有两种方式可以支持Spark和Volcano集成: +- Spark on Kubernetes native支持: 由[Apache Spark社区](https://github.com/apache/spark)和Volcano社区共同维护。 +- Spark Operator支持: 由[GoogleCloudPlatform community](https://github.com/GoogleCloudPlatform/spark-on-k8s-operator)和Volcano社区共同维护。 + +#### Spark on Kubernetes native支持 (spark-submit) + +从Apache Spark v3.3.0版本及Volcano v1.5.1版本开始,Spark支持Volcano作为自定义调度,查看[链接](https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-volcano-as-customized-scheduler-for-spark-on-kubernetes)了解更多。 + +#### Spark Operator支持 (spark-operator) 通过helm安装spark-operator。 @@ -90,4 +98,4 @@ spec: ``` $ kubectl apply -f spark-pi.yaml $ kubectl get SparkApplication -``` \ No newline at end of file +```