-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-43504][K8S] Mounts the hadoop config map on the executor pod #41181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2945219 to
c599d24
Compare
2b052b1 to
6689b74
Compare
|
I encountered the same issues w/ Spark 3.3.1. This sounds like a regression (I suppose it works before SPARK-25815, I don't have experience w/ running such an old version of Spark on K8s). The key point is that the executor needs to download artifacts during the bootstrap phase, so the assumption in SPARK-25815 is not always true.
Given the executor use spark/core/src/main/scala/org/apache/spark/executor/Executor.scala Lines 1006 to 1012 in 0df4c01
This PR definitely fixes some use cases, @turboFei would you mind updating "Does this PR introduce any user-facing change?" |
updated |
...core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopConfExecutorFeatureStep.scala
Outdated
Show resolved
Hide resolved
...core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopConfExecutorFeatureStep.scala
Outdated
Show resolved
Hide resolved
|
seems the k8s integration testing is stuck, will check this pr in our dev hadoop cluster tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for making a PR, @turboFei .
However, this PR might cause a outage because the number of configMap is controlled by quota.
$ kubectl describe quota | grep configmaps
count/configmaps 4 150
To avoid the production outage, this should be under a new configuration with false by default at least.
|
thanks for the comments, I will check it |
150 is a bit small for serious production usage, we may add this note in the And BTW, this PR doesn't create new ConfigMaps, it either uses a user pre-set config map (no creation) or just reuse the config map created by driver which is created if necessary. |
yes, this PR doesn't create new ConfigMap. |
|
Oh, got it. Thank you for correcting me. |
|
the UT has passed, gentle ping @dongjoon-hyun |
...src/test/scala/org/apache/spark/deploy/k8s/features/HadoopConfExecutorFeatureStepSuite.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.
left one minor comment, I'm ok with merging this as it is.
|
gentle ping @dongjoon-hyun would you like to review again? thanks |
|
The Hadoop configurations can be propagated after #27735. And putting and locating extra configuration files in SPARK_HOME/conf is also a suggested way from our docs, so is this step necessary? Alternatively, if both exist, what is the precedence between them? Is it idempotent? |
I think it is necessary. Hadoop and spark are different components, it is better to maintain them separately. In our company, we have conf version for hadoop conf, so we do not put hadoop config files under SPARK_HOME/conf, we use soft link to manage the hadoop conf.
In this pr, it just mounts the hadoop config map in the executor side(mounts HADOOP_CONF_DIR env) and the hadoop conf mounted is absolute same with that in driver pod. As shown below, the SPARK_CONF_DIR has higher precedence. I think it is idempotent. spark/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh Lines 68 to 76 in e096bce
|
|
gentle ping @yaooqinn @dongjoon-hyun |
I do not fully agree. In the early days, Hadoop may be special. We have a specific code path to read HADOOP_CONF_DIR. But now Hadoop is an option, as we have other options for storage and scheduling, especially on the cloud or Kubernetes. Maybe we shall treat it like hive configurations or other third-party components to reduce the maintenance burden and complexity of the deployment. |
|
I can also help with a use case for this, usually the submission client is on a single environment (Lets say we have it on cloud), and with spark on k8s, we can easily run jobs in different envs like in private Cloud Clusters being submitted from public Cloud. Where we would need diff properties to be passed for the submission client as well as for drivers and executors. This is also a use case where mounting the hadoopConfMap in executors would help in making the task easy to maintain the configs. |
Yes, I think this pr is general for hadoop conf use case, and it does not create more resource because it just use the existing config map. @yaooqinn @dongjoon-hyun could you help to take another look? Appreciated for your help. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although this way is a little controversial like @yaooqinn pointed out, I agree with the intention and use-cases and I believe we can allow this way back additionally without many burden or intervention because spark.kubernetes.executor.hadoopConfigMapName is still reserved in the code.
If there is no strong objection from other people, +1 from my side. Please let us know if you still disagree, @yaooqinn .
|
Merged to master for Apache Spark 3.5.0. |
|
thanks all !!! |
|
thanks, @dongjoon-hyun and @turboFei. Late +1 from my side. |
### What changes were proposed in this pull request? In this pr, for spark on k8s, the hadoop config map will be mounted in executor side as well. Before, the hadoop config map is only mounted in driver side. ### Why are the changes needed? Since [SPARK-25815](https://issues.apache.org/jira/browse/SPARK-25815) [,](apache#22911,) the hadoop config map will not be mounted in executor side. Per the apache#22911 description: > The main two things that don't need to happen in executors anymore are: > 1. adding the Hadoop config to the executor pods: this is not needed > since the Spark driver will serialize the Hadoop config and send > it to executors when running tasks. But in fact, the executor still need the hadoop configuration.  As shown in above picture, the driver can resolve `hdfs://zeus`, but the executor can not. so we still need to mount the hadoop config map in executor side. ### Does this PR introduce _any_ user-facing change? Yes, users do not need to take workarounds to make executors load the hadoop configuration. Such as: - including hadoop conf in executor image - placing hadoop conf files under `SPARK_CONF_DIR`. ### How was this patch tested? UT. Closes apache#41181 from turboFei/exec_hadoop_conf. Authored-by: fwang12 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
|
gentle ping @turboFei |
|
@maomaodev Generally, the executor is supposed to use the delegation token instead of keytab to access kerberized HDFS. I wonder how do you set up your Kerberos auth for Spark? And what's the behavior before this PR. |
How about uploading the krb5.conf? spark/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala Lines 572 to 579 in 36d23ef
|
|
The parameter(spark.kubernetes.kerberos.krb5.path) has been set. |
|
@maomaodev I see what happens, the official Spark image installs so that In short, creating a dummy |
Yes, creating a dummy /etc/krb5.conf in the base image does work. Is the community planning to fix this issue? |
|
@maomaodev I will prepare a PR soon |
Thank you, I have created a corresponding jira:https://issues.apache.org/jira/browse/SPARK-50758 |






What changes were proposed in this pull request?
In this pr, for spark on k8s, the hadoop config map will be mounted in executor side as well.
Before, the hadoop config map is only mounted in driver side.
Why are the changes needed?
Since SPARK-25815 , the hadoop config map will not be mounted in executor side.
Per the #22911 description:
But in fact, the executor still need the hadoop configuration.
As shown in above picture, the driver can resolve
hdfs://zeus, but the executor can not.so we still need to mount the hadoop config map in executor side.
Does this PR introduce any user-facing change?
Yes, users do not need to take workarounds to make executors load the hadoop configuration.
Such as:
SPARK_CONF_DIR.How was this patch tested?
UT.