Skip to content

Commit

Permalink
[SPARK-28938][K8S] Move to supported OpenJDK docker image for Kuberne…
Browse files Browse the repository at this point in the history
…tes (#616)

[SPARK-28938][K8S] Move to supported OpenJDK docker image for Kubernetes

The current docker image used by Kubernetes is `openjdk:8-alpine`. It was not supported and  was removed with the commit docker-library/openjdk@3eb0351#diff-f95ffa3d1377774732c33f7b8368e099.

This PR proposes to move to a supported docker image.

I think there are at least two reasons:

1. According to the commit, Alpine/musl is not officially supported by the OpenJDK project.
2. As no more OpenJDK 8 Alpine images, new JDK updates including security fixes
, are not applied to it. See below:

```
docker run -it --rm openjdk:8-alpine java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (IcedTea 3.12.0) (Alpine 8.212.04-r0)
OpenJDK 64-Bit Server VM (build 25.212-b04, mixed mode)
```
```
docker run -it --rm openjdk:8-jdk-slim java -version
openjdk version "1.8.0_222"
OpenJDK Runtime Environment (build 1.8.0_222-b10)
OpenJDK 64-Bit Server VM (build 25.222-b10, mixed mode)
```

Yes. This changes the base docker image of Spark.

Existing tests.

Closes apache#26037 from viirya/SPARK-28938.

Authored-by: Liang-Chi Hsieh <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
  • Loading branch information
raiju authored and robert3005 committed Nov 7, 2019
1 parent f032370 commit 702b6b0
Show file tree
Hide file tree
Showing 10 changed files with 27 additions and 20 deletions.
2 changes: 1 addition & 1 deletion dev/appveyor-install-dependencies.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ if (!(Test-Path $tools)) {
# ========================== Maven
Push-Location $tools

$mavenVer = "3.6.0"
$mavenVer = "3.6.2"
Start-FileDownload "https://archive.apache.org/dist/maven/maven-3/$mavenVer/binaries/apache-maven-$mavenVer-bin.zip" "maven.zip"

# extract
Expand Down
4 changes: 2 additions & 2 deletions docs/building-spark.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ redirect_from: "building-with-maven.html"
## Apache Maven

The Maven-based build is the build of reference for Apache Spark.
Building Spark using Maven requires Maven 3.6.0 and Java 8.
Note that support for Java 7 was removed as of Spark 2.2.0.
Building Spark using Maven requires Maven 3.6.2 and Java 8.
Spark requires Scala 2.12; support for Scala 2.11 was removed in Spark 3.0.0.

### Setting up Maven's Memory Usage

Expand Down
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@
<java.version>1.8</java.version>
<maven.compiler.source>${java.version}</maven.compiler.source>
<maven.compiler.target>${java.version}</maven.compiler.target>
<maven.version>3.6.0</maven.version>
<maven.version>3.6.2</maven.version>
<sbt.project.name>spark</sbt.project.name>
<slf4j.version>1.7.25</slf4j.version>
<log4j.version>1.2.17</log4j.version>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
# limitations under the License.
#

FROM openjdk:8-alpine
FROM openjdk:8-jdk-slim

ARG spark_uid=185

Expand All @@ -27,14 +27,17 @@ ARG spark_uid=185
# docker build -t spark:latest -f kubernetes/dockerfiles/spark/Dockerfile .

RUN set -ex && \
apk upgrade --no-cache && \
apk add --no-cache bash tini krb5 krb5-libs && \
apt-get update && \
ln -s /lib /lib64 && \
apt install -y bash tini libc6 libpam-modules krb5-user libnss3 && \
mkdir -p /opt/spark && \
mkdir -p /opt/spark/work-dir && \
touch /opt/spark/RELEASE && \
rm /bin/sh && \
ln -sv /bin/bash /bin/sh && \
chgrp root /etc/passwd && chmod ug+rw /etc/passwd
echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su && \
chgrp root /etc/passwd && chmod ug+rw /etc/passwd && \
rm -rf /var/cache/apt/*

COPY jars /opt/spark/jars
COPY bin /opt/spark/bin
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ USER 0

RUN mkdir ${SPARK_HOME}/R

RUN apk add --no-cache R R-dev
RUN apt install -y r-base r-base-dev && rm -rf /var/cache/apt/*

COPY R ${SPARK_HOME}/R
ENV R_HOME /usr/lib/R
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,17 +25,15 @@ USER 0

RUN mkdir ${SPARK_HOME}/python
# TODO: Investigate running both pip and pip3 via virtualenvs
RUN apk add --no-cache python && \
apk add --no-cache python3 && \
python -m ensurepip && \
python3 -m ensurepip && \
RUN apt install -y python python-pip && \
apt install -y python3 python3-pip && \
# We remove ensurepip since it adds no functionality since pip is
# installed on the image and it just takes up 1.6MB on the image
rm -r /usr/lib/python*/ensurepip && \
pip install --upgrade pip setuptools && \
# You may install with python3 packages by using pip3.6
# Removed the .cache to save space
rm -r /root/.cache
rm -r /root/.cache && rm -rf /var/cache/apt/*

COPY python/pyspark ${SPARK_HOME}/python/pyspark
COPY python/lib ${SPARK_HOME}/python/lib
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -111,4 +111,4 @@ case "$1" in
esac

# Execute the container CMD under tini for better hygiene
exec /sbin/tini -s -- "${CMD[@]}"
exec /usr/bin/tini -s -- "${CMD[@]}"
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
import java.nio.file.Files;
import java.nio.file.StandardCopyOption;
import java.util.List;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import org.gradle.api.DefaultTask;
Expand Down Expand Up @@ -73,8 +74,10 @@ public final void generateDockerFile() throws IOException {
File currentDestDockerFile = getDestDockerFile();
List<String> fileLines;
try (Stream<String> rawLines = Files.lines(currentSrcDockerFile.toPath(), StandardCharsets.UTF_8)) {
AtomicBoolean isFirstFromCommand = new AtomicBoolean(true);
fileLines = rawLines.map(line -> {
if (line.equals("FROM openjdk:8-alpine")) {
// The first command in any valid dockerfile must be a from instruction
if (line.startsWith("FROM ") && isFirstFromCommand.getAndSet(false)) {
return String.format("FROM %s", baseImage.get());
} else {
return line;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,14 +27,17 @@ ARG spark_uid=185
# docker build -t spark:latest -f kubernetes/dockerfiles/spark/Dockerfile .

RUN set -ex && \
apk upgrade --no-cache && \
apk add --no-cache bash tini krb5 krb5-libs && \
apt-get update && \
ln -s /lib /lib64 && \
apt install -y bash tini libc6 libpam-modules krb5-user libnss3 && \
mkdir -p /opt/spark && \
mkdir -p /opt/spark/work-dir && \
touch /opt/spark/RELEASE && \
rm /bin/sh && \
ln -sv /bin/bash /bin/sh && \
chgrp root /etc/passwd && chmod ug+rw /etc/passwd
echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su && \
chgrp root /etc/passwd && chmod ug+rw /etc/passwd && \
rm -rf /var/cache/apt/*

COPY jars /opt/spark/jars
COPY bin /opt/spark/bin
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ dependencies {
}

sparkDocker {
baseImage 'anapsix/alpine-java:8'
baseImage 'openjdk:8-jdk-slim'
imageName 'docker.palantir.test/spark/spark-test-app'
tags System.getProperty('docker-tag')
}

0 comments on commit 702b6b0

Please sign in to comment.