Automatically select Hadoop build to use for Spark #323

nchammas · 2021-01-08T03:12:10Z

Spark releases are built against multiple versions of Hadoop. Keeping track of what build to use can be irritating. Flintrock will now use the selected Hadoop version to automatically select the appropriate build of Spark to use.

This simplifies things for the user but changes the format of the two download-source options. Flintrock should somehow smooth the transition of formats for users.

TODO:

Add changelog.
Bump major version.
Add test for new Hadoop build version utility.
Explain how file name is generated, with reference to dist.apache.org.
Detect old download source format and raise deprecation warning.
~~Print helper message on appropriate version of hadoop-aws to use. (?)~~
Show generated download URLs in debug output.

Related: #322

…ild-selection

nchammas added 3 commits January 7, 2021 22:07

automatically select hadoop build to use for spark

ee92dae

revert unrelated change

d125257

Merge branch 'master' of github.com:nchammas/flintrock into hadoop-bu…

26b03c1

…ild-selection

nchammas force-pushed the hadoop-build-selection branch from 6dc9504 to 26b03c1 Compare February 13, 2021 06:09

nchammas added 14 commits February 13, 2021 01:10

missing space

096f75a

fix template

186cd75

clarify file name in help text

d2c9bdd

Merge branch 'master' into hadoop-build-selection

8f0cc89

only append file name if not already specified

d842cee

show download URLs in debug output

22360f9

remove debug print

dca0098

warn on old download source format

22d6ddf

Merge branch 'master' into hadoop-build-selection

f67144a

source hadoop build profile automatically

8639cf2

add change log

ea6277f

clarify dist.apache.org locations

c91dff4

can't support without-hadoop now

02523c2

add test for spark_hadoop_build_version

194cede

nchammas marked this pull request as ready for review February 21, 2021 03:06

show full dist.apache.org url in config template

d0e87a6

nchammas merged commit 55ad7b9 into master Feb 21, 2021

nchammas deleted the hadoop-build-selection branch February 21, 2021 03:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically select Hadoop build to use for Spark #323

Automatically select Hadoop build to use for Spark #323

nchammas commented Jan 8, 2021 •

edited

Loading

Automatically select Hadoop build to use for Spark #323

Automatically select Hadoop build to use for Spark #323

Conversation

nchammas commented Jan 8, 2021 • edited Loading

nchammas commented Jan 8, 2021 •

edited

Loading