Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically select Hadoop build to use for Spark #323

Merged
merged 18 commits into from
Feb 21, 2021

Conversation

nchammas
Copy link
Owner

@nchammas nchammas commented Jan 8, 2021

Spark releases are built against multiple versions of Hadoop. Keeping track of what build to use can be irritating. Flintrock will now use the selected Hadoop version to automatically select the appropriate build of Spark to use.

This simplifies things for the user but changes the format of the two download-source options. Flintrock should somehow smooth the transition of formats for users.

TODO:

  • Add changelog.
  • Bump major version.
  • Add test for new Hadoop build version utility.
  • Explain how file name is generated, with reference to dist.apache.org.
  • Detect old download source format and raise deprecation warning.
  • Print helper message on appropriate version of hadoop-aws to use. (?)
  • Show generated download URLs in debug output.

Related: #322

@nchammas nchammas force-pushed the hadoop-build-selection branch from 6dc9504 to 26b03c1 Compare February 13, 2021 06:09
@nchammas nchammas marked this pull request as ready for review February 21, 2021 03:06
@nchammas nchammas merged commit 55ad7b9 into master Feb 21, 2021
@nchammas nchammas deleted the hadoop-build-selection branch February 21, 2021 03:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant