-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Website] Website deployment workflow (deploy.yml
) is failing due to Node.js 18 version bump in ubuntu-latest
GitHub Actions runner image and Webpack usage of md4
hashing algorithm
#325
Comments
FWIW, I'm in favor of both 2 & 3: we should control our upgrades and not have them unexpectedly change, but we should also follow a reasonable upgrade cycle and not use outdated cryptographic functions. That being said, the arrow project in general lacks a surplus of front-end development skills, so it option 3 makes the most sense to me if we only have the resources for 1 of the above. |
@avantgardnerio - thanks for sharing your thoughts. I agree - 2 + 3 seems like the optimal solution. As you mentioned, if we are concerned with resourcing, starting with option 3 feels like a good first step which would unblock things in the short term. We could always choose to address 2 in a separate PR when resourcing permits. |
take |
…he/arrow-site website deployment workflow when pushing to master (#323) # Overview As part of apache/arrow#31142 and in response to the [recent rename of the `apache/arrow-site` repository default branch to `main`](https://issues.apache.org/jira/browse/INFRA-24242), this pull request removes support for triggering the website deployment workflow (`.github/workflows/deploy.yml`) on pushes to a branch with the name "master". # Qualification To qualify these changes, I verified that: 1. Pushing these changes to the `main` branch of `mathworks/arrow-site` [triggered the `deploy.yml` workflow as expected](https://github.com/mathworks/arrow-site/actions/runs/4296490412/jobs/7488276762). 2. The workflow step "Configure for GitHub Pages on push to main or master branch" has been renamed to "Configure for GitHub Pages on push to main branch" 3. The `deploy.yml` is no longer being triggered when commits are pushed to a branch named "master". # Future Directions 1. I will follow up with a separate PR to address apache/arrow#20161 # Notes **Note**: The [CI failures in the Build step](https://github.com/mathworks/arrow-site/actions/runs/4296490412/jobs/7488276762) are unrelated to this change. This is a result of apache/arrow#34379, which is being addressed separately. Closes apache/arrow#31412.
A quick status update: I've been actively working on implementing workaround 3. (run the website deployment workflow inside of an However, getting the entire website deployment workflow to run successfully inside of an I managed to get the workflow to succeed on the You can see the history of attempts at getting the workflow running here: https://github.com/mathworks/arrow-site/actions Two more important issues I ran into while trying to get the container approach working:
I suspect that 1. shouldn't be required to get the workflow running. It isn't clear to me at this point why this code was failing to create the Also, it seems like Given the issues / tradeoffs here, it will take some more time to get this working effectively. To unblock the website deployment for other PRs, we could consider pursuing workaround 1. or 2. described in this issue as short term solutions. I'm open to whatever approach the community prefers. |
Update: By reverting back a debugging change I accidentally left behind in the This appears to have also resolved the issue where the workflow was unable to create a At this point, it seems like the main issue is reducing the time of the workflow (currently, it takes around 10 minutes) - possibly, through the use of caching. |
Update: I've managed to reduce the time of the workflow to around 5 minutes by reverting back to using the It would still be great to reduce the workflow time further if possible. |
Update: At this point, the only part of the workflow that is adding any significant additional time (around 1 minute) is the call to |
Update: I've opened pull request #326 to address this issue. |
deploy.yml
) is failing due to Node.js 18 version bump inubuntu-latest
GitHub Actions runner image and Webpack usage of md4
hashing algorithmdeploy.yml
) is failing due to Node.js 18 version bump in ubuntu-latest
GitHub Actions runner image and Webpack usage of md4
hashing algorithm
…version of Node.js and Webpack 5.75.0 (#326) # Overview This pull request modifies the `apache/arrow-site` website deployment workflow (`.github/workflows/deploy.yml`) to use the latest LTS version of Node.js and Webpack 5.75.0 to work around the build issue described in #325. # Qualification To qualify these changes, I: 1. Submitted these changes to the `main` branch of the `mathworks/arrow-site` fork in order to trigger the `gh-pages` deployment workflow. I then selected `gh-pages` as the GitHub Pages deployment branch and verified that the site was deployed as expected to https://mathworks.github.io/arrow-site/. For an example of a successful workflow run, see: https://github.com/mathworks/arrow-site/actions/runs/4313253336/jobs/7524824999. 2. I inspected the GitHub Actions workflow steps to ensure there are no errors. # Future Directions 1. While qualifying with the [fork deployment workflow](https://github.com/apache/arrow-site#deployment), I realized that I needed to [manually change the GitHub Pages deployment branch](https://docs.github.com/en/pages/quickstart) from `asf-site` to `gh-pages` in the "Pages" settings of the `mathworks/arrow-site` fork. This wasn't immediately obvious, and it [isn't listed explicitly as a required step in the README.md](https://github.com/apache/arrow-site#deployment) of `apache/arrow-site`. It would helpful to add an explicit note about this step. I've captured this as #327 and addressed it with PR #328. 2. As described in the "Workarounds" section of the description of #325, there is still more we could choose to do to address the root cause of these build failures (the deprecation of the `md4` hash algorithm in Node 18). This would include setting the `output.hashFunction` to `xxhash64` for Webpack. 3. We could move the workflow into a container to make it easier to reproduce the website build process on a local machine (see the discussion in the comments on this pull request). # Notes 1. Thank you @sgilmore10 for your help with this pull request! 2. Thank you to @avantgardnerio for your suggestion to move the deployment workflow inside of an `ubuntu:latest` container! Closes #325. --------- Co-authored-by: Sarah Gilmore <[email protected]> Co-authored-by: Sutou Kouhei <[email protected]>
Describe the bug, including details regarding any error messages, version, and platform.
See the following comment on apache/arrow#322.
A few weeks ago, the
apache/arrow-site
deployment workflow (.github/workflows/deploy.yml
) started failing with the following output:This appears to be related to the use of Webpack by
apache/arrow-site
and the following issues:My high level understanding is that in Node 18 (the build output above shows Node.js
v18.14.1
is being used), themd4
hashing algorithm is deprecated (more specifically, it seems that Node 18 uses OpenSSL 3.0, which has deprecatedmd4
) and the version of Webpack used byapache/arrow-site
(v5.21.2
) seems to default to usingmd4
.Webpack v5.61.0 added a WASM
md4
implementation as a fallback. However, the advice in webpack/webpack#14532 (comment) recommends settingoutput.hasFunction
in the Webpack config to use an alternative hashing algorithm instead. Specifically, it recommends usingxxhash64
(which is planned to be the default hashing algorithm when Webpack 6 is released).It seems that the version of Node.js in the
ubuntu-latest
GitHub Actions runner image (used bydeploy.yml
) was bumped to v18 on Februrary 13, 2023. This would explain why this issue started appearing a few weeks ago.Workarounds
There are a few different approaches we could pursue to address this issue:
We could choose to pin the version of Node.js used by the GitHub Actions runner to v16 for the
actions/setup-node
action to work around this issue. Of course, this would mean we would be continuing to rely on an outdated version of Node.js, which doesn't seem ideal in the long term.We could follow the advice in nodejs 17: digital envelope routines::unsupported webpack/webpack#14532 (comment) and set
output.hashFunction
in the Webpack config to use an alternative hashing algorithm, likexxhash64
.We could follow the advice of @avantgardnerio in DataFusion Substrait blog post #322 (comment) and move away from relying on the proprietary
ubuntu-latest
image, which is subject to sudden updates like the Node.js one that caused this issue. Instead, we can use the officialubuntu:latest
container image (this is the approach followed by arrow-ballista).ubuntu:latest
wouldn't have unexpected library updates, and it would also be possible to run the container image locally for debugging purposes.Component(s)
Website
The text was updated successfully, but these errors were encountered: