Skip to content

feat(api): image building process improvement #404

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 78 commits into
base: main
Choose a base branch
from

Conversation

naufalandika
Copy link
Contributor

@naufalandika naufalandika commented Apr 8, 2025

Context

This PR contains improvement for pyfunc-ensembler-service / pyfunc-ensembler-job image building process. This improvement removes the need of supporting specific python versions and use the python version specified in user dependency instead, similar on how merlin builds pyfunc-server and batch-predictor images.

Changes in this PR includes improvement for the imagebuilder, pyfunc-ensembler-service / pyfunc-ensembler-job package, and little bit of sdk.

imagebuilder

  • Only use 1 base image for every python version used by the users.
  • Inject pyfunc-ensembler-service / pyfunc-ensembler-job dependency to user's conda.yaml.
  • Hash user conda.yaml to prevent the URL value of conda.yaml that has the same content to change and invalidating docker layer cache.

pyfunc-ensembler-service / pyfunc-ensembler-job package

  • Add new Makefile rule to build and publish the package to pypi
  • Add new github workflow to publish the package to pypi everytime sdk workflow is run successfully. pyfunc-ensembler-service / pyfunc-ensembler-job share the same version with turing-sdk package, but both of these package requires latest turing-sdk version, so we need to run the workflow only after sdk workflow completed.

sdk

  • Change the trigger to sdk workflow to python/* tag push instead of sdk/* tag push. python/* tag and version will be used by turing-sdk, pyfunc-ensembler-service and pyfunc-ensembler-job packages.

Main Modifications

imagebuilder/imagebuilder.go:

  • use 1 config for --build-arg=BASE_IMAGE=%s value
  • hash user's conda.yaml

engines/pyfunc-ensembler-*/Makefile:

  • add rule to build and publish pyfunc-ensembler-* package

engines/pyfunc-ensembler-*/app.Dockerfile:

  • injects turing-pyfunc-ensembler-* to user's dependency

.github/workflows/*

  • add publishing python/* tag as sdk workflow triggers
  • run pyfunc-ensembler-* workflow after sdk workflow is run successfully

Muhammad Naufal Andika Natsir Putra added 25 commits March 18, 2025 15:14
@naufalandika naufalandika self-assigned this Apr 8, 2025
@naufalandika naufalandika force-pushed the DAT-2995_image_building_improvement branch from 00c1d89 to 65f9d3e Compare April 8, 2025 06:48
Copy link
Contributor

@deadlycoconuts deadlycoconuts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for all the refactoring and testing work gone into this PR! 🚀🙏🏼 Most of it looks good but I just left a couple of small comments here and there (the comments on the ensembler job stuff generally apply to the ensembler server stuff too). Feel free to merge this PR once you've addressed the comments!

Comment on lines 31 to 33
strategy:
matrix:
python-version: ["3.8", "3.9", "3.10"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think it's alright to keep this, so that we perform unit tests of the pyfunc ensembler job/service for different versions of Python 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted

Comment on lines +6 to +9
workflow_run:
workflows: ["sdk"]
types:
- completed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow this is cool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I can't test this until the workflow is merged to main tho, hopefully it runs as expected

|| ( github.event_name != 'pull_request' )
|| ( github.event.pull_request.head.repo.full_name == github.repository )
|| ( github.event.pull_request.head.repo.full_name == github.repository )) &&
${{ github.event.workflow_run.conclusion == 'success' }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, github.event.workflow_run refers to the SDK workflow right? If it is, then wouldn't the on.workflow_run.workflows thing you configured at the top of this file already guarantee that the workflow needs to be successful before this entire workflow can even begin running? 🤔 In other words, would ${{ github.event.workflow_run.conclusion == 'success' }} be redundant here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workflow_run:
    workflows: ["sdk"]
    types:
      - completed

the completed in here doesn't check whether the conclusion of the workflow is success or failure, therefore I need to check the conclusion manually. The example on the documentation suggests that it work like that. Maybe my comment above the workflow_run is confusing

run: |
set -o pipefail
make build-and-publish | tee output.log
echo "pyfunc-ensembler-job=$(sed -n 's%Building docker image: \(.*\)%\1%p' output.log)" >> $GITHUB_OUTPUT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, why do we need to store this as a GitHub output?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember my reason on doing this haha, maybe because the Build Docker Image job saves the log to the output, I also save this job's log to the output. But if it isn't necessary I can delete this part. And setting the output using ::set-output is deprecated and they ask us to setting it using >> $GITHUB_OUTPUT

Comment on lines 135 to 136
artifactServiceType string
artifactService artifact.Service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we can simplify this struct by removing the artifactServiceType field because it is the same as artifactService.GetType().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I missed that, thanks. Updated

Comment on lines -308 to +360
tt.artifactURI,
testArtifactURI,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol thanks for realising that tt.artifactURI is the same for all test cases and for replacing it with a single re-usable variable instead.

Comment on lines 57 to 61
# Install yq
ENV YQ_VERSION=v4.42.1
RUN wget https://github.com/mikefarah/yq/releases/download/${YQ_VERSION}/yq_linux_amd64 -O /usr/bin/yq && \
chmod +x /usr/bin/yq

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ultra nit: can we move this up above the block that installs the gcloud SDK? So it looks similar to the Dockerfile found in Merlin? 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hahaha sure no problem, updated!

Comment on lines -9 to +13
setup: $(CONDA_ENV_NAME)
$(CONDA_ENV_NAME):
@conda env update -f env-$(PYTHON_VERSION).yaml -n $(CONDA_ENV_NAME) --prune
$(ACTIVATE_ENV) && pip install -r requirements.dev.txt
setup: build
@pip install pipenv
@DIST_VERSION=$$(echo $(VERSION) | \
sed -E 's/^v([0-9]+\.[0-9]+\.[0-9]+)(-rc([0-9]+))?/\1rc\3/'); \
pipenv run pip install "dist/turing_pyfunc_ensembler_job-$${DIST_VERSION}-py3-none-any.whl[dev]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering, did you change all the conda virtual env stuff to pipenv to follow Merlin's implementation? I was thinking if it was possible to just keep all the conda-related stuff 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea I think I tried to mimic how merlin does this and they use pipenv to do this. Should I rewrite it to use conda ?

@@ -5,3 +5,4 @@ mypy>=0.910
pytest<=8.1.2
pytest-cov
pylint
types-PyYAML
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh where is this used? :o

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I remember having an error saying types-PyYAML is not found, but I just retested it and it works fine now. Removed

Comment on lines 7 to 15
# get version from version.py
spec = importlib.util.spec_from_file_location(
"pyfuncserver.version", os.path.join("version.py")
)

v_module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(v_module)

version = v_module.VERSION
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoa this looks complex 😅 Can't we retrieve the version in a simpler way like this https://github.com/caraml-dev/turing/blob/main/sdk/setup.py#L5? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
python-version: "3.10"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intended to 3.10 and why are we still hard coded this specific version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to put a python version here that we will use to do the testing, no specific reason on why I choose 3.10 tho haha. But it will be reverted to apply Zi's comment https://github.com/caraml-dev/turing/pull/404/files/69419b6ebed4ebbd1c8c1ab0167b30d2395fe352#r2053718016

@@ -94,6 +95,8 @@ func NewAppContext(
// Init ensemblers service
ensemblersService := service.NewEnsemblersService(db)

artifactService, err := initArtifactService(cfg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, it this expected that the error isn't being handled? 😮

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this 🤝, updated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants