Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ml): arm64 build for cuda #12456

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ jobs:
- platforms: linux/amd64,linux/arm64
device: cpu

- platforms: linux/amd64
- platforms: linux/amd64,linux/arm64
device: cuda
suffix: -cuda

Expand Down
24 changes: 18 additions & 6 deletions machine-learning/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ RUN mkdir /opt/armnn && \

FROM builder-${DEVICE} AS builder

ARG DEVICE
ARG DEVICE TARGETARCH
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
PIP_NO_CACHE_DIR=true \
Expand All @@ -32,7 +32,11 @@ RUN poetry config installer.max-workers 10 && \
RUN python3 -m venv /opt/venv

COPY poetry.lock pyproject.toml ./
RUN poetry install --sync --no-interaction --no-ansi --no-root --with ${DEVICE} --without dev
RUN if [ "$DEVICE" = "cuda" ] && [ "$TARGETARCH" = "arm64" ]; then \
# hack to work around poetry not setting the right filename for the wheel https://github.com/python-poetry/poetry/issues/4472
wget -q -O onnxruntime_gpu-1.18.0-cp311-cp311-manylinux_aarch64.whl https://nvidia.box.com/shared/static/fy55jvniujjbigr4gwkv8z1ma6ipgspg.whl; fi && \
poetry install --sync --no-interaction --no-ansi --no-root --with ${DEVICE} --without dev && \
if [ "$DEVICE" = "cuda" ] && [ "$TARGETARCH" = "arm64" ]; then rm onnxruntime_gpu-1.18.0-cp311-cp311-manylinux_aarch64.whl; fi

FROM python:3.11-slim-bookworm@sha256:5148c0e4bbb64271bca1d3322360ebf4bfb7564507ae32dd639322e4952a6b16 AS prod-cpu

Expand All @@ -49,13 +53,21 @@ RUN apt-get update && \
apt-get remove wget -yqq && \
rm -rf /var/lib/apt/lists/*

FROM nvidia/cuda:12.2.2-runtime-ubuntu22.04@sha256:94c1577b2cd9dd6c0312dc04dff9cb2fdce2b268018abc3d7c2dbcacf1155000 AS prod-cuda

FROM nvidia/cuda:12.2.2-runtime-ubuntu22.04@sha256:94c1577b2cd9dd6c0312dc04dff9cb2fdce2b268018abc3d7c2dbcacf1155000 AS prod-cuda-amd64
RUN apt-get update && \
apt-get install --no-install-recommends -yqq libcudnn9-cuda-12 && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*

FROM nvidia/cuda:12.2.2-runtime-ubuntu22.04@sha256:94c1577b2cd9dd6c0312dc04dff9cb2fdce2b268018abc3d7c2dbcacf1155000 AS prod-cuda-arm64
RUN apt-get update && \
apt-get install --no-install-recommends -yqq libcudnn8 && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
ENV LD_LIBRARY_PATH=/usr/local/cuda-12/compat:$LD_LIBRARY_PATH

FROM prod-cuda-${TARGETARCH} AS prod-cuda

COPY --from=builder-cuda /usr/local/bin/python3 /usr/local/bin/python3
COPY --from=builder-cuda /usr/local/lib/python3.11 /usr/local/lib/python3.11
COPY --from=builder-cuda /usr/local/lib/libpython3.11.so /usr/local/lib/libpython3.11.so
Expand All @@ -81,10 +93,10 @@ COPY --from=builder-armnn \
/opt/armnn/

FROM prod-${DEVICE} AS prod
ARG DEVICE
ARG DEVICE TARGETARCH

RUN apt-get update && \
apt-get install -y --no-install-recommends tini $(if ! [ "$DEVICE" = "openvino" ]; then echo "libmimalloc2.0"; fi) && \
apt-get install -y --no-install-recommends tini $(if ! { [ "$DEVICE" = "openvino" ] || { [ "$DEVICE" = "cuda" ] && [ "$TARGETARCH" = "arm64" ]; }; }; then echo "libmimalloc2.0"; fi) && \
apt-get autoremove -yqq && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
Expand Down
24 changes: 23 additions & 1 deletion machine-learning/poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 5 additions & 2 deletions machine-learning/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ version = "1.120.2"
description = ""
authors = ["Hau Tran <[email protected]>"]
readme = "README.md"
packages = [{include = "app"}]
packages = [{ include = "app" }]

[tool.poetry.dependencies]
python = ">=3.10,<4.0"
Expand Down Expand Up @@ -45,7 +45,10 @@ onnxruntime = "^1.15.0"
optional = true

[tool.poetry.group.cuda.dependencies]
onnxruntime-gpu = {version = "^1.17.0", source = "cuda12"}
onnxruntime-gpu = [
{ version = "^1.17.0", source = "cuda12", markers = "platform_machine == 'x86_64'" },
{ python = "3.11", path = "onnxruntime_gpu-1.18.0-cp311-cp311-manylinux_aarch64.whl", markers = "platform_machine == 'aarch64'" }
]

[tool.poetry.group.openvino]
optional = true
Expand Down
19 changes: 13 additions & 6 deletions machine-learning/start.sh
Original file line number Diff line number Diff line change
@@ -1,19 +1,26 @@
#!/usr/bin/env sh

lib_path="/usr/lib/$(arch)-linux-gnu/libmimalloc.so.2"
# mimalloc seems to increase memory usage dramatically with openvino, need to investigate
if ! [ "$DEVICE" = "openvino" ]; then
mimalloc="/usr/lib/$(arch)-linux-gnu/libmimalloc.so.2"
if [ -f "$mimalloc" ]; then
export LD_PRELOAD="$mimalloc"
fi

if { [ "$DEVICE" = "cuda" ] && [ "$(arch)" = "aarch64" ]; }; then
lib_path="/usr/lib/$(arch)-linux-gnu/libmimalloc.so.2"
export LD_PRELOAD="$lib_path"
export LD_BIND_NOW=1
: "${MACHINE_LEARNING_WORKER_TIMEOUT:=120}"
else
: "${MACHINE_LEARNING_WORKER_TIMEOUT:=300}"
fi
export LD_BIND_NOW=1

: "${IMMICH_HOST:=[::]}"
: "${IMMICH_PORT:=3003}"
: "${MACHINE_LEARNING_WORKERS:=1}"
: "${MACHINE_LEARNING_HTTP_KEEPALIVE_TIMEOUT_S:=2}"
if [ "$DEVICE" = "openvino" ]; then
: "${MACHINE_LEARNING_WORKER_TIMEOUT:=300}"
else
: "${MACHINE_LEARNING_WORKER_TIMEOUT:=120}"
fi

gunicorn app.main:app \
-k app.config.CustomUvicornWorker \
Expand Down
Loading