Skip to content

Added Dockerfile and accessing mdx2 through Docker container#57

Merged
jlee733 merged 12 commits intomainfrom
jlee-org-launch
Mar 11, 2026
Merged

Added Dockerfile and accessing mdx2 through Docker container#57
jlee733 merged 12 commits intomainfrom
jlee-org-launch

Conversation

@jlee733
Copy link
Copy Markdown

@jlee733 jlee733 commented Mar 10, 2026

Dockerfile Changes
-Expose ports 8880-8890
-Now users can set their own Jupyter port (the same as the container port [ports 8880-8890])
-Git cloned dials-extensions and pinned commit
-A Jupyter lab server is now launched at the creation of the docker container
*dockerfile is now primarily used for CI through a GitHub action that rebuilds the image after each push to main
*Updated README

Summary by CodeRabbit

  • New Features

    • Docker containerized development environment with Jupyter Lab and pre-configured scientific tools
    • Multi-platform support with accessible ports 8880-8890
  • Documentation

    • Updated README with Docker installation and setup instructions
  • Chores

    • Automated Docker image building and publishing workflow

jlee733 and others added 5 commits March 10, 2026 13:24
- Builds and pushes Docker images on main branch and PRs
- Main branch pushes :latest tag
- PRs push :test tag
- Uses DOCKERHUB_USERNAME var and DOCKERHUB_TOKEN secret
…ports, launch of jupyter lab server, users now specify their Jupyter port), updated README, github action pulls and pushes docker image to/from docker hub when pushing to main
@jlee733 jlee733 requested review from bielj and spmeisburger March 10, 2026 14:52
@jlee733 jlee733 self-assigned this Mar 10, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • ✅ Review completed - (🔄 Check again to review again)

Walkthrough

This pull request adds Docker containerization support to the project. It introduces a GitHub Actions workflow for building and pushing Docker images, a multi-stage Dockerfile defining a Python development environment with Jupyter Lab, and updates the README with Docker-based installation instructions.

Changes

Cohort / File(s) Summary
Docker CI/CD Infrastructure
.github/workflows/docker.yml, .github/Dockerfile
Adds GitHub Actions workflow to build and push Docker images on pushes to main and pull requests, with conditional tagging (main → latest, others → test). Multi-stage Dockerfile sets up Python 3.10 development environment with micromamba, installs conda dependencies from env.yaml, pins dials-extensions, and runs Jupyter Lab on port 8880-8890.
Documentation
README.md
Extends Prerequisites section with Docker requirement and adds new Developer Install (Docker container) section with steps to pull, run, and access the Docker image including shell and micromamba activation instructions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A container so snug, with dependencies tucked, ✨
From Python to Jupyter, not a thing out of place,
Docker builds swift as a hop through the space,
Multi-stages arranged in an elegant trace,
A dev environment bundled with computational grace! 🐋

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main changes: adding a Dockerfile and enabling Docker container access to mdx2, which aligns with the changeset introducing .github/Dockerfile, docker.yml workflow, and Docker documentation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch jlee-org-launch

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (5)
.github/workflows/docker.yml (1)

34-38: Consider specifying the Dockerfile path and build context explicitly.

While defaults work when Dockerfile is at repository root, explicitly specifying context and file improves clarity and prevents issues if the file is renamed/moved.

♻️ Proposed enhancement
       - name: Build and push
         uses: docker/build-push-action@v6
         with:
+          context: .
+          file: ./dockerfile
           push: ${{ github.ref == 'refs/heads/main' }}
           tags: ${{ steps.tags.outputs.tags }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/docker.yml around lines 34 - 38, The "Build and push" step
using docker/build-push-action@v6 should explicitly set the build context and
Dockerfile path to avoid relying on defaults; update the step named "Build and
push" (docker/build-push-action@v6) to include with: context: <repo root or
desired path> and file: <path/to/Dockerfile> so the action always uses the
intended build context and Dockerfile even if the Dockerfile is moved or
renamed.
README.md (1)

77-86: Provide example values for placeholder variables.

The placeholders {CONTAINER_PORT} and {HOST_PORT} may confuse users. Consider adding a concrete example or explaining valid port ranges.

📝 Suggested clarification
 Pull image from Docker Hub and run the container. Search the Docker container logs for the localhost address to access mdx2 within a Jupyter Lab notebook or terminal.
   a. Exposes ports 8880-8890
   b. Platform supported: linux/amd64
+
+Replace `{HOST_PORT}` with your desired local port (e.g., 8880) and `{CONTAINER_PORT}` with a port in the 8880-8890 range.
+
 ```bash
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 77 - 86, Update the README's docker run example to
replace the ambiguous placeholders {CONTAINER_PORT} and {HOST_PORT} with
concrete example values and a short note about valid ranges: show a typical
example such as using container port 8888 and host port 8888 (e.g., in the
docker run command string that contains --name 'mdx2_container' -e
JUPYTER_PORT={CONTAINER_PORT} -p {HOST_PORT}:{CONTAINER_PORT}) and add a brief
parenthetical that valid ports are 1024–65535 (or common exposed range 8880–8890
mentioned above); ensure the example uses the same tag
(diffuseproject/mdx2:1.0.0 or :latest) as in the snippet to avoid confusion.
dockerfile (3)

37-37: Jupyter runs as root; consider creating a non-root user for improved security.

The --allow-root flag enables running Jupyter as root, which is a security concern in shared or production-like environments. For development containers this is often acceptable, but consider documenting this trade-off or adding a non-root user for stricter environments.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dockerfile` at line 37, The CMD currently launches Jupyter via micromamba
with the --allow-root flag which permits running as root; instead add a non-root
user in the Dockerfile (create user/group, chown relevant dirs and micromamba
environments or conda env paths) and switch to that user with USER before the
CMD, then run micromamba run -n mdx2-dev jupyter lab without --allow-root so
Jupyter runs unprivileged; also ensure workspace, notebooks and any runtime
ports are writable by the new user and add a brief comment documenting the
security trade-off if a root Jupyter is ever required for development.

32-33: Consider combining pip install commands into a single RUN layer.

Two separate RUN instructions create extra layers. Combining them reduces image size slightly.

♻️ Proposed fix
-RUN /usr/local/bin/micromamba run -n mdx2-dev pip install git+https://github.com/FlexXBeamline/dials-extensions.git@963fe9e458a505a8443c988d45fb8dcb4532768f
-RUN /usr/local/bin/micromamba run -n mdx2-dev pip install -e .
+RUN /usr/local/bin/micromamba run -n mdx2-dev pip install \
+    git+https://github.com/FlexXBeamline/dials-extensions.git@963fe9e458a505a8443c988d45fb8dcb4532768f \
+    && /usr/local/bin/micromamba run -n mdx2-dev pip install -e .
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dockerfile` around lines 32 - 33, Replace the two separate RUN instructions
that call "/usr/local/bin/micromamba run -n mdx2-dev pip install ..." with a
single RUN that installs both targets in one pip invocation to avoid an extra
image layer; specifically combine the
git+https://github.com/FlexXBeamline/dials-extensions.git@963fe9e458a505a8443c988d45fb8dcb4532768f
install and the editable install (-e .) into one "/usr/local/bin/micromamba run
-n mdx2-dev pip install" command so both packages are installed in the same
layer.

9-9: Consider cleaning apt cache to reduce image size.

The apt-get install leaves cached files in /var/lib/apt/lists/, bloating the image.

🧹 Proposed fix
-RUN apt-get update && apt-get install -y ca-certificates git
+RUN apt-get update && apt-get install -y --no-install-recommends ca-certificates git \
+    && rm -rf /var/lib/apt/lists/*
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dockerfile` at line 9, The RUN apt-get update && apt-get install -y
ca-certificates git layer leaves apt caches in /var/lib/apt/lists/ and increases
image size; update the RUN instruction that installs packages to (1) use
--no-install-recommends when installing ca-certificates and git, (2) run apt-get
clean, and (3) remove /var/lib/apt/lists/* after installation so cached package
lists are deleted and the final image is smaller.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/docker.yml:
- Around line 12-38: The workflow is missing the repository checkout so
docker/build-push-action cannot access the Dockerfile; add an
actions/checkout@v4 step before the build/push step (i.e., insert a step named
"Checkout repository" using actions/checkout immediately before the "Set up
QEMU"/"Set up Docker Buildx" or at least before the "Build and push" step) so
that the build action (docker/build-push-action@v6) has the repository context
and Dockerfile available.
- Around line 13-17: The Docker Hub login step ("Login to Docker Hub" using
docker/login-action@v3) will fail for forked-PRs because secrets.DOCKERHUB_TOKEN
is not available; add a conditional to that job/step to skip it when the
token/PR context is absent (for example: if: ${{ github.event.pull_request ==
null }} or if: ${{ secrets.DOCKERHUB_TOKEN != '' }}) and keep using
DOCKERHUB_USERNAME/DOCKERHUB_TOKEN only when present; also update the subsequent
build/push step so it only performs a docker push when the same condition is
true (i.e., build-only for forked PRs and push when login/secret is available).

In `@dockerfile`:
- Line 15: The ENV JUPYTER_PORT default (JUPYTER_PORT) is set to 8888 but the
Dockerfile EXPOSE range only covers 8880-8890, so update the Dockerfile to keep
these in sync: either change ENV JUPYTER_PORT to a value inside the exposed
range (e.g., 8880) or expand the EXPOSE directive to include 8888 (e.g., EXPOSE
8880-8890 -> include 8888 or widen the range). Make the same corresponding
change where JUPYTER_PORT is declared elsewhere (the other occurrence mentioned)
so both declarations match the EXPOSE range.

In `@README.md`:
- Line 47: Typo: change "installed" to "installation" in the README sentence.
Edit the sentence containing "The Docker-based installed requires you to install
and login to [Docker](https://www.docker.com/get-started/)." and replace
"installed" with "installation" so it reads "The Docker-based installation
requires you to install and login to
[Docker](https://www.docker.com/get-started/)."
- Around line 80-86: The README contains two issues: the volume mount string "-v
\"$(pwd)\":/ home/dev" has an errant space and should be corrected to "-v
\"$(pwd)\":/home/dev", and the image tag is inconsistent between "docker pull
diffuseproject/mdx2:1.0.0" and "docker run ... diffuseproject/mdx2:latest"; make
the tags match (e.g., use ":1.0.0" in both the docker pull and docker run
commands or change both to ":latest") so the pulled image and the run command
refer to the same tag.

---

Nitpick comments:
In @.github/workflows/docker.yml:
- Around line 34-38: The "Build and push" step using docker/build-push-action@v6
should explicitly set the build context and Dockerfile path to avoid relying on
defaults; update the step named "Build and push" (docker/build-push-action@v6)
to include with: context: <repo root or desired path> and file:
<path/to/Dockerfile> so the action always uses the intended build context and
Dockerfile even if the Dockerfile is moved or renamed.

In `@dockerfile`:
- Line 37: The CMD currently launches Jupyter via micromamba with the
--allow-root flag which permits running as root; instead add a non-root user in
the Dockerfile (create user/group, chown relevant dirs and micromamba
environments or conda env paths) and switch to that user with USER before the
CMD, then run micromamba run -n mdx2-dev jupyter lab without --allow-root so
Jupyter runs unprivileged; also ensure workspace, notebooks and any runtime
ports are writable by the new user and add a brief comment documenting the
security trade-off if a root Jupyter is ever required for development.
- Around line 32-33: Replace the two separate RUN instructions that call
"/usr/local/bin/micromamba run -n mdx2-dev pip install ..." with a single RUN
that installs both targets in one pip invocation to avoid an extra image layer;
specifically combine the
git+https://github.com/FlexXBeamline/dials-extensions.git@963fe9e458a505a8443c988d45fb8dcb4532768f
install and the editable install (-e .) into one "/usr/local/bin/micromamba run
-n mdx2-dev pip install" command so both packages are installed in the same
layer.
- Line 9: The RUN apt-get update && apt-get install -y ca-certificates git layer
leaves apt caches in /var/lib/apt/lists/ and increases image size; update the
RUN instruction that installs packages to (1) use --no-install-recommends when
installing ca-certificates and git, (2) run apt-get clean, and (3) remove
/var/lib/apt/lists/* after installation so cached package lists are deleted and
the final image is smaller.

In `@README.md`:
- Around line 77-86: Update the README's docker run example to replace the
ambiguous placeholders {CONTAINER_PORT} and {HOST_PORT} with concrete example
values and a short note about valid ranges: show a typical example such as using
container port 8888 and host port 8888 (e.g., in the docker run command string
that contains --name 'mdx2_container' -e JUPYTER_PORT={CONTAINER_PORT} -p
{HOST_PORT}:{CONTAINER_PORT}) and add a brief parenthetical that valid ports are
1024–65535 (or common exposed range 8880–8890 mentioned above); ensure the
example uses the same tag (diffuseproject/mdx2:1.0.0 or :latest) as in the
snippet to avoid confusion.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 39d511d1-7928-4c26-a271-cf215ff94b5c

📥 Commits

Reviewing files that changed from the base of the PR and between 76d439d and 639e12e.

📒 Files selected for processing (3)
  • .github/workflows/docker.yml
  • README.md
  • dockerfile

Comment thread .github/workflows/docker.yml
Comment on lines +13 to +17
- name: Login to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ vars.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Docker Hub login will fail for PRs from forks.

Secrets (DOCKERHUB_TOKEN) are not available for pull requests from forked repositories. This will cause the login step to fail for external contributors. Consider adding a condition to skip login for fork PRs or restructuring the workflow.

🔧 Proposed fix: Skip login when secrets unavailable
       - name: Login to Docker Hub
+        if: github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name == github.repository
         uses: docker/login-action@v3
         with:
           username: ${{ vars.DOCKERHUB_USERNAME }}
           password: ${{ secrets.DOCKERHUB_TOKEN }}

And update the build step to only build (not push) when login is skipped:

       - name: Build and push
         uses: docker/build-push-action@v6
         with:
-          push: ${{ github.ref == 'refs/heads/main' }}
+          push: ${{ github.ref == 'refs/heads/main' && (github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name == github.repository) }}
           tags: ${{ steps.tags.outputs.tags }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/docker.yml around lines 13 - 17, The Docker Hub login step
("Login to Docker Hub" using docker/login-action@v3) will fail for forked-PRs
because secrets.DOCKERHUB_TOKEN is not available; add a conditional to that
job/step to skip it when the token/PR context is absent (for example: if: ${{
github.event.pull_request == null }} or if: ${{ secrets.DOCKERHUB_TOKEN != ''
}}) and keep using DOCKERHUB_USERNAME/DOCKERHUB_TOKEN only when present; also
update the subsequent build/push step so it only performs a docker push when the
same condition is true (i.e., build-only for forked PRs and push when
login/secret is available).

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an issue, because right now we want a more rigorous review process of changes being made to the mdx2 repo

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jlee733, understood! That's a valid design decision — restricting contributions to internal collaborators ensures tighter control over the CI/CD pipeline and the changes that trigger it. I'll keep this in mind for future reviews.


🧠 Learnings used
Learnt from: jlee733
Repo: diff-use/mdx2 PR: 57
File: README.md:82-83
Timestamp: 2026-03-11T14:12:25.456Z
Learning: In the diff-use/mdx2 repository, the Docker image (diffuseproject/mdx2) does NOT have mdx2 pre-installed. The `git clone` step in the README's "Developer Install (Docker container)" section is required so the local repository can be mounted into the container via the volume flag (`-v "$(pwd)":/home/dev`).

Comment thread .github/Dockerfile
COPY --from=stage2 /bin/micromamba /usr/local/bin/micromamba

ENV PATH="/usr/local/bin:/opt/conda/bin:$PATH"
ENV JUPYTER_PORT=8888
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Default JUPYTER_PORT (8888) is outside the exposed range (8880-8890).

The default JUPYTER_PORT=8888 doesn't fall within the EXPOSE 8880-8890 range. Users relying on the default will have Jupyter running on an unexposed port. Either change the default to a port within the exposed range (e.g., 8880) or expand the exposed range to include 8888.

🔧 Proposed fix (Option 1: Change default port)
-ENV JUPYTER_PORT=8888
+ENV JUPYTER_PORT=8880

Also applies to: 35-35

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dockerfile` at line 15, The ENV JUPYTER_PORT default (JUPYTER_PORT) is set to
8888 but the Dockerfile EXPOSE range only covers 8880-8890, so update the
Dockerfile to keep these in sync: either change ENV JUPYTER_PORT to a value
inside the exposed range (e.g., 8880) or expand the EXPOSE directive to include
8888 (e.g., EXPOSE 8880-8890 -> include 8888 or widen the range). Make the same
corresponding change where JUPYTER_PORT is declared elsewhere (the other
occurrence mentioned) so both declarations match the EXPOSE range.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8888 is in range (8880-8890)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jlee733, you're absolutely right — 8888 falls within the inclusive range 8880–8890. My original comment was incorrect. Apologies for the noise!

Comment thread README.md Outdated
Comment thread README.md
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
.github/workflows/docker.yml (1)

28-35: Consider more descriptive tags for PR builds.

All non-main builds will overwrite the same test tag, making it impossible to distinguish between different PR builds. For better traceability, consider using the PR number or short SHA:

       - name: Determine Docker tags
         id: tags
         run: |
           if [ "${{ github.ref }}" = "refs/heads/main" ]; then
             echo "tags=${{ vars.DOCKERHUB_USERNAME }}/mdx2:latest" >> $GITHUB_OUTPUT
+          elif [ "${{ github.event_name }}" = "pull_request" ]; then
+            echo "tags=${{ vars.DOCKERHUB_USERNAME }}/mdx2:pr-${{ github.event.pull_request.number }}" >> $GITHUB_OUTPUT
           else
             echo "tags=${{ vars.DOCKERHUB_USERNAME }}/mdx2:test" >> $GITHUB_OUTPUT
           fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/docker.yml around lines 28 - 35, The current "Determine
Docker tags" step (id: tags) always sets non-main builds to a static "test" tag
which causes overwrites; update the logic to generate a unique tag for
PRs/branch builds by using either the PR number
(github.event.pull_request.number) when present or a short commit SHA
(github.sha slice) and append it to the base image name from
vars.DOCKERHUB_USERNAME so tags become e.g. "${{ vars.DOCKERHUB_USERNAME
}}/mdx2:pr-<number>" or "mdx2:<short-sha>" instead of the single "test" tag, and
write that value to $GITHUB_OUTPUT the same way the current branch does.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/docker.yml:
- Around line 37-41: The Build and push step using docker/build-push-action@v6
will fail on case-sensitive filesystems because it defaults to "Dockerfile" —
add the file: attribute pointing to the lowercase "dockerfile" and keep context
if needed; also enable build caching by adding cache-from and cache-to (or
cache: true) entries so the action reuses image layers between runs (reference
the same tags output from steps.tags.outputs.tags when using registry caching).
Ensure you update the "Build and push" step to include file: dockerfile and
appropriate cache-from/cache-to fields to enable registry layer caching.

---

Nitpick comments:
In @.github/workflows/docker.yml:
- Around line 28-35: The current "Determine Docker tags" step (id: tags) always
sets non-main builds to a static "test" tag which causes overwrites; update the
logic to generate a unique tag for PRs/branch builds by using either the PR
number (github.event.pull_request.number) when present or a short commit SHA
(github.sha slice) and append it to the base image name from
vars.DOCKERHUB_USERNAME so tags become e.g. "${{ vars.DOCKERHUB_USERNAME
}}/mdx2:pr-<number>" or "mdx2:<short-sha>" instead of the single "test" tag, and
write that value to $GITHUB_OUTPUT the same way the current branch does.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 63b4695a-7a29-4e64-864b-d692af55bc9a

📥 Commits

Reviewing files that changed from the base of the PR and between 639e12e and b039712.

📒 Files selected for processing (2)
  • .github/workflows/docker.yml
  • README.md
✅ Files skipped from review due to trivial changes (1)
  • README.md

Comment thread .github/workflows/docker.yml
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@README.md`:
- Around line 77-79: Update the README to explicitly state that Jupyter Lab
inside the container listens on port 8888 (as set in the Dockerfile) and that
the default docker run mapping uses 8888:8888; replace the vague "search the
Docker container logs" instruction with guidance that the Jupyter Lab URL will
be printed to the container console output (docker logs or stdout) and explain
how to choose a different host port by changing the mapping from "8888:8888" to
"{HOST_PORT}:8888" when running the container; mention the exposed range
8880-8890 only as available host port suggestions but clarify 8888 is the
container port Jupyter uses.
- Around line 81-82: Update the README around the git clone lines ("git clone
https://github.com/diff-use/mdx2.git && \\" and "cd mdx2 && \\") to clarify why
cloning is shown: state explicitly whether cloning is required to access bundled
examples/test-data, or optional for mounting local code during development, and
provide two brief alternatives (1) a note that cloning is optional if you only
want to run the preinstalled package from the Docker container, or (2)
instructions saying “Clone the repo to access examples and mount it into the
container for development” so users aren’t confused about the
Dockerfile-installed mdx2.
- Around line 81-86: Update the README's Docker command chain so it isn't
broken: stop chaining docker run and docker logs with &&; either remove the
final "docker logs 'mdx2_container'" or run the container detached (remove -it
and add -d and drop --rm) if you need to call docker logs later. Replace the
unclear placeholders {CONTAINER_PORT} and {HOST_PORT} with the concrete default
Jupyter port 8888 in the docker run command (e.g., -e JUPYTER_PORT=8888 -p
8888:8888) and add a short note that users can change the host side (8888:8888
-> {HOST_PORT}:8888) to map a different local port; keep the container name
'mdx2_container' as shown if you intend to reference it later.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 414e80a8-5509-4eb3-8f6b-5a305c82764c

📥 Commits

Reviewing files that changed from the base of the PR and between b039712 and a41afe8.

📒 Files selected for processing (1)
  • README.md

Comment thread README.md Outdated
Comment on lines +77 to +79
Pull image from Docker Hub and run the container. Search the Docker container logs for the localhost address to access mdx2 within a Jupyter Lab notebook or terminal.
a. Exposes ports 8880-8890
b. Platform supported: linux/amd64
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify port configuration for better user experience.

The documentation mentions exposing ports 8880-8890 but doesn't explain which specific port Jupyter uses (8888 by default, per the Dockerfile) or how users should choose a port from this range. The instruction to "search the Docker container logs" is also vague.

📝 Suggested improvement

Consider revising to:

Pull image from Docker Hub and run the container. The Jupyter Lab URL will be printed to the console output.

**Configuration:**
- Default Jupyter port: 8888 (mapped to host port 8888)
- Supported platform: linux/amd64
- To use a different host port, replace `8888:8888` with `{HOST_PORT}:8888` in the command below

This makes it clear that Jupyter runs on port 8888 inside the container and provides actionable guidance for customization.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 77 - 79, Update the README to explicitly state that
Jupyter Lab inside the container listens on port 8888 (as set in the Dockerfile)
and that the default docker run mapping uses 8888:8888; replace the vague
"search the Docker container logs" instruction with guidance that the Jupyter
Lab URL will be printed to the container console output (docker logs or stdout)
and explain how to choose a different host port by changing the mapping from
"8888:8888" to "{HOST_PORT}:8888" when running the container; mention the
exposed range 8880-8890 only as available host port suggestions but clarify 8888
is the container port Jupyter uses.

Comment thread README.md Outdated
Comment on lines +81 to +82
git clone https://github.com/diff-use/mdx2.git && \
cd mdx2 && \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify the purpose of git clone in Docker workflow.

The Docker container already has mdx2 installed (per the Dockerfile), so it's unclear why users need to clone the repository. If the intent is to:

  • Mount local code for development → explain this explicitly
  • Access examples or test data → state this clearly
  • Neither → consider making the clone step optional

The current workflow may confuse users who expect a Docker-based install to "just work" without cloning.

💡 Suggested clarification

Option 1: If cloning is required for examples:

Clone the repository to access examples and mount it into the container:

Option 2: If cloning is optional:

(Optional) Clone the repository to access examples or mount your own code:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 81 - 82, Update the README around the git clone lines
("git clone https://github.com/diff-use/mdx2.git && \\" and "cd mdx2 && \\") to
clarify why cloning is shown: state explicitly whether cloning is required to
access bundled examples/test-data, or optional for mounting local code during
development, and provide two brief alternatives (1) a note that cloning is
optional if you only want to run the preinstalled package from the Docker
container, or (2) instructions saying “Clone the repo to access examples and
mount it into the container for development” so users aren’t confused about the
Dockerfile-installed mdx2.

Comment thread README.md Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
Dockerfile (4)

24-26: Potential dependency conflict from split installs.

The environment is created from env.yaml which pins python >=3.10, <3.11, then additional packages (nexpy, jupyterlab, dials, xia2, etc.) are installed separately. This two-step approach can cause the solver to downgrade or change dependencies from the initial environment. Consider consolidating all dependencies into env.yaml or using --freeze-installed to prevent modifications to already-installed packages.

♻️ Proposed fix using --freeze-installed
 RUN /usr/local/bin/micromamba create -f env.yaml -n mdx2-dev && \
-    /usr/local/bin/micromamba install -y -n mdx2-dev nexpy jupyterlab jupyterlab-h5web dials xia2 wget tar -c conda-forge && \
+    /usr/local/bin/micromamba install -y -n mdx2-dev --freeze-installed nexpy jupyterlab jupyterlab-h5web dials xia2 wget tar -c conda-forge && \
     /usr/local/bin/micromamba clean --all --yes
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile` around lines 24 - 26, The Dockerfile currently creates the
micromamba env from env.yaml then separately installs extra packages which can
let the solver alter pinned versions; update the RUN line that calls
"/usr/local/bin/micromamba create -f env.yaml -n mdx2-dev" and the subsequent
"/usr/local/bin/micromamba install -y -n mdx2-dev ..." step so that either (a)
all packages (nexpy, jupyterlab, jupyterlab-h5web, dials, xia2, wget, tar, etc.)
are consolidated into env.yaml and created in one step, or (b) keep the split
but add the "--freeze-installed" flag to the "/usr/local/bin/micromamba install
-y -n mdx2-dev" invocation to prevent modifying already-installed pinned
dependencies for the mdx2-dev environment.

9-9: Add --no-install-recommends to reduce image size.

The apt-get install command is missing the --no-install-recommends flag, which pulls in unnecessary recommended packages and bloats the image.

♻️ Proposed fix
-RUN apt-get update && apt-get install -y ca-certificates git
+RUN apt-get update && apt-get install -y --no-install-recommends ca-certificates git && \
+    rm -rf /var/lib/apt/lists/*
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile` at line 9, Update the Dockerfile RUN instruction that currently
reads "RUN apt-get update && apt-get install -y ca-certificates git" to add the
--no-install-recommends flag so apt-get installs only essential packages (i.e.,
change to use "apt-get install -y --no-install-recommends ca-certificates git");
keep the existing apt-get update and retain any subsequent cleanup (e.g.,
removal of apt lists) as-is to minimize image size.

28-33: Consider combining RUN instructions to reduce layers.

The two pip install commands can be combined into a single RUN instruction to reduce the number of image layers.

♻️ Proposed fix
-RUN /usr/local/bin/micromamba run -n mdx2-dev pip install git+https://github.com/FlexXBeamline/dials-extensions.git@963fe9e458a505a8443c988d45fb8dcb4532768f
-RUN /usr/local/bin/micromamba run -n mdx2-dev pip install -e .
+RUN /usr/local/bin/micromamba run -n mdx2-dev pip install \
+    git+https://github.com/FlexXBeamline/dials-extensions.git@963fe9e458a505a8443c988d45fb8dcb4532768f && \
+    /usr/local/bin/micromamba run -n mdx2-dev pip install -e .
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile` around lines 28 - 33, Combine the two pip installs into a single
RUN invocation to reduce image layers: replace the two separate RUN lines that
call "/usr/local/bin/micromamba run -n mdx2-dev pip install
git+https://github.com/FlexXBeamline/dials-extensions.git@963fe9e458a505a8443c988d45fb8dcb4532768f"
and "/usr/local/bin/micromamba run -n mdx2-dev pip install -e ." with one RUN
that runs both pip installs in sequence under the same
"/usr/local/bin/micromamba run -n mdx2-dev" command so both installations occur
in a single layer.

37-37: Container runs as root user.

The container runs JupyterLab as root, which is a security concern flagged by static analysis (DS-0002). While --allow-root is required for root execution, consider creating a non-root user for better security posture, especially since this image may be used in CI environments.

🛡️ Proposed fix to add a non-root user

Add before the WORKDIR directive:

+RUN useradd -m -s /bin/bash devuser
+
 WORKDIR /home/dev

And update the CMD:

+USER devuser
+
 EXPOSE 8880-8890

-CMD /usr/local/bin/micromamba run -n mdx2-dev jupyter lab --ip=0.0.0.0 --port=$JUPYTER_PORT --no-browser --allow-root
+CMD /usr/local/bin/micromamba run -n mdx2-dev jupyter lab --ip=0.0.0.0 --port=$JUPYTER_PORT --no-browser

Note: This requires adjusting directory permissions and potentially the WORKDIR path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile` at line 37, The Dockerfile currently starts JupyterLab as root
via the CMD that runs "micromamba run -n mdx2-dev jupyter lab ... --allow-root";
create a non-root user (e.g., adduser/ group and home directory), chown the
WORKDIR and any micromamba/conda env paths to that user, switch to that user
with the USER instruction, and update the CMD to run "micromamba run -n mdx2-dev
jupyter lab --ip=0.0.0.0 --port=$JUPYTER_PORT --no-browser" (remove
--allow-root) so Jupyter runs unprivileged; ensure permissions for the mdx2-dev
environment and JUPYTER_PORT-bound directories are adjusted to the new user's
UID/GID.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@README.md`:
- Around line 81-82: Clarify why the README asks users to run "git clone" before
starting Docker: add a short sentence above the two commands explaining that
cloning the repository is optional and is only needed if they want to mount
local code into the container for development so edits on the host are reflected
inside the container; alternatively mark the step as optional and state that the
Docker image already contains mdx2 for users who just want to run the container
without mounting local sources.
- Line 84: Replace the undefined placeholders `{CONTAINER_PORT}` and
`{HOST_PORT}` in the docker run example with concrete defaults and a
customization note: use the IMAGE env var JUPYTER_PORT default 8888 (set via
JUPYTER_PORT) and show the docker run mapping as `-p 8888:8888` by default,
explain that users can change the host side (e.g., `{YOUR_HOST_PORT}:8888`) to
customize, and add a note to run `docker logs <container_name>` to retrieve the
Jupyter URL; update the README.md docker run line accordingly and mention the
default port and how to override it.
- Around line 83-84: The README uses the image tag "diffuseproject/mdx2:1.0.0"
which doesn't match the CI workflow that pushes "diffuseproject/mdx2:latest";
update the README's docker commands (the lines containing "docker pull
diffuseproject/mdx2:1.0.0" and the subsequent "docker run" invocation) to
reference the correct tag "diffuseproject/mdx2:latest" (or change the workflow
to push :1.0.0 if you intend to keep that tag), and ensure the README and
.github/workflows/docker.yml are consistent.

---

Nitpick comments:
In `@Dockerfile`:
- Around line 24-26: The Dockerfile currently creates the micromamba env from
env.yaml then separately installs extra packages which can let the solver alter
pinned versions; update the RUN line that calls "/usr/local/bin/micromamba
create -f env.yaml -n mdx2-dev" and the subsequent "/usr/local/bin/micromamba
install -y -n mdx2-dev ..." step so that either (a) all packages (nexpy,
jupyterlab, jupyterlab-h5web, dials, xia2, wget, tar, etc.) are consolidated
into env.yaml and created in one step, or (b) keep the split but add the
"--freeze-installed" flag to the "/usr/local/bin/micromamba install -y -n
mdx2-dev" invocation to prevent modifying already-installed pinned dependencies
for the mdx2-dev environment.
- Line 9: Update the Dockerfile RUN instruction that currently reads "RUN
apt-get update && apt-get install -y ca-certificates git" to add the
--no-install-recommends flag so apt-get installs only essential packages (i.e.,
change to use "apt-get install -y --no-install-recommends ca-certificates git");
keep the existing apt-get update and retain any subsequent cleanup (e.g.,
removal of apt lists) as-is to minimize image size.
- Around line 28-33: Combine the two pip installs into a single RUN invocation
to reduce image layers: replace the two separate RUN lines that call
"/usr/local/bin/micromamba run -n mdx2-dev pip install
git+https://github.com/FlexXBeamline/dials-extensions.git@963fe9e458a505a8443c988d45fb8dcb4532768f"
and "/usr/local/bin/micromamba run -n mdx2-dev pip install -e ." with one RUN
that runs both pip installs in sequence under the same
"/usr/local/bin/micromamba run -n mdx2-dev" command so both installations occur
in a single layer.
- Line 37: The Dockerfile currently starts JupyterLab as root via the CMD that
runs "micromamba run -n mdx2-dev jupyter lab ... --allow-root"; create a
non-root user (e.g., adduser/ group and home directory), chown the WORKDIR and
any micromamba/conda env paths to that user, switch to that user with the USER
instruction, and update the CMD to run "micromamba run -n mdx2-dev jupyter lab
--ip=0.0.0.0 --port=$JUPYTER_PORT --no-browser" (remove --allow-root) so Jupyter
runs unprivileged; ensure permissions for the mdx2-dev environment and
JUPYTER_PORT-bound directories are adjusted to the new user's UID/GID.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 6853448b-3cc4-4dc3-b3d3-f0679580e717

📥 Commits

Reviewing files that changed from the base of the PR and between a41afe8 and 8d5fa4a.

📒 Files selected for processing (2)
  • Dockerfile
  • README.md

Comment thread README.md
Comment on lines +81 to +82
git clone https://github.com/diff-use/mdx2.git
cd mdx2
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify why cloning is needed when Docker already contains mdx2.

The Docker image already has mdx2 installed. The git clone step is presumably to mount local code for development, but this isn't explained. Users expecting a "just works" Docker install will be confused.

📝 Suggested clarification

Add context before the commands:

Clone the repository to mount your local code into the container for development. This allows you to edit files on your host and see changes reflected inside the container.

Or, if cloning is optional:

(Optional) Clone the repository if you want to mount local code for development:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 81 - 82, Clarify why the README asks users to run
"git clone" before starting Docker: add a short sentence above the two commands
explaining that cloning the repository is optional and is only needed if they
want to mount local code into the container for development so edits on the host
are reflected inside the container; alternatively mark the step as optional and
state that the Docker image already contains mdx2 for users who just want to run
the container without mounting local sources.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Docker image doesn't have mdx2 installed, so that's why the git clone step is needed

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jlee733, thanks for the clarification! I was wrong about the image having mdx2 pre-installed.

That said, the README could still benefit from a brief note explaining that the git clone is required (not optional) because mdx2 is not bundled in the image — it's mounted from the local clone. This would prevent confusion for new users who might wonder why they need to clone before running Docker. For example:

### Developer Install (Docker container)

Pull image from Docker Hub and run the container. The repository must be cloned locally so it can be mounted into the container — mdx2 is not pre-installed in the image. Search the Docker container logs for the localhost address to access mdx2 within a Jupyter Lab notebook or terminal.

This is just a suggestion to keep the documentation clear. Feel free to word it however fits best!


✏️ Learnings added
Learnt from: jlee733
Repo: diff-use/mdx2 PR: 57
File: README.md:82-83
Timestamp: 2026-03-11T14:12:25.456Z
Learning: In the diff-use/mdx2 repository, the Docker image (diffuseproject/mdx2) does NOT have mdx2 pre-installed. The `git clone` step in the README's "Developer Install (Docker container)" section is required so the local repository can be mounted into the container via the volume flag (`-v "$(pwd)":/home/dev`).

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment thread README.md
Comment thread README.md
git clone https://github.com/diff-use/mdx2.git
cd mdx2
docker pull diffuseproject/mdx2:1.0.0
docker run --platform linux/amd64 --name mdx2_container -it -d -e JUPYTER_PORT={CONTAINER_PORT} -p {HOST_PORT}:{CONTAINER_PORT} -v "$(pwd)":/home/dev diffuseproject/mdx2:1.0.0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Undefined placeholders {CONTAINER_PORT} and {HOST_PORT} are not actionable.

Users cannot execute this command without knowing what values to substitute. The Dockerfile sets JUPYTER_PORT=8888 by default, so provide concrete defaults with guidance on customization.

🐛 Proposed fix with concrete values and explanation

Replace lines 77-86 with:

Pull image from Docker Hub and run the container. The Jupyter Lab URL will be printed when you run `docker logs`.

- Default Jupyter port: **8888**
- Platform: `linux/amd64`
- To use a different host port, replace `8888:8888` with `{YOUR_HOST_PORT}:8888`

```bash
git clone https://github.com/diff-use/mdx2.git
cd mdx2
docker pull diffuseproject/mdx2:latest
docker run --platform linux/amd64 --name mdx2_container -d -p 8888:8888 -v "$(pwd)":/home/dev diffuseproject/mdx2:latest
docker logs mdx2_container
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @README.md at line 84, Replace the undefined placeholders {CONTAINER_PORT}
and {HOST_PORT} in the docker run example with concrete defaults and a
customization note: use the IMAGE env var JUPYTER_PORT default 8888 (set via
JUPYTER_PORT) and show the docker run mapping as -p 8888:8888 by default,
explain that users can change the host side (e.g., {YOUR_HOST_PORT}:8888) to
customize, and add a note to run docker logs <container_name> to retrieve the
Jupyter URL; update the README.md docker run line accordingly and mention the
default port and how to override it.


</details>

<!-- fingerprinting:phantom:poseidon:ocelot -->

<!-- This is an auto-generated comment by CodeRabbit -->

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
.github/Dockerfile (3)

9-9: Add --no-install-recommends and clean apt cache to reduce image size.

The apt-get install is missing --no-install-recommends, which installs unnecessary suggested packages. Additionally, cleaning up the apt cache in the same layer avoids persisting it in the image.

♻️ Proposed fix
-RUN apt-get update && apt-get install -y ca-certificates git
+RUN apt-get update && apt-get install -y --no-install-recommends ca-certificates git \
+    && rm -rf /var/lib/apt/lists/*
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/Dockerfile at line 9, Update the RUN instruction that currently
performs "apt-get update && apt-get install -y ca-certificates git" to add the
--no-install-recommends flag and remove apt lists in the same layer;
specifically modify the RUN line that installs ca-certificates and git so it
uses "--no-install-recommends" and then cleans the apt cache (remove
/var/lib/apt/lists) in the same command to reduce image size and avoid
persisting cache.

35-37: Consider adding a non-root user for improved security.

The container runs as root by default, which is flagged by security scanners and could pose risks if the container is exposed. For a development container, running as root is sometimes acceptable for write permissions, but adding a dedicated user improves security posture.

🔒 Proposed fix to add non-root user
+RUN useradd -m -s /bin/bash dev && chown -R dev:dev /home/dev /opt/conda
+USER dev
+
 EXPOSE 8880-8890

 CMD /usr/local/bin/micromamba run -n mdx2-dev jupyter lab --ip=0.0.0.0 --port=$JUPYTER_PORT --no-browser --allow-root

Note: If running as non-root, you can remove --allow-root from the CMD.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/Dockerfile around lines 35 - 37, Add a dedicated non-root user in
the Dockerfile and switch to it before CMD to improve security: create a user
(e.g., mdxuser) and group, create a writable home/workdir, chown relevant
directories (including any binder/micromamba env dirs) to that user, and set
USER to that non-root user before the existing CMD that launches micromamba and
jupyter; update the CMD invocation referencing the existing environment name
"mdx2-dev" and port variable "JUPYTER_PORT" and remove the now-unnecessary
"--allow-root" flag from the micromamba run command if running as non-root.

3-12: Stage1 Python copy is unused — consider removing to reduce image size.

All Python execution routes through the conda environment via micromamba run -n mdx2-dev, which uses the Python installed in /opt/conda/envs/mdx2-dev. The Python from stage1, copied to /usr/local, is never directly or implicitly invoked. Removing stage1 and the COPY instruction would reduce final image size without affecting functionality.

♻️ Proposed fix
-FROM python:3.10-slim AS stage1
-
 FROM mambaorg/micromamba:1.5.5 AS stage2

 FROM debian:stable-slim AS final

 RUN apt-get update && apt-get install -y ca-certificates git

-COPY --from=stage1 /usr/local /usr/local
 COPY --from=stage2 /bin/micromamba /usr/local/bin/micromamba

-ENV PATH="/usr/local/bin:/opt/conda/bin:$PATH"
+ENV PATH="/opt/conda/bin:/usr/local/bin:$PATH"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/Dockerfile around lines 3 - 12, Remove the unused Python build stage
and its copy: drop the "FROM python:3.10-slim AS stage1" stage and remove the
"COPY --from=stage1 /usr/local /usr/local" line since runtime uses the conda
environment invoked via "micromamba run -n mdx2-dev" (Python from
/opt/conda/envs/mdx2-dev), ensuring no other references to stage1 remain.
.github/workflows/docker.yml (1)

37-43: Consider adding build caching to improve CI performance.

The build configuration is correct with explicit context and Dockerfile path. Adding GitHub Actions cache support would reduce build times for subsequent runs.

⚡ Proposed enhancement for build caching
       - name: Build and push
         uses: docker/build-push-action@v6
         with:
           context: .
           file: .github/Dockerfile
           push: ${{ github.ref == 'refs/heads/main' }}
           tags: ${{ steps.tags.outputs.tags }}
+          cache-from: type=gha
+          cache-to: type=gha,mode=max
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/docker.yml around lines 37 - 43, The "Build and push"
GitHub Actions step using docker/build-push-action@v6 should enable build
caching to speed CI; update that step (name "Build and push") to add the
docker/build-push-action inputs cache-from and cache-to (e.g., cache-from:
type=gha, cache-to: type=gha,mode=max) so the action stores and reuses layer
cache between runs and include any necessary permissions (id-token/read) or
registry login if required by your runner; keep existing context, file, push and
tags inputs and ensure the new cache keys use the action's supported syntax.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/Dockerfile:
- Line 9: Update the RUN instruction that currently performs "apt-get update &&
apt-get install -y ca-certificates git" to add the --no-install-recommends flag
and remove apt lists in the same layer; specifically modify the RUN line that
installs ca-certificates and git so it uses "--no-install-recommends" and then
cleans the apt cache (remove /var/lib/apt/lists) in the same command to reduce
image size and avoid persisting cache.
- Around line 35-37: Add a dedicated non-root user in the Dockerfile and switch
to it before CMD to improve security: create a user (e.g., mdxuser) and group,
create a writable home/workdir, chown relevant directories (including any
binder/micromamba env dirs) to that user, and set USER to that non-root user
before the existing CMD that launches micromamba and jupyter; update the CMD
invocation referencing the existing environment name "mdx2-dev" and port
variable "JUPYTER_PORT" and remove the now-unnecessary "--allow-root" flag from
the micromamba run command if running as non-root.
- Around line 3-12: Remove the unused Python build stage and its copy: drop the
"FROM python:3.10-slim AS stage1" stage and remove the "COPY --from=stage1
/usr/local /usr/local" line since runtime uses the conda environment invoked via
"micromamba run -n mdx2-dev" (Python from /opt/conda/envs/mdx2-dev), ensuring no
other references to stage1 remain.

In @.github/workflows/docker.yml:
- Around line 37-43: The "Build and push" GitHub Actions step using
docker/build-push-action@v6 should enable build caching to speed CI; update that
step (name "Build and push") to add the docker/build-push-action inputs
cache-from and cache-to (e.g., cache-from: type=gha, cache-to:
type=gha,mode=max) so the action stores and reuses layer cache between runs and
include any necessary permissions (id-token/read) or registry login if required
by your runner; keep existing context, file, push and tags inputs and ensure the
new cache keys use the action's supported syntax.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 18064df4-1dad-49e6-a18b-53ca19a5107b

📥 Commits

Reviewing files that changed from the base of the PR and between 8d5fa4a and ee992b8.

📒 Files selected for processing (2)
  • .github/Dockerfile
  • .github/workflows/docker.yml

@jlee733
Copy link
Copy Markdown
Author

jlee733 commented Mar 11, 2026

Also, I moved Dockerfile to the github actions directory (".github") to declutter the root directory

username: ${{ vars.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}

- name: Set up QEMU
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is purpose of setting up QEMU here?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QEMU is used for multi-platform builds of the docker image. Right now we're supporting linux/amd64 architecture, but if we want o support other platforms like linux/arm64 in the future, we can have a way of building the image on that platform too

Comment thread .github/Dockerfile
/usr/local/bin/micromamba install -y -n mdx2-dev nexpy jupyterlab jupyterlab-h5web dials xia2 wget tar -c conda-forge && \
/usr/local/bin/micromamba clean --all --yes

COPY . .
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to add a .dockerignore file to prevent unnecessary files (.git, etc.) from going into the image if we decide to publish on Dockerhub?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created the ".dockerignore" file to reduce the size of the image being pushed to docker hub, but depending on how the volume is mounted, the user can still access the unnecessary files

Comment thread .github/Dockerfile Outdated

RUN apt-get update && apt-get install -y ca-certificates git

COPY --from=stage1 /usr/local /usr/local
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude is warning of potential mismatch and drift between libraries that are dynamically linked on python install from stage1 that may not match with final:

Multi-stage build copies dynamically-linked Python into a different distro — fragile

Copies /usr/local (including the Python runtime) from python:3.10-slim into debian:stable-slim
Python is dynamically linked against glibc, libssl, etc. at specific versions. Copying it into a different base risks those shared libraries being missing or at wrong versions
micromamba (the only binary from stage2) is statically linked and is safe to copy across images — the problem is solely the stage1 Python copy
debian:stable also changes over time, making the build non-reproducible
Fix: Use python:3.10-slim as the single final base (Python is then native to the image, no cross-image copy needed). Copy only the micromamba binary from stage2 — it's statically linked so this is safe:
FROM mambaorg/micromamba:1.5.5 AS micromamba_stage
FROM python:3.10-slim AS final
RUN apt-get update && apt-get install -y ca-certificates git
COPY --from=micromamba_stage /bin/micromamba /usr/local/bin/micromamba

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed "FROM debian:stable-slim AS final" to avoid this

…om Dockerfile to prevent shared library mismatch
@jlee733 jlee733 merged commit 0d86d17 into main Mar 11, 2026
2 checks passed
@jlee733 jlee733 deleted the jlee-org-launch branch March 11, 2026 20:48
@coderabbitai coderabbitai Bot mentioned this pull request Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants