Skip to content

Adding ollama as a service to docker compose file #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

krasch
Copy link
Collaborator

@krasch krasch commented Jun 4, 2025

This PR adds ollama to the docker compose file, i.e. ollama is automatically started by docker-compose up

@krasch
Copy link
Collaborator Author

krasch commented Jun 4, 2025

With this PR, the readme structure does not really work that well anymore. ollama is a prerequisite only in the non-docker case and the troubleshooting advice also is specific to the non-docker case. Perhaps there should be two big "When not using docker" and "When using docker" sections?

Also there is some repetition with model names. And I as a user would actually like to get some guidance on when to use which model (e.g. when only having a CPU). Perhaps just give one example for each model and then have a table with recommended alternatives and in which situations they are suitable? This table could then be referenced both in the docker and non-docker sections

README.md Outdated
docker-compose up

# Install embedding model (required)
docker exec -it ollama ollama pull nomic-embed-text
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here the first ollama is the name of the container and the second is the name of the command. This is a bit cryptic, but hopefully understandable enough for users of this first version?

@clstaudt
Copy link
Owner

clstaudt commented Jun 5, 2025

@krasch Thank you for the PR. Is this ready to merge, so that I can update the README?

@krasch
Copy link
Collaborator Author

krasch commented Jun 5, 2025

@krasch Thank you for the PR. Is this ready to merge, so that I can update the README?

There are still two issues that need to be investigated.

  1. Do we need a volume for the ollama service, see above?
  2. For GPU support, do we need to add anything wrt nvidia-docker or does it just work out of the box? I have not yet tested this on a GPU machine.

If you want we can create Issues for these two and merge this one, so that you can update the README

@clstaudt
Copy link
Owner

clstaudt commented Jun 5, 2025

@krasch This is why I was unsure about adding ollama to the docker setup. I guess the PR needs to wait until this is understood.

@krasch
Copy link
Collaborator Author

krasch commented Jun 6, 2025

@krasch This is why I was unsure about adding ollama to the docker setup. I guess the PR needs to wait until this is understood.

I understand that this is what you are worried about, but I know 100% that it is possible, have done it before. just might need some additional things.

Just do whatever changes you need to make in the Readme and I will get this branch here synced later.

@krasch krasch force-pushed the ollama-in-docker-compose branch from 2e37997 to 87fcb5f Compare June 6, 2025 13:13
@krasch krasch force-pushed the ollama-in-docker-compose branch from 87fcb5f to 26cda89 Compare June 6, 2025 13:16
@krasch krasch force-pushed the ollama-in-docker-compose branch from 7dd82d1 to 9c9cc24 Compare June 6, 2025 13:55
@krasch krasch changed the title [WIP] Adding ollama as a service to docker compose file Adding ollama as a service to docker compose file Jun 6, 2025
@krasch
Copy link
Collaborator Author

krasch commented Jun 6, 2025

@clstaudt This is now ready to merge. I confirmed that ollama does indeed use the GPU in this setup.

I moved things around a bit in the Readme, please review

@clstaudt
Copy link
Owner

clstaudt commented Jun 7, 2025

@krasch Thank you for moving this forward!

Screenshot 2025-06-07 at 14 36 49
  1. May be nitpicking, but why do I need to remember a more complicated docker command if I have a GPU? (A GPU is not required, but highly recommended to run this app smoothly.)
services:
  ollama:
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities:
                - gpu
  1. What if I have a GPU, but it is not an NVIDIA GPU?

@krasch
Copy link
Collaborator Author

krasch commented Jun 12, 2025

1. May be nitpicking, but why do I need to remember a more complicated docker command if I have a GPU?  (A GPU is not required, but highly recommended to run this app smoothly.)

Because I have not found an easier way to make that happen with docker compose, unfortunately there does not seem to be a way to simply set a flag "use gpu" on the command line.

What could be done to simplify for GPU users, is to move the content of docker-compose.gpu.yml directly into the main docker-compose.yml. Then GPU users would only need to run docker compose up. CPU users on the other hand would need to open the docker-compose.yml file and uncomment those lines, so it is much worse for CPU users. Let me know if you want me to make that change.

2. What if I have a GPU, but it is not an NVIDIA GPU?

There isn't really CUDA support for other GPUs, so these are usually not supported anyway. Deep learning happens on nvidia.


So I have been reading up a little on the Apple stuff, and I have found that what I have here in the PR will actually only work on Linux. https://chariotsolutions.com/blog/post/apple-silicon-gpus-docker-and-ollama-pick-two/, so perhaps it is best if you just close the PR.

Be aware though, that your main might not work for Linux users instead. The setup of accessing from within a docker container (your app) a system HTTP port (ollama) is a bit unusual and the last time I tried it, I could not make it work on linux. There is this answer https://stackoverflow.com/a/24326540 which I believe last time I tried and failed and then just refactored to do things in the standard manner (i.e. either all on docker or nothing).

@clstaudt
Copy link
Owner

There isn't really CUDA support for other GPUs, so these are usually not supported anyway. Deep learning happens on nvidia.

With Ollama, LLM inference works just fine on Apple Silicon GPUs - out of the box when Mac users install and start Ollama.app.

At this point I am not even sure that a Docker configuration simplifies anything for this application.

@krasch
Copy link
Collaborator Author

krasch commented Jun 15, 2025

At this point I am not even sure that a Docker configuration simplifies anything for this application.

So that Windows and Linux and Mac users with older hardware can try out your application extremely quickly without having to install anything (if they already have docker).

But up to you, your application, your decision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants