-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reuse Ollama cached image when available #782
Conversation
Signed-off-by: Kush Gupta <[email protected]>
Signed-off-by: Kush Gupta <[email protected]>
Reuse of a local ollama cache
Reviewer's Guide by SourceryThis pull request implements a caching mechanism that reuses a locally cached Ollama model layer instead of re-downloading it. The changes include the addition of a function to check for an existing cache and modifications in the pull process to leverage the cache by symlinking the local layer, with retry logic on checksum mismatches if necessary. Sequence diagram for reusing cached Ollama model layersequenceDiagram
participant PB as pull_blob
participant IEC as in_existing_cache
participant DF as download_file
participant VC as verify_checksum
participant RC as run_cmd
PB->>IEC: Call in_existing_cache(model_name, model_tag)
alt Cache Found
IEC-->>PB: local_blob
PB->>RC: run_cmd(ln -sf local_blob, layer_blob_path)
else No Cache Found
IEC-->>PB: None
PB->>DF: download_file(url, layer_blob_path)
PB->>VC: verify_checksum(layer_blob_path)
alt Checksum passes
note right of VC: Proceed with downloaded blob
else Checksum mismatch
PB->>PB: Print error and remove layer_blob_path
PB->>DF: download_file(url, layer_blob_path) again
PB->>VC: verify_checksum(layer_blob_path) again
alt Second Checksum passes
note right of VC: Proceed after re-download
else Second Checksum fails
PB->>PB: Raise ValueError
end
end
end
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @kush-gupt - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider adding a check to ensure the
layer_blob_path
exists after the symlink operation inpull_blob
. - The
in_existing_cache
function could be improved by usingos.path.expanduser
to handle the home directory path.
Here's what I looked at during the review
- 🟡 General issues: 1 issue found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Signed-off-by: Kush Gupta <[email protected]>
Signed-off-by: Kush Gupta <[email protected]>
@kush-gupt Thanks great work. Could you verify what happens if Ramalama or Ollama removes the model.
Want to make sure if ollama removes the model, then ramalama no longer has it? It might make more sense to have hard links then softlinks. Then ollama removing it would not effect ramalama or vice versa. Of course do hard links and soft links work well on Mac andWindows? |
The code looks good and the core "ramalama run/ramalama serve" would work, I guess we are probably capable of corrupting Ollama model store via RamaLama with "ramalama rm" etc. ? |
@rhatdan had a similar thought at the same time... 😄 |
RamaLama does not currently have the ability to fully remove an Ollama model from a Ollama model store, since we only look at the .gguf part, I think with the current code, we could only partially remove an Ollama model from it's store. Writing/removing things in the Ollama store in general seems messy, it's not on our turf. My initial thoughs would be, we just remove our link to the Ollama registry (maybe print a message that we won't manipulate the Ollama model store). Links to the Ollama model store, seem safe... |
print(f"Checksum mismatch for blob {layer_blob_path}, retrying download...") | ||
os.remove(layer_blob_path) | ||
local_blob = in_existing_cache(model_name, model_tag) | ||
if local_blob is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the:
if local_blob is not None:
case, we seem to call ln -sf
twice, should we call it once?
In the rm case in an Ollama-stored object, I propose we just remove the symlink part:
local_blob
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I think we should only remove the ramalama content on ramalama rm
. But on ollama rm
ramalama can end up in a bad state as well, which I think we should take care of also. ramalama ls
should either ignore hanging symlinks or remove them.
BTW I have seen the database get futz up by breaking removing of images, and we don't handle it cracefully. I had to find the symlink and rm -f it .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking through this, I understand the double symlink now, once for the sha256 and a second time for the named version, I think the rm should handle this without touching the Ollama store, but we should test to be sure.
Once this gets in, we should do the same for Huggingface database. I think this is partially what @benoitf wants. |
if layer["mediaType"] == "application/vnd.ollama.image.model": | ||
layer_digest = layer["digest"] | ||
ollama_digest_path = os.path.join(cache_dir, 'blobs', layer_digest) | ||
if os.path.exists(str(ollama_digest_path).replace(':','-')): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we ever make breaking changes to the model store which I suspect we will, we should change ':' to '-'on disk. ':' doesn't work on Windows filesystems for one. But that's should be a hidden detail to the user, they should use ':'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets make the change now, when we have not released 1.0, For now we can handle both : and -.
It could be a hard link from the Ollama model blob to the ramalama blob store, retaining the ability for |
Created a new function called in_existing_cache to take in a model name and tag, search a default Ollama cache and see if there's a matching model layer to symlink in the ramalama store.
Allows the process of someone running ollama to use ramalama with no model pulls!
Directory structure would look like:
Summary by Sourcery
New Features: