-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Local deepseek deepseek-r1:14b, working but took very long time #195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
maybe your ollama does not use gpu. try this two methods to solve: 1. ollama run deepseek-r1:14b first to check whether it can run faster? 2. try this: #185 |
It run fast enough with ollama run deepseek-r1:14b, have no issue prompting directly. Already try 2 option, but no luck. Its still super slow.. it works, doing step 1, but super slow until reach step 2 and so on |
deepseek-reasoner doesn;t work with 1.3 |
please donot use 1.3,the code is under development, keep up with the latest codes |
I have the same issue on my laptop with 13th gen i9 and RTX 4080 (ollama splits 50-50 between cpu & gpu). I think it may be related to the large context length (DeepSeek API was giving me errors on this as well). |
I followed that link to #185 as mentioned above and downloaded/installed the newest version of ollama that (according to that post) allows the CPU/GPU to share workload. However, running mistral results in a nearly 15 minute execution time of "go to google.com and type 'OpenAI' click search and give me the first url". Running even deepseek 7b pretty much does nothing by the time i get tired of waiting ten minutes later. I'm running an AMD Ryzen 9 7940HX with an RTX4070 gpu. As others have mentioned, the direct prompting in the terminal is very fast. But the browser manipulation is dreadfully slow. |
This is exactly whats happened to me also, CPU/GPU 49/50, results took very long time |
Related to #111 options_dict = kwargs.pop(
"options",
{
"mirostat": self.mirostat,
"mirostat_eta": self.mirostat_eta,
"mirostat_tau": self.mirostat_tau,
"num_ctx": self.num_ctx,
"num_gpu": self.num_gpu,
# "num_thread": self.num_thread, ############## Commented out
"num_predict": self.num_predict,
"repeat_last_n": self.repeat_last_n,
"repeat_penalty": self.repeat_penalty,
"temperature": self.temperature,
"seed": self.seed,
"stop": self.stop if stop is None else stop,
"tfs_z": self.tfs_z,
"top_k": self.top_k,
"top_p": self.top_p,
},
) This resolves the issue as Ollama will automatically set this while loading the model. Stats Other things to note: |
Im try to commenting out also but nothing changed. Its still 51/49 CPU/GPU Dunno why |
@harst21 You have to restart WebUI and Ollama when you do or Ollama caches the model and keeps the old config loaded |
What to do ??? |
I've also commented out the line mentioned above and it got me no results. The default task description from webUI still takes nearly 30 minutes to run using mistral which is a lighter model. It takes almost 5 minutes just for the google home page to load in the browser for this command: go to google.com and type 'OpenAI' click search and give me the first url |
Any updates? did anyone get it working on GPU? |
Hi there,
Im using ollama and local model of deepseek-r1:14b, it works for default prompt. But each step took very long time, more than 4 minutes each. Its finally get into first page of search result
But it took more than 30 minutes. Lol
Any idea why is this happening? is there any way to speed up the process?
Thanks in advance!
The text was updated successfully, but these errors were encountered: