Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bolt.diy is using more CPU and less GPU #1518

Open
Anybody2007 opened this issue Mar 16, 2025 · 7 comments
Open

bolt.diy is using more CPU and less GPU #1518

Anybody2007 opened this issue Mar 16, 2025 · 7 comments

Comments

@Anybody2007
Copy link

Describe the bug

When I am making improvments in my local projects which is medium size(approx = 126 files), the request to ollama will be more load on CPU rather than GPU, and which results in error with network. And I using ollama in my system with GTX 2060 card for local model.

Link to the Bolt URL that caused the error

localhost

Steps to reproduce

  1. Clone a medium size project from github
  2. Import the project in bolt.diy
  3. Setup ollama in the same machine as bolt.diy
  4. Setup bolt.diy and ollama
  5. Select ollama model from bolt.diy drop down.
  6. Start asking for improvemnts.

Expected behavior

It should use less CPU and More depended on GPU. And there will be response.

Screen Recording / Screenshot

Model list from ollama

Prompt Started, high CPU

High CPU usage and low GPU usage

Error

Platform

  • OS: [e.g. macOS, Windows, Linux]
  • Browser: [e.g. Chrome, Safari, Firefox]
  • Version: [e.g. 91.1][latest]

Provider Used

ollama

Model Used

qwen-code:7b and deepseek-r1:8b

Additional context

No response

@leex279
Copy link
Collaborator

leex279 commented Mar 18, 2025

Hey @Anybody2007,

this has nothing to do with bolt.diy itself. That depends on your Ollama configuration. It would also not use more gpu if you just use ollama locally in the terminal.

Check your Ollama configuration and make sure it uses your GPU: https://github.com/ollama/ollama/blob/main/docs/gpu.md

As far as I see in the table, your card is not supported by Ollama. Just the RTX 2060 but not GTX 2060.

@Anybody2007
Copy link
Author

Hi @leex279 ,
Thanks for the documentation page, I will surely go through that once again.
I found my card on the same table.

root@model:~# nvidia-smi -L
GPU 0: NVIDIA GeForce RTX 2060 (UUID: GPU-4eabbb12-bdeb-4ab6-297c-6c5fd3b3df0d)

And I will post whatever I finds.

@Anybody2007
Copy link
Author

Hi @leex279 ,
I checked the documentation page which you shared, and applied the manual GPU selection with the below option.

export CUDA_VISIBLE_DEVICES=GPU-4eabbb12-bdeb-4ab6-297c-6c5fd3b3df0d
systemctl restart ollama

But still I am seeing 37%/63% CPU/GPU.

Below is the error log pasted from the bolt.diy web-ui.

Chat request failed

{
   "name": "TypeError",
   "message": "network error",
   "stack": "TypeError: network error",
   "component": "Chat",
   "action": "request",
   "error": "network error"
 }

Ollama is configured with localhost, below is the url.

http://127.0.0.1:11434

One additional thing, in the Settings -> Local Provider

I have given the mentioned url http://127.0.0.1:11434, but I am not able to pull any new model from bolt ui. Also there is no model listed.

But surprisingly, in the chat prompt I can list down all the models.

@leex279
Copy link
Collaborator

leex279 commented Mar 19, 2025

I found my card on the same table.

you wrote GTX 2060 in your initial post, not RTX 2060 ;)

@leex279
Copy link
Collaborator

leex279 commented Mar 19, 2025

Check the DEV-Console and the Terminal log please.

Also how is your bolt.diy instance running? Also local with pnpm run dev?

@Anybody2007
Copy link
Author

Anybody2007 commented Mar 21, 2025

Also how is your bolt.diy instance running? Also local with pnpm run dev?

I am running as -> pnpm run dev
And using nginx to forward the port and use it in my browser.

Below is the nginx config I am using

user www-data;
worker_processes auto;
pid /run/nginx.pid;
error_log /var/log/nginx/error.log;
include /etc/nginx/modules-enabled/*.conf;

events {
	worker_connections 768;
}

http {
	client_max_body_size 100M;
	keepalive_timeout 2000;
	sendfile on;
	tcp_nopush on;
	types_hash_max_size 2048;
	include /etc/nginx/mime.types;
	default_type application/octet-stream;
	ssl_prefer_server_ciphers on;
	access_log /var/log/nginx/access.log;
	gzip on;
	include /etc/nginx/conf.d/*.conf;
	include /etc/nginx/sites-enabled/*;
}

And

server {
    listen 443;
    server_name bolt.my.site ssl;
    ssl_certificate /root/certs/npm-4/fullchain.pem;
    ssl_certificate_key /root/certs/npm-4/privkey.pem;
    location / {
        proxy_pass http://localhost:5173;
        proxy_set_header Host localhost:5173;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "Upgrade";
    }
}

Check the DEV-Console and the Terminal log please.

I found nothing special I found the logs, when I am getting error in the frontend.

 INFO   select-context  Total files: 4
 DEBUG   api.chat  files in context : ["/home/project/imports/addfiles.py","/home/project/imports/execute_command.py","/home/project/imports/planconfigs.py","/home/project/imports/cre_mod_wi_del.py"]
 INFO   LLMManager  Found 4 cached models for Ollama
 INFO   stream-text  Sending llm call to Ollama with model qwen2.5-coder:7b
 DEBUG  Ollama Base Url used:  http://127.0.0.1:11434
 DEBUG   api.chat  usage {"promptTokens":6050,"completionTokens":253,"totalTokens":6303}

@Anybody2007
Copy link
Author

Hi @leex279,
I tried with qwen2.5-coder:3b which is a 1.9GB in size but the issue is still same. So I think that the model size is not the issue here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants