Ollama runs 100% CPU after first Prompt. WebUi (v0.4.8 )
After running the first prompt in Webui, ollama moves from running in the GPU to the CPU. It started a couple of updates ago. I cannot have a chat, it takes some minutes to stop using the CPU at 100%. Whats going on here? Why? How can I stop it?
Here is my setup.
OS: Fedora40
CPU: Ryzen 5800x
GPU: TUF 7900XT
ROCm installation
Ollma (0.4.7) runs in systemd
WebUi (v0.4.8 (latest))runs in a Docker Container
ollama ps ─╯
NAME ID SIZE PROCESSOR UNTIL
llama3.1:8b-instruct-fp16 1fbd8c253427 17 GB 100% GPU 4 minutes from now
After ending the promt response.
ollama ps ─╯
NAME ID SIZE PROCESSOR UNTIL
llama3.1:8b-instruct-fp16 1fbd8c253427 15 GB 100% CPU Stopping...