Flexible LLM providers + manage Ollama with 4CAT#576
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds Ollama LLM container support to 4CAT's Docker stack, along with an admin UI to manage LLM models (pull, delete, enable/disable). The LLM model refresh logic is extracted from the old refresh_items worker into a dedicated OllamaManager worker. A new llm.enabled_models configuration setting allows admins to control which available models are exposed to users.
Changes:
- New
OllamaManagerbackend worker for refreshing, pulling, and deleting Ollama models via the Ollama HTTP API - New
/admin/llm/admin panel (views_llm.py+llm-server.html) for managing LLM models, gated by both admin privileges andllm.access - New
docker-compose_ollama.ymloverride for running Ollama as a Docker sidecar, with auto-configuration indocker_setup.py
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
backend/workers/ollama_manager.py |
New worker for Ollama model refresh/pull/delete operations |
backend/workers/refresh_items.py |
LLM refresh logic removed; worker now does nothing |
webtool/views/views_llm.py |
New Flask blueprint for the admin LLM management panel |
webtool/templates/controlpanel/llm-server.html |
New admin panel template for model listing and actions |
webtool/templates/controlpanel/layout.html |
Adds "LLM Server" nav link when llm.access is enabled |
webtool/__init__.py |
Registers the new views_llm blueprint |
processors/machine_learning/llm_prompter.py |
Filters available models by enabled list before showing to users |
common/lib/config_definition.py |
Adds llm.enabled_models config definition |
docker/docker_setup.py |
Auto-configures LLM settings when Ollama is detected on Docker network |
docker-compose_ollama.yml |
New Docker Compose override for the Ollama sidecar service |
docker/README.md |
Documents the Ollama Docker setup and usage |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| elif task == "delete": | ||
| success = self.delete_model(model_name) | ||
| if success: | ||
| self.refresh_models() |
There was a problem hiding this comment.
When a model is successfully deleted from the Ollama server, refresh_models() updates llm.available_models to remove the deleted model, but llm.enabled_models is never cleaned up. This means deleted models accumulate as stale entries in llm.enabled_models. While this doesn't cause an immediate runtime error (since llm_prompter.py intersects the two lists), it's misleading: after a delete-and-refresh cycle, the model would disappear from the available models table in the UI, but it remains in the enabled list. If the model is later re-pulled, it would reappear as already enabled, which could be surprising.
The delete_model() method (or the work() method after a successful delete) should remove the model from llm.enabled_models, or at minimum refresh_models() should reconcile llm.enabled_models to remove entries no longer present in llm.available_models.
| ### Configuring 4CAT to use Ollama | ||
|
|
||
| 1. Log in as admin and open **Control Panel → Settings**. | ||
| 2. Set the following LLM fields: | ||
|
|
||
| | Setting | Value | | ||
| |---|---| | ||
| | LLM Provider Type | `ollama` | | ||
| | LLM Server URL | `http://ollama:11434` | | ||
| | LLM Access | enabled | | ||
|
|
||
| 3. Save settings. | ||
| 4. Open **Control Panel → LLM Server** (visible once *LLM Access* is enabled). | ||
| 5. Use the **Refresh** button to load available models, then **Pull** a model | ||
| (e.g. `llama3.2:3b`) to download it from the Ollama library. | ||
| 6. Enable the models you want to make available to users. |
There was a problem hiding this comment.
The docker/README.md section "Configuring 4CAT to use Ollama" (steps 1–3) instructs users to manually set the LLM Provider Type, LLM Server URL, and LLM Access fields in the Control Panel Settings. However, docker/docker_setup.py now automatically detects the Ollama sidecar on first startup and configures these settings without user intervention. The README should mention this auto-configuration so users know they can skip steps 1–3 on a fresh install with the Ollama override.
|
@copilot open a new pull request to apply changes based on the comments in this thread:
|
|
@dale-wahl I've opened a new pull request, #581, to work on those changes. Once the pull request is ready, I'll request review from you. |
…config docs (#581) * Initial plan * Fix stale enabled models, disable refresh_items scheduling, update README docs Co-authored-by: dale-wahl <32108944+dale-wahl@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: dale-wahl <32108944+dale-wahl@users.noreply.github.com>
# Conflicts: # backend/workers/refresh_items.py
Todo:
|
|
@dale-wahl this is now mostly ready to merge, I think - need to test the migrate script (which I plan to do after merge, when we can test it on our own 4CAT) and the Docker setup. Can you do the latter? I tried it but it pulled the stable image, which doesn't contain the relevant updates to the LLM code yet, so that would fail. Maybe that also needs to be tested after merging? |
|
@stijn-uva running migrate from 1.53 results in the following error |
|
@sal-uva should be fixed now, thanks! |
|
So far, it seems to mean:
I am still reviewing, but I need to drop this someplace because it feels off. I think there are multiple axis here (e.g. the wrapper we use (Ollama, Anthropic, OpenAI etc.), where it is hosted, who's credentials are used). Possibly related (?), third party (the static file llms) are not showing as available to actually select. I think it is semi related because they have the Reminder for me: look at |
`provider_key` was not used; made a general `wrapper` key for the model.
…d "Status" with action buttons
…y one provider; do not accidently delete enabled models when connection fails
…errors printing on fail)
|
|
||
| # if we have a categorised set of options, look deeper to get | ||
| # valid option values | ||
| is_categorised = all([type(o) is dict for o in options.values()]) |
There was a problem hiding this comment.
@stijn-uva this looks weird to me. maybe options = settings.get("options", {}) if it really is supposed to be a dict. list.values() is going to fail.
Also the if choice not in match_options uses the chain iterator to check for choice so you'll only be left with the remaining items in match_options.
dale-wahl
left a comment
There was a problem hiding this comment.
This one took me a bit. I do think some renaming would help as "providers" collides with OpenAI, Anthorpic, etc. and our "providers" is more connections to LLMs... services 😅 like Ollama vs LiteLLM etc. I also think renaming api to thirdparty may be beneficial.
Once I understood the providers/clients were connections, I think we could expand on them to help with the other "axis" I mentioned in my earlier comment. I added wrapper to fix the Third Party class and allow you to connect which LangChain wrapper to use. We could also add an egress key so you could denote which connections are external vs internal (e.g. do we warn if a user is sending data to UvA via LiteLLM or whatever setup others come up with). You could also add key_source and allow users to provide the key. That axis is perhaps less important, but we are conflating it now with the api- means thirdparty "provider". And I wouldn't mind adding my own keys to providers for my own instance. (Plus we could then have keys available to groups of users by making providers/connections available by tag...).
All that said, I tested out some configurations and the docker setup and think we are pretty good.
That's right, boys and girls, now you can spin up an Ollama container right beside your 4CAT containers. Lil admin UI action to pull and delete models on it (should work with other Ollama servers as well) as well as enable and disable models (should work with
tagsas well, but did not test that).The gist:
docker compose -f docker-compose.yml -f docker-compose_ollama.yml up -dIt's that simple! (Or almost that simple; you do need to un-comment some lines if you want it to use GPU, but a) works without GPU--albeit slowly--and b) doesn't crash for those GPU-less users.)
You're welcome.
Fixes #564, fixes #563