(Zero Cost โข Maximum Privacy โข No API Keys)
Want to run your OpenClaw agent completely offline with zero monthly costs and zero data leaving your VPS? This guide shows you exactly how to switch to **100% local models** using Ollama โ and highlights the brand-new **Gemma 4** family thatโs currently dominating local performance.
curl -fsSL https://ollama.com/install.sh | sh
ollama pull gemma4:26b # My current daily driver (excellent balance)
ollama pull gesta4:31b # Maximum intelligence (if you have the RAM)
ollama pull gemma4:e4b # Lightweight & fast for lighter tasks
ollama list
| Model | Size | Best For | Recommended VPS RAM | Speed on 8-core VPS |
|---|---|---|---|---|
| gemma4:26b | 26B MoE | My daily driver โ best overall balance | 32 GB | Very Fast |
| gemma4:31b | 31B Dense | Maximum intelligence & reasoning | 64+ GB | Fast |
| gemma4:e4b | ~9B effective | Lightweight & super fast | 16 GB | Extremely Fast |
| llama3.3 | 70B | Heavy tasks when needed | 64+ GB | Medium |
Once Ollama is running, switch your agent with one command (this is what I use daily):
openclaw models set ollama/gemma4:26b
Other useful commands:
openclaw models list --local โ See all your Ollama modelsopenclaw models status โ Check current model + speedopenclaw models set ollama/gemma4:31b โ Switch to the bigger beast when neededAfter changing the model, always restart the gateway:
openclaw gateway restart
ollama serve in the background with systemd/think low or medium for faster responses with Gemma 4| Use Local (Ollama Gemma 4) | Use Cloud (Venice.ai) |
|---|---|
| Daily tasks, privacy-sensitive work, zero cost | Extremely complex reasoning or when you want Claude 4.6-level power |
| Offline capable | Faster on weaker hardware |
I run gemma4:26b locally as default and keep Venice.ai as fallback:
openclaw models fallbacks add venice/claude-4.6-opus
Hereโs a clean, practical list of Ollama CLI commands specifically for working with Gemma 4 (and general Ollama usage). All commands are run in your terminal.
ollama pull gemma4 # Default (E4B, ~9.6 GB) โ recommended starting point
ollama pull gemma4:e2b # Smaller & faster (~7.2 GB, great for laptops)
ollama pull gemma4:e4b # Same as default but explicit tag
ollama pull gemma4:26b # MoE model (~18 GB)
ollama pull gemma4:31b # Largest dense model (~20 GB)
ollama run gemma4 # Starts interactive chat with default model
ollama run gemma:e2b # Or any specific tag
One-shot prompts (run once without entering chat):
ollama run gemma4 "Write a Python script to scrape a webpage"
ollama run gemma4:e2b "Explain quantum computing in simple terms"
Put the image path at the end of the prompt:
ollama run gemma4 "Describe this image in detail" /path/to/your/photo.jpg
ollama run gemma4 "Whatโs written on this document?" ~/Desktop/invoice.png
(Works with the e2b/e4b variants too โ they even support audio.)
ollama list # (or ollama ls) โ see all models you have downloaded
ollama ps # show currently running models
ollama show gemma4 # view model info (parameters, architecture, etc.)
ollama show --modelfile gemma4 # see the full Modelfile
ollama rm gemma4 # delete a model to free up space
ollama cp gemma4 my-gemma4 # make a copy/rename
ollama stop gemma4 # stop a running model
ollama serve
ollama run gemma4 --keepalive 30m
ollama run gemma4 --temperature 0.7
ollama run SessionOnce youโre in the chat (after ollama run gemma4), you can type these:
/bye or /exit โ quit/clear โ clear chat history/system You are a world-class Python developer. โ set a new system prompt/temperature 0.8 โ change temperature on the flyThatโs it! You now have a completely private, zero-cost OpenClaw agent running 24/7 on your VPS โ powered by the latest Gemma 4 models.
Try gemma4:26b today and drop your favorite local model or performance results in the comments below!
โ Back to Quick Tutorial Browse All Blog Posts โ
๐ฌ Comments