From Blank Chat to Private Search: Setting Up SearXNG with Open WebUI on Windows
A step-by-step guide to decoupling your local LLMs from knowledge cutoffs using self-hosted, lightning-fast web search.
Local Large Language Models (LLMs) such as Microsoft’s phi-4-mini, Meta’s Llama 3, Alibaba Group’s Qwen3, and Mistral AI’s Mistral offer major advantages in privacy, low latency, and offline performance. However, they share a critical limitation: they are trapped in the past. When asked about recent events, emerging technologies, or real-time information, they rely solely on pre-trained knowledge constrained by a fixed training cutoff, often producing outdated or inaccurate responses.
By integrating Open WebUI with SearXNG—a privacy-respecting, self-hosted metasearch engine—you can build a completely private, localized alternative to commercial AI search tools. Here is the complete blueprint to install, connect, and optimize SearXNG with Open WebUI on a Windows machine using Docker.
1. Prerequisites
Before starting, ensure you have the following software active on your host system:
- WSL 2 (Windows Subsystem for Linux): Handles native Linux virtualization.
- Docker Desktop: Running with the WSL 2 backend enabled.
- Open WebUI: Installed and actively running in its own Docker container.
2. Preparing the Windows File Infrastructure
SearXNG requires configuration variables to be declared before the container fires up. Specifically, Open WebUI needs JSON integration format enabled, which is turned off by default in stock SearXNG deployments.
- Open PowerShell and create a dedicated root directory:
mkdir C:\searxng-docker
cd C:\searxng-docker
mkdir searxng - Open Notepad, paste the following text block, and save it as
C:\searxng-docker\searxng\settings.yml(Ensure you change the save type dropdown in Notepad to All Files (.) to prevent a hidden.txtextension):
use_default_settings: true
server:
port: 8080
bind_address: "0.0.0.0"
secret_key: "generate_a_long_random_string_here"
search:
formats:
- html
- json
3. Writing the Orchestration File
Next, create the execution file that tells Docker Desktop how to fetch, assemble, and run the container image.
- Open a new Notepad instance and paste this structure:
services:
searxng:
image: docker.io/searxng/searxng:latest
container_name: searxng
volumes:
- C:\searxng-docker\searxng:/etc/searxng:rw
ports:
- "8080:8080"
restart: unless-stopped
environment:
- SEARXNG_SETTINGS_PATH=/etc/searxng/settings.yml
- Save this file exactly as
docker-compose.ymldirectly inside your root directory:C:\searxng-docker\.
4. Deploying the Instance
- Return to your PowerShell terminal (ensure you are resting in
C:\searxng-docker). - Instruct Docker to build the environment cleanly:
docker compose up -d --force-recreate - Open Docker Desktop. You will now see a green active status ring next to both your
open-webuideployment and your freshsearxng-dockerapplication stack. - Verify functionality by executing a manual terminal command to ensure the JSON endpoint responds:
curl.exe "http://localhost:8080/search?q=test&format=json"
If raw text/JSON results pour down your screen, the container is working perfectly.
5. Bridging the Network Gap in Open WebUI
Because your Open WebUI container and your SearXNG container sit on completely different localized Docker networks, Open WebUI cannot identify localhost inside its own environment. It must speak to the Windows host proxy loopback.
- Open your web browser and load Open WebUI (
http://localhost:3000). - Navigate to: Admin Panel > Settings > Web Search.
- Toggle the Enable Web Search switch to ON.
- Select searxng from the search engine provider dropdown menu.
- In the SearXNG Query URL text box, use the internal host proxy line exactly as written:
http://host.docker.internal:8080/search?q=<query>
6. Crucial Speed Adjustments for Small Local Models
If you attempt a search query right now, a small model like phi4-mini may freeze or take a long time to answer. This lag happens because Open WebUI default settings force it to download and read every single website page found in the background.
To fix this latency and achieve instant search returns, check these boxes in your Web Search configuration panel:
- Bypass Web Loader: Turn ON. This forces the system to only read the quick summary snippets returned by SearXNG, skipping tedious full-page downloading.
- Bypass Embedding and Retrieval: Turn ON. This prevents your local PC from chewing through resource-heavy vector math operations on search data.
- Search Result Count: Lower this number down to
2or3. Small models handle small, dense context chunks much quicker than huge payloads. - Click Save.
7. Structuring the AI Synthesis Output
Because search engine data can be repetitive, you want your model to summarize facts gracefully rather than spitting out messy metadata.
- Click on your profile icon in the bottom-left corner and open your General Settings.
- Locate the System Prompt text zone.
- Inject clear, strict instructions so your model handles live data with poise:
“You are an AI assistant with access to real-time internet search results. Synthesize the provided search snippets into a direct, coherent, and clean response. Group relevant facts together logically, cite your sources naturally, and ignore duplicate text snippets.”
- Click Save.
The Verdict
Open a new chat tab, hit the + (plus) integration toggle, ensure Web Search is active, and type a question about current world news.
Your model will now systematically parse live network headlines, track ongoing data changes, and present a synthesized summary backed by link citations—all processed locally on your hardware.

