I found myself writing these notes up for someone, to help them and figured I’d document the steps publicly to help others. To be clear, this is not for the technically advanced. This is a very simple way to run this stick, but is not how I personally run it, nor would I recommend this approach for one’s home lab. This is for a non-technical individual, looking to just get this running ASAP.

Step 1: Install Ollama

To get started, head over to ollama.com and download the application. If you’re on a Mac, simply drag the Ollama.app into your Applications folder. I recommend using the native app on Mac rather than the Docker version, as it generally runs faster.

After installation, you can open Ollama by double-clicking it. There’s no graphical user interface—Ollama runs quietly in the background, visible only in your application bar.

For more advanced usage, you can launch Ollama from the terminal with custom arguments like this:

$ OLLAMA_KEEP_ALIVE=30m OLLAMA_HOST=0.0.0.0:11434 ollama serve

Since I’m hosting on Linux, I use Docker and manage Ollama through a docker-compose.yaml file. By default, Ollama should be running on localhost:11434.

Step 2: Install Docker

We’ll need Docker to run our user interface. Follow these installation guides for Windows or Mac.

Step 3: Install Open WebUI

With Docker installed, you can start Open WebUI by running this command in your terminal:

$ docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v $HOME/open-webui:/app/backend/data --name open-webui --restart always --env OLLAMA_KEEP_ALIVE=30m --env OLLAMA_HOST=0.0.0.0 --env WEBUI_AUTH=False ghcr.io/open-webui/open-webui:main

You can now access Open WebUI by navigating to localhost:3000 in your web browser. Here’s what the command does:

  • Stores data in ~/open-webui
  • Keeps LLMs loaded for 30 minutes
  • Allows connections from your network
  • Disables authentication for quick access

Feel free to adjust the command parameters as needed.

If you prefer using Docker Compose, you can also configure Open WebUI with a docker-compose.yaml file.

Step 4: Configure Open WebUI

Open your browser and go to localhost:3000. In the bottom-right corner, click on User and select Admin Panel:

Admin Panel

Next, go to the Connections tab:

Connections

Enter http://localhost:11434 and test the connection using the button with two arrows. Once successful, click Save.

Back in the Admin Settings, navigate to Models:

Models

Here, you can enter a model from Ollama. Browse available models here. Depending on your hardware, I recommend:

  • llama3.1:8b
  • gemma2:2b
  • gemma2:9b
  • gemma2:27b (my personal favorite)

After choosing a model, click New Chat in the top-left corner:

New Chat

Select your model and start chatting!

Step 5: How to Chat with Your Documents

Click on Workspace in the top-left, then select Documents.

Add your documents by clicking the +, tag them if needed, and hit Save. Keep in mind that processing large document sets may take some time without showing any progress.

To begin, click New Chat, and use # to specify a document or all documents in your query. For example:

Tell me about my documents. Briefly summarize them.

How to Stop Everything

To shut down Ollama, simply close the application from the task bar.

To stop Open WebUI, run the following command in your terminal:

$ docker stop open-webui