Self Host Llama 3 On Arch Linux

·

3 min read

Technology is evolving very fast. Now a person without knowledge of how AI works can easily setup and host a ChatGPT-like service on personal computer. Llama is a large language model released by Meta AI. Meta has recently released Llama 3, making it available for download. I have set it up in my Arch Linux home lab, but the instruction is not distro-specific and be performed on other distros.

How It Works

Ollama is an application that download different language models and provides CLI/API web service for interacting with the models. I am using it to download Llama 3.

Open WebUI provides a web UI for interacting with the model. It achieves this by integrating Ollama API.

Prerequisite: Make sure docker is installed, I use it to install & run Open WebUI container.

Setup Ollama

Run the installer script.

curl -fsSL https://ollama.com/install.sh | sudo sh

The script will create an ollama.service systemd service and start it immediately. It also creates a group and user, both named "ollama". Ollama will be running as "ollama" user. You can check if Ollama service is running with systemctl status ollama command.

Reference: https://github.com/ollama/ollama

Download Llama 3.

ollama pull llama3

This downloads the 8B version. If you prefer 70B version, pull "llama3:70b" instead.

Run the model.

ollama run llama3

This launches a CLI for interacting with the model. You can start asking questions or request it to write a poem.

ollama.service launches an API web service, which listens on port 11434 by default. Let's setup a web UI to interact with it.

Web UI

docker run -d --network=host -v open-webui:/app/backend/data -e PORT=8080 -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

It downloads docker image and launches the container. -e PORT=8080 is optional as it is running on port 8080 by default. You can change it to other port number if you want.

Networking Issue

# Does not work for me
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Bridged networking with port mapping does not work for me. (If you insist on using bridged networking, check outgithub.com/open-webui/open-webui/issues/209)

Now you can load the web UI on your browser: http://<server-ip>:8080. You have to signup a new user account. The first created user account will be assigned administrator role. Upon login, you can disable new signups.

In Settings > Connections > Ollama Base URL, make sure the value is "http://127.0.0.1:11434".

Start chatting...

To Infinity And Beyond!

Ollama makes it easy to try out different models. This is a list of supported models.

If you have a domain name and want to make it publicly accessible, you can setup a reverse proxy. This is a sample Apache configuration:

# Ensure these 2 modules are loaded: mod_proxy, mod_proxy_http
<VirtualHost *:80>
    ServerName example.com

    ProxyPreserveHost On
    ProxyPass / http://127.0.0.1:8080/
    ProxyPassReverse / http://127.0.0.1:8080/

    ErrorLog /var/log/httpd/error_log
    CustomLog /var/log/httpd/access_log common
</VirtualHost>

Did you find this article valuable?

Support Hong by becoming a sponsor. Any amount is appreciated!