Host your own AI Server using Proxmox and Ollama and connect PHPStorm to it

Published at Feb 13, 2025

This is a guide on how to set up an AI server on Proxmox, connect it to PHPStorm, and use Tailscale for remote access.

I have a rather beefy server at home that I use mostly as my Windows gaming machine. It runs Proxmox because, from the beginning, I knew I wanted a server where I could run VMs and containers from a distance.

You may not know, but I travel a lot with my van, and I can't bring all my equipment with me.

For context, here are the hardware specs of my server:

  • CPU: AMD Ryzen 9 5900X 12-Core Processor
  • RAM: 32GB (I need to upgrade it to 128GB or something)
  • GPU: Nvidia RTX 3070 + Nvidia RTX 3060 Ti
  • Storage: 2TB NVMe + 2x 60GB SSD

Create a new VM

We begin by setting up an Ubuntu VM on Proxmox, allocating the necessary resources for your AI server. You can download an ISO and create a new VM from it with the console, but I prefer to use a script to automate the process.

The script comes from the community-scripts repository, and it's a simple bash script that you can run on your Proxmox server.

Run the following script to deploy the Ubuntu VM:

Copied!
bash -c "$(wget -qLO - [Proxmox script](https://github.com/community-scripts/ProxmoxVE/raw/main/vm/ubuntu2404-vm.sh))"

The script will ask you for some information; here is what I used:

  • I allocated 512 Go storage because LLVM models are huge.
  • I gave it 24 576 MB RAM. I tried with 32 GB, but it made my server crash. Keep some RAM for the host. Ollama is great at distributing the load on the CPU and GPUs. This gives me plenty of room to test large models.
  • Do not start the VM immediately after installation.

After the installation is finished, in the Cloud-Init tab:

  • Configure the user user, set a password, and/or add your public SSH key.
  • I chose to use DHCP for the network configuration because I will only access it from my Tailscale network.
  • Click on Regenerate image

Before starting the VM, go to the Hardware tab and add your GPU to the VM.

  • Add -> PCI Device -> Raw Device -> Your GPU
  • Check All functions
  • In the advanced options, check ROM-Bar
  • Start the VM.

After that, you have a fresh Ubuntu VM ready to be used.

I took some time to install ZSH and Oh-My-Zsh, but do as you wish.

Install required Nvidia CUDA Drivers and Toolkit

Now it's time to configure this server to work well with AI models.

Start by confirming that your GPU is detected with:

Copied!
lspci | grep -i nvidia

Now we need to install Nvidia's CUDA Toolkit and drivers.
The instructions for your machine can be found here CUDA downloads page

For our Ubuntu 24.04, we will use the following commands:

Copied!
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-8
sudo apt-get install -y cuda-drivers

Now we need to reboot the server to apply the changes:

Copied!
sudo reboot

Set the environment variables:

Copied!
export PATH=/usr/local/cuda-12/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

And verify the installation:

Copied!
nvcc --version
nvidia-smi

That last command shows the GPU usage and the CUDA version.

Later you can watch it live with:

Copied!
watch -n 0.5 nvidia-smi

There is also a command called nvtop that you can install with:

Copied!
sudo apt-get install nvtop

Docker Installation

For some software that we will use, we need to install Docker.

Follow the steps detailed in the Docker installation guide:

or copy and paste the following commands:

Copied!
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
 
# Add the repository to Apt sources:
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
 
# Install Docker
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Confirm the installation by running:

Copied!
sudo docker run hello-world

Install Nvidia Container Toolkit

I don't use Ollama in a Docker container, but you may want to. Or maybe you need to run some other containers that need access to the GPU. This section ensures that you can use the GPU in a container in the future.

Nvidia container toolkit documentation

Follow the instructions for your distribution.

Use Ollama for running AI models

Ollama is a powerful tool for running AI models on your server. It provides a simple command-line interface to run various AI models with ease.

It also comes with an API that can be used by other applications.

One feature that I like is that it can distribute the load on multiple GPUs and CPUs. I tried to run models that only use GPU; the rule of thumb is to use models that can fit in the combined memory of all GPUs.

This is our true AI server.

I decided to install Ollama directly on the VM, but you can also use a Docker container.

Install Ollama with the provided script:

Copied!
curl -fsSL https://ollama.com/install.sh | sh

Basic Usage of Ollama

Run various AI models with commands such as:

Copied!
ollama run llama3.2 --verbose
ollama run deepseek-r1:14b --verbose
ollama run deepseek-r1:8b --verbose

I like to use the --verbose flag to see stats like tokens per second and samples per second after every answer.

You can also just pull the models without running them:

Copied!
ollama pull mistral

You can also see how the load is distributed on the GPUs and CPUs with:

Copied!
ollama ps

Make Ollama listen outside of localhost

This step will be useful if we want our PHPStorm to connect to Ollama.

Copied!
systemctl edit ollama.service

Add the following lines at the beginning of the file:

Copied!
[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"

OpenWebUI Installation

OpenWebUI is a web interface that gives you a ChatGPT interface to interact with your AI server.

OpenWebUI GitHub repository for detailed installation instructions.

Here is a quick guide to install OpenWebUI:

Copied!
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

You can now access OpenWebUI at http://your-server-ip:3000.

The first account created will be the admin account.

Tailscale

Tailscale is your personal VPN that makes it easy to connect your devices through internet without any port forwarding.

To install Tailscale, the easiest way is to go to your Tailscale admin console and get the command to install it on your server.

Then the next step is to tell Tailscale to serve OpenWebUI:

Copied!
sudo tailscale serve --bg 3000

You can now access OpenWebUI from anywhere with your Tailscale Magic DNS with HTTPS.

Example: https://aiserver.tail1234567.ts.net

PHPStorm Integration

For PHPStorm to connect to Ollama, we need to install the offical PHPStorm AI plugin.

In the configuration, you can add your Ollama server with the IP and port.

Update Ollama and OpenWebUI

To update OpenWebUI automatically, you can use Watchtower.

Here is a command to update OpenWebUI once:

Copied!
sudo docker run --rm --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once open-webui

And here is a command to update OpenWebUI every hour:

Copied!
sudo docker run -d \
--restart always \
--volume /var/run/docker.sock:/var/run/docker.sock \
containrrr/watchtower --interval 3600 open-webui

The only way to update Ollama is to run the install script again.

Here is a command to update Ollama every week:

Copied!
sudo crontab -e

Add the following line to the file:

Copied!
0 0 * * 0 curl -fsSL https://ollama.com/install.sh | sh

Conclusion

And voilĂ !

You now have a personal AI server that you can access from anywhere with PHPStorm and OpenWebUI.

You should dig into the documentation of Ollama and OpenWebUI to see all the features they offer.

Examples of advanced features you can explore:

  • Image generation
  • Indexing documents with embeddings
  • Chat with ChatGPT and local models at the same time
  • Use multi modal models like llama3.2-vision
#proxmox #ai #phpstorm #tailscale #ollama #openwebui #docker

Syntax highlighting provided by torchlight.dev

likes
reposts
comments

Comments

Reply on Bluesky to join the conversation.