🎧 Listen to this article

Introduction

If you're running AMD's Strix Halo hardware -- specifically the Ryzen AI MAX+ 395 with its integrated Radeon 8060S GPU -- you already know the software ecosystem is a moving target. The gfx1151 architecture sits in an awkward spot: powerful hardware that isn't officially listed on AMD's ROCm support matrix, yet functional enough to run real workloads with the right driver stack. When ROCm 7.2 landed in early 2026, upgrading from 7.0.2 was a priority. The newer stack brings an updated HSA runtime, a refreshed amdgpu kernel module, and broader compatibility improvements that matter on bleeding-edge silicon.

This post documents the complete upgrade procedure from ROCm 7.0.2 to 7.2 on a production Ubuntu 24.04 system. It's not a theoretical exercise -- this was performed on a live server running QEMU virtual machines and network services, with the expectation that everything would come back online after a single reboot.

AMD's official documentation states that in-place ROCm upgrades are not supported. The recommended path is a full uninstall followed by a clean reinstall. That's exactly what we did, and the entire process took about 20 minutes of wall-clock time (excluding the reboot).

System Overview

The target system is a Bosgame mini PC running the Ryzen AI MAX+ 395 APU. If you've read the earlier review of this hardware, you'll be familiar with the specs. For context on this upgrade, here's what matters:

Hardware

  • CPU: AMD Ryzen AI MAX+ 395, 16 cores / 32 threads, Zen 5
  • GPU: Integrated Radeon 8060S, 40 Compute Units, RDNA 3.5 (gfx1151)
  • Memory: 32 GB DDR5, unified architecture with 96 GB allocatable to GPU
  • Peak GPU Clock: 2,900 MHz

Software (Pre-Upgrade)

  • OS: Ubuntu 24.04.3 LTS (Noble Numbat)
  • Kernel: 6.14.0-37-generic (HWE, pinned)
  • ROCm: 7.0.2
  • amdgpu-dkms: 6.14.14 (from repo.radeon.com/amdgpu/30.10.2)
  • ROCk Module: 6.14.14

Running Services

The system was actively serving several roles during the upgrade:

  • Five QEMU virtual machines (three x86, two aarch64)
  • A PXE boot server (dnsmasq) for the local network
  • Docker daemon with various containers

None of these services are tied to the GPU driver stack, so the plan was to perform the upgrade and reboot without shutting them down first. The VMs and network services would come back automatically after the reboot.

Why Upgrade

ROCm 7.0.2 worked on this hardware. Models loaded, inference ran, rocminfo detected the GPU. So why bother upgrading?

Three reasons:

  1. Driver maturity for gfx1151: The amdgpu kernel module jumped from 6.14.14 to 6.16.13 between the two releases. That's not a minor revision -- it represents months of kernel driver development. On hardware that isn't officially supported, newer drivers tend to bring meaningful stability improvements as AMD's internal teams encounter and fix issues on adjacent architectures.

  2. HSA Runtime improvements: ROCm 7.2 ships HSA Runtime Extension version 1.15, up from 1.11 in ROCm 7.0.2. The HSA (Heterogeneous System Architecture) runtime is the lowest layer of the ROCm software stack -- it handles device discovery, memory management, and kernel dispatch. Improvements here affect everything built on top of it.

  3. Ecosystem alignment: PyTorch wheels, Ollama builds, and other ROCm-dependent tools increasingly target 7.2 as the baseline. Running 7.0.2 was becoming an exercise in version pinning and compatibility workarounds.

The Kernel Hold: Why It Matters

Before diving into the procedure, a note on kernel management. This system runs the Ubuntu HWE (Hardware Enablement) kernel, which provides newer kernel versions on LTS releases. At the time of this upgrade, the HWE kernel was 6.14.0-37-generic. The upstream kernel had already moved to 6.17, but we didn't want the ROCm upgrade to pull in a kernel that AMD's DKMS module might not build against.

The solution is apt-mark hold:

sudo apt-mark hold linux-generic-hwe-24.04 linux-headers-generic-hwe-24.04 linux-image-generic-hwe-24.04

This prevents apt from upgrading the kernel meta-packages, effectively pinning the system to 6.14.0-37-generic. The hold was already in place before the upgrade and remained untouched throughout. After the upgrade, we confirmed it was still active:

apt-mark showhold
linux-generic-hwe-24.04
linux-headers-generic-hwe-24.04
linux-image-generic-hwe-24.04

If you're running Strix Halo or any other hardware where kernel compatibility with amdgpu-dkms is uncertain, kernel holds are essential. A kernel upgrade that breaks the DKMS build means no GPU driver after reboot.

Upgrade Procedure

Step 1: Uninstall the Current ROCm Stack

AMD provides the amdgpu-uninstall script for exactly this purpose. It removes all ROCm userspace packages and the amdgpu-dkms kernel module in a single operation:

sudo amdgpu-uninstall -y

This command removed approximately 120 packages, including the full HIP runtime, rocBLAS, MIOpen, MIGraphX, ROCm SMI, the LLVM-based compiler toolchain, and the Mesa graphics drivers that ship with ROCm. The DKMS module was purged, which means the amdgpu kernel module was removed from the 6.14.0-37-generic kernel's module tree.

After the ROCm stack was removed, we purged the amdgpu-install meta-package itself:

sudo apt purge -y amdgpu-install

This also cleaned up the APT repository entries that amdgpu-install had configured in /etc/apt/sources.list.d/. The old repos -- repo.radeon.com/amdgpu/30.10.2, repo.radeon.com/rocm/apt/7.0.2, and repo.radeon.com/graphics/7.0.2 -- were all removed automatically.

Step 2: Clean Up Leftover Files

The package removal was thorough but not perfect. A few leftover directories remained in /opt/:

ls /opt/ | grep rocm
rocm-7.0.0
rocm-7.0.2
rocm-7.9.0

The rocm-7.0.0 directory was from a previous installation attempt. The rocm-7.9.0 was from an earlier experiment with a release candidate build. The rocm-7.0.2 directory contained a single orphaned shared library (libamdhip64.so.6) that dpkg couldn't remove because the directory wasn't empty. All three were cleaned up manually:

sudo rm -rf /opt/rocm-7.0.0 /opt/rocm-7.0.2 /opt/rocm-7.9.0

It's worth checking for stale ROCm directories after any uninstall. They consume negligible disk space but can confuse build systems and scripts that scan /opt/rocm* for active installations.

Step 3: Install the ROCm 7.2 Installer

AMD distributes ROCm through a meta-package called amdgpu-install. Each ROCm release has its own version of this package, which configures the appropriate APT repositories. The 7.2 installer was downloaded directly from AMD's repository:

cd /tmp
wget https://repo.radeon.com/amdgpu-install/7.2/ubuntu/noble/amdgpu-install_7.2.70200-1_all.deb
sudo apt install -y ./amdgpu-install_7.2.70200-1_all.deb
sudo apt update

After installation and apt update, three new repositories were active:

  • https://repo.radeon.com/amdgpu/30.30/ubuntu noble -- the kernel driver and Mesa components
  • https://repo.radeon.com/rocm/apt/7.2 noble -- the ROCm userspace stack
  • https://repo.radeon.com/graphics/7.2/ubuntu noble -- graphics libraries

The version numbering can be confusing. The amdgpu-install package version is 30.30.0.0.30300000-2278356.24.04, which maps to the amdgpu driver release 30.30. The ROCm version is 7.2.0. These are different version tracks that AMD maintains in parallel.

Step 4: Install ROCm 7.2

With the repositories configured, the actual installation was a single command:

sudo amdgpu-install -y --usecase=graphics,rocm

The --usecase=graphics,rocm flag tells the installer to include both the Mesa graphics drivers and the full ROCm compute stack. This is the right choice for a system that needs both display output and GPU compute capabilities.

The installation took approximately 10 minutes and included:

  • amdgpu-dkms 6.16.13: The kernel module, compiled via DKMS against the running kernel
  • Full ROCm 7.2 stack: HIP runtime, hipcc compiler, rocBLAS, rocFFT, MIOpen, MIGraphX, RCCL, ROCm SMI, ROCProfiler, and dozens of other libraries
  • Mesa graphics: Updated EGL, OpenGL, and Vulkan drivers from the amdgpu Mesa fork
  • ROCm LLVM toolchain: The LLVM-based compiler infrastructure that HIP uses for kernel compilation

The DKMS build is the critical step. During installation, DKMS compiled the amdgpu module against the kernel headers for 6.14.0-37-generic. The output confirmed a successful build:

depmod...
update-initramfs: Generating /boot/initrd.img-6.14.0-37-generic

The initramfs was regenerated to include the new module, ensuring it would be loaded at boot.

Step 5: Verify DKMS

Before rebooting, we confirmed the DKMS status:

dkms status
amdgpu/6.16.13-2278356.24.04, 6.14.0-37-generic, x86_64: installed
virtualbox/7.0.16, 6.14.0-36-generic, x86_64: installed
virtualbox/7.0.16, 6.14.0-37-generic, x86_64: installed
virtualbox/7.0.16, 6.8.0-100-generic, x86_64: installed

The new amdgpu module (6.16.13) was built and installed for 6.14.0-37-generic. Note that it only built for the currently running kernel, unlike VirtualBox which had modules built for older kernels as well. This is expected -- DKMS builds against available kernel headers, and the old kernel headers for 6.14.0-36 and 6.8.0-100 were still present from earlier installations.

Step 6: Reboot

sudo reboot

The server came back online in approximately 50 seconds.

Post-Reboot Verification

rocminfo

The first check after reboot was rocminfo, which queries the HSA runtime for available agents:

rocminfo
ROCk module version 6.16.13 is loaded
=====================
HSA System Attributes
=====================
Runtime Version:         1.18
Runtime Ext Version:     1.15
...
==========
HSA Agents
==========
Agent 1: AMD RYZEN AI MAX+ 395 w/ Radeon 8060S (CPU)
Agent 2: gfx1151 (GPU)
  Marketing Name:          AMD Radeon Graphics
  Compute Unit:            40
  Max Clock Freq. (MHz):   2900
  Memory Properties:       APU
  ISA 1: amdgcn-amd-amdhsa--gfx1151
  ISA 2: amdgcn-amd-amdhsa--gfx11-generic

Key observations:

  • ROCk module 6.16.13: The new kernel module loaded successfully.
  • Runtime Ext Version 1.15: Upgraded from 1.11 in ROCm 7.0.2.
  • gfx1151 detected: The GPU was recognized with its correct ISA identifier.
  • gfx11-generic ISA: ROCm 7.2 also exposes a generic gfx11 ISA, which allows software compiled for the broader RDNA 3 family to run on this device without gfx1151-specific builds.
  • APU memory: The memory properties correctly identify this as an APU with unified memory.

ROCm SMI

rocm-smi
Device  Node  Temp    Power     SCLK  MCLK     Fan  Perf  VRAM%  GPU%
0       1     33.0C   9.087W    N/A   1000Mhz  0%   auto  0%     0%

The GPU was visible and reporting telemetry. The 0% VRAM reading is expected on an APU -- rocm-smi reports dedicated VRAM usage, but on a unified memory architecture, GPU memory allocations come from system RAM and aren't reflected in this counter.

ROCm Version

cat /opt/rocm/.info/version
7.2.0

DKMS

dkms status

Confirmed amdgpu/6.16.13 remained installed for 6.14.0-37-generic after reboot.

PyTorch Validation

With the driver stack verified, the next step was confirming that PyTorch could see and use the GPU. ROCm 7.2 ships with prebuilt PyTorch wheels on AMD's repository.

Installing PyTorch for ROCm 7.2

We set up a Python virtual environment and installed the ROCm-specific wheels:

python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip

The PyTorch wheel for ROCm 7.2 requires a matching ROCm-specific build of Triton. Both are available from AMD's manylinux repository. The order matters -- Triton must be installed first, since the PyTorch wheel declares it as a dependency with a specific version that doesn't exist on PyPI:

pip install https://repo.radeon.com/rocm/manylinux/rocm-rel-7.2/triton-3.5.1%2Brocm7.2.0.gita272dfa8-cp312-cp312-linux_x86_64.whl
pip install https://repo.radeon.com/rocm/manylinux/rocm-rel-7.2/torch-2.9.1%2Brocm7.2.0.lw.git7e1940d4-cp312-cp312-linux_x86_64.whl
pip install https://repo.radeon.com/rocm/manylinux/rocm-rel-7.2/torchvision-0.24.0%2Brocm7.2.0.gitb919bd0c-cp312-cp312-linux_x86_64.whl

These are the ROCm 7.2 builds for Python 3.12. AMD also provides wheels for Python 3.10, 3.11, and 3.13.

Smoke Test

import torch
print("PyTorch:", torch.__version__)
print("CUDA available:", torch.cuda.is_available())
print("Device:", torch.cuda.get_device_name(0))
print("VRAM:", round(torch.cuda.get_device_properties(0).total_memory / 1e9, 1), "GB")
PyTorch: 2.9.1+rocm7.2.0.git7e1940d4
CUDA available: True
Device: AMD Radeon Graphics
VRAM: 103.1 GB

PyTorch detected the GPU through ROCm's HIP-to-CUDA translation layer. The 103.1 GB figure represents the total addressable memory on this unified-memory APU, which includes both the 96 GB GPU allocation and additional system memory accessible through the HSA runtime.

Note the use of torch.cuda despite this being an AMD GPU. ROCm's HIP runtime presents itself through PyTorch's CUDA interface, so all CUDA API calls in PyTorch (device selection, memory management, kernel launches) work transparently with AMD hardware.

Before and After Summary

Component ROCm 7.0.2 ROCm 7.2.0
ROCm Version 7.0.2 7.2.0
amdgpu-dkms 6.14.14 6.16.13
ROCk Module 6.14.14 6.16.13
HSA Runtime Ext 1.11 1.15
amdgpu Repo 30.10.2 30.30
PyTorch N/A 2.9.1+rocm7.2.0
Triton N/A 3.5.1+rocm7.2.0
Kernel 6.14.0-37-generic 6.14.0-37-generic
Kernel Holds In place In place

Notes on gfx1151 Support

It's worth being explicit about the support situation. As of February 2026, gfx1151 (Strix Halo) is not listed on AMD's official ROCm support matrix. The supported RDNA 3 targets are gfx1100 (Navi 31, RX 7900 XTX) and gfx1101 (Navi 32). Strix Halo's gfx1151 is an RDNA 3.5 derivative that shares much of the ISA with gfx1100 but has architectural differences in the memory subsystem and compute unit layout.

In practice, ROCm 7.2 works on gfx1151. The kernel driver loads, rocminfo detects the GPU, and PyTorch can allocate tensors and dispatch compute kernels. The gfx11-generic ISA target in ROCm 7.2 is particularly helpful -- it provides a compatibility path for software that hasn't been explicitly compiled for gfx1151.

However, "works" and "fully supported" are different things. There are known quirks:

  • rocm-smi VRAM reporting: Always shows 0% on the APU since it only tracks discrete VRAM
  • No official PyTorch gfx1151 builds: The ROCm PyTorch wheels target gfx1100. They run on gfx1151 through ISA compatibility, but performance may not be optimal
  • Large model loading latency: Moving large models to the GPU device can be slow on the unified memory architecture, as the HSA runtime handles page migration differently than discrete GPU DMA transfers

If you're considering this hardware for production AI workloads, treat ROCm support as "functional but experimental." It works well enough for development, testing, and moderate inference workloads. For production training or latency-sensitive deployment, stick with hardware on AMD's official support list.

Rollback Plan

If the upgrade fails -- the DKMS module doesn't build, the GPU isn't detected after reboot, or something else goes wrong -- the rollback path is straightforward:

  1. Uninstall ROCm 7.2:
sudo amdgpu-uninstall -y
sudo apt purge -y amdgpu-install
  1. Reinstall ROCm 7.0.2:
wget https://repo.radeon.com/amdgpu-install/30.10.2/ubuntu/noble/amdgpu-install_30.10.2.0.30100200-2226257.24.04_all.deb
sudo apt install -y ./amdgpu-install_30.10.2.0.30100200-2226257.24.04_all.deb
sudo apt update
sudo amdgpu-install -y --usecase=graphics,rocm
sudo reboot

The entire rollback takes about 15 minutes. Keep the old amdgpu-install deb URL handy -- it's not linked from AMD's current download pages once a newer version is published.

Conclusion

Upgrading ROCm on hardware that isn't officially supported always carries some risk, but this upgrade from 7.0.2 to 7.2 on gfx1151 was uneventful. The procedure follows AMD's documented uninstall-reinstall approach with no deviations. The kernel hold strategy kept the kernel stable, the DKMS module built cleanly against 6.14.0-37-generic, and all post-reboot checks passed.

The improvements in ROCm 7.2 -- particularly the HSA runtime bump to 1.15 and the introduction of the gfx11-generic ISA target -- represent meaningful progress for Strix Halo users. The ecosystem is slowly catching up to the hardware. It's not there yet, but each release closes the gap.

For anyone running a Ryzen AI MAX+ 395 or similar Strix Halo hardware on Ubuntu 24.04, this upgrade is worth doing. The procedure is well-defined, the rollback path is clear, and the newer driver stack brings tangible benefits. Just remember to hold your kernel first.

Recommended Resources

Hardware

Books