<!--
.. title: Upgrading ROCm 7.0 to 7.2 on AMD Strix Halo (gfx1151)
.. slug: upgrading-rocm-7.0-to-7.2-on-amd-strix-halo-gfx1151
.. date: 2026-02-18 10:00:00 UTC-06:00
.. tags: ROCm, AMD, Strix Halo, gfx1151, GPU Computing, Linux, Ubuntu, DKMS, PyTorch, amdgpu, Ryzen AI, driver upgrade
.. category: Hardware and Software Setup
.. link:
.. description: Step-by-step guide to upgrading from ROCm 7.0.2 to ROCm 7.2 on an AMD Ryzen AI MAX+ 395 system with the Radeon 8060S (gfx1151) GPU. Covers the full uninstall-reinstall procedure on Ubuntu 24.04, DKMS verification, kernel hold management, and post-upgrade PyTorch validation on Strix Halo hardware that isn't officially on AMD's support matrix.
.. type: text
-->

<div class="audio-widget">
<div class="audio-widget-header">
<span class="audio-widget-icon">🎧</span>
<span class="audio-widget-label">Listen to this article</span>
</div>
<audio controls preload="metadata">
<source src="/upgrading-rocm-7.0-to-7.2-on-amd-strix-halo-gfx1151_tts.mp3" type="audio/mpeg">
</audio>
<div class="audio-widget-footer">15 min · AI-generated narration</div>
</div>

## Introduction

If you're running AMD's Strix Halo hardware -- specifically the Ryzen AI MAX+ 395 with its integrated Radeon 8060S GPU -- you already know the software ecosystem is a moving target. The gfx1151 architecture sits in an awkward spot: powerful hardware that isn't officially listed on AMD's ROCm support matrix, yet functional enough to run real workloads with the right driver stack. When ROCm 7.2 landed in early 2026, upgrading from 7.0.2 was a priority. The newer stack brings an updated HSA runtime, a refreshed amdgpu kernel module, and broader compatibility improvements that matter on bleeding-edge silicon.

This post documents the complete upgrade procedure from ROCm 7.0.2 to 7.2 on a production Ubuntu 24.04 system. It's not a theoretical exercise -- this was performed on a live server running QEMU virtual machines and network services, with the expectation that everything would come back online after a single reboot.

AMD's official documentation states that in-place ROCm upgrades are not supported. The recommended path is a full uninstall followed by a clean reinstall. That's exactly what we did, and the entire process took about 20 minutes of wall-clock time (excluding the reboot).

## System Overview

The target system is a [Bosgame mini PC](https://baud.rs/WZgnl1) running the Ryzen AI MAX+ 395 APU. If you've read the [earlier review](/posts/amd-ai-max+-395-system-review-a-comprehensive-analysis/) of this hardware, you'll be familiar with the specs. For context on this upgrade, here's what matters:

### Hardware

- **CPU**: AMD Ryzen AI MAX+ 395, 16 cores / 32 threads, Zen 5
- **GPU**: Integrated Radeon 8060S, 40 Compute Units, RDNA 3.5 (gfx1151)
- **Memory**: 32 GB DDR5, unified architecture with 96 GB allocatable to GPU
- **Peak GPU Clock**: 2,900 MHz

### Software (Pre-Upgrade)

- **OS**: Ubuntu 24.04.3 LTS (Noble Numbat)
- **Kernel**: 6.14.0-37-generic (HWE, pinned)
- **ROCm**: 7.0.2
- **amdgpu-dkms**: 6.14.14 (from `repo.radeon.com/amdgpu/30.10.2`)
- **ROCk Module**: 6.14.14

### Running Services

The system was actively serving several roles during the upgrade:

- Five QEMU virtual machines (three x86, two aarch64)
- A PXE boot server (dnsmasq) for the local network
- Docker daemon with various containers

None of these services are tied to the GPU driver stack, so the plan was to perform the upgrade and reboot without shutting them down first. The VMs and network services would come back automatically after the reboot.

## Why Upgrade

ROCm 7.0.2 worked on this hardware. Models loaded, inference ran, `rocminfo` detected the GPU. So why bother upgrading?

Three reasons:

1. **Driver maturity for gfx1151**: The amdgpu kernel module jumped from 6.14.14 to 6.16.13 between the two releases. That's not a minor revision -- it represents months of kernel driver development. On hardware that isn't officially supported, newer drivers tend to bring meaningful stability improvements as AMD's internal teams encounter and fix issues on adjacent architectures.

2. **HSA Runtime improvements**: ROCm 7.2 ships HSA Runtime Extension version 1.15, up from 1.11 in ROCm 7.0.2. The HSA (Heterogeneous System Architecture) runtime is the lowest layer of the ROCm software stack -- it handles device discovery, memory management, and kernel dispatch. Improvements here affect everything built on top of it.

3. **Ecosystem alignment**: PyTorch wheels, Ollama builds, and other ROCm-dependent tools increasingly target 7.2 as the baseline. Running 7.0.2 was becoming an exercise in version pinning and compatibility workarounds.

## The Kernel Hold: Why It Matters

Before diving into the procedure, a note on kernel management. This system runs the Ubuntu HWE (Hardware Enablement) kernel, which provides newer kernel versions on LTS releases. At the time of this upgrade, the HWE kernel was 6.14.0-37-generic. The upstream kernel had already moved to 6.17, but we didn't want the ROCm upgrade to pull in a kernel that AMD's DKMS module might not build against.

The solution is `apt-mark hold`:

```bash
sudo apt-mark hold linux-generic-hwe-24.04 linux-headers-generic-hwe-24.04 linux-image-generic-hwe-24.04
```

This prevents `apt` from upgrading the kernel meta-packages, effectively pinning the system to 6.14.0-37-generic. The hold was already in place before the upgrade and remained untouched throughout. After the upgrade, we confirmed it was still active:

```bash
apt-mark showhold
```

```
linux-generic-hwe-24.04
linux-headers-generic-hwe-24.04
linux-image-generic-hwe-24.04
```

If you're running Strix Halo or any other hardware where kernel compatibility with `amdgpu-dkms` is uncertain, kernel holds are essential. A kernel upgrade that breaks the DKMS build means no GPU driver after reboot.

## Upgrade Procedure

### Step 1: Uninstall the Current ROCm Stack

AMD provides the `amdgpu-uninstall` script for exactly this purpose. It removes all ROCm userspace packages and the amdgpu-dkms kernel module in a single operation:

```bash
sudo amdgpu-uninstall -y
```

This command removed approximately 120 packages, including the full HIP runtime, rocBLAS, MIOpen, MIGraphX, ROCm SMI, the LLVM-based compiler toolchain, and the Mesa graphics drivers that ship with ROCm. The DKMS module was purged, which means the amdgpu kernel module was removed from the 6.14.0-37-generic kernel's module tree.

After the ROCm stack was removed, we purged the `amdgpu-install` meta-package itself:

```bash
sudo apt purge -y amdgpu-install
```

This also cleaned up the APT repository entries that `amdgpu-install` had configured in `/etc/apt/sources.list.d/`. The old repos -- `repo.radeon.com/amdgpu/30.10.2`, `repo.radeon.com/rocm/apt/7.0.2`, and `repo.radeon.com/graphics/7.0.2` -- were all removed automatically.

### Step 2: Clean Up Leftover Files

The package removal was thorough but not perfect. A few leftover directories remained in `/opt/`:

```bash
ls /opt/ | grep rocm
```

```
rocm-7.0.0
rocm-7.0.2
rocm-7.9.0
```

The `rocm-7.0.0` directory was from a previous installation attempt. The `rocm-7.9.0` was from an earlier experiment with a release candidate build. The `rocm-7.0.2` directory contained a single orphaned shared library (`libamdhip64.so.6`) that dpkg couldn't remove because the directory wasn't empty. All three were cleaned up manually:

```bash
sudo rm -rf /opt/rocm-7.0.0 /opt/rocm-7.0.2 /opt/rocm-7.9.0
```

It's worth checking for stale ROCm directories after any uninstall. They consume negligible disk space but can confuse build systems and scripts that scan `/opt/rocm*` for active installations.

### Step 3: Install the ROCm 7.2 Installer

AMD distributes ROCm through a meta-package called `amdgpu-install`. Each ROCm release has its own version of this package, which configures the appropriate APT repositories. The 7.2 installer was downloaded directly from AMD's repository:

```bash
cd /tmp
wget https://repo.radeon.com/amdgpu-install/7.2/ubuntu/noble/amdgpu-install_7.2.70200-1_all.deb
sudo apt install -y ./amdgpu-install_7.2.70200-1_all.deb
sudo apt update
```

After installation and `apt update`, three new repositories were active:

- `https://repo.radeon.com/amdgpu/30.30/ubuntu noble` -- the kernel driver and Mesa components
- `https://repo.radeon.com/rocm/apt/7.2 noble` -- the ROCm userspace stack
- `https://repo.radeon.com/graphics/7.2/ubuntu noble` -- graphics libraries

The version numbering can be confusing. The `amdgpu-install` package version is `30.30.0.0.30300000-2278356.24.04`, which maps to the amdgpu driver release 30.30. The ROCm version is 7.2.0. These are different version tracks that AMD maintains in parallel.

### Step 4: Install ROCm 7.2

With the repositories configured, the actual installation was a single command:

```bash
sudo amdgpu-install -y --usecase=graphics,rocm
```

The `--usecase=graphics,rocm` flag tells the installer to include both the Mesa graphics drivers and the full ROCm compute stack. This is the right choice for a system that needs both display output and GPU compute capabilities.

The installation took approximately 10 minutes and included:

- **amdgpu-dkms 6.16.13**: The kernel module, compiled via DKMS against the running kernel
- **Full ROCm 7.2 stack**: HIP runtime, hipcc compiler, rocBLAS, rocFFT, MIOpen, MIGraphX, RCCL, ROCm SMI, ROCProfiler, and dozens of other libraries
- **Mesa graphics**: Updated EGL, OpenGL, and Vulkan drivers from the amdgpu Mesa fork
- **ROCm LLVM toolchain**: The LLVM-based compiler infrastructure that HIP uses for kernel compilation

The DKMS build is the critical step. During installation, DKMS compiled the amdgpu module against the kernel headers for 6.14.0-37-generic. The output confirmed a successful build:

```
depmod...
update-initramfs: Generating /boot/initrd.img-6.14.0-37-generic
```

The initramfs was regenerated to include the new module, ensuring it would be loaded at boot.

### Step 5: Verify DKMS

Before rebooting, we confirmed the DKMS status:

```bash
dkms status
```

```
amdgpu/6.16.13-2278356.24.04, 6.14.0-37-generic, x86_64: installed
virtualbox/7.0.16, 6.14.0-36-generic, x86_64: installed
virtualbox/7.0.16, 6.14.0-37-generic, x86_64: installed
virtualbox/7.0.16, 6.8.0-100-generic, x86_64: installed
```

The new amdgpu module (6.16.13) was built and installed for 6.14.0-37-generic. Note that it only built for the currently running kernel, unlike VirtualBox which had modules built for older kernels as well. This is expected -- DKMS builds against available kernel headers, and the old kernel headers for 6.14.0-36 and 6.8.0-100 were still present from earlier installations.

### Step 6: Reboot

```bash
sudo reboot
```

The server came back online in approximately 50 seconds.

## Post-Reboot Verification

### rocminfo

The first check after reboot was `rocminfo`, which queries the HSA runtime for available agents:

```bash
rocminfo
```

```
ROCk module version 6.16.13 is loaded
=====================
HSA System Attributes
=====================
Runtime Version:         1.18
Runtime Ext Version:     1.15
...
==========
HSA Agents
==========
Agent 1: AMD RYZEN AI MAX+ 395 w/ Radeon 8060S (CPU)
Agent 2: gfx1151 (GPU)
  Marketing Name:          AMD Radeon Graphics
  Compute Unit:            40
  Max Clock Freq. (MHz):   2900
  Memory Properties:       APU
  ISA 1: amdgcn-amd-amdhsa--gfx1151
  ISA 2: amdgcn-amd-amdhsa--gfx11-generic
```

Key observations:

- **ROCk module 6.16.13**: The new kernel module loaded successfully.
- **Runtime Ext Version 1.15**: Upgraded from 1.11 in ROCm 7.0.2.
- **gfx1151 detected**: The GPU was recognized with its correct ISA identifier.
- **gfx11-generic ISA**: ROCm 7.2 also exposes a generic gfx11 ISA, which allows software compiled for the broader RDNA 3 family to run on this device without gfx1151-specific builds.
- **APU memory**: The memory properties correctly identify this as an APU with unified memory.

### ROCm SMI

```bash
rocm-smi
```

```
Device  Node  Temp    Power     SCLK  MCLK     Fan  Perf  VRAM%  GPU%
0       1     33.0C   9.087W    N/A   1000Mhz  0%   auto  0%     0%
```

The GPU was visible and reporting telemetry. The 0% VRAM reading is expected on an APU -- `rocm-smi` reports dedicated VRAM usage, but on a unified memory architecture, GPU memory allocations come from system RAM and aren't reflected in this counter.

### ROCm Version

```bash
cat /opt/rocm/.info/version
```

```
7.2.0
```

### DKMS

```bash
dkms status
```

Confirmed `amdgpu/6.16.13` remained installed for 6.14.0-37-generic after reboot.

## PyTorch Validation

With the driver stack verified, the next step was confirming that PyTorch could see and use the GPU. ROCm 7.2 ships with prebuilt PyTorch wheels on AMD's repository.

### Installing PyTorch for ROCm 7.2

We set up a Python virtual environment and installed the ROCm-specific wheels:

```bash
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
```

The PyTorch wheel for ROCm 7.2 requires a matching ROCm-specific build of Triton. Both are available from AMD's manylinux repository. The order matters -- Triton must be installed first, since the PyTorch wheel declares it as a dependency with a specific version that doesn't exist on PyPI:

```bash
pip install https://repo.radeon.com/rocm/manylinux/rocm-rel-7.2/triton-3.5.1%2Brocm7.2.0.gita272dfa8-cp312-cp312-linux_x86_64.whl
pip install https://repo.radeon.com/rocm/manylinux/rocm-rel-7.2/torch-2.9.1%2Brocm7.2.0.lw.git7e1940d4-cp312-cp312-linux_x86_64.whl
pip install https://repo.radeon.com/rocm/manylinux/rocm-rel-7.2/torchvision-0.24.0%2Brocm7.2.0.gitb919bd0c-cp312-cp312-linux_x86_64.whl
```

These are the ROCm 7.2 builds for Python 3.12. AMD also provides wheels for Python 3.10, 3.11, and 3.13.

### Smoke Test

```python
import torch
print("PyTorch:", torch.__version__)
print("CUDA available:", torch.cuda.is_available())
print("Device:", torch.cuda.get_device_name(0))
print("VRAM:", round(torch.cuda.get_device_properties(0).total_memory / 1e9, 1), "GB")
```

```
PyTorch: 2.9.1+rocm7.2.0.git7e1940d4
CUDA available: True
Device: AMD Radeon Graphics
VRAM: 103.1 GB
```

PyTorch detected the GPU through ROCm's HIP-to-CUDA translation layer. The 103.1 GB figure represents the total addressable memory on this unified-memory APU, which includes both the 96 GB GPU allocation and additional system memory accessible through the HSA runtime.

Note the use of `torch.cuda` despite this being an AMD GPU. ROCm's HIP runtime presents itself through PyTorch's CUDA interface, so all CUDA API calls in PyTorch (device selection, memory management, kernel launches) work transparently with AMD hardware.

## Before and After Summary

| Component       | ROCm 7.0.2             | ROCm 7.2.0             |
|-----------------|------------------------|------------------------|
| ROCm Version    | 7.0.2                  | **7.2.0**              |
| amdgpu-dkms     | 6.14.14                | **6.16.13**            |
| ROCk Module     | 6.14.14                | **6.16.13**            |
| HSA Runtime Ext | 1.11                   | **1.15**               |
| amdgpu Repo     | 30.10.2                | **30.30**              |
| PyTorch         | N/A                    | 2.9.1+rocm7.2.0       |
| Triton          | N/A                    | 3.5.1+rocm7.2.0       |
| Kernel          | 6.14.0-37-generic      | 6.14.0-37-generic      |
| Kernel Holds    | In place               | In place               |

## Notes on gfx1151 Support

It's worth being explicit about the support situation. As of February 2026, gfx1151 (Strix Halo) is **not listed** on AMD's official ROCm support matrix. The supported RDNA 3 targets are gfx1100 (Navi 31, RX 7900 XTX) and gfx1101 (Navi 32). Strix Halo's gfx1151 is an RDNA 3.5 derivative that shares much of the ISA with gfx1100 but has architectural differences in the memory subsystem and compute unit layout.

In practice, ROCm 7.2 works on gfx1151. The kernel driver loads, `rocminfo` detects the GPU, and PyTorch can allocate tensors and dispatch compute kernels. The `gfx11-generic` ISA target in ROCm 7.2 is particularly helpful -- it provides a compatibility path for software that hasn't been explicitly compiled for gfx1151.

However, "works" and "fully supported" are different things. There are known quirks:

- **rocm-smi VRAM reporting**: Always shows 0% on the APU since it only tracks discrete VRAM
- **No official PyTorch gfx1151 builds**: The ROCm PyTorch wheels target gfx1100. They run on gfx1151 through ISA compatibility, but performance may not be optimal
- **Large model loading latency**: Moving large models to the GPU device can be slow on the unified memory architecture, as the HSA runtime handles page migration differently than discrete GPU DMA transfers

If you're considering this hardware for production AI workloads, treat ROCm support as "functional but experimental." It works well enough for development, testing, and moderate inference workloads. For production training or latency-sensitive deployment, stick with hardware on AMD's official support list.

## Rollback Plan

If the upgrade fails -- the DKMS module doesn't build, the GPU isn't detected after reboot, or something else goes wrong -- the rollback path is straightforward:

1. Uninstall ROCm 7.2:
```bash
sudo amdgpu-uninstall -y
sudo apt purge -y amdgpu-install
```

2. Reinstall ROCm 7.0.2:
```bash
wget https://repo.radeon.com/amdgpu-install/30.10.2/ubuntu/noble/amdgpu-install_30.10.2.0.30100200-2226257.24.04_all.deb
sudo apt install -y ./amdgpu-install_30.10.2.0.30100200-2226257.24.04_all.deb
sudo apt update
sudo amdgpu-install -y --usecase=graphics,rocm
sudo reboot
```

The entire rollback takes about 15 minutes. Keep the old `amdgpu-install` deb URL handy -- it's not linked from AMD's current download pages once a newer version is published.

## Conclusion

Upgrading ROCm on hardware that isn't officially supported always carries some risk, but this upgrade from 7.0.2 to 7.2 on gfx1151 was uneventful. The procedure follows AMD's documented uninstall-reinstall approach with no deviations. The kernel hold strategy kept the kernel stable, the DKMS module built cleanly against 6.14.0-37-generic, and all post-reboot checks passed.

The improvements in ROCm 7.2 -- particularly the HSA runtime bump to 1.15 and the introduction of the `gfx11-generic` ISA target -- represent meaningful progress for Strix Halo users. The ecosystem is slowly catching up to the hardware. It's not there yet, but each release closes the gap.

For anyone running a Ryzen AI MAX+ 395 or similar Strix Halo hardware on Ubuntu 24.04, this upgrade is worth doing. The procedure is well-defined, the rollback path is clear, and the newer driver stack brings tangible benefits. Just remember to hold your kernel first.

## Recommended Resources

### Hardware
- [Bosgame M5 AI Mini PC (Ryzen AI MAX+ 395)](https://baud.rs/WZgnl1) - The system used in this post
- [GMKtec EVO X2 (Ryzen AI MAX+ 395)](https://baud.rs/q87EAZ) - Another Strix Halo mini PC option on Amazon

### Books
- [Deep Learning with PyTorch](https://baud.rs/NTAPGg) by Stevens, Antiga, Huang, Viehmann - Comprehensive guide to building, training, and tuning neural networks with PyTorch
- [Programming PyTorch for Deep Learning](https://baud.rs/Iu8KR4) by Ian Pointer - Practical guide to creating and deploying deep learning applications
- [Understanding Deep Learning](https://baud.rs/zmKSQj) by Simon Prince - Modern treatment of deep learning fundamentals