Distilled Reasoning on Strix Halo: Running a Claude-Trained Thinking Model Locally
Running a 27B reasoning model distilled from Claude 4.6 Opus on an AMD Strix Halo APU. The model exposes its chain-of-thought via think tags, runs locally at 10 tokens per second in 4-bit quantization, and reveals both the promise and the limits of reasoning distillation from frontier models.