Posts about inference
- Running DiffusionGemma on AMD Strix Halo and Decade-Old Tesla P40s
- Running DeepSeek V4 Flash on AMD Strix Halo
- Distilled Reasoning on Strix Halo: Running a Claude-Trained Thinking Model Locally
- Running a 22B Video Model on Four Tesla P40s
- The Economics of Owning Your Own Inference
- The Real Cost of Running Qwen TTS Locally: Three Machines Compared
- Repurposing Enterprise GPUs: The Tesla P40 Home Lab Story
- Moore's Law for Intelligence: What Happens When Thinking Gets Cheap