<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="../assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>TinyComputers.io (Posts about ai acceleration)</title><link>https://tinycomputers.io/</link><description></description><atom:link href="https://tinycomputers.io/categories/ai-acceleration.xml" rel="self" type="application/rss+xml"></atom:link><language>en</language><copyright>Contents © 2026 A.C. Jokela 
&lt;!-- div style="width: 100%" --&gt;
&lt;a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"&gt;&lt;img alt="" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/80x15.png" /&gt; Creative Commons Attribution-ShareAlike&lt;/a&gt;&amp;nbsp;|&amp;nbsp;
&lt;!-- /div --&gt;
</copyright><lastBuildDate>Mon, 06 Apr 2026 22:12:56 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Banana Pi CM5-Pro Review: A Solid Middle Ground with AI Ambitions</title><link>https://tinycomputers.io/posts/banana-pi-cm5-pro-review.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/banana-pi-cm5-pro-review_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;49 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;h3&gt;Introduction&lt;/h3&gt;
&lt;p&gt;The Banana Pi CM5-Pro (also sold as the ArmSoM-CM5) represents Banana Pi's entry into the Raspberry Pi Compute Module 4 form factor market, powered by Rockchip's RK3576 SoC. Released in 2024, this compute module targets developers seeking a CM4-compatible solution with enhanced specifications: up to 16GB of RAM, 128GB of storage, WiFi 6 connectivity, and a 6 TOPS Neural Processing Unit for AI acceleration. With a price point of approximately $103 for the 8GB/64GB configuration and a guaranteed production life until at least August 2034, Banana Pi positions the CM5-Pro as a long-term alternative to Raspberry Pi's official offerings.&lt;/p&gt;
&lt;p&gt;After extensive testing, benchmarking, and comparison against contemporary single-board computers including the Orange Pi 5 Max, Raspberry Pi 5, and LattePanda IOTA, the Banana Pi CM5-Pro emerges as a competent but not exceptional offering. It delivers solid performance, useful features including AI acceleration, and good expandability, but falls short of being a clear winner in any specific category. This review examines where the CM5-Pro excels, where it disappoints, and who should consider it for their projects.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Banana Pi CM5-Pro compute module" src="https://tinycomputers.io/images/bananapi-cm5-pro/IMG_4048.jpeg"&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Banana Pi CM5-Pro showing the dual 100-pin connectors and CM4-compatible form factor&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;Hardware Architecture: The Rockchip RK3576&lt;/h3&gt;
&lt;p&gt;At the heart of the Banana Pi CM5-Pro lies the Rockchip RK3576, a second-generation 8nm SoC featuring a big.LITTLE ARM architecture:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;4x ARM Cortex-A72 cores @ 2.2 GHz (high performance)&lt;/li&gt;
&lt;li&gt;4x ARM Cortex-A53 cores @ 1.8 GHz (power efficiency)&lt;/li&gt;
&lt;li&gt;6 TOPS Neural Processing Unit (NPU)&lt;/li&gt;
&lt;li&gt;Mali-G52 MC3 GPU&lt;/li&gt;
&lt;li&gt;8K@30fps H.265/VP9 decoding, 4K@60fps H.264/H.265 encoding&lt;/li&gt;
&lt;li&gt;Up to 16GB LPDDR5 RAM support&lt;/li&gt;
&lt;li&gt;Dual-channel DDR4/LPDDR4/LPDDR5 memory controller&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The Cortex-A72, originally released by ARM in 2015, represents a significant step up from the ancient Cortex-A53 (2012) but still trails the more modern Cortex-A76 (2018) found in Raspberry Pi 5 and Orange Pi 5 Max. The A72 offers approximately 1.8-2x the performance per clock compared to the A53, with better branch prediction, wider execution units, and more sophisticated memory prefetching. However, it lacks the A76's more advanced microarchitecture improvements and typically runs at lower clock speeds (2.2 GHz vs. 2.4 GHz for the A76 in the Pi 5).&lt;/p&gt;
&lt;p&gt;The inclusion of four Cortex-A53 efficiency cores alongside the A72 performance cores gives the RK3576 a total of eight cores, allowing it to balance power consumption and performance. In practice, this means the system can handle background tasks and light workloads on the A53 cores while reserving the A72 cores for demanding applications. The big.LITTLE scheduler in the Linux kernel attempts to make intelligent decisions about which cores to use for which tasks, though the effectiveness varies depending on workload characteristics.&lt;/p&gt;
&lt;h3&gt;Memory, Storage, and Connectivity&lt;/h3&gt;
&lt;p&gt;Our test unit came configured with:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;4GB LPDDR5 RAM (8GB and 16GB options available)&lt;/li&gt;
&lt;li&gt;29GB eMMC internal storage (32GB nominal, formatted capacity lower)&lt;/li&gt;
&lt;li&gt;M.2 NVMe SSD support (our unit had a 932GB NVMe drive installed)&lt;/li&gt;
&lt;li&gt;WiFi 6 (802.11ax) and Bluetooth 5.3&lt;/li&gt;
&lt;li&gt;Gigabit Ethernet&lt;/li&gt;
&lt;li&gt;HDMI 2.0 output supporting 4K@60fps&lt;/li&gt;
&lt;li&gt;Multiple MIPI CSI camera interfaces&lt;/li&gt;
&lt;li&gt;USB 3.0 and USB 2.0 interfaces via the 100-pin connectors&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The LPDDR5 memory is a notable upgrade over the LPDDR4 found in many competing boards, offering higher bandwidth and better power efficiency. In our testing, memory bandwidth didn't appear to be a significant bottleneck for CPU-bound workloads, though applications that heavily stress memory subsystems (large dataset processing, video encoding, etc.) may benefit from the faster RAM.&lt;/p&gt;
&lt;p&gt;The inclusion of both eMMC storage and M.2 NVMe support provides excellent flexibility. The eMMC serves as a reliable boot medium with consistent performance, while the NVMe slot allows for high-capacity, high-speed storage expansion. This dual-storage approach is superior to SD card-only solutions, which suffer from reliability issues and inconsistent performance.&lt;/p&gt;
&lt;p&gt;WiFi 6 and Bluetooth 5.3 represent current-generation wireless standards, providing better performance and lower latency than the WiFi 5 found in older boards. For robotics applications, low-latency wireless communication can be crucial for remote control and telemetry, making this a meaningful upgrade.&lt;/p&gt;
&lt;h3&gt;The NPU: 6 TOPS of AI Potential&lt;/h3&gt;
&lt;p&gt;The RK3576's integrated 6 TOPS Neural Processing Unit is the CM5-Pro's headline AI feature, designed to accelerate machine learning inference workloads. The NPU supports multiple quantization formats (INT4/INT8/INT16/BF16/TF32) and can interface with mainstream frameworks including TensorFlow, PyTorch, MXNet, and Caffe through Rockchip's RKNN toolkit.&lt;/p&gt;
&lt;p&gt;In our testing, we confirmed the presence of the NPU hardware at &lt;code&gt;/sys/kernel/iommu_groups/0/devices/27700000.npu&lt;/code&gt; and verified that the RKNN runtime library (&lt;code&gt;librknnrt.so&lt;/code&gt;) and server (&lt;code&gt;rknn_server&lt;/code&gt;) were installed and accessible. To validate real-world NPU performance, we ran MobileNet V1 image classification inference tests using the pre-installed RKNN model.&lt;/p&gt;
&lt;p&gt;NPU Inference Benchmarks - MobileNet V1:&lt;/p&gt;
&lt;p&gt;Running 10 inference iterations on a 224x224 RGB image (bell.jpg), we measured consistent performance:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Average inference time: 161.8ms per image&lt;/li&gt;
&lt;li&gt;Min/Max: 146ms to 172ms&lt;/li&gt;
&lt;li&gt;Standard deviation: ~7.2ms&lt;/li&gt;
&lt;li&gt;Throughput: ~6.2 frames per second&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The model successfully classified test images with appropriate confidence scores across 1,001 ImageNet classes. The inference pipeline includes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;JPEG decoding and preprocessing&lt;/li&gt;
&lt;li&gt;Image resizing and color space conversion&lt;/li&gt;
&lt;li&gt;INT8 quantized inference on the NPU&lt;/li&gt;
&lt;li&gt;FP16 output tensor postprocessing&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This demonstrates that the NPU is fully functional and provides practical acceleration for computer vision workloads. The ~160ms inference time for MobileNet V1 is reasonable for edge AI applications, though more demanding models like YOLOv8 or larger classification networks would benefit from the full 6 TOPS capacity.&lt;/p&gt;
&lt;p&gt;Rockchip's RKNN toolkit provides a development workflow that converts trained models into RKNN format for efficient execution on the NPU. The process involves:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Training a model using a standard framework (TensorFlow, PyTorch, etc.)&lt;/li&gt;
&lt;li&gt;Exporting the model to ONNX or framework-specific format&lt;/li&gt;
&lt;li&gt;Converting the model using rknn-toolkit2 on a PC&lt;/li&gt;
&lt;li&gt;Quantizing the model to INT8 or other supported formats&lt;/li&gt;
&lt;li&gt;Deploying the RKNN model file to the board&lt;/li&gt;
&lt;li&gt;Running inference using RKNN C/C++ or Python APIs&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This workflow is more complex than simply running a PyTorch or TensorFlow model directly, but the trade-off is significantly improved inference performance and lower power consumption compared to CPU-only execution. For applications like real-time object detection, the 6 TOPS NPU can deliver:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Face recognition: 240fps @ 1080p&lt;/li&gt;
&lt;li&gt;Object detection (YOLO-based models): 50fps @ 4K&lt;/li&gt;
&lt;li&gt;Semantic segmentation: 30fps @ 2K&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These performance figures represent substantial improvements over CPU-based inference, making the NPU genuinely useful for edge AI applications. However, they also require investment in learning the RKNN toolchain, optimizing models for the specific NPU architecture, and managing the conversion pipeline as part of your development workflow.&lt;/p&gt;
&lt;p&gt;RKLLM and Large Language Model Support:&lt;/p&gt;
&lt;p&gt;To thoroughly test LLM capabilities, we performed end-to-end testing: model conversion on an x86_64 platform (LattePanda IOTA), transfer to the CM5-Pro, and NPU inference validation. RKLLM (Rockchip Large Language Model) toolkit enables running quantized LLMs on the RK3576's 6 TOPS NPU, supporting models including Qwen, Llama, ChatGLM, Phi, Gemma, InternLM, MiniCPM, and others.&lt;/p&gt;
&lt;p&gt;LLM Model Conversion Benchmark:&lt;/p&gt;
&lt;p&gt;We converted TinyLLAMA 1.1B Chat from Hugging Face format to RKLLM format using an Intel N150-powered LattePanda IOTA:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Source Model: TinyLLAMA 1.1B Chat v1.0 (505 MB safetensors)&lt;/li&gt;
&lt;li&gt;Conversion Platform: x86_64 (RKLLM-Toolkit only available for x86, not ARM)&lt;/li&gt;
&lt;li&gt;Quantization: W4A16 (4-bit weights, 16-bit activations)&lt;/li&gt;
&lt;li&gt;Conversion Time Breakdown:&lt;/li&gt;
&lt;li&gt;Model loading: 6.95 seconds&lt;/li&gt;
&lt;li&gt;Building/Quantizing: 220.47 seconds (293 layers)&lt;/li&gt;
&lt;li&gt;Optimization: 206.72 seconds (22 optimization steps)&lt;/li&gt;
&lt;li&gt;Export to RKLLM format: 37.41 seconds&lt;/li&gt;
&lt;li&gt;Total Conversion Time: 264.83 seconds (4.41 minutes)&lt;/li&gt;
&lt;li&gt;Output File Size: 644.75 MB (increased from 505 MB due to RKNN format overhead)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The cross-platform requirement is important: RKLLM-Toolkit is distributed as x86_64-only Python wheels, so model conversion must be performed on an x86 PC or VM, not on the ARM-based CM5-Pro itself. Conversion time scales with model size and CPU performance - larger models on slower CPUs will take proportionally longer.&lt;/p&gt;
&lt;p&gt;NPU LLM Inference Testing:&lt;/p&gt;
&lt;p&gt;After transferring the converted model to the CM5-Pro, we successfully:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;✓ Loaded the TinyLLAMA 1.1B model (645 MB) into RKLLM runtime&lt;/li&gt;
&lt;li&gt;✓ Initialized NPU with 2-core configuration for W4A16 inference&lt;/li&gt;
&lt;li&gt;✓ Verified token generation and text output&lt;/li&gt;
&lt;li&gt;✓ Confirmed the model runs on NPU cores (not CPU fallback)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The RKLLM runtime v1.2.2 correctly identified the model configuration (W4A16, max_context=2048, 2 NPU cores) and enabled the Cortex-A72 cores [4,5,6,7] for host processing while the NPU handled inference.&lt;/p&gt;
&lt;p&gt;Actual RK3576 LLM Performance (Official Rockchip Benchmarks):&lt;/p&gt;
&lt;p&gt;Based on Rockchip's published benchmarks for the RK3576, small language models perform as follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Qwen2 0.5B (w4a16): 34.24 tokens/second, 327ms first token latency, 426 MB memory&lt;/li&gt;
&lt;li&gt;MiniCPM4 0.5B (w4a16): 35.8 tokens/second, 349ms first token latency, 322 MB memory&lt;/li&gt;
&lt;li&gt;TinyLLAMA 1.1B (w4a16): 21.32 tokens/second, 518ms first token latency, 591 MB memory&lt;/li&gt;
&lt;li&gt;InternLM2 1.8B (w4a16): 13.65 tokens/second, 772ms first token latency, 966 MB memory&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For context, the RK3588 (with more powerful NPU) achieves 42.58 tokens/second for Qwen2 0.5B - about 1.85x faster than the RK3576.&lt;/p&gt;
&lt;p&gt;Practical Assessment:&lt;/p&gt;
&lt;p&gt;The 30-35 tokens/second achieved with 0.5B models is usable for offline chatbots, text classification, and simple Q&amp;amp;A applications, but would feel noticeably slow compared to cloud LLM APIs or GPU-accelerated solutions. Humans typically read at 200-300 words per minute (~50 tokens/second), so 35 tokens/second is borderline for comfortable real-time conversation. Larger models (1.8B+) drop to 13 tokens/second or less, which feels sluggish for interactive use.&lt;/p&gt;
&lt;p&gt;The complete workflow (download model → convert on x86 → transfer to ARM → run inference) works as designed but requires infrastructure: an x86 machine or VM for conversion, network transfer for large model files (645 MB), and familiarity with Python environments and RKLLM APIs. For embedded deployments, this is acceptable; for rapid prototyping, it adds friction compared to cloud-based LLM solutions.&lt;/p&gt;
&lt;p&gt;Compared to Google's Coral TPU (4 TOPS), the RK3576's 6 TOPS provides 1.5x more computational power, though the Coral benefits from more mature tooling and broader community support. Against the Horizon X3's 5 TOPS, the RK3576 offers 20% more capability with far better CPU performance backing it up. For serious AI workloads, NVIDIA's Jetson platforms (40+ TOPS) remain in a different performance class, but at significantly higher price points and power requirements.&lt;/p&gt;
&lt;h3&gt;Performance Testing: Real-World Compilation Benchmarks&lt;/h3&gt;
&lt;p&gt;To assess the Banana Pi CM5-Pro's CPU performance, we ran our standard Rust compilation benchmark: building a complex ballistics simulation engine with numerous dependencies from a clean state, three times, and averaging the results. This real-world workload stresses CPU cores, memory bandwidth, compiler performance, and I/O subsystems.&lt;/p&gt;
&lt;p&gt;Banana Pi CM5-Pro Compilation Times:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Run 1: 173.16 seconds (2 minutes 53 seconds)&lt;/li&gt;
&lt;li&gt;Run 2: 162.29 seconds (2 minutes 42 seconds)&lt;/li&gt;
&lt;li&gt;Run 3: 165.99 seconds (2 minutes 46 seconds)&lt;/li&gt;
&lt;li&gt;Average: 167.15 seconds (2 minutes 47 seconds)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For context, here's how the CM5-Pro compares to other contemporary single-board computers:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;System&lt;/th&gt;
&lt;th&gt;CPU&lt;/th&gt;
&lt;th&gt;Cores&lt;/th&gt;
&lt;th&gt;Average Time&lt;/th&gt;
&lt;th&gt;vs. CM5-Pro&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Orange Pi 5 Max&lt;/td&gt;
&lt;td&gt;Cortex-A55/A76&lt;/td&gt;
&lt;td&gt;8 (4+4)&lt;/td&gt;
&lt;td&gt;62.31s&lt;/td&gt;
&lt;td&gt;2.68x faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Raspberry Pi CM5&lt;/td&gt;
&lt;td&gt;Cortex-A76&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;71.04s&lt;/td&gt;
&lt;td&gt;2.35x faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LattePanda IOTA&lt;/td&gt;
&lt;td&gt;Intel N150&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;72.21s&lt;/td&gt;
&lt;td&gt;2.31x faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Raspberry Pi 5&lt;/td&gt;
&lt;td&gt;Cortex-A76&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;76.65s&lt;/td&gt;
&lt;td&gt;2.18x faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Banana Pi CM5-Pro&lt;/td&gt;
&lt;td&gt;Cortex-A53/A72&lt;/td&gt;
&lt;td&gt;8 (4+4)&lt;/td&gt;
&lt;td&gt;167.15s&lt;/td&gt;
&lt;td&gt;1.00x (baseline)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The results reveal the CM5-Pro's positioning: it's significantly slower than top-tier ARM and x86 single-board computers, but respectable within its price and power class. The 2.68x performance deficit versus the Orange Pi 5 Max is substantial, explained by the RK3588's newer Cortex-A76 cores running at higher clock speeds (2.4 GHz) with more advanced microarchitecture.&lt;/p&gt;
&lt;p&gt;More telling is the comparison to the Raspberry Pi 5 and Raspberry Pi CM5, both featuring four Cortex-A76 cores at 2.4 GHz. Despite having eight cores to the Pi's four, the CM5-Pro is approximately 2.2x slower. This performance gap illustrates the generational advantage of the A76 architecture - the Pi 5's four newer cores outperform the CM5-Pro's four A72 cores plus four A53 cores combined for this workload.&lt;/p&gt;
&lt;p&gt;The LattePanda IOTA's Intel N150, despite having only four cores, also outperforms the CM5-Pro by 2.3x. Intel's Alder Lake-N architecture, even in its low-power form, delivers superior single-threaded performance and more effective multi-threading than the RK3576.&lt;/p&gt;
&lt;p&gt;However, context matters. The CM5-Pro's 167-second compilation time is still quite usable for development workflows. A project that takes 77 seconds to compile on a Raspberry Pi 5 will take 167 seconds on the CM5-Pro - an additional 90 seconds. For most developers, this difference is noticeable but not crippling. Compile times remain in the "get a coffee" range rather than the "go to lunch" range.&lt;/p&gt;
&lt;p&gt;More importantly, the CM5-Pro vastly outperforms older ARM platforms. Compared to boards using only Cortex-A53 cores (like the Horizon X3 CM at 379 seconds), the CM5-Pro is 2.27x faster, demonstrating the value of the Cortex-A72 performance cores.&lt;/p&gt;
&lt;h3&gt;Geekbench 6 CPU Performance&lt;/h3&gt;
&lt;p&gt;To provide standardized synthetic benchmarks, we ran Geekbench 6.5.0 on the Banana Pi CM5-Pro:&lt;/p&gt;
&lt;p&gt;Geekbench 6 Scores:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Single-Core Score: 328&lt;/li&gt;
&lt;li&gt;Multi-Core Score: 1337&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These scores reflect the RK3576's positioning as a mid-range ARM platform. The single-core score of 328 indicates modest per-core performance from the Cortex-A72 cores, while the multi-core score of 1337 demonstrates reasonable scaling across all eight cores (4x A72 + 4x A53). For context, the Raspberry Pi 5 with Cortex-A76 cores typically scores around 550-600 single-core and 1700-1900 multi-core, showing the generational advantage of the newer ARM architecture.&lt;/p&gt;
&lt;p&gt;Notable individual benchmark results include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;PDF Renderer: 542 single-core, 2904 multi-core&lt;/li&gt;
&lt;li&gt;Ray Tracer: 2763 multi-core&lt;/li&gt;
&lt;li&gt;Asset Compression: 2756 multi-core&lt;/li&gt;
&lt;li&gt;Horizon Detection: 540 single-core&lt;/li&gt;
&lt;li&gt;HTML5 Browser: 455 single-core&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The relatively strong performance on PDF rendering and asset compression tasks suggests the RK3576 handles real-world productivity workloads reasonably well, though the lower single-core scores indicate that latency-sensitive interactive applications may feel less responsive than on platforms with faster per-core performance.&lt;/p&gt;
&lt;p&gt;Full Geekbench results: https://browser.geekbench.com/v6/cpu/14853854&lt;/p&gt;
&lt;h3&gt;Comparative Analysis: CM5-Pro vs. the Competition&lt;/h3&gt;
&lt;h4&gt;vs. Orange Pi 5 Max&lt;/h4&gt;
&lt;p&gt;The Orange Pi 5 Max represents the performance leader in our testing, powered by Rockchip's flagship RK3588 SoC with four Cortex-A76 + four Cortex-A55 cores. The 5 Max compiled our benchmark in 62.31 seconds - 2.68x faster than the CM5-Pro's 167.15 seconds.&lt;/p&gt;
&lt;p&gt;Key differences:&lt;/p&gt;
&lt;p&gt;Performance: The 5 Max's Cortex-A76 cores deliver substantially better single-threaded and multi-threaded performance. For CPU-intensive development work, the performance gap is significant.&lt;/p&gt;
&lt;p&gt;NPU: The RK3588 includes a 6 TOPS NPU, matching the RK3576's AI capabilities. Both boards can run similar RKNN-optimized models with comparable inference performance.&lt;/p&gt;
&lt;p&gt;Form Factor: The 5 Max is a full-sized single-board computer with on-board ports and connectors, while the CM5-Pro is a compute module requiring a carrier board. This makes the 5 Max more suitable for standalone projects and the CM5-Pro better for embedded integration.&lt;/p&gt;
&lt;p&gt;Price: The Orange Pi 5 Max sells for approximately \$150-180 with 8GB RAM, compared to $103 for the CM5-Pro. The 5 Max's superior performance comes at a premium, but the cost-per-performance ratio remains competitive.&lt;/p&gt;
&lt;p&gt;Memory: Both support up to 16GB RAM, though the 5 Max typically ships with higher-capacity configurations.&lt;/p&gt;
&lt;p&gt;Verdict: If raw CPU performance is your priority and you can accommodate a full-sized SBC, the Orange Pi 5 Max is the clear choice. The CM5-Pro makes sense if you need the compute module form factor, want to minimize cost, or have thermal/power constraints that favor the slightly more efficient RK3576.&lt;/p&gt;
&lt;h4&gt;vs. Raspberry Pi 5&lt;/h4&gt;
&lt;p&gt;The Raspberry Pi 5, with its Broadcom BCM2712 SoC featuring four Cortex-A76 cores at 2.4 GHz, compiled our benchmark in 76.65 seconds - 2.18x faster than the CM5-Pro.&lt;/p&gt;
&lt;p&gt;Key differences:&lt;/p&gt;
&lt;p&gt;Performance: The Pi 5's four A76 cores outperform the CM5-Pro's 4+4 big.LITTLE configuration for most workloads. Single-threaded performance heavily favors the Pi 5, while multi-threaded performance depends on whether the workload can effectively utilize the CM5-Pro's additional A53 cores.&lt;/p&gt;
&lt;p&gt;NPU: The Pi 5 lacks integrated AI acceleration, while the CM5-Pro includes a 6 TOPS NPU. For AI-heavy applications, this is a significant advantage for the CM5-Pro.&lt;/p&gt;
&lt;p&gt;Ecosystem: The Raspberry Pi ecosystem is vastly more mature, with extensive documentation, massive community support, and guaranteed long-term software maintenance. While Banana Pi has committed to supporting the CM5-Pro until 2034, the Pi Foundation's track record inspires more confidence.&lt;/p&gt;
&lt;p&gt;Software: Raspberry Pi OS is polished and actively maintained, with hardware-specific optimizations. The CM5-Pro runs generic ARM Linux distributions (Debian, Ubuntu) which work well but lack Pi-specific refinements.&lt;/p&gt;
&lt;p&gt;Price: The Raspberry Pi 5 (8GB model) retails for \$80, significantly cheaper than the CM5-Pro's \$103. The Pi 5 offers better performance for less money - a compelling value proposition.&lt;/p&gt;
&lt;p&gt;Expansion: The Pi 5's standard SBC form factor provides easier access to GPIO, HDMI, USB, and other interfaces. The CM5-Pro requires a carrier board, adding cost and complexity but enabling more customized designs.&lt;/p&gt;
&lt;p&gt;Verdict: For general-purpose computing, development, and hobbyist projects, the Raspberry Pi 5 is the better choice: faster, cheaper, and better supported. The CM5-Pro makes sense if you specifically need AI acceleration, prefer the compute module form factor, or want more RAM/storage capacity than the Pi 5 offers.&lt;/p&gt;
&lt;h4&gt;vs. LattePanda IOTA&lt;/h4&gt;
&lt;p&gt;The LattePanda IOTA, powered by Intel's N150 Alder Lake-N processor with four cores, compiled our benchmark in 72.21 seconds - 2.31x faster than the CM5-Pro.&lt;/p&gt;
&lt;p&gt;Key differences:&lt;/p&gt;
&lt;p&gt;Architecture: The IOTA uses x86_64 architecture, providing compatibility with a wider range of software that may not be well-optimized for ARM. The CM5-Pro's ARM architecture benefits from lower power consumption and better mobile/embedded software support.&lt;/p&gt;
&lt;p&gt;Performance: Intel's N150, despite having only four cores, delivers superior single-threaded performance and competitive multi-threaded performance against the CM5-Pro's eight cores. Intel's microarchitecture and higher sustained frequencies provide an edge for CPU-bound tasks.&lt;/p&gt;
&lt;p&gt;NPU: The IOTA lacks dedicated AI acceleration, relying on CPU or external accelerators for machine learning workloads. The CM5-Pro's integrated 6 TOPS NPU is a clear advantage for AI applications.&lt;/p&gt;
&lt;p&gt;Power Consumption: The N150 is a low-power x86 chip, but still consumes more power than ARM solutions under typical workloads. The CM5-Pro's big.LITTLE configuration can achieve better power efficiency for mixed workloads.&lt;/p&gt;
&lt;p&gt;Form Factor: The IOTA is a small x86 board with Arduino co-processor integration, targeting maker/IoT applications. The CM5-Pro's compute module format serves different use cases, primarily embedded systems and custom carrier board designs.&lt;/p&gt;
&lt;p&gt;Price: The LattePanda IOTA sells for approximately $149, more expensive than the CM5-Pro. However, it includes unique features like the Arduino co-processor and x86 compatibility that may justify the premium for specific applications.&lt;/p&gt;
&lt;p&gt;Software Ecosystem: x86 enjoys broader commercial software support, while ARM excels in embedded and mobile-focused applications. Choose based on your software requirements.&lt;/p&gt;
&lt;p&gt;Verdict: If you need x86 compatibility or want a compact standalone board with Arduino integration, the LattePanda IOTA makes sense despite its higher price. If you're working in ARM-native embedded Linux, need AI acceleration, or want the compute module form factor, the CM5-Pro is the better choice at a lower price point.&lt;/p&gt;
&lt;h4&gt;vs. Raspberry Pi CM5&lt;/h4&gt;
&lt;p&gt;The Raspberry Pi Compute Module 5 is the most direct competitor to the Banana Pi CM5-Pro, offering the same CM4-compatible form factor with different specifications. The Pi CM5 compiled our benchmark in 71.04 seconds - 2.35x faster than the CM5-Pro.&lt;/p&gt;
&lt;p&gt;Key differences:&lt;/p&gt;
&lt;p&gt;Performance: The Pi CM5's four Cortex-A76 cores at 2.4 GHz significantly outperform the CM5-Pro's 4x A72 + 4x A53 configuration. The architectural advantage of the A76 over the A72 translates to approximately 2.35x better performance in our testing.&lt;/p&gt;
&lt;p&gt;NPU: The CM5-Pro's 6 TOPS NPU provides integrated AI acceleration, while the Pi CM5 requires external solutions (Hailo-8, Coral TPU) for hardware-accelerated inference. If AI is central to your application, the CM5-Pro's integrated NPU is more elegant.&lt;/p&gt;
&lt;p&gt;Memory Options: The CM5-Pro supports up to 16GB LPDDR5, while the Pi CM5 offers up to 8GB LPDDR4X. For memory-intensive applications, the CM5-Pro's higher capacity could be decisive.&lt;/p&gt;
&lt;p&gt;Storage: Both offer eMMC options, with the CM5-Pro available up to 128GB and the Pi CM5 up to 64GB. Both support additional storage via carrier board interfaces.&lt;/p&gt;
&lt;p&gt;Price: The Raspberry Pi CM5 (8GB/32GB eMMC) sells for approximately $95, slightly cheaper than the CM5-Pro's $103. The CM5-Pro's extra features (more RAM/storage options, integrated NPU) justify the small price premium for those who need them.&lt;/p&gt;
&lt;p&gt;Ecosystem: The Pi CM5 benefits from Raspberry Pi's ecosystem, tooling, and community. The CM5-Pro has decent support but can't match the Pi's extensive resources.&lt;/p&gt;
&lt;p&gt;Carrier Boards: Both are CM4-compatible, meaning they can use the same carrier boards. However, some boards may not fully support CM5-Pro-specific features, and subtle electrical differences could cause issues in rare cases.&lt;/p&gt;
&lt;p&gt;Verdict: For maximum CPU performance in the CM4 form factor, choose the Pi CM5. Its 2.35x performance advantage is significant for compute-intensive applications. Choose the CM5-Pro if you need integrated AI acceleration, more than 8GB of RAM, more than 64GB of eMMC storage, or prefer the better wireless connectivity (WiFi 6 vs. WiFi 5).&lt;/p&gt;
&lt;h3&gt;Use Cases and Recommendations&lt;/h3&gt;
&lt;p&gt;Based on our testing and analysis, here are scenarios where the Banana Pi CM5-Pro excels and where alternatives might be better:&lt;/p&gt;
&lt;h4&gt;Choose the Banana Pi CM5-Pro if you:&lt;/h4&gt;
&lt;p&gt;Need AI acceleration in a compute module: The integrated 6 TOPS NPU eliminates the need for external AI accelerators, simplifying hardware design and reducing BOM costs. For robotics, smart cameras, or IoT devices with AI workloads, this is a compelling advantage.&lt;/p&gt;
&lt;p&gt;Require more than 8GB of RAM: The CM5-Pro supports up to 16GB LPDDR5, double the Pi CM5's maximum. If your application processes large datasets, runs multiple VMs, or needs extensive buffering, the extra RAM headroom matters.&lt;/p&gt;
&lt;p&gt;Want high-capacity built-in storage: With up to 128GB eMMC options, the CM5-Pro can store large datasets, models, or applications without requiring external storage. This simplifies deployment and improves reliability compared to SD cards or network storage.&lt;/p&gt;
&lt;p&gt;Prefer WiFi 6 and Bluetooth 5.3: Current-generation wireless standards provide better performance and lower latency than WiFi 5. For wireless robotics control or IoT applications with many connected devices, WiFi 6's improvements are meaningful.&lt;/p&gt;
&lt;p&gt;Value long production lifetime: Banana Pi's commitment to produce the CM5-Pro until August 2034 provides assurance for commercial products with multi-year lifecycles. You can design around this module without fear of it being discontinued in 2-3 years.&lt;/p&gt;
&lt;p&gt;Have thermal or power constraints: The RK3576's 8nm process and big.LITTLE architecture can deliver better power efficiency than always-on high-performance cores, extending battery life or reducing cooling requirements for fanless designs.&lt;/p&gt;
&lt;h4&gt;Choose alternatives if you:&lt;/h4&gt;
&lt;p&gt;Prioritize raw CPU performance: The Raspberry Pi 5, Pi CM5, Orange Pi 5 Max, and LattePanda IOTA all deliver significantly faster CPU performance. If your application is CPU-bound and doesn't benefit from the NPU, these platforms are better choices.&lt;/p&gt;
&lt;p&gt;Want the simplest development experience: The Raspberry Pi ecosystem's polish, documentation, and community support make it the easiest platform for beginners and rapid prototyping. The Pi 5 or Pi CM5 will get you running faster with fewer obstacles.&lt;/p&gt;
&lt;p&gt;Need maximum AI performance: NVIDIA Jetson platforms provide 40+ TOPS of AI performance with mature CUDA/TensorRT tooling. If AI is your primary workload, the investment in a Jetson module is worthwhile despite higher costs.&lt;/p&gt;
&lt;p&gt;Require x86 compatibility: The LattePanda IOTA or other x86 platforms provide better software compatibility for commercial applications that depend on x86-specific libraries or software.&lt;/p&gt;
&lt;p&gt;Work with standard SBC form factors: If you don't need a compute module and prefer the convenience of a full-sized SBC with onboard ports, the Orange Pi 5 Max or Raspberry Pi 5 are better choices.&lt;/p&gt;
&lt;h3&gt;The NPU in Practice: RKNN Toolkit and Ecosystem&lt;/h3&gt;
&lt;p&gt;While we didn't perform exhaustive AI benchmarking, our exploration of the RKNN ecosystem reveals both promise and challenges. The infrastructure exists: the NPU hardware is present and accessible, the runtime libraries are installed, and documentation is available from both Rockchip and Banana Pi. The RKNN toolkit can convert mainstream frameworks to NPU-optimized models, and community examples demonstrate YOLO11n object detection running successfully on the CM5-Pro.&lt;/p&gt;
&lt;p&gt;However, the RKNN development experience is not as streamlined as more mature ecosystems. Converting and optimizing models requires learning Rockchip-specific tools and workflows. Debugging performance issues or accuracy degradation during quantization demands patience and experimentation. The documentation is improving but remains fragmented across Rockchip's official site, Banana Pi's docs, and community forums.&lt;/p&gt;
&lt;p&gt;For developers already familiar with embedded AI deployment, the RKNN workflow will feel familiar - it follows similar patterns to TensorFlow Lite, ONNX Runtime, or other edge inference frameworks. For developers new to edge AI, the learning curve is steeper than cloud-based solutions but gentler than some alternatives (looking at you, Hailo's toolchain).&lt;/p&gt;
&lt;p&gt;The 6 TOPS performance figure is real and achievable for properly optimized models. INT8 quantized YOLO models can indeed run at 50fps @ 4K, and simpler models scale accordingly. The NPU's support for INT4 and BF16 formats provides flexibility for trading off accuracy versus performance. For many robotics and IoT applications, the 6 TOPS NPU hits a sweet spot: enough performance for useful AI workloads, integrated into the SoC to minimize complexity and cost, and accessible through reasonable (if not perfect) tooling.&lt;/p&gt;
&lt;h3&gt;Build Quality and Physical Characteristics&lt;/h3&gt;
&lt;p&gt;The Banana Pi CM5-Pro adheres to the Raspberry Pi CM4 mechanical specification, featuring dual 100-pin high-density connectors arranged in the standard layout. Physical dimensions match the CM4, allowing drop-in replacement in compatible carrier boards. Our sample unit appeared well-manufactured with clean solder joints, proper component placement, and no obvious defects.&lt;/p&gt;
&lt;p&gt;The module includes an on-board WiFi/Bluetooth antenna connector (U.FL/IPEX), power management IC, and all necessary supporting components. Unlike some compute modules that require extensive external components on the carrier board, the CM5-Pro is relatively self-contained, simplifying carrier board design.&lt;/p&gt;
&lt;p&gt;Thermal performance is adequate but not exceptional. Under sustained load during our compilation benchmarks, the SoC reached temperatures requiring thermal management. For applications running continuous AI inference or heavy CPU workloads, active cooling (fan) or substantial passive cooling (heatsink and airflow) is recommended. The carrier board design should account for thermal dissipation, especially if the module will be enclosed in a case.&lt;/p&gt;
&lt;h3&gt;Software and Ecosystem&lt;/h3&gt;
&lt;p&gt;The CM5-Pro ships with Banana Pi's custom Debian-based Linux distribution, featuring a 6.1.75 kernel with Rockchip-specific patches and drivers. In our testing, the system worked well out of the box: networking functioned, sudo worked (refreshingly, after the Horizon X3 CM disaster), and package management operated normally.&lt;/p&gt;
&lt;p&gt;The distribution includes pre-installed RKNN libraries and tools, enabling NPU development without additional setup. Python 3 and essential development packages are available, and standard Debian repositories provide access to thousands of additional packages. For developers comfortable with Debian/Ubuntu, the environment feels familiar and capable.&lt;/p&gt;
&lt;p&gt;However, the software ecosystem lags behind Raspberry Pi's. Raspberry Pi OS includes countless optimizations, hardware-specific integrations, and utilities that simply don't exist for Rockchip platforms. Camera support, GPIO access, and peripheral interfaces work, but often require more manual configuration or programming compared to the Pi's plug-and-play experience.&lt;/p&gt;
&lt;p&gt;Third-party software support varies. Popular frameworks like ROS2, OpenCV, and TensorFlow compile and run without issues. Hardware-specific accelerators (GPU, NPU) may require additional configuration or custom builds. Overall, the software situation is "good enough" for experienced developers but not as polished as the Raspberry Pi ecosystem.&lt;/p&gt;
&lt;p&gt;Banana Pi's documentation has improved significantly over the years, with reasonably comprehensive guides covering basic setup, GPIO usage, and RKNN deployment. Community support exists through forums and GitHub, though it's smaller and less active than Raspberry Pi's communities. Expect to do more troubleshooting independently and rely less on finding someone who's already solved your exact problem.&lt;/p&gt;
&lt;h3&gt;Conclusion: A Capable Platform for Specific Niches&lt;/h3&gt;
&lt;p&gt;The Banana Pi CM5-Pro is a solid, if unspectacular, compute module that serves specific niches well while falling short of being a universal recommendation. Its combination of integrated 6 TOPS NPU, up to 16GB RAM, WiFi 6 connectivity, and CM4-compatible form factor creates a unique offering that competes effectively against alternatives when your requirements align with its strengths.&lt;/p&gt;
&lt;p&gt;For projects needing AI acceleration in a compute module format, the CM5-Pro is arguably the best choice currently available. The integrated NPU eliminates the complexity and cost of external AI accelerators while delivering genuine performance improvements for inference workloads. The RKNN toolkit, while imperfect, provides a workable path to deploying optimized models. If your robotics platform, smart camera, or IoT device depends on local AI processing, the CM5-Pro deserves serious consideration.&lt;/p&gt;
&lt;p&gt;For projects requiring more than 8GB of RAM or more than 64GB of storage in a compute module, the CM5-Pro is the only game in town among CM4-compatible options. This makes it the default choice for memory-intensive applications that need the compute module form factor.&lt;/p&gt;
&lt;p&gt;For general-purpose computing, development, or applications where AI is not central, the Raspberry Pi CM5 is the better choice. Its 2.35x performance advantage is substantial and directly translates to faster build times, quicker application responsiveness, and better user experience. The Pi's ecosystem advantages further tip the scales for most users.&lt;/p&gt;
&lt;p&gt;Our compilation benchmark results - 167 seconds for the CM5-Pro versus 71-77 seconds for Pi5/CM5 - illustrate the performance gap clearly. For development workflows, this difference is noticeable but workable. Most developers can tolerate the CM5-Pro's slower compilation times if other factors (AI acceleration, RAM capacity, price) favor it. But if maximum CPU performance is your priority, look elsewhere.&lt;/p&gt;
&lt;p&gt;The comparison to the Orange Pi 5 Max reveals a significant performance gap (62 vs. 167 seconds), but also highlights different market positions. The 5 Max is a full-featured SBC designed for standalone use, while the CM5-Pro is a compute module designed for embedded integration. They serve different purposes and target different applications.&lt;/p&gt;
&lt;p&gt;Against the LattePanda IOTA's x86 architecture, the CM5-Pro trades x86 compatibility for better power efficiency, integrated AI, and lower cost. The choice between them depends entirely on software requirements - x86-specific applications favor the IOTA, while ARM-native embedded applications favor the CM5-Pro.&lt;/p&gt;
&lt;p&gt;The Banana Pi CM5-Pro earns a qualified recommendation: excellent for AI-focused embedded projects, good for high-RAM compute module applications, acceptable for general embedded Linux development, and not recommended if raw CPU performance or ecosystem maturity are priorities. At $103 for the 8GB/64GB configuration, it offers reasonable value for applications that leverage its strengths, though it won't excite buyers seeking the fastest or cheapest option.&lt;/p&gt;
&lt;p&gt;If your project needs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;AI acceleration integrated into a compute module&lt;/li&gt;
&lt;li&gt;More than 8GB RAM in CM4 form factor&lt;/li&gt;
&lt;li&gt;WiFi 6 and current wireless standards&lt;/li&gt;
&lt;li&gt;Guaranteed long production life (until 2034)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Then the Banana Pi CM5-Pro is a solid choice that delivers on its promises.&lt;/p&gt;
&lt;p&gt;If your project needs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Maximum CPU performance&lt;/li&gt;
&lt;li&gt;The most polished software ecosystem&lt;/li&gt;
&lt;li&gt;The easiest development experience&lt;/li&gt;
&lt;li&gt;The lowest cost&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Then the Raspberry Pi CM5 or Pi 5 remains the better option.&lt;/p&gt;
&lt;p&gt;The CM5-Pro occupies a middle ground: not the fastest, not the cheapest, not the easiest, but uniquely capable in specific areas. For the right application, it's exactly what you need. For others, it's a compromise that doesn't quite satisfy. Choose accordingly.&lt;/p&gt;
&lt;h3&gt;Specifications Summary&lt;/h3&gt;
&lt;p&gt;Processor:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Rockchip RK3576 (8nm process)&lt;/li&gt;
&lt;li&gt;4x ARM Cortex-A72 @ 2.2 GHz (performance cores)&lt;/li&gt;
&lt;li&gt;4x ARM Cortex-A53 @ 1.8 GHz (efficiency cores)&lt;/li&gt;
&lt;li&gt;Mali-G52 MC3 GPU&lt;/li&gt;
&lt;li&gt;6 TOPS NPU (Rockchip RKNPU)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Memory &amp;amp; Storage:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;4GB/8GB/16GB LPDDR5 RAM options&lt;/li&gt;
&lt;li&gt;32GB/64GB/128GB eMMC options&lt;/li&gt;
&lt;li&gt;M.2 NVMe SSD support via carrier board&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Video:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;8K@30fps H.265/VP9 decoding&lt;/li&gt;
&lt;li&gt;4K@60fps H.264/H.265 encoding&lt;/li&gt;
&lt;li&gt;HDMI 2.0 output (via carrier board)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Connectivity:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;WiFi 6 (802.11ax) and Bluetooth 5.3&lt;/li&gt;
&lt;li&gt;Gigabit Ethernet (via carrier board)&lt;/li&gt;
&lt;li&gt;Multiple USB 2.0/3.0 interfaces&lt;/li&gt;
&lt;li&gt;MIPI CSI camera inputs&lt;/li&gt;
&lt;li&gt;I2C, SPI, UART, PWM&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Physical:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Dual 100-pin board-to-board connectors (CM4-compatible)&lt;/li&gt;
&lt;li&gt;Dimensions: 55mm x 40mm&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Benchmark Performance:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Rust compilation: 167.15 seconds average&lt;/li&gt;
&lt;li&gt;2.68x slower than Orange Pi 5 Max&lt;/li&gt;
&lt;li&gt;2.35x slower than Raspberry Pi CM5&lt;/li&gt;
&lt;li&gt;2.31x slower than LattePanda IOTA&lt;/li&gt;
&lt;li&gt;2.18x slower than Raspberry Pi 5&lt;/li&gt;
&lt;li&gt;2.27x faster than Horizon X3 CM&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Pricing: ~$103 USD (8GB RAM / 64GB eMMC configuration)&lt;/p&gt;
&lt;p&gt;Production Lifetime: Guaranteed until August 2034&lt;/p&gt;
&lt;p&gt;Recommendation: Good choice for AI-focused embedded projects requiring compute module form factor; not recommended if raw CPU performance is the priority.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Review Date: November 3, 2025&lt;/p&gt;
&lt;p&gt;Hardware Tested: Banana Pi CM5-Pro (ArmSoM-CM5) with 4GB RAM, 29GB eMMC, 932GB NVMe SSD&lt;/p&gt;
&lt;p&gt;OS Tested: Banana Pi Debian (based on Debian GNU/Linux), kernel 6.1.75&lt;/p&gt;
&lt;p&gt;Conclusion: Solid middle-ground option with integrated AI acceleration; best for specific niches rather than general-purpose use.&lt;/p&gt;</description><category>ai acceleration</category><category>arm cortex a53</category><category>arm cortex a72</category><category>banana pi</category><category>benchmarks</category><category>cm5 pro</category><category>compute module</category><category>edge ai</category><category>hardware review</category><category>npu</category><category>rknn</category><category>rockchip rk3576</category><category>single board computers</category><category>wifi 6</category><guid>https://tinycomputers.io/posts/banana-pi-cm5-pro-review.html</guid><pubDate>Mon, 03 Nov 2025 20:22:15 GMT</pubDate></item><item><title>The Horizon X3 CM: A Cautionary Tale in Robotics Development Platforms</title><link>https://tinycomputers.io/posts/horizon-robotics-x3-cm-review.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/horizon-robotics-x3-cm-review_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;37 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;h3&gt;Introduction&lt;/h3&gt;
&lt;p&gt;The Horizon X3 CM (Compute Module) represents an interesting case study in the single-board computer market: a product marketed as an AI-focused robotics platform that, in practice, falls dramatically short of both its promises and its competition. Released during the 2021-2022 timeframe and based on Horizon Robotics' Sunrise 3 chip (announced September 2020), the X3 CM attempts to position itself as a robotics development platform with integrated AI acceleration through its "Brain Processing Unit" or BPU. However, as we discovered through extensive testing and configuration attempts, the Horizon X3 CM is an underwhelming offering that suffers from outdated hardware, broken software distributions, abandoned documentation, and a configuration process so Byzantine that it borders on hostile to users.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Horizon X3 CM compute module" src="https://tinycomputers.io/images/horizon_x3_cm/IMG_4038.jpeg"&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Horizon X3 CM compute module showing the CM4-compatible 200-pin connector&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;img alt="Horizon X3 CM mounted on carrier board" src="https://tinycomputers.io/images/horizon_x3_cm/IMG_4040.jpeg"&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Horizon X3 CM installed on a carrier board with exposed components&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;Hardware Architecture: A Foundation Built on Yesterday's Technology&lt;/h3&gt;
&lt;p&gt;At the heart of the Horizon X3 CM lies the Sunrise X3 system-on-chip, featuring a quad-core ARM Cortex-A53 processor clocked at 1.5 GHz, paired with a single Cortex-R5 core for real-time tasks. The Cortex-A53, released by ARM in 2012, was already considered a low-power, efficiency-focused core at launch. By 2025 standards, it is ancient technology - predating even the Cortex-A55 by six years and the high-performance Cortex-A76 by eight years.&lt;/p&gt;
&lt;p&gt;To put this in perspective: the Cortex-A53 was designed in an era when ARM was still competing against Intel Atom processors in tablets and smartphones. The microarchitecture lacks modern features like advanced branch prediction, sophisticated out-of-order execution, and the aggressive clock speeds found in contemporary ARM cores. It was never intended for computationally demanding workloads, instead optimizing for power efficiency in battery-powered devices.&lt;/p&gt;
&lt;p&gt;The system includes 2GB or 4GB of RAM (our test unit had 4GB), eMMC storage options, and the typical suite of interfaces expected on a compute module: MIPI CSI for cameras, MIPI DSI for displays, USB 3.0, Gigabit Ethernet, and HDMI output. The physical form factor mimics the Raspberry Pi Compute Module 4's 200-pin board-to-board connector, allowing it to fit into existing CM4 carrier boards - at least in theory.&lt;/p&gt;
&lt;h3&gt;The BPU: Marketing Promise vs. Reality&lt;/h3&gt;
&lt;p&gt;The headline feature of the Horizon X3 CM is undoubtedly its Brain Processing Unit, marketed as providing 5 TOPS (trillion operations per second) of AI inference capability using Horizon's Bernoulli 2.0 architecture. The BPU is a dual-core dedicated neural processing unit fabricated on a 16nm process, designed specifically for edge AI applications in robotics and autonomous driving.&lt;/p&gt;
&lt;p&gt;On paper, 5 TOPS sounds impressive for an edge device. The marketing materials emphasize the X3's ability to run AI models locally without cloud dependency, perform real-time object detection, enable autonomous navigation, and support various computer vision tasks. Horizon Robotics, founded in 2015 and focused primarily on automotive AI processors, positioned the Sunrise 3 chip as a way to bring their automotive-grade AI capabilities to the robotics and IoT markets.&lt;/p&gt;
&lt;p&gt;In practice, the BPU's utility is severely constrained by several factors. First, the 5 TOPS figure assumes optimal utilization with models specifically optimized for the Bernoulli architecture. Second, the Cortex-A53 CPU cores create a significant bottleneck for any workload that cannot be entirely offloaded to the BPU. Third, and most critically, the toolchain and software ecosystem required to actually leverage the BPU is fragmented, poorly documented, and largely abandoned.&lt;/p&gt;
&lt;h3&gt;The Software Ecosystem: Abandonment and Fragmentation&lt;/h3&gt;
&lt;p&gt;Perhaps the most telling aspect of the Horizon X3 CM is the state of its software support. Horizon Robotics archived all their GitHub repositories, effectively abandoning public development and support. D-Robotics, which appears to be either a subsidiary or spin-off focused on the robotics market, has continued maintaining forks of some repositories, but the overall ecosystem feels scattered and undermaintained.&lt;/p&gt;
&lt;h4&gt;hobot_llm: An Exercise in Futility&lt;/h4&gt;
&lt;p&gt;One of the more recent developments is hobot_llm, a project that attempts to run Large Language Models on the RDK X3 platform. Hosted at https://github.com/D-Robotics/hobot_llm, this ROS2 node promises to bring LLM capabilities to edge robotics applications. The reality is far less inspiring.&lt;/p&gt;
&lt;p&gt;hobot_llm provides two interaction modes: a terminal-based chat interface and a ROS2 node that subscribes to text topics and publishes LLM responses. The system requires the 4GB RAM version of the RDK X3 and recommends increasing the BPU reserved memory to 1.7GB - leaving precious little memory for other tasks.&lt;/p&gt;
&lt;p&gt;Users report that responses take 15-30 seconds to generate, and the quality of responses is described as "confusing and mostly unrelated to the query." This performance characteristic makes the system effectively useless for any real-time robotics application. A robot that takes 30 seconds to formulate a language-based response is not demonstrating intelligence; it's demonstrating the fundamental inadequacy of the platform.&lt;/p&gt;
&lt;p&gt;The hobot_llm project exemplifies the broader problem with the X3 ecosystem: projects that look interesting in concept but fall apart under scrutiny, implemented on hardware that lacks the computational resources to make them practical, maintained by a fractured development community that can't provide consistent support.&lt;/p&gt;
&lt;h4&gt;D-Robotics vs. Horizon Robotics: Corporate Confusion&lt;/h4&gt;
&lt;p&gt;The relationship between Horizon Robotics and D-Robotics adds another layer of confusion for potential users. Horizon Robotics, the original creator of the Sunrise chips, has clearly shifted its focus to the automotive market, where margins are higher and customers are more willing to accept proprietary, closed-source solutions. The company's GitHub repositories were archived, signaling an end to community-focused development.&lt;/p&gt;
&lt;p&gt;D-Robotics picked up the robotics development kit mantle, maintaining forks of key repositories like hobot_llm, hobot_dnn (the DNN inference framework), and the RDK model zoo. However, this continuation feels more like life support than active development. Commit frequencies are low, issues pile up without resolution, and the documentation remains fragmented across multiple sites (d-robotics.cc, developer.d-robotics.cc, github.com/D-Robotics, github.com/HorizonRDK).&lt;/p&gt;
&lt;p&gt;For a potential user in 2025, this corporate structure raises immediate red flags. Who actually supports this platform? If you encounter a problem, where do you file an issue? If Horizon has abandoned the project and D-Robotics is merely keeping it alive, what is the long-term viability of building a product on this foundation?&lt;/p&gt;
&lt;h3&gt;The Bootstrap Nightmare: A System Designed to Frustrate&lt;/h3&gt;
&lt;p&gt;If the hardware limitations and software abandonment weren't enough to dissuade potential users, the actual process of getting a functioning Horizon X3 CM system should seal the case. We downloaded the latest Ubuntu 22.04-derived distribution from https://archive.d-robotics.cc/downloads/en/os_images/rdk_x3/rdk_os_3.0.3-2025-09-08/ and discovered a system configuration so broken and non-standard that it defies belief.&lt;/p&gt;
&lt;h4&gt;The Sudo Catastrophe&lt;/h4&gt;
&lt;p&gt;The most egregious issue: sudo doesn't work out of the box. Not because of a configuration error, but because critical system files are owned by the wrong user. The distribution ships with /usr/bin/sudo, /etc/sudoers, and related files owned by uid 1000 (the sunrise user) rather than root. This creates an impossible catch-22:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;You need root privileges to fix the file ownership&lt;/li&gt;
&lt;li&gt;sudo is the standard way to gain root privileges&lt;/li&gt;
&lt;li&gt;sudo won't function because of incorrect ownership&lt;/li&gt;
&lt;li&gt;You can't fix the ownership without root privileges&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Traditional escape routes all fail. The root password is not set, so su doesn't work. pkexec requires polkit authentication. systemctl requires authentication for privileged operations. Even setting file capabilities (setcap) to grant specific privileges fails because the sunrise user lacks CAP_SETFCAP.&lt;/p&gt;
&lt;p&gt;The workaround involves creating an /etc/rc.local script that runs at boot time as root to fix ownership of sudo binaries, sudoers files, and apt directories:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="ch"&gt;#!/bin/bash -e&lt;/span&gt;
&lt;span class="c1"&gt;# Fix sudo binary ownership and permissions&lt;/span&gt;
chown&lt;span class="w"&gt; &lt;/span&gt;root:root&lt;span class="w"&gt; &lt;/span&gt;/usr/bin/sudo
chmod&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;4755&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;/usr/bin/sudo

&lt;span class="c1"&gt;# Fix sudo plugins directory&lt;/span&gt;
chown&lt;span class="w"&gt; &lt;/span&gt;-R&lt;span class="w"&gt; &lt;/span&gt;root:root&lt;span class="w"&gt; &lt;/span&gt;/usr/lib/sudo/

&lt;span class="c1"&gt;# Fix sudoers configuration files&lt;/span&gt;
chown&lt;span class="w"&gt; &lt;/span&gt;root:root&lt;span class="w"&gt; &lt;/span&gt;/etc/sudoers
chmod&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0440&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;/etc/sudoers
chown&lt;span class="w"&gt; &lt;/span&gt;-R&lt;span class="w"&gt; &lt;/span&gt;root:root&lt;span class="w"&gt; &lt;/span&gt;/etc/sudoers.d/
chmod&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0755&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;/etc/sudoers.d/
chmod&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0440&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;/etc/sudoers.d/*

&lt;span class="c1"&gt;# Fix apt package manager directories&lt;/span&gt;
mkdir&lt;span class="w"&gt; &lt;/span&gt;-p&lt;span class="w"&gt; &lt;/span&gt;/var/cache/apt/archives/partial
mkdir&lt;span class="w"&gt; &lt;/span&gt;-p&lt;span class="w"&gt; &lt;/span&gt;/var/lib/apt/lists/partial
chown&lt;span class="w"&gt; &lt;/span&gt;-R&lt;span class="w"&gt; &lt;/span&gt;root:root&lt;span class="w"&gt; &lt;/span&gt;/var/lib/apt/lists
chown&lt;span class="w"&gt; &lt;/span&gt;_apt:root&lt;span class="w"&gt; &lt;/span&gt;/var/lib/apt/lists/partial
chmod&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0700&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;/var/lib/apt/lists/partial
chown&lt;span class="w"&gt; &lt;/span&gt;-R&lt;span class="w"&gt; &lt;/span&gt;root:root&lt;span class="w"&gt; &lt;/span&gt;/var/cache/apt/archives
chown&lt;span class="w"&gt; &lt;/span&gt;_apt:root&lt;span class="w"&gt; &lt;/span&gt;/var/cache/apt/archives/partial
chmod&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0700&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;/var/cache/apt/archives/partial

&lt;span class="nb"&gt;exit&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This is not a minor configuration quirk. This is a fundamental misunderstanding of Linux system security and standard practices. No competent distribution would ship with sudo broken in this manner. The fact that this made it into a release image dated September 2025 suggests either complete incompetence or absolute indifference to user experience.&lt;/p&gt;
&lt;h4&gt;Network Configuration Hell&lt;/h4&gt;
&lt;p&gt;The default network configuration assumes you're using the 192.168.1.0/24 subnet with a gateway at 192.168.1.1. If your network uses any other addressing scheme - as most enterprise networks, lab environments, and even many home networks do - you're in for a frustrating experience.&lt;/p&gt;
&lt;p&gt;Changing the network configuration should be trivial: edit /etc/network/interfaces, update the IP address and gateway, reboot. Except the sunrise user lacks CAP_NET_ADMIN capability, so you can't use ip commands to modify network configuration on the fly. You can't use NetworkManager's command-line tools without authentication. You must edit the configuration files manually and reboot to apply changes.&lt;/p&gt;
&lt;p&gt;Our journey to move the device from 192.168.1.10 to 10.1.1.135 involved:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Accessing the device through a gateway system that could route to both networks&lt;/li&gt;
&lt;li&gt;Backing up /etc/network/interfaces&lt;/li&gt;
&lt;li&gt;Manually editing the static IP configuration&lt;/li&gt;
&lt;li&gt;Removing conflicting secondary IP configuration scripts&lt;/li&gt;
&lt;li&gt;Adding DNS servers (which weren't configured at all in the default image)&lt;/li&gt;
&lt;li&gt;Rebooting and hoping the configuration took&lt;/li&gt;
&lt;li&gt;Troubleshooting DNS resolution failures&lt;/li&gt;
&lt;li&gt;Editing /etc/systemd/resolved.conf to add nameservers&lt;/li&gt;
&lt;li&gt;Adding a systemd-resolved restart to /etc/rc.local&lt;/li&gt;
&lt;li&gt;Rebooting again&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This process, which takes approximately 30 seconds on a properly configured Linux system, consumed hours on the Horizon X3 CM due to the broken permissions structure and missing default configurations.&lt;/p&gt;
&lt;h4&gt;Repository Roulette&lt;/h4&gt;
&lt;p&gt;The default APT repositories point to mirrors.tuna.tsinghua.edu.cn (a Chinese university mirror) and archive.sunrisepi.tech (which is frequently unreachable). For users outside China, these repositories are slow or inaccessible. The solution requires manually reconfiguring /etc/apt/sources.list to use official Ubuntu Ports mirrors:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;deb&lt;span class="w"&gt; &lt;/span&gt;http://ports.ubuntu.com/ubuntu-ports/&lt;span class="w"&gt; &lt;/span&gt;focal&lt;span class="w"&gt; &lt;/span&gt;main&lt;span class="w"&gt; &lt;/span&gt;restricted&lt;span class="w"&gt; &lt;/span&gt;universe&lt;span class="w"&gt; &lt;/span&gt;multiverse
deb&lt;span class="w"&gt; &lt;/span&gt;http://ports.ubuntu.com/ubuntu-ports/&lt;span class="w"&gt; &lt;/span&gt;focal-security&lt;span class="w"&gt; &lt;/span&gt;main&lt;span class="w"&gt; &lt;/span&gt;restricted&lt;span class="w"&gt; &lt;/span&gt;universe&lt;span class="w"&gt; &lt;/span&gt;multiverse
deb&lt;span class="w"&gt; &lt;/span&gt;http://ports.ubuntu.com/ubuntu-ports/&lt;span class="w"&gt; &lt;/span&gt;focal-updates&lt;span class="w"&gt; &lt;/span&gt;main&lt;span class="w"&gt; &lt;/span&gt;restricted&lt;span class="w"&gt; &lt;/span&gt;universe&lt;span class="w"&gt; &lt;/span&gt;multiverse
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Again, this should be a non-issue. Modern distributions detect geographic location and configure appropriate mirrors automatically. The Horizon X3 CM requires manual intervention for basic package management functionality.&lt;/p&gt;
&lt;h4&gt;The Permission Structure Mystery&lt;/h4&gt;
&lt;p&gt;Beyond these specific issues lies a broader architectural decision that makes no sense: why are system directories owned by a non-root user? Running ls -ld on /etc, /usr/lib, and /var/lib/apt reveals they're owned by sunrise:sunrise rather than root:root. This violates fundamental Unix security principles and creates cascading problems throughout the system.&lt;/p&gt;
&lt;p&gt;Was this an intentional design decision? If so, what was the rationale? Was it an accident that made it through quality assurance? The complete lack of documentation about this unusual setup suggests it's not intentional, yet it persists through multiple distribution releases.&lt;/p&gt;
&lt;h3&gt;Performance Testing: Confirmation of Inadequacy&lt;/h3&gt;
&lt;p&gt;To quantitatively assess the Horizon X3 CM's performance, we ran our standard Rust compilation benchmark: building a complex ballistics simulation engine with numerous dependencies from clean state, three times, and averaging the results. This workload stresses CPU cores, memory bandwidth, and compiler performance - a representative real-world task for any development platform.&lt;/p&gt;
&lt;h4&gt;Benchmark Results&lt;/h4&gt;
&lt;p&gt;The Horizon X3 CM posted compilation times of:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Run 1: 384.32 seconds (6 minutes 24 seconds)&lt;/li&gt;
&lt;li&gt;Run 2: 376.66 seconds (6 minutes 17 seconds)&lt;/li&gt;
&lt;li&gt;Run 3: 375.46 seconds (6 minutes 15 seconds)&lt;/li&gt;
&lt;li&gt;Average: 378.81 seconds (6 minutes 19 seconds)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For context, here's how this compares to contemporary ARM and x86 single-board computers:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;System&lt;/th&gt;
&lt;th&gt;Architecture&lt;/th&gt;
&lt;th&gt;CPU&lt;/th&gt;
&lt;th&gt;Cores&lt;/th&gt;
&lt;th&gt;Average Time&lt;/th&gt;
&lt;th&gt;vs. X3 CM&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Orange Pi 5 Max&lt;/td&gt;
&lt;td&gt;ARM64&lt;/td&gt;
&lt;td&gt;Cortex-A55/A76&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;62.31s&lt;/td&gt;
&lt;td&gt;6.08x faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Raspberry Pi CM5&lt;/td&gt;
&lt;td&gt;ARM64&lt;/td&gt;
&lt;td&gt;Cortex-A76&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;71.04s&lt;/td&gt;
&lt;td&gt;5.33x faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LattePanda Iota&lt;/td&gt;
&lt;td&gt;x86_64&lt;/td&gt;
&lt;td&gt;Intel N150&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;72.21s&lt;/td&gt;
&lt;td&gt;5.25x faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Raspberry Pi 5&lt;/td&gt;
&lt;td&gt;ARM64&lt;/td&gt;
&lt;td&gt;Cortex-A76&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;76.65s&lt;/td&gt;
&lt;td&gt;4.94x faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Horizon X3 CM&lt;/td&gt;
&lt;td&gt;ARM64&lt;/td&gt;
&lt;td&gt;Cortex-A53&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;378.81s&lt;/td&gt;
&lt;td&gt;1.00x (baseline)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Orange Pi RV2&lt;/td&gt;
&lt;td&gt;RISC-V&lt;/td&gt;
&lt;td&gt;Ky X1&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;650.60s&lt;/td&gt;
&lt;td&gt;1.72x slower&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The Horizon X3 CM is approximately five times slower than the Raspberry Pi 5, despite both boards having four cores. This dramatic performance gap is explained by the generational difference in ARM core architecture: the Cortex-A76 in the Pi 5 represents eight years of microarchitectural advancement over the A53, with wider execution units, better branch prediction, higher clock speeds, and more sophisticated memory hierarchies.&lt;/p&gt;
&lt;p&gt;The only platform slower than the X3 CM in our testing was the Orange Pi RV2, which uses an experimental RISC-V processor with an immature compiler toolchain. The fact that an established ARM platform with a mature software ecosystem performs only 1.72x better than a bleeding-edge RISC-V platform speaks volumes about the X3's inadequacy.&lt;/p&gt;
&lt;h4&gt;Geekbench 6 Results: Industry-Standard Confirmation&lt;/h4&gt;
&lt;p&gt;To complement our real-world compilation benchmarks, we also ran Geekbench 6 - an industry-standard synthetic benchmark that measures CPU performance across a variety of workloads including cryptography, image processing, machine learning, and general computation. The results reinforce and quantify just how far behind the Horizon X3 CM falls compared to modern alternatives.&lt;/p&gt;
&lt;p&gt;Horizon X3 CM Geekbench 6 Scores:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Single-Core Score: 127&lt;/li&gt;
&lt;li&gt;Multi-Core Score: 379&lt;/li&gt;
&lt;li&gt;Geekbench Link: https://browser.geekbench.com/v6/cpu/14816041&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For context, here's how this compares to other single-board computers running Geekbench 6:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;System&lt;/th&gt;
&lt;th&gt;CPU&lt;/th&gt;
&lt;th&gt;Single-Core&lt;/th&gt;
&lt;th&gt;Multi-Core&lt;/th&gt;
&lt;th&gt;vs. X3 Single&lt;/th&gt;
&lt;th&gt;vs. X3 Multi&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Orange Pi 5 Max&lt;/td&gt;
&lt;td&gt;Cortex-A55/A76&lt;/td&gt;
&lt;td&gt;743&lt;/td&gt;
&lt;td&gt;2,792&lt;/td&gt;
&lt;td&gt;5.85x faster&lt;/td&gt;
&lt;td&gt;7.37x faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Raspberry Pi 5&lt;/td&gt;
&lt;td&gt;Cortex-A76&lt;/td&gt;
&lt;td&gt;764-774&lt;/td&gt;
&lt;td&gt;1,588-1,604&lt;/td&gt;
&lt;td&gt;6.01-6.09x faster&lt;/td&gt;
&lt;td&gt;4.19-4.23x faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Raspberry Pi 5 (OC)&lt;/td&gt;
&lt;td&gt;Cortex-A76&lt;/td&gt;
&lt;td&gt;837&lt;/td&gt;
&lt;td&gt;1,711&lt;/td&gt;
&lt;td&gt;6.59x faster&lt;/td&gt;
&lt;td&gt;4.51x faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Horizon X3 CM&lt;/td&gt;
&lt;td&gt;Cortex-A53&lt;/td&gt;
&lt;td&gt;127&lt;/td&gt;
&lt;td&gt;379&lt;/td&gt;
&lt;td&gt;1.00x (baseline)&lt;/td&gt;
&lt;td&gt;1.00x (baseline)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The Geekbench results align remarkably well with our compilation benchmarks, confirming that the X3 CM's poor performance isn't specific to one workload but represents a fundamental computational deficit across all task types.&lt;/p&gt;
&lt;p&gt;A single-core score of 127 is abysmal by 2025 standards. To put this in perspective, the iPhone 6s from 2015 scored around 140 in single-core Geekbench 6 tests. The Horizon X3 CM, released in 2021-2022, delivers performance comparable to a decade-old smartphone processor.&lt;/p&gt;
&lt;p&gt;The multi-core score of 379 shows that the X3 fails to effectively leverage its four cores. Despite having the same core count as the Raspberry Pi 5, the X3 scores less than one-quarter of the Pi 5's multi-core performance. The Orange Pi 5 Max, with its eight cores (four A76 + four A55), absolutely destroys the X3 with 7.37x better multi-core performance.&lt;/p&gt;
&lt;p&gt;The Geekbench individual test scores reveal specific weaknesses:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Navigation tasks: 282 single-core (embarrassingly slow for robotics applications requiring path planning)&lt;/li&gt;
&lt;li&gt;Clang compilation: 208 single-core (confirming our real-world compilation benchmark findings)&lt;/li&gt;
&lt;li&gt;HTML5 Browser: 180 single-core (even web-based robot control interfaces would lag)&lt;/li&gt;
&lt;li&gt;PDF Rendering: 200 single-core, 797 multi-core (document processing would crawl)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These synthetic benchmarks might seem academic, but they translate directly to real-world robotics performance. The navigation score predicts poor path planning performance. The Clang score explains the painful compilation times. The HTML5 browser score means even accessing web-based configuration interfaces will be sluggish. Every aspect of development and deployment on the X3 CM will feel slow because the processor is fundamentally inadequate.&lt;/p&gt;
&lt;h4&gt;What This Means for Real Workloads&lt;/h4&gt;
&lt;p&gt;The compilation benchmark translates directly to real-world robotics and AI development scenarios:&lt;/p&gt;
&lt;p&gt;Development iteration time: Compiling ROS2 packages, building custom nodes, and testing changes takes five times longer than on a Raspberry Pi 5. A developer waiting 20 minutes for a build on the Pi 5 will wait 100 minutes on the X3 CM.&lt;/p&gt;
&lt;p&gt;AI model training: While the BPU handles inference, any model training, data preprocessing, or optimization work runs on the Cortex-A53 cores at a glacial pace.&lt;/p&gt;
&lt;p&gt;Computer vision processing: Pre-BPU image processing, post-BPU result processing, and any vision algorithms not optimized for the Bernoulli architecture will execute slowly.&lt;/p&gt;
&lt;p&gt;Multi-tasking performance: Running ROS2, sensor drivers, motion controllers, and application logic simultaneously will strain the limited CPU resources. The cores will spend more time context switching than doing useful work.&lt;/p&gt;
&lt;h3&gt;The AI Promise: Hollow Marketing&lt;/h3&gt;
&lt;p&gt;Let's return to the central premise of the Horizon X3 CM: it's an AI-focused robotics platform with a dedicated Brain Processing Unit providing 5 TOPS of inference capability. Does this specialization justify the platform's shortcomings?&lt;/p&gt;
&lt;p&gt;The answer is a resounding no.&lt;/p&gt;
&lt;p&gt;First, 5 TOPS is not impressive by 2025 standards. The Google Coral TPU provides 4 TOPS in a USB dongle costing under $60. The NVIDIA Jetson Orin Nano provides 40 TOPS. Even smartphone SoCs like the Apple A17 Pro deliver over 35 TOPS. The Horizon X3's 5 TOPS might have been notable in 2020 when the chip was announced, but it's thoroughly uncompetitive five years later.&lt;/p&gt;
&lt;p&gt;Second, the BPU's usefulness is limited by the proprietary toolchain and model conversion requirements. You can't simply take a TensorFlow or PyTorch model and run it on the BPU. It must be converted using Horizon's tools, quantized to specific formats the Bernoulli architecture supports, and optimized for the dual-core BPU's execution model. The documentation for this process is scattered, incomplete, and assumes familiarity with Horizon's automotive-focused development flow.&lt;/p&gt;
&lt;p&gt;Third, the weak Cortex-A53 cores undermine any AI acceleration advantage. If your application spends 70% of its time in AI inference and 30% in CPU-bound tasks, accelerating the inference to near-zero still leaves you with performance dominated by the slow CPU. The system is only as fast as its slowest component, and the CPU is very slow.&lt;/p&gt;
&lt;p&gt;Fourth, the ecosystem lock-in is severe. Code written for the Horizon BPU doesn't port to other platforms. Models optimized for Bernoulli architecture require re-optimization for other accelerators. Investing development time in Horizon-specific tooling is investing in a dead-end technology with an uncertain future.&lt;/p&gt;
&lt;p&gt;Compare this to the Raspberry Pi ecosystem, where you can add AI acceleration through well-supported options like the Coral TPU, Intel Neural Compute Stick, or Hailo-8 accelerator. These solutions work across the Pi 4, Pi 5, and other platforms, with mature Python APIs, extensive documentation, and active communities. The development you do with these accelerators transfers to other projects and platforms.&lt;/p&gt;
&lt;h3&gt;Documentation: Scarce and Scattered&lt;/h3&gt;
&lt;p&gt;Throughout our evaluation of the Horizon X3 CM, a consistent theme emerged: finding documentation for any task ranged from difficult to impossible. Want to understand the BPU's capabilities? The information is spread across d-robotics.cc, developer.d-robotics.cc, archived Horizon Robotics pages, and forums in both English and Chinese.&lt;/p&gt;
&lt;p&gt;Looking for example code? Some repositories on GitHub have examples, but they assume familiarity with Horizon's model conversion tools. The tools themselves have documentation, but it's automotive-focused and doesn't translate well to robotics applications.&lt;/p&gt;
&lt;p&gt;Need help troubleshooting a problem? The forums are sparsely populated, with many questions unanswered. The most reliable source of information is reverse-engineering what other users have done and hoping it works on your hardware revision.&lt;/p&gt;
&lt;p&gt;This stands in stark contrast to the Raspberry Pi ecosystem, where every sensor, every module, every software package has multiple tutorials, forums full of discussions, YouTube videos, and GitHub repositories with example code. The Pi's ubiquity means that any problem you encounter has likely been solved multiple times by others.&lt;/p&gt;
&lt;h3&gt;The YouTube Deception&lt;/h3&gt;
&lt;p&gt;It's worth addressing the several YouTube videos that demonstrate the Horizon X3 running robotics applications, performing object detection, and controlling robot platforms. These videos create an impression that the X3 is a viable robotics platform. They're not technically dishonest - the hardware can do these things - but they omit the critical context that makes the X3 a poor choice.&lt;/p&gt;
&lt;p&gt;These demonstrations typically show:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Custom-built systems where someone has already overcome the configuration hurdles&lt;/li&gt;
&lt;li&gt;Specific AI models that have been painstakingly optimized for the BPU&lt;/li&gt;
&lt;li&gt;Applications that carefully avoid the CPU bottlenecks&lt;/li&gt;
&lt;li&gt;No comparisons to how the same task performs on alternative platforms&lt;/li&gt;
&lt;li&gt;No discussion of development time, tool chain difficulties, or ecosystem limitations&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What they don't show is the hours spent fixing sudo, configuring networks, battling documentation gaps, and waiting for slow compilation. They don't mention that achieving the same functionality on a Raspberry Pi 5 with a Coral TPU would be faster to develop, more performant, better documented, and more maintainable.&lt;/p&gt;
&lt;p&gt;The YouTube demonstrations are real, but they represent the absolute best case: experienced developers who've mastered the platform's quirks showing carefully crafted demos. They do not represent the typical user experience.&lt;/p&gt;
&lt;h3&gt;Who Is This For? (No One)&lt;/h3&gt;
&lt;p&gt;Attempting to identify the target audience for the Horizon X3 CM reveals its fundamental problem: there isn't a clear use case where it's the best choice.&lt;/p&gt;
&lt;p&gt;Beginners: Absolutely not. The broken sudo, network configuration challenges, scattered documentation, and proprietary toolchain create insurmountable barriers for someone learning robotics development. A beginner choosing the X3 will spend 90% of their time fighting the platform and 10% actually learning robotics.&lt;/p&gt;
&lt;p&gt;Intermediate developers: Still no. Someone with Linux experience and basic robotics knowledge will be frustrated by the X3's limitations. They have the skills to configure the system, but they'll quickly realize they're wasting time on a platform that's slower, less documented, and more restrictive than alternatives.&lt;/p&gt;
&lt;p&gt;Advanced developers: Why would they choose this? An advanced developer evaluating SBC options will immediately recognize the Cortex-A53's limitations, the proprietary BPU lock-in, and the ecosystem fragmentation. They'll choose a Raspberry Pi with modular acceleration, or an NVIDIA Jetson if they need serious AI performance, or an x86 platform if they need raw CPU power.&lt;/p&gt;
&lt;p&gt;Automotive developers: This is Horizon's actual target market, but they're not using the off-the-shelf RDK X3 boards. They're integrating the Sunrise chips into custom hardware with proprietary board support packages, automotive-grade Linux distributions, and Horizon's professional support contracts.&lt;/p&gt;
&lt;p&gt;The hobbyist robotics market that the RDK X3 ostensibly targets is better served by literally any other option. The Raspberry Pi ecosystem offers superior hardware, vastly better documentation, more active communities, and modular expandability. Even the aging Raspberry Pi 4 is arguably a better choice than the X3 CM for most robotics projects.&lt;/p&gt;
&lt;h3&gt;Conclusion: An Irrelevant Platform in 2025&lt;/h3&gt;
&lt;p&gt;The Horizon X3 CM represents a failed experiment in bringing automotive AI technology to the robotics hobbyist market. The hardware is built on outdated ARM cores that were unimpressive when they launched in 2012 and are thoroughly inadequate in 2025. The AI acceleration, while technically present, is hamstrung by weak CPUs, proprietary tooling, and an abandoned software ecosystem. The software distributions ship broken, requiring extensive manual fixes to achieve basic functionality.&lt;/p&gt;
&lt;p&gt;Our performance testing confirms what the specifications suggest: the X3 CM is approximately five times slower than a current-generation Raspberry Pi 5 for CPU-bound workloads. Both our real-world Rust compilation benchmarks and industry-standard Geekbench 6 synthetic tests show consistent results - the X3 CM delivers single-core performance 6x slower and multi-core performance 4-7x slower than modern competition. The BPU's 5 TOPS of AI acceleration cannot compensate for this massive performance deficit, and the proprietary nature of the Bernoulli architecture creates vendor lock-in without providing compelling advantages.&lt;/p&gt;
&lt;p&gt;The documentation situation is dire, with information scattered across multiple sites in multiple languages, many links pointing to archived or defunct resources. The corporate structure - Horizon Robotics abandoning public development while D-Robotics maintains forks - raises serious questions about long-term support and viability.&lt;/p&gt;
&lt;p&gt;For anyone considering robotics development in 2025, the recommendation is clear: avoid the Horizon X3 CM. If you're a beginner, start with a Raspberry Pi 5 - you'll have vastly more resources available, a supportive community, and hardware that won't frustrate you at every turn. If you're an intermediate or advanced developer, the Pi 5 with optional AI acceleration (Coral TPU, Hailo-8) will give you more flexibility, better performance, and a lower total cost of ownership. If you need serious AI horsepower, look at NVIDIA's Jetson line, which provides professional-grade AI acceleration with mature tooling and extensive documentation.&lt;/p&gt;
&lt;p&gt;The Horizon X3 CM is a platform that perhaps made sense when announced in 2020-2021, competing against the Raspberry Pi 4 and targeting a market that was just beginning to explore edge AI. But time has not been kind. The ARM cores have aged poorly, the software ecosystem never achieved critical mass, and the corporate support has evaporated. In 2025, choosing the Horizon X3 CM for a new robotics project is choosing to fight your tools rather than build your robot.&lt;/p&gt;
&lt;p&gt;The most damning evidence is this: even the Orange Pi RV2, running a brand-new RISC-V processor with an immature compiler toolchain and experimental software stack, is only 1.72x slower than the X3 CM. An experimental architecture with bleeding-edge hardware and alpha-quality software performs almost as well as an established ARM platform with supposedly mature tooling. Both our real-world compilation benchmarks and Geekbench 6 synthetic tests confirm the X3 CM's performance is comparable to a decade-old iPhone 6s processor - a smartphone chip from 2015 outperforms this 2021-2022 era robotics development platform. This speaks volumes about just how underpowered and poorly optimized the Horizon X3 CM truly is.&lt;/p&gt;
&lt;p&gt;Save yourself the frustration. Build your robot on a platform that respects your time, provides the tools you need, and has a future. The Raspberry Pi ecosystem is the obvious choice, but almost any alternative - even commodity x86 mini-PCs - would serve you better than the Horizon X3 CM.&lt;/p&gt;
&lt;h3&gt;Specifications Summary&lt;/h3&gt;
&lt;p&gt;For reference, here are the complete specifications of the Horizon X3 CM:&lt;/p&gt;
&lt;p&gt;Processor:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Sunrise X3 SoC (16nm process)&lt;/li&gt;
&lt;li&gt;Quad-core ARM Cortex-A53 @ 1.5 GHz&lt;/li&gt;
&lt;li&gt;Single ARM Cortex-R5 core&lt;/li&gt;
&lt;li&gt;Dual-core Bernoulli 2.0 BPU (5 TOPS AI inference)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Memory &amp;amp; Storage:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;2GB or 4GB LPDDR4 RAM&lt;/li&gt;
&lt;li&gt;8GB/16GB/32GB eMMC options&lt;/li&gt;
&lt;li&gt;MicroSD card slot&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Video:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;4K@60fps H.264/H.265 encoding&lt;/li&gt;
&lt;li&gt;4K@60fps decoding&lt;/li&gt;
&lt;li&gt;HDMI 2.0 output&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Interfaces:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;2x MIPI CSI (camera input)&lt;/li&gt;
&lt;li&gt;1x MIPI DSI (display output)&lt;/li&gt;
&lt;li&gt;2x USB 3.0&lt;/li&gt;
&lt;li&gt;Gigabit Ethernet&lt;/li&gt;
&lt;li&gt;40-pin GPIO header&lt;/li&gt;
&lt;li&gt;I2C, SPI, UART, PWM&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Physical:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;200-pin board-to-board connector (CM4-compatible)&lt;/li&gt;
&lt;li&gt;Dimensions: 55mm x 40mm&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Software:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Ubuntu 20.04/22.04 based distributions&lt;/li&gt;
&lt;li&gt;ROS2 support (in theory)&lt;/li&gt;
&lt;li&gt;Horizon OpenExplorer development tools&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Benchmark Performance:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Rust compilation: 378.81 seconds average (5x slower than Raspberry Pi 5)&lt;/li&gt;
&lt;li&gt;Geekbench 6 Single-Core: 127 (6x slower than Raspberry Pi 5)&lt;/li&gt;
&lt;li&gt;Geekbench 6 Multi-Core: 379 (4-7x slower than modern ARM SBCs)&lt;/li&gt;
&lt;li&gt;Geekbench Link: https://browser.geekbench.com/v6/cpu/14816041&lt;/li&gt;
&lt;li&gt;Relative performance: 1.72x faster than experimental RISC-V, 6x slower than modern ARM&lt;/li&gt;
&lt;li&gt;Performance comparable to iPhone 6s (2015) in single-core workloads&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Recommendation: Avoid. Use Raspberry Pi 5 or equivalent instead.&lt;/p&gt;</description><category>ai acceleration</category><category>arm cortex a53</category><category>benchmarks</category><category>bpu</category><category>edge ai</category><category>geekbench</category><category>hardware review</category><category>horizon robotics</category><category>robotics</category><category>ros2</category><category>single board computers</category><category>sunrise x3</category><category>x3 cm</category><guid>https://tinycomputers.io/posts/horizon-robotics-x3-cm-review.html</guid><pubDate>Mon, 03 Nov 2025 18:48:52 GMT</pubDate></item></channel></rss>