Raspberry Pi 5 gets AI HAT+ 2 with LLM and VLM support, finally running generative AI entirely on the device

Raspberry Pi AI HAT+ 2 allows Raspberry Pi 5 to run LLM locally
Hailo-10H Accelerator Delivers 40 TOPS of INT4 Inference Power
PCIe interface enables high bandwidth communication between the board and Raspberry Pi 5

Raspberry Pi has expanded its edge computing ambitions with the launch of AI HAT+ 2, an add-on board designed to bring generative AI workloads to the Raspberry Pi 5.

Previous AI HAT hardware focused almost entirely on computer vision acceleration, handling tasks such as object detection and scene segmentation.

The new board extends that reach by supporting large language models and vision language models that run locally, without relying on cloud infrastructure or persistent network access.

Hardware changes that enable local language models.

At the center of the upgrade is the Hailo-10H neural network accelerator, which delivers 40TOPS of INT4 inference performance.

Unlike its predecessor, the AI HAT+ 2 features 8GB of dedicated onboard memory, allowing larger models to run without consuming system RAM on the host Raspberry Pi.

This change enables direct execution of LLM and VLM on the device while maintaining low latency and local data, which is a key requirement for many edge deployments.

Using a standard Raspberry Pi distribution, users can install compatible models and access them through familiar interfaces, such as browser-based chat tools.

The AI HAT+ 2 connects to the Raspberry Pi 5 via the GPIO header and relies on the system’s PCIe interface for data transfer, ruling out compatibility with the Raspberry Pi 4.

This connection supports high-bandwidth data transfer between the accelerator and the host, which is essential for moving model inputs, outputs, and camera data efficiently.

Demonstrations include text-based question answering with Qwen2, code generation using Qwen2.5-Coder, basic translation tasks, and visual descriptions of scenes from live camera feeds.

These workloads rely on AI tools packaged to run within the Pi software stack, including containerized backends and local inference servers.

All processing occurs on the device, without external computing resources.

Supported models range from 1 billion to 500 million parameters, which is modest compared to cloud-based systems that operate at much larger scales.

These smaller LLMs focus on limited memory and power capabilities rather than broad, general-purpose knowledge.

To address this limitation, AI HAT+ 2 supports fine-tuning methods, such as Low-Rank Adaptation, which allows developers to customize models for specific tasks while leaving most parameters unchanged.

Vision models can also be retrained using application-specific data sets through the Hailo toolchain.

The AI HAT+ 2 is available for $130, putting it above previous vision-focused accessories while still offering similar computer vision performance.

For workloads focused solely on image processing, the upgrade offers limited benefits, as its appeal depends largely on local LLM execution and privacy-sensitive applications.

In practical terms, the hardware shows that generative AI on Raspberry Pi hardware is now feasible, although limited memory space and small model sizes remain an issue.

Follow TechRadar on Google News and add us as a preferred source to receive news, reviews and opinions from our experts in your feeds. Be sure to click the Follow button!

And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form and receive regular updates from us on WhatsApp also.

Must Read

Leave a Comment Cancel Reply