- Exo supports Llama, Mistral, Llava, Qwen and Deepseek
- It can be executed in Linux, Macos, Android and iOS, but not Windows
- The AI models need 16 GB of RAM can be executed on two laptops of 8 GB
The execution of large language models (LLM) generally requires expensive and high -performance hardware with substantial memory and GPU power. However, the EXO software now seeks to offer an alternative to enable the inference of artificial intelligence (AI) distributed in a network of devices.
The company allows users to combine the computer power of multiple computers, smartphones and even single -plate computers (SBC) such as Raspberry Pis to run models that would otherwise be inaccessible.
This decentralized approach shares similarities with the Seti@home project, which distributed computer tasks in voluntary machines. By taking advantage of a network of equal to equal (P2P), Exo eliminates the need for a unique and powerful system, which makes IA inference more accessible to people and organizations.
How Exo distributes workloads of AI
Exo aims to challenge the domain of large technology companies in the development of artificial intelligence. Through the decentralization of inference, it seeks to provide people with the smallest organizations more control over AI models, similar to initiatives focused on expanding access to GPU resources.
“The fundamental restriction with AI is the calculation,” argues Alex Cheema, co -founder of Exo Labs. “If he does not have the calculation, he cannot compete. But if he creates this distributed network, we may” we can. “
The software dynamically divides LLM into the devices available in a network, assigning model layers based on memory and processing power of each machine. The LLM with support include flame, mistral, llava, qwen and deepseek.
Users can install EXO in Linux, Macos, Android or iOS, although Windows support is not currently available. A minimum Python version of 3.12.0 is required, together with additional units for systems that execute Linux equipped with NVIDIA GPU.
One of the key EXO strengths is that, unlike traditional configurations that depend on high -end GPUs, it allows collaboration between different hardware configurations.
For example, an AI model that requires 16 GB of RAM can be executed on two laptops of 8 GB that work together. A more demanding model as Deepseek R1, which requires approximately 1.3TB of RAM, could theoretically operate in a cluster of 170 Raspberry Pi 5 devices with 8 GB of RAM each.
The speed and latency of the network are critical concerns, and EXO developers recognize that adding devices of lower performance can slow the latency of inference, but insists that the general performance improves with each device added to the network.
Safety risks also arise when multiple machines share workloads, which require safeguards to avoid data leaks and unauthorized access.
Adoption is another obstacle, since IA tool developers currently depend on large -scale data centers. The low cost of the EXO approach can appeal. But the EXO approach will simply not coincide with the speed of those high -end AI groups.
Through CNX software