As you will have seen, OpenAi has just launched two new AI models: GPT-ASS-20B and GPT-ASS-20B, which are the first open weight models of the company from GPT-2.
These two models, one is more compact and the other much larger, are defined by the fact that it can execute them locally. They will work on their PC or laptop desktop, right on the device, without the need to connect or touch the cloud power, provided that their hardware is powerful enough.
Therefore, you can download the 20B version, or, if its PC is a powerful machine, the 120b turn, and play with it on your computer, verify how it works (in text fashion to text) and how the model thinks (its whole reasoning process is divided into steps). And, in fact, you can adjust and build on these open models, although safety railings and censorship measures, of course, will be in place.
But what kind of hardware do you need to execute these AI models? In this article, I am examining the PC specifications requirements for both GPT-Oss-20B, the most restricted model that packs 21 billion parameters, and GPT-ASS-120b, which offers 117 billion parameters. The latter is designed for the use of the data center, but will be executed on a high-end PC, while GPT-Oss-20B is the model designed specifically for consumption devices.
In fact, announcing these new AI models, Sam Altman referred to 20b working not only on laptops in the mills, but also on smartphones, but it is enough to say that it is a ambitious Claim that I will return later.
These models can be downloaded from Hugging Face (here are GPT-Oss-20b and here there are GPT-AS-120b) under the Apache 2.0 license, or for simply curious, there is an online demonstration that can be consulted (it is not necessary to download).
The smallest GPT-Oss-20b model
MINIMUM RAM IS NEEDED: 16 GB
The official OpenAI documentation simply presents a required amount of RAM for these AI models, which in the case of this more compact GPT-20B effort is 16 GB.
This means that you can run GPT-Oss-20B on any laptop or PC that has 16 GB of system memory (or 16 GB of video RAM, or a combo of both). However, it is largely one more, better, or faster case, rather. The model can see along with that minimum of 16 GB, and ideally, you will want a little more barrel.
As for the CPU, AMD recommends the use of a CPU Ryzen AI 300 series matched with 32 GB of memory (and half of that, 16 GB, established in the memory of variable graphics). For the GPU, AMD recommends any RX 7000 or 9000 model that has 16 GB of memory, but these are not hard and fast requirements as such.
Actually, the key factor is simply having enough memory: the mentioned assignment of 16 GB, and preferably having all that in its GPU. This allows all work to take place on the graphics card, without decreasing by having to download some of it to the memory of the PC system. Although the so -called mixture of experts, or MOE, Design Openai has used here helps minimize this performance drag, fortunately.
Anecdotally, to choose an example started from Reddit, GPT-Oss-20b works well in a M3 M3 with 18 GB.
The largest GPT-Oss-20b model
RAM Necessary: 80 GB
It is the same general treatment with the most robust GPT-OS-120b model, except as it can assume a lot More memory. Officially, this means 80 GB, even if you remember that you don’t have to have all that RAM on your graphics card. That said, this great AI model is really designed for the use of the data center in a GPU with 80 GB of memory on board.
However, RAM’s assignment can be divided. Therefore, you can run GPT-Oss-20B on a computer with 64 GB of system memory and a 24 GB graphics card (a NVIDIA RTX 3090 TI, for example, according to this redditor), which makes a total of 88 GB of RAM be grouped.
AMD’s recommendation in this case, CPU-Wise, is for its Ryzen AI Max+ 395 processor, together with 128 GB of RAM of the system (and 96 GB of ESO assigned as memory of variable graphics).
In other words, you are watching a high-end high-end laptop or desktop (perhaps with multiple GPU) for GPT-Oss-20b. However, it is possible that it can get yours with a little less than the 80 GB stipulated by memory, through some anecdotal reports, although it would not cost me in any way.
How to execute these models on your PC
Assuming that it meets the requirements of the system described above, you can execute any of these new GPT-ASS launches in Ollama, which is OpenAI’s choice platform to use these models.
Go here to get Olama for your PC (Windows, Mac or Linux): Click the Executable Download button, and when you have finished downloading, double click on the executable file to run it and click Install.
Next, run the following two commands in Ollama to obtain and then execute the model you want. In the example, we are running GPT-Oss-20B, but if you want the largest model, simply replace 20B with 120b.
ollama pull gpt-oss:20b
ollama run gpt-oss:20b
If you prefer another option instead of Ollama, you can use LM Studio instead, using the following command. Again, you can change 20b for 120b, or vice versa, as appropriate:
lms get openai/gpt-oss-20b
Windows 11 (or 10) users can exercise the Windows Ai Foundry option (hat tip at the edge).
In this case, you must install local Foundry, however, there is a warning here, and this is still in preview, see this guide to see the complete instructions on what to do. In addition, keep in mind that at this time you will need a NVIDIA graphics card with 16 GB of VRM on board (although other GPUs, such as AMD Radeon models, will eventually be compatible; remember, this remains a previous version).
In addition, the macOS support “will arrive soon,” they tell us.
What about smartphones?
As noted from the beginning, while Sam Altman said that the smallest AI model runs on a phone, that statement is pressing it.
It is true that Qualcomm issued a press release (as seen by Android Authority) on GPT-Oss-20b that runs on devices with a Snapdragon chip, but it is more about laptops: co-pilot+ PC that have Snapdragon X Silicon, instead of smartphone CPU.
Executing GPT-Oss-20B is not a realistic proposal for today’s phones, although it may be possible in a technical sense (assuming your phone has 16 GB+ RAM). Even so, I doubt the results are impressive.
However, we are not far from these types of models to work properly on mobile phones, and this will surely be in the cards for the future almost sufficient.