Anthrope has just launched a new model called Claude 3.7 Sonnet, and although I am always interested in the last capabilities of AI, it was the new “widespread” mode that really caught my attention. He reminded me how Operai first debuted his O1 model for Chatgpt. It offered a way to access O1 without leaving a window with the chatgpt 4th model. You could write “/reason”, and the chatbot AI would use O1 instead. It is now superfluous, although it still works in the application. Anyway, the deepest and most structured reasoning promised by both made me want to see how they would work with each other.
Claude 3.7’s extended mode is designed to be a hybrid reasoning tool, which gives users the option to alternate between rapid and conversational responses and the resolution of in -depth problems and step by step. You need time to analyze your notice before delivering your answer. That makes it excellent for mathematics, coding and logic. You can even adjust the balance between speed and depth, giving you a time limit to think about your response. Anthropic positions this as a way of making AI more useful for real -world applications that require the methodical resolution of layers in layers, instead of surface level responses.
Access Claude 3.7 requires a subscription to Claude Pro, so I decided to use the video demonstration below as proof. To challenge the extended way of thinking, Anthrope asked the AI to analyze and explain the popular vintage probability puzzles known as the Mushy Hall problem. It is a misleadingly complicated question that touches many people, even those who consider themselves good in mathematics.
The configuration is simple: you are in a game program and ask to choose one of the three doors. Behind one there is a car; Behind others, goats. A whim, Anthrope decided to go with crabs instead of goats, but the beginning is the same. After making his choice, the host, who knows what is behind each door, opens one of the remaining two to reveal a goat (or crab). Now you have an option: stay with your original selection or change to the last unopening door. Most people assume that it doesn’t matter, but contraintuitively, changing actually gives a 2/3 probability to win, while staying with their first option leaves it with just a 1/3 probability.
Clear bad options
Attend
With the extended thought enabled, Claude 3.7 adopted a measured and almost academic approach to explain the problem. Instead of just declaring the correct answer, he carefully placed the underlying logic in multiple steps, emphasizing why the probabilities change after the host reveals a crab. He not only explained in terms of dry mathematics either. Claude ran through hypothetical scenarios, demonstrating how probabilities developed on repeated tests, which makes it much easier to understand why change is always the best movement. The answer was not hurried; I felt that a teacher guided me through him in a slow and deliberate way, ensuring that I really understood why common intuition was wrong.
Chatgpt O1 offered only a great breakdown and explained the problem well. In fact, he explained it in multiple forms and styles. Together with the basic probability, he also went through the theory of game, narrative opinions, psychological experience and even an economic breakdown. In any case, it was a bit overwhelming.
Gameplay
However, that is not all that Claude’s extended thinking could do. As you can see in the video, Claude could even make a version of Monty Hall’s problem in a game you could play in the window. Trying the same warning with Chatgpt O1 did not do the same. Instead, Chatgpt wrote an HTML script for a simulation of the problem that could save and open in my browser. It worked, as you can see below, but took some additional steps.
While it is almost certain that there are small quality differences depending on the type of code or mathematics in which it is working, both Claude’s extended thinking and the Chatgpt model offer solid and analytical approaches to logical problems. I can see the advantage of adjusting the time and depth of reasoning that Claude offers. That said, unless you really are in a hurry or demand an unusually heavy analysis, Chatgpt does not take too long and produces enough content of your reflection.
The ability to do the problem as a simulation within the chat is much more notable. It makes Claude feel more flexible and powerful, even if the real simulation probably uses a code very similar to the HTML written by Chatgpt.