Anthropic just released a new model called Claude 3.7 Sonnet, and while I’m always interested in the latest AI capabilities, it was the new “extended” mode that really drew my eye. It reminded me of how OpenAI first debuted its o1 model for ChatGPT. It offered a way of accessing o1 without leaving a window using the ChatGPT 4o model. You could type “/reason,” and the AI chatbot would use o1 instead. It’s superfluous now, though it still works on the app. Regardless, the deeper, more structured reasoning promised by both made me want to see how they would do against one another.
Claude 3.7’s Extended mode is designed to be a hybrid reasoning tool, giving users the option to toggle between quick, conversational responses and in-depth, step-by-step problem-solving. It takes time to analyze your prompt before delivering its answer. That makes it great for math, coding, and logic. You can even fine-tune the balance between speed and depth, giving it a time limit to think about its response. Anthropic positions this as a way to make AI more useful for real-world applications that require layered, methodical problem-solving, as opposed to just surface-level responses.
Accessing Claude 3.7 requires a subscription to Claude Pro, so I decided to use the demonstration in the video below as my test instead. To challenge the Extended thinking mode, Anthropic asked the AI to analyze and explain the popular, vintage probability puzzle known as the Monty Hall Problem. It’s a deceptively tricky question that stumps a lot of people, even those who consider themselves good at math.
The setup is simple: you’re on a game show and asked to pick one of three doors. Behind one is a car; behind the others, goats. At a whim, Anthropic decided to go with crabs instead of goats, but the principle is the same. After you make your choice, the host, who knows what’s behind each door, opens one of the remaining two to reveal a goat (or crab). Now you have a choice: stick with your original pick or switch to the last unopened door. Most people assume it doesn’t matter, but counterintuitively, switching actually gives you a 2/3 chance of winning, while sticking with your first choice leaves you with just a 1/3 probability.
Crabby Choices

With Extended Thinking enabled, Claude 3.7 took a measured, almost academic approach to explaining the problem. Instead of just stating the correct answer, it carefully laid out the underlying logic in multiple steps, emphasizing why the probabilities shift after the host reveals a crab. It didn’t just explain in dry math terms, either. Claude ran through hypothetical scenarios, demonstrating how the probabilities played out over repeated trials, making it much easier to grasp why switching is always the better move. The response wasn’t rushed; it felt like having a professor walk me through it in a slow, deliberate manner, ensuring I truly understood why the common intuition was wrong.
ChatGPT o1 offered just much of a break down, and explained the issue well. In fact, it explained it in multiple forms and styles. Along with the basic probability, it also went through game theory, the narrative views, the psychological experience, and even an economic breakdown. If anything, it was a little overwhelming.
Gameplay
That’s not all Claude’s Extended thinking could do, though. As you can see in the video, Claude was even able to make a version of the Monty Hall Problem into a game you could play right in the window. Attempting the same prompt with ChatGPT o1 didn’t do quite the same. Instead, ChatGPT wrote an HTML script for a simulation of the problem that I could save and open in my browser. It worked, as you can see below, but took a few extra steps.
While there are almost certainly small differences in quality depending on what kind of code or math you’re working on, both Claude’s Extended thinking and ChatGPT’s o1 model offer solid, analytical approaches to logical problems. I can see the advantage of adjusting the time and depth of reasoning that Claude offers. That said, unless you’re really in a hurry or demand an unusually heavy bit of analysis, ChatGPT doesn’t take up too much time and produces quite a lot of content from its pondering.
The ability to render the problem as a simulation within the chat is much more notable. It makes Claude feel more flexible and powerful, even if the actual simulation likely uses very similar code to the HTML written by ChatGPT.