OpenAI's GPT-4.5 dominates multiple categories on Chatbot Arena

Last week, OpenAI introduced GPT-4.5, its largest frontier model to date. OpenAI claimed that GPT-4.5 is the most knowledgeable model yet and that it was built by further scaling the pre-training process. In addition to having more knowledge, the GPT-4.5 model features improved writing skills and a refined personality when compared to OpenAI’s older models.

Today, the GPT-4.5 model made its debut on Chatbot Arena with the #1 position across most categories. GPT-4.5 topped the following categories, with a clear lead in Multi-Turn. GPT-4.5 is also leading on the Style Control leaderboard.

Multi-Turn
Hard Prompts
Coding
Math
Creative Writing
Instruction Following
Longer Query

xAI’s latest Grok-3 model (grok-3-preview-02-24) also made its debut on the Arena leaderboard with the #1 position on Hard Prompts (English) and tied #1 overall, and in Coding, Math, Creative Writing, Instruction Following, and Longer Query. The rapid improvements showcased by GPT-4.5 and Grok-3 highlight the intensifying competition within the AI landscape.

OpenAI's GPT-4.5 has topped several other AI benchmarks. It scored #1 in the Elimination Game Benchmark. The Elimination Game is a multi-player tournament that tests LLMs in social reasoning, strategy, and deception. In IQ Test Score rankings, GPT-4.5 performed better than all other non-reasoning models in the industry. On the SimpleQA Hallucination Rate benchmark, GPT-4.5 scored the lowest among all of OpenAI's large language models.

Last month, OpenAI CEO Sam Altman revealed that GPT-4.5 is OpenAI's last non-chain-of-thought model. Additionally, OpenAI will no longer release o3 as a standalone model. Instead, OpenAI will unify the o-series and GPT-series models by creating systems that can determine the appropriate thinking time based on the user query.

Sam Altman also confirmed that even ChatGPT free tier users will have access to GPT-5, but under the standard intelligence setting. ChatGPT Plus subscribers will be able to run GPT-5 at a higher level of intelligence, while Pro subscribers will be able to run GPT-5 at an even higher level of intelligence. Furthermore, the unified model will support all existing ChatGPT features, such as voice, canvas, search, deep research, and more.