OpenAI has announced the release of a new family of artificial intelligence models designed to excel at coding, in an effort to compete with companies such as Google and Anthropic. These models are accessible to developers through OpenAI’s application programming interface (API).
The release includes three model sizes: GPT 4.1, GPT 4.1 Mini, and GPT 4.1 Nano. According to Kevin Weil, OpenAI’s Chief Product Officer, who spoke during a livestream event, these new models outperform the widely used GPT-4o model and surpass the capabilities of the larger GPT-4.5 model in certain aspects.
The GPT-4.1 model achieved a score of 55 percent on SWE-Bench, a recognized benchmark for assessing the coding skills of AI models. This score exceeds that of previous models from OpenAI. Weil highlighted that the new models excel in coding, following complex instructions, and building AI agents.
Recent advancements in AI models have significantly enhanced their ability to write and edit code, facilitating more automated methods of software prototyping and enhancing the capabilities of AI agents. Competitors such as Anthropic and Google have also introduced models with a strong focus on coding proficiency.
The release of GPT-4.1 had been anticipated for weeks, with reports indicating that OpenAI tested the model under the pseudonym Alpha Quasar on popular leaderboards. Users of this “stealth” model have reported remarkable coding capabilities, with one individual on Reddit noting that Quasar resolved open issues they encountered with other code-generated models.
The new models have the capacity to analyze eight times more code simultaneously, thereby improving their ability to enhance and debug. They also show improved performance in following user instructions, reducing the need for repeated commands to achieve the desired outcomes. OpenAI demonstrated the capabilities of GPT-4.1 in creating various applications, including a flashcard app for language learning.
Michelle Pokrass, who is involved in post-training at OpenAI, emphasized in the livestream the company’s efforts to enhance the model’s ability to write functional code, explore repositories, run unit tests, and write compilable code.
The GPT-4.1 model is 40 percent faster than the previous GPT-4o and has reduced the cost of user queries by 80 percent, according to OpenAI.
During the livestream, Varun Mohan, CEO of Windsurf, a widely used AI coding tool, revealed that his company had tested GPT-4.1 and found it to be 60 percent more effective than GPT-4o based on their benchmarks. Mohan also noted a substantial reduction in instances of degenerate behavior with the new model, as it spends less time processing irrelevant files.
In the past couple of years, OpenAI has leveraged interest in ChatGPT, a notable chatbot introduced in late 2022, to grow its business in providing access to advanced chatbots and AI models. As reported in a TED interview last week with CEO Sam Altman, OpenAI has reached 500 million weekly active users, with usage increasing rapidly.