```html
OpenAI Releases GPT-4.1: Focus on Coding and AI Agents
OpenAI has introduced a new generation of its language models: GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano. These models are specifically aimed at software developers and promise improvements in code generation and instruction following compared to their predecessors, GPT-4o and GPT-4o mini. Key innovations include an updated knowledge base (June 2024) and extended context windows of up to one million tokens, enabling the processing of extensive requests with up to 750,000 words. Access to GPT-4.1 is exclusively available through the API; integration into ChatGPT is not currently planned.
Performance Comparison: Benchmarks and Evaluations
OpenAI's internal evaluations show that GPT-4.1 achieves similar results to GPT-4.1 mini, GPT-4.5, o1, and o3-mini in following instructions. While GPT-4.1 nano performs slightly worse, it is on a similar level compared to GPT-4o and GPT-4o mini. In the Multi-Challenge Benchmark, GPT-4.1 surpasses the smaller Mini model but lags behind the Reasoning models and GPT-4.5.
In the area of coding, OpenAI refers to the SWE-bench Verified Benchmark, which tests the performance of language models based on 500 programming tasks. Here, GPT-4.1 achieves a solution rate of approximately 55 percent. In comparison, Google's Gemini 2.5 Pro and Anthropic's Claude 3.7 Sonnet each achieve around 63 percent. Within the OpenAI models, however, GPT-4.1 positions itself ahead of GPT-4o (33 percent), GPT-4.5 (38 percent), and OpenAI o3-mini (49 percent).
Another focus of GPT-4.1 is on improving front-end coding and reducing post-processing effort. The development and revision of interfaces is also expected to be optimized by the new models. In Aider's Polyglot Benchmark, which tests the ability to edit individual code blocks, GPT-4.1 achieves a solution rate of 53 percent, trailing behind OpenAI o1 and o3-mini (each approximately 60 percent).
Outlook and Pricing
OpenAI emphasizes that GPT-4.1 offers a good balance between performance and efficiency. While the large model performs well in benchmarks, the smaller models offer a faster and more cost-effective alternative. The prices for GPT-4.1 are 2 US dollars per million input tokens and 8 US dollars per million output tokens. For GPT-4.1 mini, the costs are 0.40 US dollars and 1.60 US dollars respectively, and for GPT-4.1 nano, 0.10 US dollars and 0.40 US dollars per million input and output tokens respectively. OpenAI also plans to discontinue the preview of GPT-4.5. The new models are intended to enable the development of AI agents that can support real-world software development.
```