|

OpenAI launches Flex API to Minimise the Cost for Low Priority Tasks

OpenAI introduced new API pricing model known as “Flex processing.” This API option is designed to minimise the costs of using AI models of OpenAI. However, the reduced pricing comes with slower processing speed, delayed response times, and potential resource unavailability. This makes Flex Processing suitable for low priority tasks such as model evaluations, data enrichment and asynchronous workloads.

The launch of Flex Processing by OpenAI is the result of increasing competition from rivals like Google, Anthropic and others which are constantly evolving their respective AI models with high efficiency and sensitive pricing. For example Google push cheaper AI models, such as Gemini 2.5 Flash, into the market.

Flex processing is available in beta for both the large language models (LLM’s ) of OpenAI- o3 and o4-mini, which are trail blazer for reasoning tasks. Under Flex Processing, these AI models charge less than the standard rates by offering 50% discount on API usage.

The standard rate for using o3 AI model is $10 per million input tokens and $40 per million output tokens, while under Flex tier, price reduced to half of the standard rate that is $5 and $20 for per million input and output tokens respectively.

Similarly, for o4-mini model, cost drop to $0.55 per million input tokens and $2.20 per million output tokens compared to previous rates of $1.1 0 and $4.40.

What is “Flex Processing?”

Flex Processing is OpenAI’s newly introduced API service tier designed to offer developers a cost-effective solution for handling non-urgent AI tasks. By accepting slower response times and occasional resource unavailability, users can access OpenAI’s powerful model at a significantly reduced cost.

What does it mean to developers?

Flex Processing is built for tasks that don’t need instant result. By opting for Flex Processing, developers can significantly reduce costs while handling large volumes of data or conducting extensive testing.

Due to its reduced cost it will be more widely accessible and will be useful for small developers too as these users usually care more about cost than real-time performance and AI models can be expensive- especially as global development costs keep rising.

Moreover, those building AI-Power tools can experiment and test models without significant financial investment.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *