Blog |

How to Resolve ChatGPT Rate Limit Errors

How to Resolve ChatGPT Rate Limit Errors
Table of Contents

Ever get overwhelmed by a chatty best friend asking a lot of questions? That's what the ChatGPT API "Over the Rate Limit" error is like. It's telling you, "Woah, slow down! Let me take a sip of water before we continue." In this guide, we will shed light on the rate limit and show you several ways to make your API requests more efficient to ensure you face as few interruptions as possible.

What is the Rate Limit?

ChatGPT’s API imposes a constraint on the number of times you can send requests or messages to the server within a given time period. This limitation is called the rate limit. There are actually two rate limits:

  1. RPM (requests per minute)
  2. TPM (tokens per minute)

The below table showcases the default rate limits for ChatGPT’s API.

Text & Embedding Chat Edit Image Audio
Free trial users 3 RPM
150,000 TPM
3 RPM
40,000 TPM
3 RPM
150,000 TPM
5 images / min 3 RPM
Pay-as-you-go users (first 48 hours) 60 RPM
250,000 TPM
60 RPM
60,000 TPM
20 RPM
150,000 TPM
50 images / min 50 RPM
Pay-as-you-go users (after 48 hours) 3,500 RPM
350,000 TPM
3,500 RPM

90,000 TPM

20 RPM
150,000 TPM
50 images / min 50 RPM

However, there is also an option to fill out the OpenAI API Rate Limit Increase Request form in order to increase your limit, in case you have higher rate limit requirements.

What causes the “Over the Rate Limit” error?

Simply put, you’re making an excessive number of API queries in a short period of time. Rate limits ensure an equitable utilization of the system resources and prevent overuse by any particular programmer.

Example: “Over the Rate Limit” Error in Java

The below Java code example generates the “Over the Rate Limit” error:

import java.io.*;
import java.net.HttpURLConnection;
import java.net.URL;

public class ChatGPTAPIExample {
   public static String chatGPT(String prompt) {
       String url = "https://api.openai.com/v1/chat/completions";
       String apiKey = "YOUR API KEY";
       String model = "gpt-3.5-turbo";

       try {
           URL obj = new URL(url);
           HttpURLConnection connection = (HttpURLConnection) obj.openConnection();
           connection.setRequestMethod("POST");
           connection.setRequestProperty("Authorization", "Bearer " + apiKey);
           connection.setRequestProperty("Content-Type", "application/json");

           // The request body
           String body = "{\"model\": \"" + model + "\", \"messages\": [{\"role\": \"user\", \"content\": \"" + prompt + "\"}]}";
           connection.setDoOutput(true);
           OutputStreamWriter writer = new OutputStreamWriter(connection.getOutputStream());
           writer.write(body);
           writer.flush();
           writer.close();

           // Response from ChatGPT
           BufferedReader br = new BufferedReader(new InputStreamReader(connection.getInputStream()));
           String line;

           StringBuffer response = new StringBuffer();

           while ((line = br.readLine()) != null) {
               response.append(line);
           }
           br.close();

           // calls the method to extract the message.
           return extractMessageFromJSONResponse(response.toString());

       } catch (IOException e) {
           throw new RuntimeException(e);
       }
   }

   public static String extractMessageFromJSONResponse(String response) {
       int start = response.indexOf("content")+ 11;

       int end = response.indexOf("\"", start);

       return response.substring(start, end);

   }

   public static void main(String[] args) {

           System.out.println(chatGPT("hello, how are you? Can you tell what's a Fibonacci Number"));

           System.out.println(chatGPT("Is it same as factorial of a number?"));

           System.out.println(chatGPT("Is it same armstrong number?"));

   }

}

Output:

Exception in thread "main" java.lang.RuntimeException Create breakpoint  : java.io.IOException: Server returned HTTP response code:429 for URL:https://api.openai.com/v1/chat/completions
   at ChatGPTAPIExample.chatGPT(ChatGPTAPIExample.java:44)
   at ChatGPTAPIExample.main (ChatGPTAPIExample.java:67)
Caused by: java.io.IOException Create breakpoint : Server returned HTTP response code: 429 for URL: https://api.openai.com/v1/chat/completions
   at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1997)
   at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1589)
   at java.base/sun.net.www.protocol.http.HttpURLConnectionImpl.getInputStream(HttpURLConnectionImp.java:224)
at ChatGPTAPIExample.chatGPT(ChatGPTAPIExample.java:30)
…1 more

Process finished with exit code 1

The HTTP response code 429 indicates that too many requests were made and you have exceeded the rate limits set by the OpenAI API.

Note: The above code example does not guarantee that it will consistently generate an "Over the Rate Limit" error. The behavior of the OpenAI API, including error responses, can vary based on several factors such as server load, usage patterns, and rate limits.

How to Resolve the Over The Rate Limit Error

Though going over the rate limit can be disruptive, there are ways to address this issue:

  • Check the API documentation: Rate limits may change and it’s a good idea to view OpenAI’s Rate Limits page to see if that’s the case, especially as new models get introduced.
  • Monitor usage and plan ahead: Aside from checking your rate limit on your account page, you can also find essential details like remaining requests, tokens, and other metadata in the HTTP response headers. Use this information to plan ahead and optimize the rate at which requests are made.
  • Use back-off tactics: To avoid repeated violations of the "Over the Rate Limit" error, it's crucial to apply back-off tactics. This entails adding delays or pauses between requests so that you stay within the permitted rate restrictions.
  • Create a new OpenAI account: This method is more of a workaround; you can make more requests using a new OpenAI API key.
  • Upgrade the API plan: You may want to do this if your usage demands routinely exceed the given rate limits. To accommodate customers with more demanding needs, service providers frequently provide various tiers with greater rate caps. You can upgrade to an appropriate ChatGPT model and lower the likelihood of making an error based on your needs.
  • Request an increase: It’s possible to request an increase in the rate limit by completing the OpenAI API Rate Limit Increase Request form. Note you’ll need to “share evidence of need”.

Example: Using Back-off Tactics

The code below illustrates one of the ways to resolve the “Over the Rate Limit” error by introducing successively greater delays between the API calls each time this error is encountered:

import java.io.*;
import java.net.HttpURLConnection;
import java.net.URL;

public class ChatGPTAPIExample {

   public static String chatGPT(String prompt) {
       String url = "https://api.openai.com/v1/chat/completions";
       String apiKey = "YOUR_API_KEY";
       String model = "gpt-3.5-turbo";
       int maxRetries = 3; // Maximum number of retries
       int retryDelay = 1000; // Initial retry delay in milliseconds

       for (int retry = 0; retry < maxRetries; retry++) {
           try {
               URL obj = new URL(url);
               HttpURLConnection connection = (HttpURLConnection) obj.openConnection();
               connection.setRequestMethod("POST");
               connection.setRequestProperty("Authorization", "Bearer " + apiKey);
               connection.setRequestProperty("Content-Type", "application/json");

               // The request body
               String body = "{\"model\": \"" + model + "\", \"messages\": [{\"role\": \"user\", \"content\": \"" + prompt + "\"}]}";
               connection.setDoOutput(true);
               OutputStreamWriter writer = new OutputStreamWriter(connection.getOutputStream());
               writer.write(body);
               writer.flush();
               writer.close();

               // Response from ChatGPT
               BufferedReader br = new BufferedReader(new InputStreamReader(connection.getInputStream()));
               String line;

               StringBuffer response = new StringBuffer();

               while ((line = br.readLine()) != null) {
                   response.append(line);
               }
               br.close();

               // Calls the method to extract the message.
               return extractMessageFromJSONResponse(response.toString());

           } catch (IOException e) {
               // Retry on IOException
               System.out.println("Error: " + e.getMessage());
               System.out.println("Retry attempt: " + (retry + 1));
               try {
                   // Implement exponential backoff by doubling the delay time on each retry
                   Thread.sleep(retryDelay);
                   retryDelay *= 2;
               } catch (InterruptedException ex) {
                   ex.printStackTrace();
               }
           }
       }

       // Return an error message if maxRetries are reached
       return "Error: Maximum number of retries reached. Unable to process the request.";
   }

   public static String extractMessageFromJSONResponse(String response) {
       int start = response.indexOf("content") + 11;
       int end = response.indexOf("\"", start);
       return response.substring(start, end);
   }

   public static void main(String[] args) {
       System.out.println(chatGPT("hello, how are you? Can you tell what's a Fibonacci Number"));
       System.out.println(chatGPT("Is it the same as the factorial of a number?"));
       System.out.println(chatGPT("Is it the same as an Armstrong number?"));
   }
}

In the above code a loop is used to attempt the API call for a maximum number of retries (maxRetries). If an IOException occurs during the API call, log the error, introduce an exponential delay between retries, and incrementally increase the delay for each retry attempt. This approach helps prevent overloading the server with rapid retries.

Output:

Hello! I am an AI language model, so I don't have emotions, but I am here to help you. \n\nA Fibonacci number is a sequence of numbers in which each number is the sum of the two preceding ones. The sequence starts with 0 and 1, and the subsequent numbers are found by adding the two numbers before it. So, the Fibonacci sequence begins as follows: 0, 1, 1, 2, 3, 5, 8, 13, 21, and so on.
No, the factorial of a number is a specific mathematical operation where you multiply all positive integers less than or equal to the given number. It is denoted by an exclamation mark (!) after the number. For example, the factorial of 5 is 5! = 5 x 4 x 3 x 2 x 1 = 120.\n\nExponentiation, on the other hand, refers to the raising of a number to a power. It involves multiplying the base number by itself a certain number of times. For example, 5 raised to the power of 3 (5^3) is equal to 5 x 5 x 5 = 125. \n\nSo, factorial and exponentiation are different mathematical operations with distinct meanings.
No, an Armstrong number is different from a perfect number. \n\n An Armstrong number (also known as a narcissistic number) is a number that is equal to the sum of its own digits raised to the power of the number of digits. For example, 153 is an Armstrong number because 1^3 + 5^3 + 3^3 = 1 + 125 + 27 = 153.\n\nOn the other hand, a perfect number is a positive integer that is equal to the sum of its proper divisors (excluding itself). For example, 6 is a perfect number because its proper divisors are 1, 2 and 3, and 1 + 2 + 3 = 6.

Process finished with exit code 0

In summary, the "Over the Rate Limit" error in ChatGPT occurs when the programmer communicates with the API more frequently than the permitted rates. The objective of this error is to protect system resources and guarantee impartial usage. By being aware of the rate constraints, monitoring consumption, addressing the error by delaying the API calls, and requesting an increase in the rate limit by completing an OpenAI API Rate Limit Increase Request form, users can manage the "Over the Rate Limit" error efficiently. Not to mention, there is always a choice to upgrade API plans as necessary.

A Solution to Monitor Errors from the ChatGPT API

It's a best practice to monitor exceptions that occur when interacting with any external API. For example, the API might be temporarily unavailable, or the expected parameters or response format may have changed and you might need to update your code, and your code should be the thing to tell you about this. Here's how to do it with error monitoring tool Rollbar:

Step 1: Set up your account on Rollbar and obtain the Rollbar access token.

Step 2: Add the Rollbar Java SDK dependency to your Java project and to the build configuration file (for example, the Maven pom.xml or the Gradle build.gradle). Here is an example for Maven:

<dependency>
    <groupId>com.rollbar</groupId>
    <artifactId>rollbar-java</artifactId>
    <version>1.10.0</version>
</dependency>

Step 3: Next, setup the Rollbar configuration with your Java code that is interacting with the ChatGPT API

import com.rollbar.notifier.Rollbar;
import com.rollbar.notifier.config.Config;
import com.rollbar.notifier.config.ConfigBuilder;

public class ChatGPTExample {

    public static void main(String[] args) {
        // Configure Rollbar
      Config rollbarConfig =
      ConfigBuilder.withAccessToken("YOUR_ROLLBAR_ACCESS_TOKEN")
                .environment("production")
                .build();
      Rollbar rollbar = new Rollbar(rollbarConfig);

        try {
            // Code for interacting with the ChatGPT API
        } catch (Exception e) {
            rollbar.error(e);
        } finally {
            rollbar.close();
        }
    }
}

Make sure you replace the YOUR_ROLLBAR_ACCESS_TOKEN with your access token obtained from Rollbar.

To get your Rollbar access token, sign up for free and follow the instructions for Java.

We can't wait to see what you build with ChatGPT. Happy coding!

Related Resources

"Rollbar allows us to go from alerting to impact analysis and resolution in a matter of minutes. Without it we would be flying blind."

Error Monitoring

Start continuously improving your code today.

Get Started Shape