AI Agents for Non-Techies

Lesson 4. API Limits and Request Costs#

Goal: understand that APIs have limits and how to avoid exceeding them.

APIs can't handle unlimited requests. So services set limits:

If you exceed the limit, the API returns an error:

{
  "error": "Rate limit exceeded. Try again in 60 seconds."
}

Limits protect the service from:

overload (if everyone makes a million requests at once, the service goes down)
abuse (e.g., DDoS attacks)
inefficient use (if your agent makes 1000 requests instead of 1, that's bad code)

Limits are always in the API documentation. Look for:

Example: OpenAI API (ChatGPT)

On the free tier (2026):

On the paid tier (Pay-as-you-go):

1. Make requests only when needed

Don't make a request for every user message if you can handle it locally.

2. Use caching

If data doesn't change often (e.g., product list), cache it for 5–10 minutes.

3. Use batch requests

Some APIs support bulk requests (e.g., read 100 clients in one request instead of 100 separate requests).

4. Add delays (throttling)

If the API allows 10 requests per second, add a 100 ms delay between requests.

5. Handle errors

If the API returns "Rate limit exceeded", wait the specified time and retry.

Many APIs are paid. Cost depends on:

Example: OpenAI API (2026)

GPT-5.2 Pro: $0.015 per 1000 input tokens, $0.045 per 1000 output tokens
GPT-5.2: $0.01 per 1000 input tokens, $0.03 per 1000 output tokens
GPT-4o mini: $0.00015 per 1000 input tokens, $0.0006 per 1000 output tokens (100x cheaper than GPT-5.2!)

Alternatives (Chinese models, 2026):

What is a token? Roughly 4 characters. The phrase "Hello, how are you?" is ~5 tokens.

Example calculation:

User writes 100 characters (~25 input tokens)
Agent replies 400 characters (~100 output tokens)
Cost (GPT-5.2): (25 / 1000) × $0.01 + (100 / 1000) × $0.03 = $0.0025 + $0.003 = $0.0055 (~$0.006 per conversation)
Cost (DeepSeek-R1): (25 / 1000) × $0.003 + (100 / 1000) × $0.003 = $0.000075 + $0.0003 = ~$0.0004 per conversation (15x cheaper!)
If 1000 conversations per month → GPT-5.2: ~$6/month, DeepSeek: ~$0.40/month

1. Set limits in the service settings

Many services let you set a spending limit (e.g., "no more than $50 per month").

2. Monitor usage

Check API usage stats once a week.

3. Use cheaper models

If GPT-5.2 is too expensive → use GPT-4o mini (100x cheaper) or Chinese models DeepSeek/GLM (3–5x cheaper).

4. Optimize prompts

Shorter prompts → fewer tokens → lower cost.