Skip to main content
OpenAIFastHigh

GPT-5 Nano

Ultra-fast and lightweight variant of GPT-5. Designed for high-throughput, low-latency applications with a 1M token context window and 32K output tokens at minimal cost.

1 credits
per 1K tokens (avg)
1M token context window
32K max output tokens
Ultra-fast inference
Lowest cost GPT-5 variant
Function calling & JSON mode
Structured outputs
Ideal for high-throughput tasks

Run it right now

Test this model instantly in the Console Playground โ€” no code required

Sign in to try

Use with AI Assistant

Copy usage instructions for Claude, ChatGPT, or other AI

llms.txt

Model Specifications

Context Window
1.0M
tokens
Max Output
33K
tokens
Training Cutoff
2025-03
Compatible SDK
OpenAI

Capabilities

Vision
Function Calling
Streaming
JSON Mode
System Prompt

Token Pricing (per 1M tokens)

Token TypeCreditsUSD Equivalent
Input Tokens100$0.10
Output Tokens800$0.80
Cached Tokens25$0.03

* 1 credit โ‰ˆ $0.001 (actual charges may vary based on usage)

Quick Start

curl -X POST "https://api.core.today/llm/openai/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
  "model": "gpt-5-nano",
  "messages": [
    {
      "role": "system",
      "content": "Classify the following customer message into one of these categories: billing, technical, general, feedback. Respond with only the category name."
    },
    {
      "role": "user",
      "content": "I was charged twice for my subscription last month and need a refund."
    }
  ],
  "max_completion_tokens": 50
}'

Parameters

ParameterTypeRequiredDefaultDescription
messagesarrayYes-Array of message objects with role and content
modelstringYesgpt-5-nanoModel identifier
max_completion_tokensintegerNo4096Maximum tokens in response (up to 32768). Note: use max_completion_tokens, not max_tokens
reasoning_effortstringNomediumReasoning effort level: low, medium, or high
streambooleanNofalseEnable Server-Sent Events streaming
response_formatobjectNo-Format of response: { type: 'json_object' } for JSON mode
toolsarrayNo-List of tools (functions) the model can call

Examples

Quick Classification

High-speed text classification with GPT-5 Nano

curl -X POST "https://api.core.today/llm/openai/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
  "model": "gpt-5-nano",
  "messages": [
    {
      "role": "system",
      "content": "Classify the following customer message into one of these categories: billing, technical, general, feedback. Respond with only the category name."
    },
    {
      "role": "user",
      "content": "I was charged twice for my subscription last month and need a refund."
    }
  ],
  "max_completion_tokens": 50
}'

Data Extraction

Fast structured data extraction

curl -X POST "https://api.core.today/llm/openai/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
  "model": "gpt-5-nano",
  "messages": [
    {
      "role": "system",
      "content": "Extract key information from the text and return as JSON."
    },
    {
      "role": "user",
      "content": "Meeting scheduled with John Smith from Acme Corp on March 15, 2026 at 2:00 PM in Conference Room B to discuss Q1 budget review."
    }
  ],
  "response_format": {
    "type": "json_object"
  },
  "max_completion_tokens": 500
}'

Tips & Best Practices

1Most cost-effective GPT-5 model โ€” 25x cheaper than GPT-5 on input tokens
2Ultra-fast response times ideal for real-time applications
3Perfect for classification, routing, and simple extraction tasks
4Use temperature 0 for deterministic classification results
5Great for high-volume batch processing
61M context window available even at the lowest price point

Use Cases

High-volume classification and routing
Quick text generation and completion
Real-time chat applications
Lightweight data extraction
Auto-tagging and categorization
Simple code assistance