Skip to main content
Core.Today
|
GoogleFastHigh

Gemini 2.5 Flash

Google's fast and efficient model with built-in thinking capabilities. Great balance of speed, reasoning, and cost for high-volume applications.

1 credits
per 1K tokens (avg)
1,048,576 token context window
65,536 max output tokens
Multimodal input: text, image, video, audio
Built-in thinking mode
Function calling, structured outputs, code execution
Search grounding, caching, Batch API
Cost-effective

Run it right now

Test this model instantly in the Console Playground โ€” no code required

Sign in to try

Use with AI Assistant

Copy usage instructions for Claude, ChatGPT, or other AI

llms.txt

Model Specifications

Context Window
1.0M
tokens
Max Output
66K
tokens
Training Cutoff
2025-01
Compatible SDK
OpenAI, Google

Capabilities

Vision
Function Calling
Streaming
JSON Mode
System Prompt

Token Pricing (per 1M tokens)

Token TypeCreditsUSD Equivalent
Input Tokens300$0.30
Output Tokens2,500$2.50

* 1 credit โ‰ˆ $0.001 (actual charges may vary based on usage)

Quick Start

curl -X POST "https://api.core.today/llm/gemini/v1beta/openai/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
  "model": "gemini-2.5-flash",
  "messages": [
    {
      "role": "user",
      "content": "A train leaves station A at 9 AM traveling at 60 mph. Another train leaves station B (300 miles away) at 10 AM traveling toward A at 80 mph. When and where will they meet?"
    }
  ],
  "max_tokens": 4000
}'

Parameters

ParameterTypeRequiredDefaultDescription
messagesarrayYes-Array of message objects (OpenAI format). Supports text, image, video, and audio inputs.
temperaturefloatNo1.0Sampling temperature (0-2).
top_pfloatNo0.95Nucleus sampling parameter (0-1).
max_tokensintegerNo-Maximum output tokens. Max: 65,536. Context window (input + output): 1,048,576 tokens.
stopstring | arrayNo-Up to 4 sequences where the model stops generating.
response_formatobjectNo-Output format constraint. Use `{ type: 'json_object' }` for structured JSON output.
presence_penaltyfloatNo0Penalty (-2.0 to 2.0) for repeating tokens.
frequency_penaltyfloatNo0Penalty (-2.0 to 2.0) by token frequency.
seedintegerNo-Seed for deterministic sampling (best-effort).
streambooleanNofalseEnable Server-Sent Events streaming.

Examples

Complex Problem

Solve with step-by-step reasoning

curl -X POST "https://api.core.today/llm/gemini/v1beta/openai/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
  "model": "gemini-2.5-flash",
  "messages": [
    {
      "role": "user",
      "content": "A train leaves station A at 9 AM traveling at 60 mph. Another train leaves station B (300 miles away) at 10 AM traveling toward A at 80 mph. When and where will they meet?"
    }
  ],
  "max_tokens": 4000
}'

Tips & Best Practices

1Max output tokens: 65,536 โ€” set max_tokens up to this limit
2Context window 1,048,576 tokens (input + output)
3Use for problems requiring step-by-step reasoning
4Most cost-effective Gemini model with thinking
5Supports OpenAI SDK format for easy migration
6Great for high-volume applications

Use Cases

Complex math problems
Multi-step reasoning
Scientific analysis
Debugging complex code
High-volume applications