Skip to main content
Core.Today
|
GoogleFastStandard

Gemini 2.0 Flash Lite

Ultra-lightweight version of Gemini 2.0 Flash optimized for maximum speed and minimal cost. Perfect for high-volume, latency-sensitive applications.

0.5 credits
per 1K tokens (avg)
Ultra-fast inference
Minimal cost per request
1,048,576 token context window
8,192 max output tokens
Multimodal input: audio, image, video, text
Function calling, structured outputs, caching, Batch API
Deprecated โ€” migrate to Gemini 3.1 Flash-Lite

Run it right now

Test this model instantly in the Console Playground โ€” no code required

Sign in to try

Use with AI Assistant

Copy usage instructions for Claude, ChatGPT, or other AI

llms.txt

Model Specifications

Context Window
1.0M
tokens
Max Output
8K
tokens
Training Cutoff
August 2024
Compatible SDK
OpenAI, Google AI

Capabilities

Vision
Function Calling
Streaming
JSON Mode
System Prompt

Token Pricing (per 1M tokens)

Token TypeCreditsUSD Equivalent
Input Tokens150$0.15
Output Tokens600$0.60

* 1 credit โ‰ˆ $0.001 (actual charges may vary based on usage)

Quick Start

curl -X POST "https://api.core.today/llm/gemini/v1beta/openai/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
  "model": "gemini-2.0-flash-lite",
  "messages": [
    {
      "role": "system",
      "content": "Classify the sentiment of the text as positive, negative, or neutral. Respond with just the label."
    },
    {
      "role": "user",
      "content": "The product works great but the delivery was slow."
    }
  ],
  "max_tokens": 10,
  "temperature": 0
}'

Parameters

ParameterTypeRequiredDefaultDescription
messagesarrayYes-Array of message objects (OpenAI format)
temperaturefloatNo1Sampling temperature (0-2)
max_tokensintegerNo-Maximum output tokens. Max: 8,192. Context window (input + output): 1,048,576 tokens.

Examples

Quick Classification

Fast text classification

curl -X POST "https://api.core.today/llm/gemini/v1beta/openai/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
  "model": "gemini-2.0-flash-lite",
  "messages": [
    {
      "role": "system",
      "content": "Classify the sentiment of the text as positive, negative, or neutral. Respond with just the label."
    },
    {
      "role": "user",
      "content": "The product works great but the delivery was slow."
    }
  ],
  "max_tokens": 10,
  "temperature": 0
}'

Tips & Best Practices

1Deprecated โ€” migrate to Gemini 3.1 Flash-Lite for better performance
2Max output tokens: 8,192 โ€” much lower than 3.x Flash-Lite (65,536)
3Context window 1,048,576 tokens (input + output)
4Best for simple, high-volume tasks
5Use temperature 0 for deterministic classification
6Ideal for real-time applications requiring low latency

Use Cases

High-volume chatbots
Real-time classification
Content filtering
Simple data extraction
Batch processing