Skip to main content
Core.Today
|
GoogleFastHigh

Gemini 3.1 Flash Live Preview

Gemini 3.1 Flash optimized for real-time interactions and live streaming scenarios. Features low-latency responses with audio input support at dedicated pricing.

300 credits
per request
Live API support (real-time bidirectional)
Low-latency streaming responses
131,072 token context window
65,536 max output tokens
Multimodal input: text, image, audio, video
Output modalities: text + audio
Function calling, thinking, audio generation, search grounding

Run it right now

Test this model instantly in the Console Playground β€” no code required

Sign in to try

Use with AI Assistant

Copy usage instructions for Claude, ChatGPT, or other AI

llms.txt

Model Specifications

Context Window
131K
tokens
Max Output
66K
tokens
Training Cutoff
January 2025
Compatible SDK
OpenAI, Google AI

Capabilities

Vision
Function Calling
Streaming
JSON Mode
System Prompt

Token Pricing (per 1M tokens)

Token TypeCreditsUSD Equivalent
Input Tokens750$0.75
Output Tokens4,500$4.50

* 1 credit β‰ˆ $0.001 (actual charges may vary based on usage)

Quick Start

curl -X POST "https://api.core.today/llm/gemini/v1beta/openai/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
  "model": "gemini-3.1-flash-live-preview",
  "messages": [
    {
      "role": "system",
      "content": "You are a real-time assistant. Respond quickly and concisely."
    },
    {
      "role": "user",
      "content": "What are the key differences between HTTP/2 and HTTP/3?"
    }
  ],
  "max_tokens": 1000,
  "stream": true
}'

Parameters

ParameterTypeRequiredDefaultDescription
messagesarrayYes-Array of message objects (OpenAI format)
temperaturefloatNo1Sampling temperature (0-2)
top_pfloatNo0.95Nucleus sampling parameter
max_tokensintegerNo-Maximum output tokens. Max: 65,536. Context window (input + output): 131,072 tokens.
streambooleanNotrueEnable Server-Sent Events streaming (recommended for live use)

Examples

Live Chat

Real-time streaming conversation

curl -X POST "https://api.core.today/llm/gemini/v1beta/openai/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
  "model": "gemini-3.1-flash-live-preview",
  "messages": [
    {
      "role": "system",
      "content": "You are a real-time assistant. Respond quickly and concisely."
    },
    {
      "role": "user",
      "content": "What are the key differences between HTTP/2 and HTTP/3?"
    }
  ],
  "max_tokens": 1000,
  "stream": true
}'

Tips & Best Practices

1Max output tokens: 65,536 β€” set max_tokens up to this limit
2Context window 131,072 tokens β€” smaller than non-Live Flash variants
3Enable streaming for the best real-time experience
4Audio input tokens are priced at $3.00/M separately
5Output supports both text and audio modalities
6Ideal for live interactions requiring low latency
7Use for real-time voice assistants and customer support

Use Cases

Live customer support interactions
Real-time voice-based assistants
Interactive streaming applications
Live transcription and analysis
Real-time content moderation

Model Info

ProviderGoogle
Version3.1-preview
CategoryLLM
Price300 credits

API Endpoint

POST /llm/gemini/v1beta/openai/chat/completions
Try in PlaygroundBack to Docs