LLM API
OpenAI, Anthropic, Google의 대화형 AI 모델을 동일한 API 키로 사용합니다. 기존 OpenAI SDK와 100% 호환됩니다.
주요 모델 상세 정보
각 모델의 상세한 파라미터, 예제 코드, 활용 팁을 확인하세요.
GPT-4o
OpenAI
OpenAI's flagship multimodal model. Industry-leading performance in reasoning, coding, and creative tasks with native vision capabilities and structured output support.
GPT-4.1
OpenAI
OpenAI's most capable model for coding and instruction following. Features a 1M token context window, 32K output tokens, and major improvements in coding, complex prompts, and long-context tasks. 20% cheaper than GPT-4o on output.
GPT-4.1 Mini
OpenAI
A significant leap in small model performance. Matches or exceeds GPT-4o in intelligence while reducing latency by nearly half and cost by 83%. Ideal balance of speed, quality, and affordability.
GPT-4.1 Nano
OpenAI
OpenAI's fastest and cheapest model. Optimized for classification, autocompletion, and low-latency tasks. Ultra-affordable at $0.10/1M input tokens.
GPT-4o Mini
OpenAI
Cost-effective, fast model with strong performance. Best for high-volume tasks where speed and cost matter more than absolute capability.
OpenAI o1
OpenAI
OpenAI's most advanced reasoning model. Uses extended thinking time to solve complex problems in science, coding, and math with exceptional accuracy.
OpenAI o4-mini
OpenAI
Fast, cost-effective reasoning model optimized for coding and STEM tasks. Provides strong reasoning at a fraction of the cost of larger reasoning models.
OpenAI o3-mini
OpenAI
Efficient reasoning model that delivers strong performance at lower cost. Ideal for tasks requiring reasoning without the overhead of larger models.
GPT-5
OpenAI
OpenAI's latest flagship model. Delivers exceptional performance across reasoning, coding, and creative tasks with a massive 1M token context window and 32K output tokens. Supports vision, function calling, and JSON mode.
GPT-5.2
OpenAI
OpenAI's latest and most advanced GPT model. Delivers state-of-the-art performance across reasoning, coding, and creative tasks with enhanced capabilities.
GPT-5 Mini
OpenAI
Fast and efficient variant of GPT-5. Delivers strong performance across reasoning, coding, and creative tasks with a 1M token context window and 32K output tokens, at a fraction of the cost of GPT-5.
GPT-5 Nano
OpenAI
Ultra-fast and lightweight variant of GPT-5. Designed for high-throughput, low-latency applications with a 1M token context window and 32K output tokens at minimal cost.
GPT Audio Mini
OpenAI
Lightweight multimodal model with native audio input/output capabilities. Optimized for voice-based interactions and audio processing tasks.
Claude Opus 4.6
Anthropic
Anthropic's most capable model. Delivers breakthrough performance in reasoning, coding, and complex analysis with enhanced safety and instruction following.
Claude Sonnet 4.5
Anthropic
Anthropic's most intelligent and capable Sonnet model. Best-in-class for complex reasoning, nuanced understanding, and coding tasks with exceptional instruction following.
Claude Opus 4.5
Anthropic
Anthropic's most powerful model for highly complex tasks. Exceptional at research, analysis, and creative projects requiring deep expertise.
Claude Haiku 4.5
Anthropic
Fast, cost-effective model for everyday tasks. Great balance of speed, intelligence, and cost for high-volume applications.
Claude Sonnet 4
Anthropic
Balanced Sonnet 4 model offering strong reasoning and coding abilities at an efficient price point. Ideal for everyday production workloads that need a good mix of speed and intelligence.
Claude 3.7 Sonnet
Anthropic
Enhanced Claude 3.7 Sonnet with improved reasoning and coding capabilities. A strong mid-tier model offering reliable performance across a wide range of tasks.
GPT-5.4
OpenAI
OpenAI's newest flagship model with 1M context window and 128K output tokens. Delivers top-tier reasoning across all domains with adjustable reasoning effort levels from none to xhigh.
GPT-5.4 Mini
OpenAI
Fast and cost-efficient variant of GPT-5.4 with 400K context window and 128K output tokens. Excellent balance of performance and affordability for everyday tasks.
GPT-5.4 Nano
OpenAI
Ultra-lightweight and fastest GPT-5.4 variant with 400K context and 128K output. Designed for high-throughput, low-latency applications at minimal cost. Supports MCP for tool integration.
GPT-5.1 (2025-11-13)
OpenAI
Dated snapshot of GPT-5.1 for reproducible results. Supports cached input tokens for cost savings on repeated context. Ideal for production deployments requiring model version pinning.
Gemini 3 Pro Preview
Google's most powerful Gemini model in preview. Features breakthrough reasoning, coding, and multimodal capabilities with the largest context window.
Gemini 2.0 Flash Lite
Ultra-lightweight version of Gemini 2.0 Flash optimized for maximum speed and minimal cost. Perfect for high-volume, latency-sensitive applications.
Gemini Embedding 001
Google's text embedding model for generating vector representations. Optimized for semantic search, clustering, and similarity tasks.
Gemini 2.0 Flash
Google's fastest and most capable model. Features a massive 1M token context window, native multimodal support, and real-time capabilities.
Gemini 2.5 Flash
Google's fast and efficient model with built-in thinking capabilities. Great balance of speed, reasoning, and cost for high-volume applications.
Gemini 2.5 Pro
Google's most capable model with state-of-the-art reasoning and 1M token context. Excels at complex coding, math, and multi-document analysis.
Gemini 3 Flash
Google's most advanced reasoning model with state-of-the-art multimodal understanding, PhD-level reasoning, and leading coding performance.
Gemini 3.1 Pro Preview
Google's latest and most capable Gemini model in preview. Features dynamic pricing that adjusts based on context length, with enhanced pricing for inputs over 200K tokens.
Gemini 3.1 Flash Image Preview
Gemini 3.1 Flash with native image generation capabilities. Can generate images directly in chat responses alongside text. Features separate pricing for text and image output tokens.
Gemini 3.1 Flash Lite Preview
Ultra-lightweight variant of Gemini 3.1 Flash. The most cost-effective Gemini model with support for cached input and audio input. Ideal for high-throughput, budget-conscious applications.
Gemini 3.1 Flash Live Preview
Gemini 3.1 Flash optimized for real-time interactions and live streaming scenarios. Features low-latency responses with audio input support at dedicated pricing.
Gemini 3 Pro Image Preview
Google's premium image generation model within the Gemini 3 Pro family. Generates high-quality images directly in chat with the highest fidelity among Gemini image models. Image output tokens are priced at 10x text output tokens.
가격 단위: 크레딧/토큰 기준입니다. 예: 1,000 토큰 입력, 500 토큰 출력 시 gpt-4o-mini는 0.3 + 0.6 = 0.9 크레딧
GPT-5 / GPT-5.2 / O-Series 주의사항
GPT-5, GPT-5.2, o1, o3 등 추론(Reasoning) 모델은 일반 모델과 파라미터가 다릅니다:
max_tokens→max_completion_tokens사용temperature,top_p지원 안 함- 새 파라미터:
reasoning_effort(minimal/low/medium/high)
OpenAI (GPT)
GPT-4o / GPT-4.1 (일반 모델)
curl -X POST https://api.core.today/llm/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer cdt_your_api_key" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello!"}],
"max_tokens": 1000,
"temperature": 0.7
}'GPT-5 / O-Series (추론 모델)
curl -X POST https://api.core.today/llm/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer cdt_your_api_key" \
-d '{
"model": "gpt-5",
"messages": [{"role": "user", "content": "Explain quantum computing"}],
"max_completion_tokens": 16000,
"reasoning_effort": "medium"
}'GPT-5 전용 파라미터
max_completion_tokens- 최대 출력 토큰 (max_tokens 대신 사용)reasoning_effort- 추론 수준: minimal, low, medium, high
Codex 모델 (코드 특화)
Codex 모델은 Responses API 전용입니다
gpt-5.1-codex, gpt-5.1-codex-mini 등 Codex 모델은 /v1/chat/completions를 지원하지 않습니다. 대신 /v1/responses 엔드포인트를 사용해야 합니다.
게이트웨이는 경로를 그대로 프록시하므로, 클라이언트에서 엔드포인트 경로만 변경하면 됩니다: /llm/openai/v1/responses → OpenAI /v1/responses
# Codex 모델: /v1/responses 엔드포인트 사용
curl -X POST https://api.core.today/llm/openai/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer cdt_your_api_key" \
-d '{
"model": "gpt-5.1-codex",
"instructions": "You are a helpful coding assistant.",
"input": "Write a Python function to merge two sorted lists",
"max_output_tokens": 16000
}'messages→input(문자열 또는 메시지 배열)- 시스템 프롬프트:
instructions파라미터 사용 - 출력 토큰 제한:
max_output_tokens사용 - 스트리밍 시 이벤트 형식:
response.output_text.delta
gpt-5.1-codex, gpt-5.1-codex-mini, gpt-5.2-codex 등 모델 이름에 "codex"가 포함된 모델| 모델 | Input | Output |
|---|---|---|
| 모델 | Input | Output |
|---|---|---|
| 모델 | Input | Output |
|---|---|---|
| 모델 | Input | Output |
|---|---|---|
Anthropic (Claude)
curl -X POST https://api.core.today/llm/anthropic/v1/messages \
-H "Content-Type: application/json" \
-H "Authorization: Bearer cdt_your_api_key" \
-d '{
"model": "claude-sonnet-4",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Explain quantum computing simply."}
]
}'| 모델 | Input | Output |
|---|---|---|
| 모델 | Input | Output |
|---|---|---|
| 모델 | Input | Output |
|---|---|---|
Google (Gemini)
curl -X POST "https://api.core.today/llm/gemini/v1beta/models/gemini-2.5-pro:generateContent" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer cdt_your_api_key" \
-d '{
"contents": [
{
"parts": [{"text": "Write a haiku about programming"}]
}
]
}'| 모델 | Input | Output |
|---|---|---|
gemini-embedding-001 (임베딩 전용, Input 0.0003) | gemini-3-pro-preview-longcontext (Input 0.0080, Output 0.0360)스트리밍 응답
실시간으로 응답을 받으려면 stream: true를 추가하세요:
curl -X POST https://api.core.today/llm/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer cdt_your_api_key" \
-d '{
"model": "gpt-5",
"messages": [{"role": "user", "content": "Tell me a long story"}],
"stream": true
}'비용 계산 예시
1,000 토큰 입력, 500 토큰 출력 기준:
| 모델 | 계산 | 총 비용 |
|---|---|---|
| gpt-4o-mini | 0.3 + 0.6 | 0.9 크레딧 |
| gpt-5 | 2.5 + 10.0 | 12.5 크레딧 |
| claude-3-haiku | 0.5 + 1.25 | 1.75 크레딧 |
| claude-sonnet-4 | 6.0 + 15.0 | 21.0 크레딧 |
| gemini-2.0-flash | 0.2 + 0.4 | 0.6 크레딧 |