Model APIsOpenAI Compatible

LLM API

OpenAI, Anthropic, Google의 대화형 AI 모델을 동일한 API 키로 사용합니다. 기존 OpenAI SDK와 100% 호환됩니다.

주요 모델 상세 정보

각 모델의 상세한 파라미터, 예제 코드, 활용 팁을 확인하세요.

Claude 3.7 Sonnet

Anthropic

Enhanced Claude 3.7 Sonnet with improved reasoning and coding capabilities. A strong mid-tier model offering reliable performance across a wide range of tasks.

빠름Ultra

상세 보기

Claude Haiku 4.5

Anthropic

1 credits/1K

Fast, cost-effective model for everyday tasks. Great balance of speed, intelligence, and cost for high-volume applications.

빠름Ultra

상세 보기

Claude Opus 4.5

Anthropic

5 credits/1K

Anthropic's most powerful model for highly complex tasks. Exceptional at research, analysis, and creative projects requiring deep expertise.

빠름Ultra

상세 보기

Claude Opus 4.6

Anthropic

5 credits/1K

Anthropic's most capable model. Delivers breakthrough performance in reasoning, coding, and complex analysis with enhanced safety and instruction following.

빠름Ultra

상세 보기

Claude Opus 4.7

Anthropic

5 credits/1K

Anthropic's latest flagship model with reliable knowledge cutoff to January 2026 and 128K max output tokens. Builds on Opus 4.6 with improved reasoning, coding, and instruction-following while staying compatible with the Anthropic Messages and OpenAI Chat Completions formats.

빠름Ultra

상세 보기

Claude Sonnet 4.5

Anthropic

4 credits/1K

Anthropic's most intelligent and capable Sonnet model. Best-in-class for complex reasoning, nuanced understanding, and coding tasks with exceptional instruction following.

빠름Ultra

상세 보기

Claude Sonnet 4

Anthropic

3 credits/1K

Balanced Sonnet 4 model offering strong reasoning and coding abilities at an efficient price point. Ideal for everyday production workloads that need a good mix of speed and intelligence.

빠름Ultra

상세 보기

Gemini 2.0 Flash Lite

Google

0.5 credits/1K

Ultra-lightweight version of Gemini 2.0 Flash optimized for maximum speed and minimal cost. Perfect for high-volume, latency-sensitive applications.

빠름Ultra

상세 보기

Gemini 2.0 Flash

Google

1 credits/1K

Google's fastest and most capable model. Features a massive 1M token context window, native multimodal support, and real-time capabilities.

빠름Ultra

상세 보기

Gemini 2.5 Flash

Google

1 credits/1K

Google's fast and efficient model with built-in thinking capabilities. Great balance of speed, reasoning, and cost for high-volume applications.

빠름Ultra

상세 보기

Gemini 2.5 Pro

Google

3 credits/1K

Google's most capable model with state-of-the-art reasoning and 1M token context. Excels at complex coding, math, and multi-document analysis.

빠름Ultra

상세 보기

Gemini 3 Flash

Google

500 credits/1K

Google's most advanced reasoning model with state-of-the-art multimodal understanding, PhD-level reasoning, and leading coding performance.

빠름Ultra

상세 보기

Gemini 3 Pro Image Preview

Google

500 credits/1K

Google's premium image generation model within the Gemini 3 Pro family. Generates high-quality images directly in chat with the highest fidelity among Gemini image models. Image output tokens are priced at 10x text output tokens.

빠름Ultra

상세 보기

Gemini 3 Pro Preview

Google

4 credits/1K

Google's most powerful Gemini model in preview. Features breakthrough reasoning, coding, and multimodal capabilities with the largest context window.

빠름Ultra

상세 보기

Gemini 3.1 Flash Image Preview

Google

500 credits/1K

Gemini 3.1 Flash with native image generation capabilities. Can generate images directly in chat responses alongside text. Features separate pricing for text and image output tokens.

빠름Ultra

상세 보기

Gemini 3.1 Flash Lite Preview

Google

100 credits/1K

Ultra-lightweight variant of Gemini 3.1 Flash. The most cost-effective Gemini model with support for cached input and audio input. Ideal for high-throughput, budget-conscious applications.

빠름Ultra

상세 보기

Gemini 3.1 Flash Live Preview

Google

300 credits/1K

Gemini 3.1 Flash optimized for real-time interactions and live streaming scenarios. Features low-latency responses with audio input support at dedicated pricing.

빠름Ultra

상세 보기

Gemini 3.1 Pro Preview

Google

500 credits/1K

Google's latest and most capable Gemini model in preview. Features dynamic pricing that adjusts based on context length, with enhanced pricing for inputs over 200K tokens.

빠름Ultra

상세 보기

Gemini Embedding 001

Google

0.1 credits/1K

Google's text embedding model for generating vector representations. Optimized for semantic search, clustering, and similarity tasks.

빠름Ultra

상세 보기

GPT-4.1 Mini

OpenAI

1 credits/1K

A significant leap in small model performance. Matches or exceeds GPT-4o in intelligence while reducing latency by nearly half and cost by 83%. Ideal balance of speed, quality, and affordability.

빠름Ultra

상세 보기

GPT-4.1 Nano

OpenAI

1 credits/1K

OpenAI's fastest and cheapest model. Optimized for classification, autocompletion, and low-latency tasks. Ultra-affordable at $0.10/1M input tokens.

빠름Ultra

상세 보기

GPT-4.1

OpenAI

3 credits/1K

OpenAI's most capable model for coding and instruction following. Features a 1M token context window, 32K output tokens, and major improvements in coding, complex prompts, and long-context tasks. 20% cheaper than GPT-4o on output.

빠름Ultra

상세 보기

GPT-4o Mini

OpenAI

1 credits/1K

Cost-effective, fast model with strong performance. Best for high-volume tasks where speed and cost matter more than absolute capability.

빠름Ultra

상세 보기

GPT-4o

OpenAI

3 credits/1K

OpenAI's flagship multimodal model. Industry-leading performance in reasoning, coding, and creative tasks with native vision capabilities and structured output support.

빠름Ultra

상세 보기

GPT-5 Mini

OpenAI

1 credits/1K

Fast and efficient variant of GPT-5. Delivers strong performance across reasoning, coding, and creative tasks with a 1M token context window and 32K output tokens, at a fraction of the cost of GPT-5.

빠름Ultra

상세 보기

GPT-5 Nano

OpenAI

1 credits/1K

Ultra-fast and lightweight variant of GPT-5. Designed for high-throughput, low-latency applications with a 1M token context window and 32K output tokens at minimal cost.

빠름Ultra

상세 보기

GPT-5.1 (2025-11-13)

OpenAI

3 credits/1K

Dated snapshot of GPT-5.1 for reproducible results. Supports cached input tokens for cost savings on repeated context. Ideal for production deployments requiring model version pinning.

빠름Ultra

상세 보기

GPT-5.2

OpenAI

4 credits/1K

OpenAI's latest and most advanced GPT model. Delivers state-of-the-art performance across reasoning, coding, and creative tasks with enhanced capabilities.

빠름Ultra

상세 보기

GPT-5.4 Mini

OpenAI

2 credits/1K

Fast and cost-efficient variant of GPT-5.4 with 400K context window and 128K output tokens. Excellent balance of performance and affordability for everyday tasks.

빠름Ultra

상세 보기

GPT-5.4 Nano

OpenAI

1 credits/1K

Ultra-lightweight and fastest GPT-5.4 variant with 400K context and 128K output. Designed for high-throughput, low-latency applications at minimal cost. Supports MCP for tool integration.

빠름Ultra

상세 보기

GPT-5.4

OpenAI

5 credits/1K

OpenAI's newest flagship model with 1M context window and 128K output tokens. Delivers top-tier reasoning across all domains with adjustable reasoning effort levels from none to xhigh.

빠름Ultra

상세 보기

GPT-5.5

OpenAI

5 credits/1K

OpenAI's newest flagship model with a 1.05M token context window and 128K max output tokens. Supports cached inputs at 10× discount and improved reasoning, coding, and multimodal performance over the GPT-5.4 series.

빠름Ultra

상세 보기

GPT-5

OpenAI

3 credits/1K

OpenAI's latest flagship model. Delivers exceptional performance across reasoning, coding, and creative tasks with a massive 1M token context window and 32K output tokens. Supports vision, function calling, and JSON mode.

빠름Ultra

상세 보기

GPT Audio Mini

OpenAI

1 credits/1K

Lightweight multimodal model with native audio input/output capabilities. Optimized for voice-based interactions and audio processing tasks.

빠름Ultra

상세 보기

OpenAI o1

OpenAI

15 credits/1K

OpenAI's most advanced reasoning model. Uses extended thinking time to solve complex problems in science, coding, and math with exceptional accuracy.

빠름Ultra

상세 보기

OpenAI o3-mini

OpenAI

2 credits/1K

Efficient reasoning model that delivers strong performance at lower cost. Ideal for tasks requiring reasoning without the overhead of larger models.

빠름Ultra

상세 보기

OpenAI o4-mini

OpenAI

2 credits/1K

Fast, cost-effective reasoning model optimized for coding and STEM tasks. Provides strong reasoning at a fraction of the cost of larger reasoning models.

빠름Ultra

상세 보기

가격 단위: 크레딧/토큰 기준입니다. 예: 1,000 토큰 입력, 500 토큰 출력 시 gpt-4o-mini는 0.3 + 0.6 = 0.9 크레딧

GPT-5 / GPT-5.2 / O-Series 주의사항

GPT-5, GPT-5.2, o1, o3 등 추론(Reasoning) 모델은 일반 모델과 파라미터가 다릅니다:

max_tokens → max_completion_tokens 사용
temperature, top_p 지원 안 함
새 파라미터: reasoning_effort (minimal/low/medium/high)

OpenAI (GPT)

GPT-4o / GPT-4.1 (일반 모델)

curl -X POST https://api.core.today/llm/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello!"}],
    "max_tokens": 1000,
    "temperature": 0.7
  }'

GPT-5 / O-Series (추론 모델)

curl -X POST https://api.core.today/llm/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
    "model": "gpt-5",
    "messages": [{"role": "user", "content": "Explain quantum computing"}],
    "max_completion_tokens": 16000,
    "reasoning_effort": "medium"
  }'

GPT-5 전용 파라미터

max_completion_tokens - 최대 출력 토큰 (max_tokens 대신 사용)
reasoning_effort - 추론 수준: minimal, low, medium, high

Codex 모델 (코드 특화)

Codex 모델은 Responses API 전용입니다

gpt-5.1-codex, gpt-5.1-codex-mini 등 Codex 모델은 /v1/chat/completions를 지원하지 않습니다. 대신 /v1/responses 엔드포인트를 사용해야 합니다.

게이트웨이는 경로를 그대로 프록시하므로, 클라이언트에서 엔드포인트 경로만 변경하면 됩니다: /llm/openai/v1/responses → OpenAI /v1/responses

# Codex 모델: /v1/responses 엔드포인트 사용
curl -X POST https://api.core.today/llm/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
    "model": "gpt-5.1-codex",
    "instructions": "You are a helpful coding assistant.",
    "input": "Write a Python function to merge two sorted lists",
    "max_output_tokens": 16000
  }'

Responses API 주요 차이점:

messages → input (문자열 또는 메시지 배열)
시스템 프롬프트: instructions 파라미터 사용
출력 토큰 제한: max_output_tokens 사용
스트리밍 시 이벤트 형식: response.output_text.delta

대상 모델: gpt-5.1-codex, gpt-5.1-codex-mini, gpt-5.2-codex 등 모델 이름에 "codex"가 포함된 모델

모델	Input	Output

모델	Input	Output

모델	Input	Output

모델	Input	Output

Anthropic (Claude)

curl -X POST https://api.core.today/llm/anthropic/v1/messages \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
    "model": "claude-sonnet-4",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Explain quantum computing simply."}
    ]
  }'

Claude Opus 4.7 주의사항

claude-opus-4-7은 temperature 파라미터가 deprecated 되었습니다. 요청 바디에 포함하면 Anthropic이 400: `temperature` is deprecated for this model 으로 거절합니다 — 이 모델 호출 시 temperature 필드를 제거하세요.

모델	Input	Output

모델	Input	Output

모델	Input	Output

Google (Gemini)

curl -X POST "https://api.core.today/llm/gemini/v1beta/models/gemini-2.5-pro:generateContent" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
    "contents": [
      {
        "parts": [{"text": "Write a haiku about programming"}]
      }
    ]
  }'

모델	Input	Output

참고: gemini-2.5-pro는 200,000 토큰 초과 시 longcontext 가격 적용 (Input: 0.0050, Output: 0.0300)

기타: gemini-embedding-001 (임베딩 전용, Input 0.0003) | gemini-3-pro-preview-longcontext (Input 0.0080, Output 0.0360)

스트리밍 응답

실시간으로 응답을 받으려면 stream: true를 추가하세요:

curl -X POST https://api.core.today/llm/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
    "model": "gpt-5",
    "messages": [{"role": "user", "content": "Tell me a long story"}],
    "stream": true
  }'

비용 계산 예시

1,000 토큰 입력, 500 토큰 출력 기준:

모델	계산	총 비용
gpt-4o-mini	0.3 + 0.6	0.9 크레딧
gpt-5	2.5 + 10.0	12.5 크레딧
claude-3-haiku	0.5 + 1.25	1.75 크레딧
claude-sonnet-4	6.0 + 15.0	21.0 크레딧
gemini-2.0-flash	0.2 + 0.4	0.6 크레딧