OpenAI빠름높음

GPT-4.1 Mini

소형 모델 성능의 획기적 도약. GPT-4o와 동등하거나 뛰어난 지능을 갖추면서 지연 시간은 거의 절반, 비용은 83% 절감. 속도, 품질, 경제성의 이상적 균형.

1 크레딧

1K 토큰당 (평균)

1M 토큰 컨텍스트 윈도우

32K 최대 출력 토큰

GPT-4o 지능을 훨씬 낮은 비용으로

GPT-4o 대비 ~50% 낮은 지연 시간

네이티브 비전 (텍스트 + 이미지)

함수 호출 & JSON 모드

AI 어시스턴트에서 사용하기

이 모델의 사용법을 Claude, ChatGPT 등에 복사

llms.txt

모델 상세 사양

컨텍스트 윈도우

1.0M

토큰

최대 출력

33K

토큰

학습 데이터

2024-05

호환 SDK

OpenAI

기능 지원

비전

함수 호출

스트리밍

JSON 모드

시스템 프롬프트

토큰별 가격 (1M 토큰당)

토큰 종류	크레딧	달러 환산
입력 토큰	400	$0.40
출력 토큰	1,600	$1.60
캐시된 토큰	100	$0.10

* 1 크레딧 ≈ $0.001 (실제 요금은 사용량에 따라 달라질 수 있습니다)

빠른 시작

curl -X POST "https://api.core.today/llm/openai/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
  "model": "gpt-4.1-mini",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant. Be concise."
    },
    {
      "role": "user",
      "content": "What are the top 3 design patterns for microservices?"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 1000
}'

파라미터

파라미터	타입	필수	기본값	설명
`messages`	array	Yes	-	role과 content를 포함한 메시지 객체 배열
`model`	string	Yes	gpt-4.1-mini	모델 식별자
`temperature`	float	No	1.0	샘플링 온도 (0-2). 낮을수록 집중적, 높을수록 창의적
`max_tokens`	integer	No	4096	응답의 최대 토큰 수 (최대 32768)
`stream`	boolean	No	false	Server-Sent Events 스트리밍 활성화
`response_format`	object	No	-	응답 형식: JSON 모드의 경우 { type: 'json_object' }
`tools`	array	No	-	모델이 호출할 수 있는 도구(함수) 목록
`top_p`	float	No	1.0	핵 샘플링 임계값 (0-1)

예제

빠른 채팅

빠르고 경제적인 대화

curl -X POST "https://api.core.today/llm/openai/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
  "model": "gpt-4.1-mini",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant. Be concise."
    },
    {
      "role": "user",
      "content": "What are the top 3 design patterns for microservices?"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 1000
}'

데이터 분류

대규모 콘텐츠 분류 및 태깅

curl -X POST "https://api.core.today/llm/openai/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer cdt_your_api_key" \
  -d '{
  "model": "gpt-4.1-mini",
  "messages": [
    {
      "role": "system",
      "content": "Classify the given text into categories. Respond with JSON."
    },
    {
      "role": "user",
      "content": "Classify this support ticket: 'My payment was charged twice and I need a refund for the duplicate charge'"
    }
  ],
  "response_format": {
    "type": "json_object"
  },
  "max_tokens": 200
}'

팁 & 모범 사례

1최고 가성비 모델 — GPT-4o 지능을 83% 낮은 비용으로

2대용량 프로덕션 워크로드에 이상적

3GPT-4o 대비 거의 절반의 지연 시간 — 실시간 앱에 적합

4GPT-4o가 비싸게 느껴지는 작업에 활용

5GPT-4.1과 동일한 1M 컨텍스트 윈도우 지원

6스트리밍으로 첫 토큰까지의 시간 대폭 단축

사용 사례

비용 효율적인 프로덕션 배포

실시간 챗봇 및 어시스턴트

대용량 데이터 처리

코드 완성 및 제안

콘텐츠 분류 및 태깅

빠른 문서 분석

모델 정보

제공자OpenAI

버전2025-04-14

카테고리LLM

가격1 크레딧

API Endpoint

POST /llm/openai/v1/chat/completions

Playground에서 테스트