Skip to main content
Core.Today
|
MiniMaxMediumUltra

MiniMax Speech 2.8 HD

Ranked #1 on Artificial Analysis Speech Arena and Hugging Face TTS Arena. Broadcast-quality TTS with autoregressive Transformer + Flow-VAE decoder, 32+ languages, voice cloning, natural interjections, and emotion control.

180 credits
per 1000 characters (0.18 credits/char, billed per character)
#1 ranked on major TTS benchmarks
Studio-grade broadcast quality audio
Natural interjections (laughs, sighs, coughs, etc.)
Voice cloning from 5-second audio samples
32+ language support with emotion control
Subtitle timestamp export

Use with AI Assistant

Copy usage instructions for Claude, ChatGPT, or other AI

llms.txt

Quick Start

curl -X POST "https://api.core.today/v1/predictions" \
  -H "Content-Type: application/json" \
  -H "X-API-Key: cdt_your_api_key" \
  -d '{
  "model": "minimax/speech-2.8-hd",
  "input": {
    "text": "μ•ˆλ…•ν•˜μ„Έμš”, μ½”μ–΄λ‹·νˆ¬λ°μ΄μž…λ‹ˆλ‹€. (sighs) μ˜€λŠ˜μ€ νŠΉλ³„ν•œ 이야기λ₯Ό λ“€λ €λ“œλ¦¬λ €κ³  ν•©λ‹ˆλ‹€. <#0.5#> AIκ°€ λ§Œλ“œλŠ” μƒˆλ‘œμš΄ 세상, ν•¨κ»˜ κ²½ν—˜ν•΄ λ³΄μ‹œκ² μ–΄μš”?",
    "voice_id": "Korean_SweetGirl",
    "emotion": "calm",
    "speed": 0.9,
    "pitch": 0,
    "sample_rate": 32000,
    "subtitle_enable": true,
    "language_boost": "Korean"
  }
}'

Voice Gallery

Preview 74 preset voices

μ—‰λš±ν•œ μ†Œλ…€F
ν™œλ°œν•œ μ†Œλ…€F
μš©κ°ν•œ 여전사F
μ°¨λΆ„ν•œ μˆ™λ…€F
λ”°λœ»ν•œ μ—¬μ„±F
λ§€λ ₯적인 μ–Έλ‹ˆF
λ§€λ ₯적인 여동생F
λ°œλž„ν•œ 여동생F
μ†ŒκΏ‰μΉœκ΅¬ μ†Œλ…€F
μ°¨κ°€μš΄ μ†Œλ…€F
카리슀마 μ—¬μ™•F
μš°μ•„ν•œ 곡주F
맀혹적인 μ–Έλ‹ˆF
λ‹€μ •ν•œ μ–Έλ‹ˆF
λΆ€λ“œλŸ¬μš΄ μ—¬μ„±F
λ„λ„ν•œ μˆ™λ…€F
μ„±μˆ™ν•œ μ—¬μ„±F
μ‹ λΉ„λ‘œμš΄ μ†Œλ…€F
κ°œμ„± μžˆλŠ” μ†Œλ…€F
λ“¬μ§ν•œ μ–Έλ‹ˆF
λ‹Ήλ‹Ήν•œ μ†Œλ…€F
μˆ˜μ€μ€ μ†Œλ…€F
νŽΈμ•ˆν•œ μ—¬μ„±F
λ‹¬μ½€ν•œ μ†Œλ…€F
사렀 κΉŠμ€ μ—¬μ„±F
ν˜„λͺ…ν•œ μ—˜ν”„F
μš΄λ™ν•˜λŠ” 학생M
μš©κ°ν•œ λͺ¨ν—˜κ°€M
μš©κ°ν•œ μ²­λ…„M
μ°¨λΆ„ν•œ 신사M
μΎŒν™œν•œ λ‚¨μžμΉœκ΅¬M
μΎŒν™œν•œ ν›„λ°°M
건방진 λ‚¨μžM
μ°¨κ°€μš΄ μ²­λ…„M
μžμ‹ κ° μžˆλŠ” 보슀M
λ°°λ €ν•˜λŠ” μ„ λ°°M
κ°•μΈν•œ 남성M
열정적인 μ‹­λŒ€M
μ˜¨ν™”ν•œ 보슀M
μˆœμˆ˜ν•œ μ†Œλ…„M
지적인 남성M
지적인 μ„ λ°°M
κ³ λ…ν•œ 전사M
긍정적인 μ²­λ…„M
λ§€λ ₯적인 λ‚¨μžM
μ†Œμœ μš• κ°•ν•œ λ‚¨μžM
λ“¬μ§ν•œ μ²­λ…„M
μ—„κ²©ν•œ 보슀M
μ§€ν˜œλ‘œμš΄ μ„ μƒλ‹˜M
Wise Woman (Default)F
λ‹€μ •ν•œ μ‚¬λžŒ-
μ˜κ°μ„ μ£ΌλŠ” μ†Œλ…€F
κΉŠμ€ λͺ©μ†Œλ¦¬ 남성M
μ°¨λΆ„ν•œ μ—¬μ„±F
μΊμ£Όμ–Όν•œ λ‚¨μžM
ν™œλ°œν•œ μ†Œλ…€F
인내심 μžˆλŠ” 남성M
μ Šμ€ 기사M
결단λ ₯ μžˆλŠ” 남성M
μ‚¬λž‘μŠ€λŸ¬μš΄ μ†Œλ…€F
예의 λ°”λ₯Έ μ†Œλ…„M
μœ„μ—„ μžˆλŠ” νƒœλ„M
μš°μ•„ν•œ 남성M
μˆ˜λ…€μ›μž₯F
λ‹¬μ½€ν•œ μ†Œλ…€ 2F
ν™œκΈ°μ°¬ μ†Œλ…€F
Expressive NarratorM
Calm WomanF
Deep Voice ManM
Friendly Person-
Captivating StorytellerM
Graceful LadyF
Kind LadyF
Gentle ButlerM

Sample: "Hello? Welcome to Core.Today" Β· Click voice_id to copy

Parameters

ParameterTypeRequiredDefaultDescription
textstringYes-Text to narrate (max 10,000 characters). Use <#0.5#> to insert pauses. Supports interjections: (laughs), (sighs), (coughs), (gasps), (humming), (whistles), (sneezes), etc.
voice_idstringNoWise_WomanVoice preset or cloned voice ID. 17+ built-in voices available.
emotionstringNoautoDelivery style
autohappysadangryfearfuldisgustedsurprisedcalmfluentneutral
speednumberNo1Speech speed multiplier (0.5-2.0)
pitchintegerNo0Semitone offset (-12 to +12)
volumenumberNo1Relative loudness (0-10, default 1.0)
sample_rateintegerNo32000Audio sample rate in Hz (8000-44100)
audio_formatstringNomp3Output audio format
mp3wavflacpcm
subtitle_enablebooleanNofalseEnable subtitle timestamps

Common Parameters

Common parameters used when calling POST /v1/predictions.

ParameterTypeRequiredDefaultDescription
modelstringYes-Model identifier
inputobjectYes-Object containing the model-specific parameters from the table above
output_folderstringNo-Folder path for output files (max 256 chars, '..' not allowed)
webhook_urlstringNo-Webhook URL to call on completion
is_publicbooleanNofalseIf true, output files are also available via permanent public URLs

Examples

Expressive Audiobook Narration

Generate broadcast-quality narration with natural interjections and emotion

curl -X POST "https://api.core.today/v1/predictions" \
  -H "Content-Type: application/json" \
  -H "X-API-Key: cdt_your_api_key" \
  -d '{
  "model": "minimax/speech-2.8-hd",
  "input": {
    "text": "μ•ˆλ…•ν•˜μ„Έμš”, μ½”μ–΄λ‹·νˆ¬λ°μ΄μž…λ‹ˆλ‹€. (sighs) μ˜€λŠ˜μ€ νŠΉλ³„ν•œ 이야기λ₯Ό λ“€λ €λ“œλ¦¬λ €κ³  ν•©λ‹ˆλ‹€. <#0.5#> AIκ°€ λ§Œλ“œλŠ” μƒˆλ‘œμš΄ 세상, ν•¨κ»˜ κ²½ν—˜ν•΄ λ³΄μ‹œκ² μ–΄μš”?",
    "voice_id": "Korean_SweetGirl",
    "emotion": "calm",
    "speed": 0.9,
    "pitch": 0,
    "sample_rate": 32000,
    "subtitle_enable": true,
    "language_boost": "Korean"
  }
}'

Tips & Best Practices

1Use natural interjection tags like (laughs), (sighs), (gasps) for lifelike delivery
2Insert pauses with <#0.5#> markers for dramatic timing (0.01-99.99 seconds)
3Write out numbers and dates in words for more natural pronunciation
4Combine slow speed (0.8-0.9) with calm emotion for audiobook narration
5Enable subtitle_enable for video production with synced captions
6Use sample_rate 32000+ for professional broadcast quality

Use Cases

Professional voiceovers and audiobooks
Broadcast-quality podcast production
Video narration and advertising
Multilingual content localization
Game character voice acting
Accessibility applications