Model Catalog
Explore our comprehensive collection of AI models for image, video, audio generation, and LLM.
Featured Models
Top picks from each category - the best models for getting started
FLUX.1 Schnell
Black Forest Labs
Ultra-fast image generation model optimized for speed. Generates high-quality images in just 1-2 seconds, perfect for real-time applications and rapid prototyping.
MiniMax Hailuo 2.3
MiniMax
Realistic human motion video generation with advanced character consistency and natural movement.
MiniMax Speech-02-Turbo
MiniMax
Low-latency text-to-speech model with multilingual support, emotional voice control, and 300+ voice options.
GPT-4o
OpenAI
OpenAI's flagship multimodal model. Industry-leading performance in reasoning, coding, and creative tasks with native vision capabilities and structured output support.
How to Choose the Right Model
Need Speed?
flux-schnell, Gemini Flash
Need Quality?
FLUX Pro, Kling Pro, Claude
Budget-Friendly?
flux-schnell, MiniMax, Gemini
Most Versatile?
GPT-4o, Claude, FLUX Dev
Image Generation Models
Generate stunning images with FLUX, Stable Diffusion, and more
FLUX.1 Schnell
FeaturedBlack Forest Labs
Ultra-fast image generation model optimized for speed. Generates high-quality images in just 1-2 seconds, perfect for real-time applications and rapid prototyping.
Seedream 4.0
ByteDance
ByteDance's latest image generation model with exceptional prompt understanding and creative capabilities.
FLUX 1.1 Pro
Black Forest Labs
Fast high-quality image generation, an upgrade to FLUX.1 Pro with faster speed and improved quality. Perfect for production workloads requiring both speed and fidelity.
FLUX.2 Dev
Black Forest Labs
Development version of FLUX.2 with image editing capabilities and reference image support. Ideal for iterative design workflows and experimentation.
FLUX 2 Flex
Black Forest Labs
Maximum-quality FLUX model supporting up to 10 reference images and advanced typography. The most capable model for complex, multi-reference creative projects.
FLUX 2 Pro
Black Forest Labs
Professional-grade FLUX 2 with high-quality editing and up to 8 reference image support. Excellent balance of quality, speed, and creative control.
FLUX.1 Krea [dev]
Krea AI
Photorealistic image generation that specifically avoids the 'AI look', producing natural-looking images indistinguishable from real photographs.
Nano Banana
Google Gemini 2.5 Flash-based image generation with multimodal editing capabilities. Fast and versatile for both creation and editing tasks.
Nano Banana Pro (Gemini 3 Pro Image)
Google's state-of-the-art image generation and editing model built on Gemini 3 Pro. Creates detailed visuals with legible text in multiple languages, connects to real-time information from Google Search, and provides professional-grade creative controls. Supports up to 14 reference images and resolutions up to 4K.
Nano Banana 2 (Gemini 3.1 Flash Image)
Google's fast image generation model built on Gemini 3.1 Flash Image. The high-efficiency counterpart to Nano Banana Pro — combining Pro-level visual quality with Flash-level speed and pricing. Features conversational editing, multi-image fusion, character consistency, accurate text rendering, and Google Search grounding. Supports up to 14 reference images and resolutions up to 4K.
Remove Background
Bria AI
AI-powered background removal tool for images. Clean, accurate cutouts for any subject with professional-quality edge detection.
Video Generation Models
Create AI videos with Kling, MiniMax, and cutting-edge models
Google Veo 3.1
FeaturedGoogle's state-of-the-art video generation model with built-in audio generation, producing cinematic-quality videos with synchronized sound.
Google Veo 3.1 Fast
Fast version of Veo 3.1 with audio generation, optimized for speed while maintaining high quality output.
OpenAI Sora 2
OpenAI
OpenAI's video generation model with realistic physics simulation and audio generation capabilities, producing highly coherent videos.
Kling v2.1
Kuaishou
Kling v2.1 with 720p/1080p support and frame transition capabilities for smooth, high-quality video generation.
Kling 2.5 Turbo Pro
Kuaishou
Cinematic-grade video generation with enhanced motion and scene coherence. Top-tier Kling model for professional output.
MiniMax Hailuo 2.3
MiniMax
Realistic human motion video generation with advanced character consistency and natural movement.
MiniMax Hailuo 2.3 Fast
MiniMax
Lower-latency version of Hailuo 2.3 optimized for faster generation while maintaining good quality for human motion videos.
PixVerse V5
PixVerse
Advanced video generation with special effects capabilities and anime-optimized output, supporting multiple visual styles.
Wan 2.5 T2V
Alibaba
Text-to-video model with audio synchronization support, producing high-quality videos from text prompts with natural motion.
Wan 2.5 T2V Fast
Alibaba
Fast text-to-video generation variant of Wan 2.5, optimized for speed with good quality output.
Wan 2.5 I2V
Alibaba
Image-to-video model with lip sync support, animating still images into realistic videos with natural motion.
Wan 2.5 I2V Fast
Alibaba
Fast image-to-video variant of Wan 2.5, optimized for rapid generation of animated videos from still images.
Seedance 1 Pro Fast
ByteDance
ByteDance's cinematic video generation model with fast generation speed and professional output quality.
Audio & TTS Models
Text-to-speech, voice cloning, and audio generation
Clova Voice TTS Premium
FeaturedNCP Clova
NAVER Clova Voice Premium TTS with 108 voices across 6 languages. High-quality Korean voice synthesis with emotion control, Pro voices, and bilingual support.
MiniMax Speech-02-Turbo
MiniMax
Low-latency text-to-speech model with multilingual support, emotional voice control, and 300+ voice options.
MiniMax Speech 2.6 HD
MiniMax
Studio-quality multilingual text-to-speech with nuanced prosody, subtitle export, and premium voices for professional applications.
MiniMax Speech 2.6 Turbo
MiniMax
Fast multilingual text-to-speech with emotional control, optimized for real-time applications with low latency.
MiniMax Speech 2.8 HD
MiniMax
Ranked #1 on Artificial Analysis Speech Arena and Hugging Face TTS Arena. Broadcast-quality TTS with autoregressive Transformer + Flow-VAE decoder, 32+ languages, voice cloning, natural interjections, and emotion control.
MiniMax Speech 2.8 Turbo
MiniMax
Low-latency MiniMax Speech 2.8 Turbo with under 250ms latency, 40+ languages, voice cloning, natural interjections, and real-time pricing. Ideal for interactive and real-time applications.
LLM Models
GPT-4o, Claude, Gemini - OpenAI-compatible chat API
GPT-4o
FeaturedOpenAI
OpenAI's flagship multimodal model. Industry-leading performance in reasoning, coding, and creative tasks with native vision capabilities and structured output support.
GPT-4.1
OpenAI
OpenAI's most capable model for coding and instruction following. Features a 1M token context window, 32K output tokens, and major improvements in coding, complex prompts, and long-context tasks. 20% cheaper than GPT-4o on output.
GPT-4.1 Mini
OpenAI
A significant leap in small model performance. Matches or exceeds GPT-4o in intelligence while reducing latency by nearly half and cost by 83%. Ideal balance of speed, quality, and affordability.
GPT-4.1 Nano
OpenAI
OpenAI's fastest and cheapest model. Optimized for classification, autocompletion, and low-latency tasks. Ultra-affordable at $0.10/1M input tokens.
GPT-4o Mini
OpenAI
Cost-effective, fast model with strong performance. Best for high-volume tasks where speed and cost matter more than absolute capability.
OpenAI o1
OpenAI
OpenAI's most advanced reasoning model. Uses extended thinking time to solve complex problems in science, coding, and math with exceptional accuracy.
OpenAI o4-mini
OpenAI
Fast, cost-effective reasoning model optimized for coding and STEM tasks. Provides strong reasoning at a fraction of the cost of larger reasoning models.
OpenAI o3-mini
OpenAI
Efficient reasoning model that delivers strong performance at lower cost. Ideal for tasks requiring reasoning without the overhead of larger models.
GPT-5
OpenAI
OpenAI's latest flagship model. Delivers exceptional performance across reasoning, coding, and creative tasks with a massive 1M token context window and 32K output tokens. Supports vision, function calling, and JSON mode.
GPT-5.2
OpenAI
OpenAI's latest and most advanced GPT model. Delivers state-of-the-art performance across reasoning, coding, and creative tasks with enhanced capabilities.
GPT-5 Mini
OpenAI
Fast and efficient variant of GPT-5. Delivers strong performance across reasoning, coding, and creative tasks with a 1M token context window and 32K output tokens, at a fraction of the cost of GPT-5.
GPT-5 Nano
OpenAI
Ultra-fast and lightweight variant of GPT-5. Designed for high-throughput, low-latency applications with a 1M token context window and 32K output tokens at minimal cost.
GPT Audio Mini
OpenAI
Lightweight multimodal model with native audio input/output capabilities. Optimized for voice-based interactions and audio processing tasks.
Claude Opus 4.6
Anthropic
Anthropic's most capable model. Delivers breakthrough performance in reasoning, coding, and complex analysis with enhanced safety and instruction following.
Claude Sonnet 4.5
Anthropic
Anthropic's most intelligent and capable Sonnet model. Best-in-class for complex reasoning, nuanced understanding, and coding tasks with exceptional instruction following.
Claude Opus 4.5
Anthropic
Anthropic's most powerful model for highly complex tasks. Exceptional at research, analysis, and creative projects requiring deep expertise.
Claude Haiku 4.5
Anthropic
Fast, cost-effective model for everyday tasks. Great balance of speed, intelligence, and cost for high-volume applications.
Claude Sonnet 4
Anthropic
Balanced Sonnet 4 model offering strong reasoning and coding abilities at an efficient price point. Ideal for everyday production workloads that need a good mix of speed and intelligence.
Claude 3.7 Sonnet
Anthropic
Enhanced Claude 3.7 Sonnet with improved reasoning and coding capabilities. A strong mid-tier model offering reliable performance across a wide range of tasks.
GPT-5.4
OpenAI
OpenAI's newest flagship model with 1M context window and 128K output tokens. Delivers top-tier reasoning across all domains with adjustable reasoning effort levels from none to xhigh.
GPT-5.4 Mini
OpenAI
Fast and cost-efficient variant of GPT-5.4 with 400K context window and 128K output tokens. Excellent balance of performance and affordability for everyday tasks.
GPT-5.4 Nano
OpenAI
Ultra-lightweight and fastest GPT-5.4 variant with 400K context and 128K output. Designed for high-throughput, low-latency applications at minimal cost. Supports MCP for tool integration.
GPT-5.1 (2025-11-13)
OpenAI
Dated snapshot of GPT-5.1 for reproducible results. Supports cached input tokens for cost savings on repeated context. Ideal for production deployments requiring model version pinning.
Gemini 3 Pro Preview
Google's most powerful Gemini model in preview. Features breakthrough reasoning, coding, and multimodal capabilities with the largest context window.
Gemini 2.0 Flash Lite
Ultra-lightweight version of Gemini 2.0 Flash optimized for maximum speed and minimal cost. Perfect for high-volume, latency-sensitive applications.
Gemini Embedding 001
Google's text embedding model for generating vector representations. Optimized for semantic search, clustering, and similarity tasks.
Gemini 2.0 Flash
Google's fastest and most capable model. Features a massive 1M token context window, native multimodal support, and real-time capabilities.
Gemini 2.5 Flash
Google's fast and efficient model with built-in thinking capabilities. Great balance of speed, reasoning, and cost for high-volume applications.
Gemini 2.5 Pro
Google's most capable model with state-of-the-art reasoning and 1M token context. Excels at complex coding, math, and multi-document analysis.
Gemini 3 Flash
Google's most advanced reasoning model with state-of-the-art multimodal understanding, PhD-level reasoning, and leading coding performance.
Gemini 3.1 Pro Preview
Google's latest and most capable Gemini model in preview. Features dynamic pricing that adjusts based on context length, with enhanced pricing for inputs over 200K tokens.
Gemini 3.1 Flash Image Preview
Gemini 3.1 Flash with native image generation capabilities. Can generate images directly in chat responses alongside text. Features separate pricing for text and image output tokens.
Gemini 3.1 Flash Lite Preview
Ultra-lightweight variant of Gemini 3.1 Flash. The most cost-effective Gemini model with support for cached input and audio input. Ideal for high-throughput, budget-conscious applications.
Gemini 3.1 Flash Live Preview
Gemini 3.1 Flash optimized for real-time interactions and live streaming scenarios. Features low-latency responses with audio input support at dedicated pricing.
Gemini 3 Pro Image Preview
Google's premium image generation model within the Gemini 3 Pro family. Generates high-quality images directly in chat with the highest fidelity among Gemini image models. Image output tokens are priced at 10x text output tokens.
Pricing Comparison
Compare credit costs across all models to find the best fit for your needs
| Category | Model | Credits | Unit | Best For |
|---|---|---|---|---|
| Image | flux-schnell | 5 | per image | Fast generation, real-time apps |
| flux-2-pro | 100 | per image | Best quality, commercial | |
| seedream-4 | 59 | per image | Text rendering, logos | |
| Video | kling-v2.5-turbo-pro | 525 | per video | Best quality video |
| hailuo-2.3 | 420 | per video | High quality, balanced | |
| Audio | speech-02-turbo | 108 | per request | Real-time TTS, fast response |
| speech-2.6-hd | 180 | per request | High quality HD voice | |
| LLM | gpt-4o | 3 | per 1K tokens | General, code, multimodal |
| claude-sonnet-4-5 | 4 | per 1K tokens | Analysis, long-form, code | |
| gemini-2.0-flash | 1 | per 1K tokens | Ultra-fast, bulk processing |
Ready to Get Started?
Try our models in the Playground or follow the Quickstart guide