Models - Eversince

By default, the agent picks the best model for each task. You can override for a single generation or an entire project from the prompt box settings, or by telling the agent directly. When using the API, GET /models returns the current list.

Image

Mode	Description
Text to image	Generate from a text prompt
Image to image	Edit or transform an existing image
Ingredients to image	Combine multiple reference images into a new image

Available image models

Model	ID	Modes	Aspect ratios	Max refs	Max variants
Nano Banana Pro	`google-nano-banana-pro`	text, image, references	16:9, 9:16, 21:9, 1:1	5	1
Nano Banana 2	`google-nano-banana-2`	text, image, references	16:9, 9:16, 21:9, 1:1	14	1
Nano Banana	`google-nano-banana`	text, image, references	16:9, 9:16, 21:9, 1:1	3	1
GPT Image 2	`gpt-image-2`	text, image, references	1:1, 16:9, 9:16	10	5
GPT Image 1.5	`gpt-image-1.5`	text, image, references	16:9, 9:16	10	5
Seedream 5.0 Lite	`seedream-5`	text, image, references	16:9, 9:16, 21:9, 1:1	14	5
Seedream 4.5	`seedream-4.5`	text, image, references	16:9, 9:16, 21:9, 1:1	10	5
Flux 2 Max	`flux-2-max`	text, image, references	16:9, 9:16	8	1
Grok Imagine Pro	`grok-imagine-image-pro`	text, image	16:9, 9:16, 1:1	—	1
Riverflow 2.0 Pro	`riverflow-2-pro`	text, image, references	16:9, 9:16, 21:9, 1:1	10	1

Video

Mode	Description
Text to video	Generate from a text prompt
Image to video	Generate video from a starting image
Ingredients to video	Combine multiple reference images into a video
Video to video	Edit or transform an existing video
Audio to video	Generate video synced to speech or music
Multi-shot	Multiple shots in a single generation, each with its own direction
Lip-sync	Generate video synced to a voiceover
Start + end frame	Set the first and last frame, video is generated between them

Available video models

Model	ID	Modes	Durations	Audio	Key features
Seedance 2.0	`seedance-2.0`	text, image, references, video, audio, end frame, multi-shot, lip-sync	1–15s	Yes	9 ref images, 3 ref videos, 3 ref audios, prose multi-shot, native lip-sync, 720p
Seedance 2.0 Fast	`seedance-2.0-fast`	text, image, references, video, audio, end frame, multi-shot, lip-sync	1–15s	Yes	Same capabilities as Seedance 2.0, lower cost, 720p
Kling 3.0 Omni	`kling-3.0-omni`	image, video, references, multi-shot, end frame	3–15s	Yes	7 ref images, 1 ref video, V2V
Kling 3.0	`kling-3.0`	image, multi-shot, end frame	3–15s	Yes	1080p
Kling O1 Edit	`kling-o1`	image, video, references, end frame	3–10s	No	V2V editing, 7 ref images, preserves original sound
Kling 2.6	`kling-2.6`	text, image	5, 10s	Yes	Negative prompt, 1080p
Veo 3.1	`google-veo-3.1`	text, image, end frame	4, 6, 8s	Yes	Up to 1080p
Veo 3.1 Fast	`google-veo-3.1-fast`	text, image, end frame	4, 6, 8s	Yes	Up to 1080p
Seedance 1.5 Pro	`seedance-1.5-pro`	text, image, end frame	4–12s	Yes	All aspect ratios, 1080p
Wan 2.6	`wan-2.6`	text, image, audio, end frame, lip-sync	5, 10, 15s	Always on	Audio input sync
LTX 2.3 Pro	`ltx-2.3-pro`	text, image, end frame	6, 8, 10s	Yes	Camera motion (dolly, jib, tracking, static, focus shift), up to 4K
LTX 2.3 Fast	`ltx-2.3-fast`	text, image, end frame	6–20s	Yes	Camera motion, up to 4K, longest durations
Sora 2 Pro	`openai-sora-2-pro`	text, image	4, 8, 12s	Always on	Up to 1024p
Sora 2	`openai-sora-2`	text, image	4, 8, 12s	Always on	720p
Grok Imagine Video	`grok-imagine-video`	text, image, video	1–15s	Always on	V2V, flexible durations, 720p

Audio

Type	Description
Voiceover	Generate speech across 60+ languages
Music	Generate music from text prompts
Sound effects	Generate sound effects and ambient audio

Voiceover

Model	ID	Key features
ElevenLabs v3	`elevenlabs`	60+ languages, tone control, voice transform, adjustable speed

Music

Model	ID	Duration	Duration control
ElevenLabs Music	`elevenlabs`	3s–10 min	Yes
MiniMax Music 2.5	`minimax`	Varies by lyrics length	No
Google Lyria 3 Pro	`google`	Up to 3 min	No

Sound effects

Model	ID	Duration	Key features
ElevenLabs SFX	`elevenlabs`	0.5–22s	Looping, prompt influence control