Image2Video

See example prompts in the gallery.

Pricing for video generation.

Overview

Astria Image2Video supports a wide catalog of image-to-video, text-to-video, reference-to-video, and motion-control video models. Pick a video_model and pass a video_prompt describing the motion; Astria renders the first frame using the prompt's image stage (the chosen tune + text) and animates it.

Form fields

Pass these alongside text on POST /tunes/:id/prompts:

`video_model` (required)

The model to animate with. See the enum table below.

`video_prompt` (required)

Describes camera movement, scene, and object interactions. Should not include the token or LoRA used by the image prompt.

Example: <lora:1533312:1.0> ohwx woman hiking in the alps (in text) + Woman looking at the camera, smiling, puts hands on her hips, confident (in video_prompt).

`video_duration` (optional)

Integer seconds. Allowed values depend on the chosen model — see the table.

`video_first_frame` (optional)

Multipart image upload. When provided it overrides the image-stage render and text is no longer required.

`video_last_frame` (optional)

Multipart image upload for first+last keyframe models.

`input_video` (optional)

Multipart video upload. Required for motion-control models.

`aspect_ratio` (optional)

Forwarded to the video model when it supports an aspect-ratio knob (e.g. HappyHorse text-to-video, Kling, Wan, LTX). For image-to-video the aspect ratio is derived from the input image.

Example

curl -X POST -H "Authorization: Bearer $API_KEY" \
  https://api.astria.ai/tunes/$TUNE_ID/prompts \
  -F 'prompt[text]=<lora:1533312:1.0> A highly detailed image of thoughtful ohwx woman exploring a hidden urban garden' \
  -F 'prompt[video_model]=seedance_v15_720p' \
  -F 'prompt[video_prompt]=Woman looking at the camera, smiling, puts hands on her hips, confident' \
  -F 'prompt[video_duration]=5' \
  -F 'prompt[aspect_ratio]=16:9'

Text-to-video

Models in the text-to-video group accept a video_prompt with no first frame and no text — the model generates the video directly. Currently: seedance2_*, happyhorse_720p, happyhorse_1080p.

curl -X POST -H "Authorization: Bearer $API_KEY" \
  https://api.astria.ai/tunes/$TUNE_ID/prompts \
  -F 'prompt[video_model]=happyhorse_720p' \
  -F 'prompt[video_prompt]=A lone explorer walks across endless dunes at sunrise, cinematic.' \
  -F 'prompt[video_duration]=5' \
  -F 'prompt[aspect_ratio]=16:9'

Motion control

Motion-control models replay the motion from input_video while restyling the subject. Models: kling30_motion_control, kling30_motion_control_pro, wan_animate_720p, dreamactor_m2, happyhorse_motion_control.

curl -X POST -H "Authorization: Bearer $API_KEY" \
  https://api.astria.ai/tunes/$TUNE_ID/prompts \
  -F 'prompt[text]=ohwx man <faceid:123>' \
  -F 'prompt[video_model]=kling30_motion_control_pro' \
  -F 'prompt[video_prompt]=match the dance moves' \
  -F 'prompt[video_duration]=10' \
  -F 'prompt[input_video]=@/path/to/reference.mp4'

Models

Cost is the per-prompt charge in cents at the base duration listed for the model; videos longer than the base scale linearly. _audio variants generate an audio track.

`video_model`	base cost (¢)	`video_duration`
`seedance_480p`	10	2–12
`seedance_v15_720p`	14	4–12
`seedance_v15_audio_720p`	29	4–12
`seedance2_fast_480p`	60	4–15
`seedance2_fast_720p`	140	4–15
`seedance2_480p`	120	4–15
`seedance2_720p`	280	4–15
`seedance2_1080p`	450	4–15
`wan22_720p`	43	5
`wan22_fast_480p`	6	5
`wan22_fast_580p`	8	5
`wan22_fast_720p`	11	5
`wan25_720p`	53	5, 10
`wan26_720p`	53	5, 10, 15
`wan26_1080p`	79	5, 10, 15
`wan27_720p`	55	5, 10, 15
`wan27_1080p`	83	5, 10, 15
`wan_animate_720p`	44	10
`ltx23_720p`	17	5, 10, 15, 20
`ltx23_1080p`	22	5, 10, 15, 20
`kling25`	39	5, 10
`kling30_standard`	92	3–15
`kling30_standard_audio`	139	3–15
`kling30_pro`	123	3–15
`kling30_pro_audio`	185	3–15
`kling30_4k`	263	3–15
`kling30_motion_control`	277	10
`kling30_motion_control_pro`	370	10
`cinematic_video`	84	5, 10, 15
`dreamactor_m2`	29	10
`happyhorse_720p`	77	3–10
`happyhorse_1080p`	132	3–10
`happyhorse_motion_control`	154	10
`veo31_fast_720p`	85	4, 6, 8
`veo31_fast_audio_720p`	126	4, 6, 8
`veo31_fast_1080p`	85	4, 6, 8
`veo31_fast_audio_1080p`	126	4, 6, 8
`veo31_fast_4k`	264	8
`veo31_fast_audio_4k`	308	8
`veo31_lite_720p`	44	4, 6, 8
`veo31_lite_audio_720p`	44	4, 6, 8
`veo31_lite_1080p`	71	4, 6, 8
`veo31_lite_audio_1080p`	71	4, 6, 8

Capability matrix

Text-to-video (no first frame required): seedance2_*, happyhorse_720p, happyhorse_1080p.
Multi-reference images: seedance2_*, happyhorse_720p, happyhorse_1080p.
First+last keyframe (video_last_frame): seedance_v15_*, seedance2_*, wan21_*, wan26_*, wan27_*, wan_fast_*, ltx23_*, kling*, veo31_*, hailuo*.
Motion control (requires input_video): kling30_motion_control*, wan_animate_720p, dreamactor_m2, happyhorse_motion_control.

Backwards compatibility

The pre-2026-04 syntax embedding flags inside text is still accepted and promoted to the new columns server-side:

<lora:1533312:1.0> ohwx woman hiking in the alps --video --video_model seedance_v15_720p --duration 5 --video_prompt "Woman looking at the camera, smiling, confident"

New integrations should prefer the dedicated form fields.

Image2Video

Overview​

Form fields​

video_model (required)​

video_prompt (required)​

video_duration (optional)​

video_first_frame (optional)​

video_last_frame (optional)​

input_video (optional)​

aspect_ratio (optional)​

Example​

Text-to-video​

Motion control​

Models​

Capability matrix​

Backwards compatibility​