3D, Music & Sound FX

Generate 3D models (Hunyuan), music (ACE-Step), and sound effects (MireloSFX).

3D Model Generation (Hunyuan)

The Hunyuan model generates 3D models from text descriptions. Generated models can be previewed in the dashboard's built-in 3D viewer and downloaded in standard formats.

ParameterTypeRequiredDescription
promptstringYesText description of the 3D model.
formatenumNoOutput format: "glb" (default), "obj", "fbx".

Example prompt: "A low-poly medieval castle with a drawbridge, stone walls, and a red flag on the tallest tower."


Music Generation (ACE-Step)

The ACE-Step model composes original music tracks from text descriptions. You can specify genre, mood, tempo, instrumentation, and structure.

ParameterTypeRequiredDescription
promptstringYesDescription of the music to generate.
durationnumberNoTrack length in seconds (default 30, max 180).
formatenumNoOutput format: "mp3" (default), "wav".

Example prompt: "An upbeat lo-fi hip hop track with mellow piano chords, a steady drum beat, vinyl crackle, and a jazzy bass line. 90 BPM."

lightbulb
Be specific about tempo (BPM), instruments, and mood for best results. Vague prompts like "nice music" produce generic output.

Sound Effects (MireloSFX)

MireloSFX generates short sound effects from text descriptions. Useful for game development, video editing, app design, and creative projects.

ParameterTypeRequiredDescription
promptstringYesDescription of the sound effect.
durationnumberNoDuration in seconds (default 3, max 10).

Example prompts:

  • "A heavy wooden door creaking open slowly in a stone castle"
  • "Sci-fi laser blaster firing three quick shots"
  • "Rain falling on a tin roof with occasional thunder"

Combining Media Skills

Agents can use multiple media skills in a single heartbeat. For example, a game asset agent might generate a 3D model, a matching texture image, and associated sound effects all in one task cycle. Assign all the relevant skills to the agent and describe the full scope in the task description.