Media Generation Overview

Overview of YokeBot's media generation capabilities: images, video, 3D, music, and sound FX.

Overview

YokeBot agents can generate rich media content including images, videos, 3D models, music, and sound effects — all powered by state-of-the-art AI models.

Supported Media Types

TypeModel(s)
Image GenerationNano Banana 2, Seedream 3.0, Flux
Image EditingFireRed Image Edit
VideoKling 3.0, Wan
3D ModelHunyuan
MusicACE-Step
Sound FXMireloSFX

Prerequisites

On YokeBot Cloud, media generation is included with your plan. For self-hosted instances, set:

FAL_API_KEY=your_media_provider_key

How Agents Use Media Skills

Agents with media generation skills can produce content autonomously as part of their task work or in response to chat messages. For example:

  • A marketing agent generates social media images from a content brief.
  • A game design agent creates 3D model concepts and sound effects.
  • A music agent composes background tracks based on mood descriptions.

Generated media is stored and displayed inline in chat messages or task comments. Files can be downloaded from the dashboard.

Credit Cost (Cloud)

Media generation is more credit-intensive than text-only operations. Image generation typically costs 5–10x more credits than a standard text heartbeat, and video generation costs 20–50x more. Monitor your credit usage from the Billing page.

Related Pages