Reference for all built-in skills that ship with YokeBot.
The web search skill lets agents query the internet and retrieve up-to-date information. YokeBot supports two search providers:
| Provider | Env Variable | Notes |
|---|---|---|
| Tavily | TAVILY_API_KEY | Optimized for AI consumption. Returns structured summaries. Recommended default. |
| Brave Search | BRAVE_API_KEY | Privacy-focused search engine. Returns traditional web results. |
Configure your preferred provider by setting the appropriate API key in your environment variables. If both are set, agents can choose between them.
Agents can generate images using multiple models. The default model is Nano Banana 2 — fast, high-quality, and cost-effective. Style references let agents provide up to 6 existing images to guide the visual output.
| Model | Strengths | Credit Cost |
|---|---|---|
| Nano Banana 2 | Fast, versatile, supports style references via /edit endpoint. | 100 |
| Seedream 3.0 | Photorealistic, high detail, great for product imagery. | 100 |
| Flux | Artistic styles, creative compositions. | 100 |
Skill: generate_image
Required env: FAL_API_KEY
Parameters: prompt (required), aspect_ratio, num_images, image_urls (style refs, up to 6)The edit_image skill uses the FireRed model to modify existing images based on text instructions. Agents can change backgrounds, swap elements, adjust styles, or composite multiple images together.
Skill: edit_image
Provider: FireRed Image Edit
Required env: FAL_API_KEY
Parameters: prompt (required), image_url (required), aspect_ratio
Credit cost: 150Agents have full browser automation capabilities via Playwright. These tools let agents complete any multi-step online task — filling forms, submitting orders, downloading files, navigating dashboards, and more.
| Tool | Description |
|---|---|
| browser_navigate | Navigate to a URL with SSRF protection. |
| browser_click | Click an element by CSS selector or coordinates. |
| browser_type | Type text into a focused input field. |
| browser_screenshot | Capture a screenshot of the current page. |
| browser_snapshot | Get an accessibility snapshot of the page DOM. |
| browser_fill_form | Fill multiple form fields at once. |
| browser_download_file | Download a file and save to workspace. |
| browser_ask_human | Ask the human a question when the agent hits ambiguity. |
| browser_select_option | Select an option from a dropdown. |
| browser_press_key | Press a keyboard key (Enter, Tab, etc.). |
Browser tools are covered in detail in the Browser Automation section.
YokeBot supports two video generation models:
Set the FAL_API_KEY environment variable to enable video generation skills.
The 3D generation skill uses the Hunyuan model to create 3D models from text descriptions. Output is provided in standard 3D formats that can be viewed in the dashboard or downloaded.
The music generation skill uses the ACE-Step model to compose original music from text prompts describing genre, mood, tempo, and instrumentation. Generated audio files are playable directly in the dashboard.
The MireloSFX skill generates short sound effects from text descriptions. Useful for game development, video production, and creative projects.
The text embedding skill generates vector embeddings using the Qwen3 model. These embeddings power the Knowledge Base's semantic search. Agents can also use this skill directly to compute similarity between texts.