How vector embeddings and semantic search work in the Knowledge Base.
An embedding is a numerical vector representation of text that captures its semantic meaning. Texts with similar meanings produce vectors that are close together in embedding space. YokeBot uses embeddings to power semantic search — agents can find relevant documents even when the exact words differ.
YokeBot uses the Qwen3 embedding model to generate vectors. This model produces high-quality embeddings optimized for retrieval tasks across multiple languages.
| Property | Value |
|---|---|
| Model | Qwen3 Embedding |
| Dimensions | 1024 |
| Max Input Tokens | 8192 |
| Provider | Configurable |
When an agent queries the Knowledge Base, the following steps occur:
You can adjust search behavior with these parameters:
| Parameter | Default | Description |
|---|---|---|
| top_k | 5 | Number of chunks to retrieve per query. |
| similarity_threshold | 0.7 | Minimum similarity score (0–1) for a chunk to be included. |
| chunk_size | 500 | Approximate chunk size in tokens. Smaller chunks = more precise but less context. |
| chunk_overlap | 50 | Overlap between adjacent chunks to preserve context boundaries. |
For best results, YokeBot combines vector similarity search with keyword matching. If the agent's query contains specific names, codes, or identifiers that may not be captured well by embeddings alone, keyword search ensures those documents are still surfaced.
Embedding generation happens once per document upload and is the most compute-intensive step. Queries are fast even for large knowledge bases because vector search is optimized with approximate nearest neighbor (ANN) indexing.