Upload, organize, and manage documents in the Knowledge Base.
| Format | Extension | Notes |
|---|---|---|
| Plain Text | .txt | Direct text ingestion. |
| Markdown | .md | Headings are used to improve chunk boundaries. |
| Text is extracted. Scanned PDFs require OCR preprocessing. | ||
| Word Document | .docx | Text and basic formatting are extracted. |
| CSV | .csv | Each row is treated as a separate chunk. |
Documents are split into chunks of approximately 500 tokens with a 50-token overlap. This balances retrieval precision with context completeness. For markdown files, the engine respects heading boundaries to keep sections intact.
Each document in the KB shows its name, upload date, chunk count, and status (processing, ready, or error). You can:
On YokeBot Cloud, document storage counts against your team's storage quota. Self-hosted instances are limited only by disk space. There is no hard limit on the number of documents per knowledge base, but very large KBs (thousands of documents) may increase query latency.
To update a document, delete the old version and upload the new one. YokeBot does not currently support in-place document updates — each upload creates a fresh set of embeddings.