Managing Documents

Upload, organize, and manage documents in the Knowledge Base.

Supported File Types

FormatExtensionNotes
Plain Text.txtDirect text ingestion.
Markdown.mdHeadings are used to improve chunk boundaries.
PDF.pdfText is extracted. Scanned PDFs require OCR preprocessing.
Word Document.docxText and basic formatting are extracted.
CSV.csvEach row is treated as a separate chunk.

Uploading Documents

  1. Navigate to Knowledge Base in the sidebar.
  2. Select an existing KB or create a new one.
  3. Click "Upload Documents".
  4. Drag and drop files or click to browse. You can upload multiple files at once.
  5. The engine processes each file: extracting text, chunking, and generating embeddings. Progress is shown in the upload panel.

Chunking Strategy

Documents are split into chunks of approximately 500 tokens with a 50-token overlap. This balances retrieval precision with context completeness. For markdown files, the engine respects heading boundaries to keep sections intact.

lightbulb
If retrieval quality seems poor, try breaking long documents into smaller, topic-focused files before uploading.

Viewing and Managing Documents

Each document in the KB shows its name, upload date, chunk count, and status (processing, ready, or error). You can:

  • Preview — view the extracted text and chunk boundaries.
  • Re-process — re-run chunking and embedding (useful if you updated the source file).
  • Delete — remove the document and its embeddings permanently.

Document Limits

On YokeBot Cloud, document storage counts against your team's storage quota. Self-hosted instances are limited only by disk space. There is no hard limit on the number of documents per knowledge base, but very large KBs (thousands of documents) may increase query latency.

Replacing Documents

To update a document, delete the old version and upload the new one. YokeBot does not currently support in-place document updates — each upload creates a fresh set of embeddings.