Learn how to add content from files, websites, URLs, and manual entry to build your knowledge base.
Document sources are the connectors between your content and knowledge collections. Each source type is optimized for different use cases, from uploading internal documents to continuously syncing public documentation sites.
All sources automatically:
Upload files directly from your computer to add them to a knowledge collection.
| Format | Extensions | Notes |
|---|---|---|
| Text | .txt | Plain text files |
| Markdown | .md, .markdown | Preserves formatting structure |
| HTML | .html, .htm | Converted to markdown |
| Text extracted from document |
Batch Upload
You can upload multiple files at once. Each file becomes a separate document in your collection, but they're all tracked under a single source.
Automatically crawl and index an entire website or documentation site.
https://docs.example.com)The starting point for crawling. The crawler will only index pages under this domain.
https://docs.example.com
— Will crawl all pages under docs.example.com
https://example.com/help
— Will crawl /help and subdirectories
https://example.com/blog/post-1
— Too specific, use blog/ instead
Limit the number of pages to crawl to control processing time and costs. Recommended limits:
Be Respectful of Target Sites
The crawler is rate-limited to avoid overloading target servers. Large sites may take 10-30 minutes to fully crawl. Consider using URL lists for specific important pages if you need faster results.
Index specific pages by providing a list of URLs to scrape without crawling entire sites.
https://docs.example.com/getting-started
https://docs.example.com/api/authentication
https://docs.example.com/api/rate-limits
https://blog.example.com/best-practices
https://help.example.com/troubleshooting
Pro Tips
Type or paste content directly into the knowledge base through the UI.
Manual entries support both plain text and markdown:
Our refund policy:
Customers can request refunds within 30 days of purchase.
Refunds are processed within 5-7 business days.
Original payment method will be credited.
# Refund Policy
## Eligibility
- Within 30 days of purchase
- Product must be unused
- Original packaging required
## Processing Time
Refunds are processed within **5-7 business days**.
## Payment
Original payment method will be credited.
Markdown is Recommended
Using markdown formatting (headings, lists, bold) helps the chunking algorithm preserve document structure and improves retrieval accuracy.
Keep your knowledge base up to date by automatically re-scraping website and URL sources on a schedule.
| Frequency | Best For |
|---|---|
| Manual | Static content, one-time imports, file uploads |
| Daily | Frequently updated documentation, news content |
| Weekly | Product docs, help centers (most common) |
| Monthly | Policy documents, infrequently updated content |
Syncing is Non-Disruptive
Your chatbot continues to use existing content while syncing happens in the background. New content becomes available as soon as processing completes.
You can trigger a sync manually at any time:
Respect Copyright and Terms of Service
Only scrape websites you have permission to use. Most public documentation is fine, but always check the site's terms of service before indexing external content.