
Docspell
Document management with OCR, tagging, and full-text search

Docspell is a self-hosted document management system focused on collecting, organizing, and finding documents such as scans, invoices, letters, and PDFs. It provides OCR, metadata extraction, and a workflow-oriented “collect/organize” model with powerful search.
Key Features
- Ingest documents through multiple channels (web upload and automated “collective”/watch-folder style imports)
- OCR pipeline for searchable text (commonly via Tesseract) and extraction of document text
- Full-text search across OCR and document metadata
- Tagging, custom metadata, and structured organization (e.g., correspondents, document dates)
- Duplicate detection support (hash-based) to avoid storing the same document twice
- Email-based workflows/notifications are supported via integrations and job execution (depending on setup)
- REST API for automations and integration with other tools
- Multi-user support with accounts/collections separation
Use Cases
- Paperless home archive: scan mail and receipts, auto-OCR, then search later by content
- Small business bookkeeping support: store invoices/contracts and quickly retrieve by vendor/date
- Team document repository for shared operational documents with consistent tagging
Limitations and Considerations
- OCR quality depends heavily on input scans and the configured OCR engine/language packs
- Feature set is oriented toward document intake/search rather than full collaborative editing
Docspell fits users who want a robust pipeline from “incoming documents” to “indexed, searchable archive” with automation-friendly ingestion. It’s particularly strong when paired with a scanner or watch-folder/import workflow to minimize manual filing effort.
Categories:
Tags:
Tech Stack:
Similar Services

Stirling PDF
Self-hosted web app for PDF manipulation and conversion
Web-based PDF toolkit for merge/split/convert/OCR/redact/sign and more, with an optional API and Docker deployment.


Paperless-ngx
Document management with OCR and full‑text search
Self-hosted document management system that ingests scans and emails, performs OCR, extracts metadata, and provides fast full-text search with tags and workflows.


Reactive Resume
A free and open-source resume builder you can host yourself
Self-hosted resume/CV builder with templates, versioning, JSON import/export, and PDF export for creating and managing multiple resumes.

ConvertX
Self-hosted file conversion server with a web UI and API
ConvertX is a self-hosted file conversion service that provides a web interface and API to convert documents, images, audio, and video using a containerized toolchain.


Documenso
Open-source document signing and workflow platform
Self-hosted platform for preparing, sending, and tracking legally binding e-signatures with templates, audit trails, and team workflows.


DocuSeal
Open-source document signing and form workflows
Self-hosted eSignature platform for creating templates, collecting form data, and signing PDFs with audit trails and team workflows.

Elasticsearch