Docspell

Docspell

Document management with OCR, tagging, and full-text search

2kstars
154forks
Last commit: 23d ago
Repo age: 7y old
Docspell screenshot

Docspell is a self-hosted document management system focused on collecting, organizing, and finding documents such as scans, invoices, letters, and PDFs. It provides OCR, metadata extraction, and a workflow-oriented “collect/organize” model with powerful search.

Key Features

  • Ingest documents through multiple channels (web upload and automated “collective”/watch-folder style imports)
  • OCR pipeline for searchable text (commonly via Tesseract) and extraction of document text
  • Full-text search across OCR and document metadata
  • Tagging, custom metadata, and structured organization (e.g., correspondents, document dates)
  • Duplicate detection support (hash-based) to avoid storing the same document twice
  • Email-based workflows/notifications are supported via integrations and job execution (depending on setup)
  • REST API for automations and integration with other tools
  • Multi-user support with accounts/collections separation

Use Cases

  • Paperless home archive: scan mail and receipts, auto-OCR, then search later by content
  • Small business bookkeeping support: store invoices/contracts and quickly retrieve by vendor/date
  • Team document repository for shared operational documents with consistent tagging

Limitations and Considerations

  • OCR quality depends heavily on input scans and the configured OCR engine/language packs
  • Feature set is oriented toward document intake/search rather than full collaborative editing

Docspell fits users who want a robust pipeline from “incoming documents” to “indexed, searchable archive” with automation-friendly ingestion. It’s particularly strong when paired with a scanner or watch-folder/import workflow to minimize manual filing effort.

Categories:

Tags:

Tech Stack:

Share:

Similar Services

Stirling PDF

Stirling PDF

Self-hosted web app for PDF manipulation and conversion

72.9k
6.2k
Last commit: 1d ago

Web-based PDF toolkit for merge/split/convert/OCR/redact/sign and more, with an optional API and Docker deployment.

Alternative to:
Adobe Acrobat
Adobe Acrobat
+2
Paperless-ngx

Paperless-ngx

Document management with OCR and full‑text search

35.5k
2.2k
Last commit: 1d ago

Self-hosted document management system that ingests scans and emails, performs OCR, extracts metadata, and provides fast full-text search with tags and workflows.

Alternative to:
Adobe Acrobat
Adobe Acrobat
+7
Reactive Resume

Reactive Resume

A free and open-source resume builder you can host yourself

34.3k
3.8k
Last commit: 1d ago

Self-hosted resume/CV builder with templates, versioning, JSON import/export, and PDF export for creating and managing multiple resumes.

Alternative to:
Canva Resume Builder
Canva Resume Builder
+5
ConvertX

ConvertX

Self-hosted file conversion server with a web UI and API

13.6k
724
Last commit: 1d ago

ConvertX is a self-hosted file conversion service that provides a web interface and API to convert documents, images, audio, and video using a containerized toolchain.

Alternative to:
Smallpdf
Smallpdf
+3
Documenso

Documenso

Open-source document signing and workflow platform

12.2k
2.2k
Last commit: 2d ago

Self-hosted platform for preparing, sending, and tracking legally binding e-signatures with templates, audit trails, and team workflows.

Alternative to:
DocuSign
DocuSign
+9
DocuSeal

DocuSeal

Open-source document signing and form workflows

11.1k
917
Last commit: 4d ago

Self-hosted eSignature platform for creating templates, collecting form data, and signing PDFs with audit trails and team workflows.

Alternative to:
DocuSign
DocuSign
+9