Pdf Discovery Demo

Demonstration of searching PDF document with Solr, Tika, and Tesseract
Alternatives To Pdf Discovery Demo
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Node Tika1281554 years ago23February 22, 201710mitJava
Apache Tika bridge for Node.js. Text and metadata extraction, language detection and more.
Php Apache Tika104338 months ago38April 14, 2023mitPHP
Apache Tika bindings for PHP: extract text and metadata from documents, images and other formats
Imagecat84
6 years agoJava
ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (images,but could be extended to other files) in place, and to extract metadata and OCR information from those files/images using Tika and Tesseract OCR.
Harvester59
7 years ago3gpl-3.0JavaScript
Web crawling and document processing through a usable interface.
Rtika52
1a year ago8April 25, 20203apache-2.0R
R Interface to Apache Tika
Doc_processing_toolkit52
7 years ago4otherPython
Python library to extract text from PDF, and default to OCR when text extraction fails.
Cogstack Pipeline39
a year agootherJava
Distributed, fault tolerant batch processing for Natural Language Applications and Search, using remote partitioning
Pdf Discovery Demo24
a year ago2apache-2.0JavaScript
Demonstration of searching PDF document with Solr, Tika, and Tesseract
Tika Server18
3 years ago2apache-2.0Java
Apache Tika Server with Tesseract 4 Docker Setup
Tika Service12
a year agoapache-2.0Java
Apache Tika running as a web service
Alternatives To Pdf Discovery Demo
Select To Compare


Alternative Project Comparisons
Popular Tesseract Projects
Popular Tika Projects
Popular Media Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Javascript
Solr
Tesseract
Pdf Document
Tika