Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for java tika
java
x
tika
x
42 search results found
Tika
⭐
2,007
The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
Fscrawler
⭐
1,279
Elasticsearch File System Crawler (FS Crawler)
Datashare
⭐
519
A self-hosted search engine for documents.
Sparkler
⭐
401
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Extract
⭐
229
A cross-platform command line tool for parallelised content extraction and analysis.
Tikaondotnet
⭐
148
Use the Java Tika text extraction library on the .NET platform
Node Tika
⭐
128
Apache Tika bridge for Node.js. Text and metadata extraction, language detection and more.
Vorbisjava
⭐
109
A library for working with Ogg Vorbis files
Verticlesearchengine
⭐
98
Academic Search Engine using Scrapy, MongoDB, Lucene/Solr, Tika, Struts2, Jquery, Bootstrap, D3, CAS
Imagecat
⭐
84
ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (images,but could be extended to other files) in place, and to extract metadata and OCR information from those files/images using Tika and Tesseract OCR.
Es Amazon S3 River
⭐
59
Amazon S3 river for Elasticsearch
Rtika
⭐
52
R Interface to Apache Tika
Xponents
⭐
41
Geographic Place, Date/time, and Pattern entity extraction toolkit along with text extraction from unstructured data and GIS outputters.
Cogstack Pipeline
⭐
39
Distributed, fault tolerant batch processing for Natural Language Applications and Search, using remote partitioning
En4j
⭐
31
Java Desktop Client to Evernote
Nifi Extracttext Processor
⭐
28
Apache NiFi Custom Processor Extracting Text From Files with Apache Tika
Xltsearch
⭐
28
High-performance, portable and configurable desktop search application / information retrieval system
Ipfs Tika
⭐
27
Java web application taking IPFS hashes, extracting (textual) content and metadata through Apache's Tika.
Ruby_tika_app
⭐
23
A ruby wrapper for the Tika jar (tika-app.jar) that extracts text in a lot of formats from PDF, xls, doc, etc files
Document_search_engine_architecture
⭐
22
📄🚀 Unleash a powerful Document Search Engine with Apache NiFi for lightning-fast, comprehensive text indexing and search.
Simple Tika Server
⭐
19
Apache Tika as a http service, PUT files and get the metadata as JSON
Jhighlight
⭐
18
JHighlight is an embeddable pure Java syntax highlighting library.
Tika Server
⭐
18
Apache Tika Server with Tesseract 4 Docker Setup
Alfresco Transform Core
⭐
15
Tika Text Extract
⭐
15
Extract text from a document by Apache Tika
Nanite
⭐
14
Nanite - a friendly swarm of format-identifying robots.
Tika Dl4j Spark Imgrec
⭐
13
Image recognition on Spark cluster powered by Deeplearning4j and Apache Tika
Tika Service
⭐
12
Apache Tika running as a web service
Tika Ner Corenlp
⭐
11
Stanford CoreNLP NER addon for Apache Tika's NamerEntityParser
Quarkus Tika
⭐
10
Quarkus Tika extension
Tika Hadoop Mapreduce
⭐
10
Apache Tika integration with Java MapReduce for Hadoop
Dropwizard Tika Server
⭐
10
A DropWizard wrapper around Apache Tika.
Leechcrawler
⭐
8
Incremental crawling capabilities for Apache Tika. Crawl content out of e.g. file systems, http(s) sources (webcrawling) imap(s) servers or your own arbitrary data sources. LeechCrawler offers additional Tika parsers providing these crawling capabilities.
Kafka Connect Document Source
⭐
8
Kafka connector with content extraction to push extracte document contents.
Gravity
⭐
8
An efficient Java substring search library
Text Extractor
⭐
6
Tool for extract text from Office and PDFs files as a very, very tiny alternative to Apache Tika
Tika Wrapper
⭐
6
Wraps Apache Tika library (http://tika.apache.org/) in order to allow a simple usage and add or improve some features
Visualize Unstructured Data With Watson
⭐
6
Visualize unstructured data using Watson NLU
Springboot Fileserver
⭐
5
Tikaserver Ex
⭐
5
JAX-RS Server for Apache Tika
Tika Page Extractor
⭐
5
Tika per page PDF extractor server returning content as JSON.
Lucene Example
⭐
5
Example project to show using Tika with Lucene
Related Searches
Java Spring (21,393)
Java Spring Boot (12,044)
Java Gradle (8,098)
Java Video Game (8,093)
Java Docker (6,075)
Java Database (6,015)
Java Mysql (5,954)
Java Sdk (5,864)
Javascript Java (5,468)
Java Rest (4,956)
1-42 of 42 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.