Informationretrieval

⇨ Designed and implemented a search engine architecture from scratch for CACM and a sample Wikipedia corpus. ⇨ Crawled the corpus, parsed and indexed the raw documents using simple word count program using Map Reduce, performed ranking using the standard Page Rank algorithm and retrieved the relevant pages using variations of four distinct IR approaches, BM25, TF-IDF, cosine similarity and Lucene based IR model. ⇨ Conducted a comparative study to evaluate the performance of the different search engines. ⇨ Technologies used: Lucene, NetBeans, JSoup, Weka, MapReduce

Categories > Data Processing > Lucene

Suggest Alternative

Stars

License

No license specified

Open Issues

Most Recent Commit

a year ago

Programming Language

Java

Categories

Programming Languages > Java

Data Processing > Lucene

Data Processing > Information Retrieval

Computer Science > Search Algorithm

Mathematics > Information Theory

Repo

Alternatives To Informationretrieval

Project Name	Stars	Most Recent Commit	License	Language
Autocomplete	203	a year ago	apache-2.0	C#
Persistent, simple, powerful and portable autocomplete library
Informationretrieval	7	a year ago		Java
⇨ Designed and implemented a search engine architecture from scratch for CACM and a sample Wikipedia corpus. ⇨ Crawled the corpus, parsed and indexed the raw documents using simple word count program using Map Reduce, performed ranking using the standard Page Rank algorithm and retrieved the relevant pages using variations of four distinct IR approaches, BM25, TF-IDF, cosine similarity and Lucene based IR model. ⇨ Conducted a comparative study to evaluate the performance of the different search engines. ⇨ Technologies used: Lucene, NetBeans, JSoup, Weka, MapReduce

Alternatives To Informationretrieval

Select To Compare

Autocomplete ⭐ 203

Persistent, simple, powerful and portable autocomplete library

most recent commit a year ago

Informationretrieval ⭐ 7

most recent commit a year ago

Suggest An Alternative To InformationRetrieval

Alternative Project Comparisons

Informationretrieval vs Autocomplete

Popular Lucene Projects

Elasticsearch Analysis Ik ⭐ 15,853

The IK Analysis plugin integrates Lucene IK analyzer into elasticsearch, support customized dictionary.

dependent packages 1total releases 17latest release January 15, 2018most recent commit 3 months ago

Awesome Elasticsearch ⭐ 4,702

A curated list of the most important and useful resources about elasticsearch: articles, videos, blogs, tips and tricks, use cases. All about Elasticsearch!

most recent commit 3 months ago

Lucene Solr ⭐ 4,363

Apache Lucene and Solr open-source search software

most recent commit 3 months ago

Crate ⭐ 3,864

CrateDB is a distributed and scalable SQL database for storing and analyzing massive amounts of data in near real-time, even with complex queries. It is PostgreSQL-compatible, and based on Lucene.

dependent packages 1total releases 13latest release October 25, 2016most recent commit 3 months ago

Roaringbitmap ⭐ 3,308

A better compressed bitset in Java: used by Apache Spark, Netflix Atlas, Tablesaw, and many others

dependent packages 124total releases 187latest release September 22, 2023most recent commit 3 months ago

Popular Search Algorithm Projects

Algorithms ⭐ 16,429

A collection of algorithms and data structures

most recent commit 2 months ago

Cosmos ⭐ 13,428

World's largest Contributor driven code dataset | Used in Quark Search Engine, @OpenGenus IQ, OpenGenus Visual Project

most recent commit 5 months ago

Flexsearch ⭐ 11,139

Next-Generation full text search library for Browser and Node.js

dependent packages 212total releases 81latest release October 03, 2022most recent commit 3 months ago

Algods ⭐ 3,410

Implementation of Algorithms and Data Structures, Problems and Solutions

most recent commit 7 months ago

Data Structure And Algorithms With Es6 ⭐ 1,012

Data Structures and Algorithms using ES6

most recent commit 4 years ago

Popular Data Processing Categories