Informationretrieval

⇨ Designed and implemented a search engine architecture from scratch for CACM and a sample Wikipedia corpus. ⇨ Crawled the corpus, parsed and indexed the raw documents using simple word count program using Map Reduce, performed ranking using the standard Page Rank algorithm and retrieved the relevant pages using variations of four distinct IR approaches, BM25, TF-IDF, cosine similarity and Lucene based IR model. ⇨ Conducted a comparative study to evaluate the performance of the different search engines. ⇨ Technologies used: Lucene, NetBeans, JSoup, Weka, MapReduce
Alternatives To Informationretrieval
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Autocomplete203
a year agoapache-2.0C#
Persistent, simple, powerful and portable autocomplete library
Informationretrieval7
a year agoJava
⇨ Designed and implemented a search engine architecture from scratch for CACM and a sample Wikipedia corpus. ⇨ Crawled the corpus, parsed and indexed the raw documents using simple word count program using Map Reduce, performed ranking using the standard Page Rank algorithm and retrieved the relevant pages using variations of four distinct IR approaches, BM25, TF-IDF, cosine similarity and Lucene based IR model. ⇨ Conducted a comparative study to evaluate the performance of the different search engines. ⇨ Technologies used: Lucene, NetBeans, JSoup, Weka, MapReduce
Alternatives To Informationretrieval
Select To Compare


Alternative Project Comparisons
Popular Lucene Projects
Popular Search Algorithm Projects
Popular Data Processing Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Java
Lucene
Information Retrieval
Search Algorithm
Information Theory