Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for natural language processing information retrieval
information-retrieval
x
natural-language-processing
x
73 search results found
Gensim
⭐
15,180
Topic Modelling for Humans
Haystack
⭐
12,474
🔍 LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
Danswer
⭐
6,435
Ask Questions in natural language and get Answers backed by private sources. Connects to tools like Slack, GitHub, Confluence, etc.
Txtai
⭐
6,143
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
Unstructured
⭐
4,404
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Marqo
⭐
3,893
Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
Catalyst
⭐
3,151
Accelerated deep learning R&D
Llmware
⭐
1,859
Providing enterprise-grade LLM-based development framework, tools, and fine-tuned models.
Knowledge Graphs
⭐
1,599
A collection of research on knowledge graphs
Pke
⭐
1,431
Python Keyphrase Extraction module
Beir
⭐
1,370
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
Awesome Fl
⭐
1,103
Comprehensive and timely academic information on federated learning (papers, frameworks, datasets, tutorials, workshops)
Rocketqa
⭐
707
🚀 RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.
Talisman
⭐
666
Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.
Awesome Persian Nlp Ir
⭐
658
Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
Fastrag
⭐
591
Efficient Retrieval Augmentation and Generation Framework
Deep Semantic Similarity Model
⭐
472
My Keras implementation of the Deep Semantic Similarity Model (DSSM)/Convolutional Latent Semantic Model (CLSM) described here: http://research.microsoft.com/pubs/226585/cikm2014
Splade
⭐
462
SPLADE: sparse neural search (SIGIR21, SIGIR22)
Densephrases
⭐
459
ACL'2021: Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too
Cdqa
⭐
418
⛔ [NOT MAINTAINED] An End-To-End Closed Domain Question Answering System.
Sequence Semantic Embedding
⭐
409
Tools and recipes to train deep learning models and build services for NLP tasks such as text classification, semantic search ranking and recall fetching, cross-lingual information retrieval, and question answering etc.
Awesome Generative Information Retrieval
⭐
387
Automated Fact Checking Resources
⭐
303
Links to conference/journal publications in automated fact-checking (resources for the TACL22/EMNLP23 paper).
Awesome Semantic Search
⭐
301
A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.
Cherche
⭐
295
📑 Neural Search
Megabots
⭐
281
🤖 State-of-the-art, production ready LLM apps made mega-easy, so you don't have to build them from scratch 🤯 Create a bot, now 🫵
Text2text
⭐
268
Text2Text: Crosslingual NLP/G toolkit
Semantic Retrieval Models
⭐
250
A curated list of awesome papers for Semantic Retrieval (TOIS Accepted: Semantic Models for the First-stage Retrieval: A Comprehensive Review).
Vec4ir
⭐
224
Word Embeddings for Information Retrieval
Ocrpy
⭐
218
OCR, Archive, Index and Search: Implementation agnostic OCR framework.
Forte
⭐
215
Forte is a flexible and powerful ML workflow builder. This is part of the CASL project: http://casl-project.ai/
Mindflow
⭐
213
🧠 AI-powered CLI git wrapper, boilerplate code generator, chat history manager, and code search engine to streamline your dev workflow 🌊
Neuralqa
⭐
207
NeuralQA: A Usable Library for Question Answering on Large Datasets with BERT
Similarity Search Kit
⭐
195
🔎 SimilaritySearchKit is a Swift package providing on-device text embeddings and semantic search functionality for iOS and macOS applications.
Awesome Hungarian Nlp
⭐
192
A curated list of NLP resources for Hungarian
Gpl
⭐
170
Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
Dan Jurafsky Chris Manning Nlp
⭐
155
My solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.
Information Retrieval
⭐
139
Neural information retrieval / semantic-search / Bi-Encoders
Chatgpt Retrievalqa
⭐
130
A dataset for training/evaluating Question Answering Retrieval models on ChatGPT responses with the possibility to training/evaluating on real human responses.
Vtext
⭐
110
Simple NLP in Rust with Python bindings
Simxns
⭐
92
SimXNS, a research project for information retrieval, containing official implementations, by MSRA NLC team.
Bert Vietnamese Question Answering
⭐
87
Vietnamese question answering system with BERT
Multiplex Plot
⭐
87
Multiplex: visualizations that tell stories—A Python library to create and annotate beautiful network graph visualizations, text visualizations and more.
Sycamore
⭐
82
🍁 Sycamore is an LLM-powered semantic data preparation system for building search applications.
Machinelearningwithpython
⭐
79
Get started with Machine Learning with Python - An introduction with Python programming examples
Lexicalrichness
⭐
69
😸 💬 A module to compute textual lexical richness (aka lexical diversity).
Perke
⭐
67
A keyphrase extractor for Persian
Wordtokenizers.jl
⭐
63
High performance tokenizers for natural language processing and other related tasks
Awesome Nlp Research
⭐
63
Query Wellformedness
⭐
63
25,100 queries from the Paralex corpus (Fader et al., 2013) annotated with human ratings of whether they are well-formed natural language questions.
Freediscovery
⭐
60
Web Service for E-Discovery Analytics
Ake Datasets
⭐
57
Large, curated set of benchmark datasets for evaluating automatic keyphrase extraction algorithms.
Rakun2
⭐
56
RaKUn 2.0 - A fast keyword detection algorithm
Bookworm
⭐
54
📚 social networks from novels
Nalcos
⭐
53
Search Git commits in natural language
Pydoxtools
⭐
52
Effortlessly extract information from unstructured data with this library, utilizing advanced AI techniques. Compose AI in customizable pipelines and diverse sources for your projects.
Textrank Keyword Extraction
⭐
51
Keyword extraction using TextRank algorithm after pre-processing the text with lemmatization, filtering unwanted parts-of-speech and other techniques.
Emdr2
⭐
48
Code and Models for the paper "End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering" (NeurIPS 2021)
Persianstemmer Python
⭐
48
PersianStemmer-Python
Lamp
⭐
47
LaMP: When Large Language Models Meet Personalization
Aspire
⭐
46
Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.
Oleanderstemminglibrary
⭐
43
Porter stemming library (C++)
Liblevenshtein Java
⭐
40
Various utilities regarding Levenshtein transducers. (Java)
Finbert Qa
⭐
39
Financial Domain Question Answering with pre-trained BERT Language Model
Coco Dr
⭐
38
[EMNLP 2022] This is the code repo for our EMNLP‘22 paper "COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning".
Senet For Weakly Supervised Relation Extraction
⭐
35
Aila Artificial Intelligence For Legal Assistance
⭐
34
Python implementations of the various methods used in FIRE 2019 conference.
Allsummarizer
⭐
33
Multilingual automatic text summarizer using statistical approach and extraction
Cs6101
⭐
32
The Web IR / NLP Group (WING)'s public reading group at the National University of Singapore.
Bsard
⭐
30
⚖️ A Statutory Article Retrieval Dataset in French. (ACL 2022)
Bm25transformer
⭐
30
(Python) transform a document-term matrix to an Okapi/BM25 representation
Pyplexity
⭐
30
Cleaning tool for web scraped text
Proqa
⭐
29
Progressively Pretrained Dense Corpus Index for Open-Domain QA and Information Retrieval
Drl4nlp.scratchpad
⭐
26
Notes on Deep Reinforcement Learning for Natural Language Processing papers
Text Clf Baselines
⭐
24
WideMLP for Text Classification
Fact Checking Rocks
⭐
24
Fact checking baseline combining dense retrieval and textual entailment
Colxlm
⭐
23
Multilingual Retrieval on Yelp Search Engine ⚡
Ares
⭐
21
SIGIR'22 paper: Axiomatically Regularized Pre-training for Ad hoc Search
Taste
⭐
21
[CIKM 2023] This is the code repo for our CIKM‘23 paper "Text Matching Improves Sequential Recommendation by Reducing Popularity Biases".
Kex
⭐
19
Kex is a python library for unsupervised keyword extraction from a document, providing an easy interface and benchmarks on 15 public datasets.
Neural Search Pills
⭐
18
Knowledge pills on Neural Search
Ai Distillery
⭐
17
Automatically modelling and distilling knowledge within AI. In other words, summarising the AI research firehose.
Corenlp Jmwe
⭐
15
Stanford CoreNLP annotator implementing jMWE for detecting Multi-Word Expressions / collocations
Contributions Ner Cs
⭐
15
This repository hosts the dataset for the paper Computer Science Named Entity Recognition in the Open Research Knowledge Graph
Learning2hash.github.io
⭐
15
Website for "A survey of learning to hash for Computer Vision" https://learning2hash.github.io
Stopword Trainer
⭐
14
A module for creating stopword lists for any language, based on a set of documents.
Ml Nlp Services
⭐
14
机器学习、深度学习、自然语言处理
Semantic Role Labeler
⭐
14
A semantic role labeling system for the Sumerian language. A Google Summer of Code '18 initiative.
Personified Chatbot
⭐
14
A personified chatbot responding to a query based on the answering pattern of Dr. APJ Abdul Kalam using Information Retrieval, Natural Language Processing, and Deep Learning techniques.
Cdqa Ui
⭐
13
⛔ [NOT MAINTAINED] A web interface for cdQA and other question answering systems.
Csk
⭐
13
Code for generating Quasimodo, a commonsense knowledge base.
Gdsr
⭐
13
⚖️ A Graph-augmented Dense Statute Retriever. (EACL 2023)
Information_retrieval_system
⭐
13
The goal of this project is to implement a basic information retrieval system using Python, NLTK and GenSIM.
Paperlist_nlp_ir_rec_ai_conference
⭐
13
2016-至今nlp/ir/recsys/ai相关顶会的论文清单paperlist列表含目录,方便直
Dlkp
⭐
12
A deep learning library for identifying keyphrases from text
Who Killed Laura Palmer
⭐
12
Simple Question Answering system, based on data crawled from Twin Peaks Wiki. It is built using 🔍 Haystack, an awesome open-source framework for building search systems that work intelligently over large document collections.
Useb
⭐
12
Heterogenous, Task- and Domain-Specific Benchmark for Unsupervised Sentence Embeddings used in the TSDAE paper: https://arxiv.org/abs/2104.06979.
Irel Reading Group
⭐
12
This repository contains the resources used for presentation/discussion in weekly iRE Lab meetings.
Quranic Search V2
⭐
12
Quranic Lexical/Semantic Search
Swim Ir
⭐
11
SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 languages, generated using PaLM 2 and summarize-then-ask prompting.
Related Searches
Python Natural Language Processing (7,915)
Jupyter Notebook Natural Language Processing (4,405)
Machine Learning Natural Language Processing (3,939)
Deep Learning Natural Language Processing (2,414)
Pytorch Natural Language Processing (1,212)
Artificial Intelligence Natural Language Processing (1,010)
Dataset Natural Language Processing (1,010)
Tensorflow Natural Language Processing (909)
Javascript Natural Language Processing (843)
Natural Language Processing Sentiment Analysis (810)
1-73 of 73 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.