Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for python information retrieval
information-retrieval
x
python
x
230 search results found
Easyocr
⭐
20,438
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Gensim
⭐
15,180
Topic Modelling for Humans
Haystack
⭐
12,474
🔍 LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
Danswer
⭐
6,435
Ask Questions in natural language and get Answers backed by private sources. Connects to tools like Slack, GitHub, Confluence, etc.
Txtai
⭐
6,143
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
Marqo
⭐
3,893
Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
Catalyst
⭐
3,151
Accelerated deep learning R&D
Flagembedding
⭐
2,797
Dense Retrieval and Retrieval-augmented LLMs
Ranking
⭐
2,666
Learning to Rank in TensorFlow
Invoicenet
⭐
2,297
Deep neural network to extract intelligent information from invoice documents.
Llmware
⭐
1,859
Providing enterprise-grade LLM-based development framework, tools, and fine-tuned models.
Pke
⭐
1,431
Python Keyphrase Extraction module
Telegram Scraper
⭐
1,356
telegram group scraper tool. fetch all information about group members
Pyserini
⭐
1,291
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
Awesome Fl
⭐
1,103
Comprehensive and timely academic information on federated learning (papers, frameworks, datasets, tutorials, workshops)
Osi.ig
⭐
1,027
Information Gathering Instagram.
Langroid
⭐
988
Harness LLMs with Multi-Agent Programming
Mteb
⭐
941
MTEB: Massive Text Embedding Benchmark
Allrank
⭐
722
allRank is a framework for training learning-to-rank neural models based on PyTorch.
Rocketqa
⭐
707
🚀 RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.
Rank_bm25
⭐
613
A Collection of BM25 Algorithms in Python
Fastrag
⭐
591
Efficient Retrieval Augmentation and Generation Framework
Simsimd
⭐
514
Vector Similarity Functions 3x-200x Faster than SciPy and NumPy — for Python, JavaScript, and C 11, supporting f64, f32, f16, i8, and binary vectors using SIMD for both x86 AVX2 & AVX-512 and Arm NEON & SVE 📐
Deep Semantic Similarity Model
⭐
472
My Keras implementation of the Deep Semantic Similarity Model (DSSM)/Convolutional Latent Semantic Model (CLSM) described here: http://research.microsoft.com/pubs/226585/cikm2014
Splade
⭐
462
SPLADE: sparse neural search (SIGIR21, SIGIR22)
Densephrases
⭐
459
ACL'2021: Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too
Openmatch
⭐
435
An Open-Source Package for Information Retrieval.
Cdqa
⭐
418
⛔ [NOT MAINTAINED] An End-To-End Closed Domain Question Answering System.
Rmdl
⭐
409
RMDL: Random Multimodel Deep Learning for Classification
Sequence Semantic Embedding
⭐
409
Tools and recipes to train deep learning models and build services for NLP tasks such as text classification, semantic search ranking and recall fetching, cross-lingual information retrieval, and question answering etc.
Tevatron
⭐
334
Tevatron - A flexible toolkit for neural retrieval research and development.
Rankgpt
⭐
318
Is ChatGPT Good at Search? LLMs as Re-Ranking Agent
Aquiladb
⭐
311
An easy to use Neural Search Engine. Index latent vectors along with JSON metadata and do efficient k-NN search.
Teaching
⭐
307
Open-Source Information Retrieval Courses @ TU Wien
Getaltname
⭐
306
Extract subdomains from SSL certificates in HTTPS sites.
Cherche
⭐
295
📑 Neural Search
Agent Search
⭐
295
AgentSearch is a framework for powering search agents and enabling customizable local search.
Ir_datasets
⭐
284
Provides a common interface to many IR ranking datasets.
Megabots
⭐
281
🤖 State-of-the-art, production ready LLM apps made mega-easy, so you don't have to build them from scratch 🤯 Create a bot, now 🫵
Text2text
⭐
268
Text2Text: Crosslingual NLP/G toolkit
Hdltex
⭐
252
HDLTex: Hierarchical Deep Learning for Text Classification
Conceptualsearch
⭐
230
Train a Word2Vec model or LSA model, and Implement Conceptual Search\Semantic Search in Solr\Lucene - Simon Hughes Dice.com, Dice Tech Jobs
Ranx
⭐
228
⚡️A Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion 🐍
Vec4ir
⭐
224
Word Embeddings for Information Retrieval
Ocrpy
⭐
218
OCR, Archive, Index and Search: Implementation agnostic OCR framework.
Forte
⭐
215
Forte is a flexible and powerful ML workflow builder. This is part of the CASL project: http://casl-project.ai/
Mindflow
⭐
213
🧠 AI-powered CLI git wrapper, boilerplate code generator, chat history manager, and code search engine to streamline your dev workflow 🌊
Retromae
⭐
171
Codebase for RetroMAE and beyond.
Gpl
⭐
170
Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
Annlite
⭐
159
⚡ A fast embedded library for approximate nearest neighbor search
K Nrm
⭐
157
K-NRM: End-to-End Neural Ad-hoc Ranking with Kernel Pooling
Entityduetneuralranking
⭐
138
Entity-Duet Neural Ranking Model
Retriv
⭐
137
A Python Search Engine for Humans 🥸
Webdork
⭐
131
A Python tool to automate some dorking stuff to find information disclosures.
Chatgpt Retrievalqa
⭐
130
A dataset for training/evaluating Question Answering Retrieval models on ChatGPT responses with the possibility to training/evaluating on real human responses.
Bm25
⭐
123
A Python implementation of the BM25 ranking function.
Pytrec_eval
⭐
115
pytrec_eval is an Information Retrieval evaluation tool for Python, based on the popular trec_eval.
Naacl2018 Fever
⭐
110
Fact Extraction and VERification baseline published in NAACL2018
Vtext
⭐
110
Simple NLP in Rust with Python bindings
Openmatch
⭐
107
An Open-Source Package for Information Retrieval
Tika Similarity
⭐
100
Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
Simxns
⭐
92
SimXNS, a research project for information retrieval, containing official implementations, by MSRA NLC team.
Mimir
⭐
89
OSINT Threat Intel Interface - CLI for HoneyDB
Capreolus
⭐
89
A toolkit for end-to-end neural ad hoc retrieval
Ip Tracker
⭐
88
Track any ip address with IP-Tracker. IP-Tracker is developed for Linux and Termux. you can retrieve any ip address information using IP-Tracker.
Multiplex Plot
⭐
87
Multiplex: visualizations that tell stories—A Python library to create and annotate beautiful network graph visualizations, text visualizations and more.
Bert Vietnamese Question Answering
⭐
87
Vietnamese question answering system with BERT
Sert
⭐
85
Semantic Entity Retrieval Toolkit
Patzilla
⭐
83
PatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.
Ml4ir
⭐
83
Machine Learning for Information Retrieval
Pyndri
⭐
83
pyndri is a Python interface to the Indri search engine.
Sycamore
⭐
82
🍁 Sycamore is an LLM-powered semantic data preparation system for building search applications.
Machinelearningwithpython
⭐
79
Get started with Machine Learning with Python - An introduction with Python programming examples
Continuous Eval
⭐
78
Evaluation for LLM / RAG pipelines, ready for CI/CD
Drhard
⭐
72
Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).
Lexicalrichness
⭐
69
😸 💬 A module to compute textual lexical richness (aka lexical diversity).
Perke
⭐
67
A keyphrase extractor for Persian
Mixgcf
⭐
65
MixGCF: An Improved Training Method for Graph Neural Network-based Recommender Systems, KDD2021
Evildork
⭐
63
Evildork targeting your fiancee👁️
Parade
⭐
63
code and data to faciliate BERT/ELECTRA for document ranking. Details refer to the paper - PARADE: Passage Representation Aggregation for Document Reranking.
Freediscovery
⭐
60
Web Service for E-Discovery Analytics
Vectorsinsearch
⭐
59
Dice.com repo to accompany the dice.com 'Vectors in Search' talk by Simon Hughes, from the Activate 2018 search conference, and the 'Searching with Vectors' talk from Haystack 2019 (US). Builds upon my conceptual search and semantic search work from 2015
Nba Search
⭐
56
flask application designed to explore NBA data 🏀
Rakun2
⭐
56
RaKUn 2.0 - A fast keyword detection algorithm
Bookworm
⭐
54
📚 social networks from novels
Emnlp2020
⭐
53
This is official Pytorch code and datasets of the paper "Where Are the Facts? Searching for Fact-checked Information to Alleviate the Spread of Fake News", EMNLP 2020.
Nalcos
⭐
53
Search Git commits in natural language
Pydoxtools
⭐
52
Effortlessly extract information from unstructured data with this library, utilizing advanced AI techniques. Compose AI in customizable pipelines and diverse sources for your projects.
Chatfaq
⭐
49
Elevate user interactions with ChatFAQ: your open-source chatbot solution, offering the full spectrum of ChatGPT capabilities. AI + LLM + CMS
Gaanaapi
⭐
48
Unofficial Gaana API
Telegram Scraper Adder_channel Member Scraper
⭐
48
A Free Tool For Scraping Telegram Group Accounts and Telegram Channel Members To Fetch All Information About Group and Channel members
Persianstemmer Python
⭐
48
PersianStemmer-Python
Emdr2
⭐
48
Code and Models for the paper "End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering" (NeurIPS 2021)
Repbert Index
⭐
47
RepBERT is a competitive first-stage retrieval technique. It represents documents and queries with fixed-length contextualized embeddings. The inner products of them are regarded as relevance scores. Its efficiency is comparable to bag-of-words methods.
Lamp
⭐
47
LaMP: When Large Language Models Meet Personalization
Aspire
⭐
46
Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.
Finbert Qa
⭐
39
Financial Domain Question Answering with pre-trained BERT Language Model
Coco Dr
⭐
38
[EMNLP 2022] This is the code repo for our EMNLP‘22 paper "COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning".
Clipmh
⭐
38
CLIPMH:CLIP Multi-modal Hashing
Hugging Face Qa Bot
⭐
37
Open source Hugging Face Question Answering Bot to aid users in developing and troubleshooting ML solutions.
Related Searches
Python Machine Learning (20,195)
Python Dataset (14,792)
Python Tensorflow (13,736)
Python Deep Learning (13,092)
Python Jupyter Notebook (12,976)
Python Network (11,495)
Python Algorithms (10,033)
Python Database (9,975)
Python Natural Language Processing (9,064)
Python Artificial Intelligence (8,580)
1-100 of 230 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.