Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Quivr | 22,922 | 2 days ago | 139 | apache-2.0 | TypeScript | |||||
🧠 Your Second Brain supercharged by Generative AI 🧠 Dump all your files and chat with your personal assistant on your files & more using GPT 3.5/4, Private, Anthropic, VertexAI, LLMs... | ||||||||||
H2ogpt | 7,573 | a day ago | 171 | apache-2.0 | Python | |||||
Private Q&A and summarization of documents+images or chat with local GPT, 100% private, Apache 2.0. Supports LLaMa2, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://codellama.h2o.ai/ | ||||||||||
Mygptreader | 4,196 | 10 days ago | 11 | mit | Python | |||||
A community-driven way to read and chat with AI bots - powered by chatGPT. | ||||||||||
Awesome Knowledge Graph | 2,122 | 3 years ago | 4 | |||||||
整理知识图谱相关学习资料 | ||||||||||
Generative Ai | 2,041 | 2 days ago | 21 | apache-2.0 | Jupyter Notebook | |||||
Sample code and notebooks for Generative AI on Google Cloud | ||||||||||
Gptdiscord | 1,482 | 25 days ago | 38 | mit | Python | |||||
A robust, all-in-one GPT3 interface for Discord. ChatGPT-style conversations, image generation, AI-moderation, custom indexes/knowledgebase, youtube summarizer, and more! | ||||||||||
Llama Node | 759 | 2 months ago | 38 | apache-2.0 | Rust | |||||
Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model. | ||||||||||
Nlp | 734 | 3 years ago | 7 | Python | ||||||
:memo: This repository recorded my NLP journey. | ||||||||||
Nlp Notebooks | 710 | a year ago | 6 | Jupyter Notebook | ||||||
A collection of notebooks for Natural Language Processing from NLP Town | ||||||||||
Seagoat | 581 | 18 hours ago | 28 | mit | Python | |||||
local-first semantic code search engine |
1️⃣st place at SIGIR eCom Challenge 2020
2️⃣nd place and Best Paper Award at WSDM Booking.com Challenge 2021
2️⃣nd place at Twitter Recsys Challenge 2021
3️⃣rd place at KDD Cup 2021
Cleora is a genus of moths in the family Geometridae. Their scientific name derives from the Ancient Greek geo γῆ or γαῖα "the earth", and metron μέτρον "measure" in reference to the way their larvae, or "inchworms", appear to "measure the earth" as they move along in a looping fashion.
Cleora is a general-purpose model for efficient, scalable learning of stable and inductive entity embeddings for heterogeneous relational data.
Read the whitepaper "Cleora: A Simple, Strong and Scalable Graph Embedding Scheme"
Cleora embeds entities in n-dimensional spherical spaces utilizing extremely fast stable, iterative random projections, which allows for unparalleled performance and scalability.
Types of data which can be embedded include for example:
Key competitive advantages of Cleora:
Embedding times - example:
Algorithm | FB dataset | RoadNet dataset | LiveJournal dataset |
Cleora | 00:00:43 h | 00:21:59 h | 01:31:42 h |
PyTorch-BigGraph | 00:04.33 h | 00:31:11 h | 07:10:00 h |
Link Prediction results - example:
FB dataset | RoadNet dataset | LiveJournal dataset | ||||
Algorithm | MRR | HitRate@10 | MRR | HitRate@10 | MRR | HitRate@10 |
Cleora | 0.072 | 0.172 | 0.929 | 0.942 | 0.586 | 0.627 |
PyTorch-BigGraph | 0.035 | 0.072 | 0.850 | 0.866 | 0.565 | 0.672 |
Cleora is built as a multi-purpose "just embed it" tool, suitable for many different data types and formats.
Cleora ingests a relational table of rows representing a typed and undirected heterogeneous hypergraph, which can contain multiple:
For example a relational table representing shopping baskets may have the following columns:
user <\t> product <\t> store
With the input file containing values:
user_id <\t> product_id product_id product_id <\t> store_id
Every column has a type, which is used to determine whether spaces of identifiers between different columns are shared or distinct. It is possible for two columns to share a type, which is the case for homogeneous graphs:
user <\t> user
Based on the column format specification, Cleora performs:
The final output of Cleora consists of multiple files for each (undirected) pair of entity types in the table.
Those embeddings can then be utilized in a novel way thanks to their dim-wise independence property, which is described further below.
The embeddings produced by Cleora are different from those produced by Node2vec, Word2vec, DeepWalk or other systems in this class by a number of key properties:
The technical properties described above imply good production-readiness of Cleora, which from the end-user perspective can be summarized as follows:
More information can be found in the full documentation.
Cleora Enterprise is now available for selected customers. Key improvements in addition to this open-source version:
For details contact us at [email protected]
Please cite our paper (and the respective papers of the methods used) if you use this code in your own work:
@article{DBLP:journals/corr/abs-2102-02302,
author = {Barbara Rychalska, Piotr Babel, Konrad Goluchowski, Andrzej Michalowski, Jacek Dabrowski},
title = {Cleora: {A} Simple, Strong and Scalable Graph Embedding Scheme},
journal = {CoRR},
year = {2021}
}
Synerise Cleora is MIT licensed, as found in the LICENSE file.
You are welcomed to contribute to this open-source toolbox. The detailed instructions will be released soon as issues.