Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for minhash
minhash
x
42 search results found
Datasketch
⭐
2,236
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
Sourmash
⭐
431
Quickly search, compare, and analyze genomic and metagenomic data sets.
Bloom Filters
⭐
304
JS implementation of probabilistic data structures: Bloom Filter (and its derived), HyperLogLog, Count-Min Sketch, Top-K and MinHash
Lsh
⭐
243
Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents
Sketch
⭐
146
C++ Implementations of sketch data structures with SIMD Parallelism, including Python bindings
Sketchy
⭐
146
Sketching Algorithms for Clojure (bloom filter, min-hash, hyper-loglog, count-min sketch)
Minhashcuda
⭐
105
Weighted MinHash implementation on CUDA (multi-gpu).
Intertext
⭐
86
Detect and visualize text reuse
Hash4j
⭐
66
Dynatrace hash library for Java
Elasticsearch Minhash
⭐
59
Elasticsearch plugin for b-bit minhash algorism
Consimilo
⭐
53
A Clojure library for querying large data-sets on similarity
Hyperminhash Java
⭐
49
Union, intersection, and set cardinality in loglog space
Groot
⭐
43
A resistome profiler for Graphing Resistance Out Of meTagenomes
Minhash
⭐
39
Quickly estimate the similarity between many sets
Flajolet
⭐
37
Probabilistic data structures for OCaml
Sketchy
⭐
35
Genomic neighbor typing of bacterial pathogens using MinHash 🐀
Lshr
⭐
33
Locality Sensitive Hashing In R
Minhash
⭐
33
This provides tools for b-bit MinHash algorism.
Rkmh
⭐
32
Classify sequencing reads using MinHash.
Set Sketch Paper
⭐
23
SetSketch: Filling the Gap between MinHash and HyperLogLog
Sampled Minhashing
⭐
22
A method to mine beyond-pairwise relationships using Min-Hashing for large-scale pattern discovery
Gaoya
⭐
19
Locality Sensitive Hashing
Mashing Pumpkins
⭐
19
Minhash and maxhash library in Python, combining flexibility, expressivity, and performance.
Mkmh
⭐
18
Generate kmers/minimizers/hashes/MinHash signatures, including with multiple kmer sizes.
Wort
⭐
17
A database for signatures of public genomic sources
Mnemophonix
⭐
17
A simple audio fingerprinting system
Spark
⭐
16
There are Python 2.7 codes and learning notes for Spark 2.1.1
Similarity Search Java
⭐
16
Easy-to-use Java similarity algorithms for text and numeric-series
Minhash Lsh
⭐
16
Minhash LSH in Golang
Text Shingles
⭐
15
k-shingling for text to help compare similarity
Neural Scam Artist
⭐
15
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
Catch Me If You Can
⭐
13
plagiarism detector
Probminhash
⭐
13
ProbMinHash – A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity
Treeminhash
⭐
12
TreeMinHash: Fast Sketching for Weighted Jaccard Similarity Estimation
Minhash Node Rs
⭐
12
MinHash and LSH index written in Rust for Node.js
Gsearch
⭐
11
Approximate nearest neighbor search for microbial genomes based on probminhash and HNSW
Sourmash Rust
⭐
7
Rust implementation of sourmash core functionality
Document Similarity
⭐
6
Using Jaccard-Similarity and Minhashing to determine similarity between two text documents
2017 Recomb
⭐
6
Poster presented at RECOMB 2017
Vokter
⭐
5
Document store that periodically checks for changes in web documents
K Freqitems
⭐
5
Massive Sparse Data Clustering Based on Frequent Items (SIGMOD 2023)
Plasmidpicker
⭐
5
Software to identify plasmid sequence data from metagenome using logistic regression and Minhash
1-42 of 42 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.