Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for spark deduplication
deduplication
x
spark
x
7 search results found
Splink
⭐
939
Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
Zingg
⭐
828
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Spark Lucenerdd
⭐
127
Spark RDD with Lucene's query and entity linkage capabilities
Spark Matcher
⭐
27
Record matching and entity resolution at scale in Spark
Sparkclean
⭐
20
A Scalable Data Cleaning Library for PySpark.
Spark Search
⭐
20
Spark Search - high performance advanced search features based on Apache Lucene
Sparklyclean
⭐
6
Optimal distributed data deduplication and supervised learning pipeline using Apache Spark
Related Searches
Scala Spark (3,279)
Python Spark (2,053)
Java Spark (1,587)
Apache Spark (1,207)
Spark Hadoop (1,188)
Jupyter Notebook Spark (1,151)
Spark Kafka (985)
Spark Streaming (817)
Spark Pyspark (812)
1-7 of 7 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.