Dedupe

🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
Alternatives To Dedupe
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Libpostal3,897
3 months ago315mitC
A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
Dedupe3,87939104 months ago174February 17, 202372mitPython
:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
Splink93923 months ago119November 14, 2023167mitPython
Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
Recordlinkage808939 months ago23July 20, 202357bsd-3-clausePython
A powerful and modular toolkit for record linkage and duplicate detection in Python
Talisman6661,13548a year ago30January 21, 202180mitJavaScript
Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.
Csvdedupe398
4 years ago21otherPython
:id: Command line tool for deduplicating CSV files
Data Matching Software329
5 months ago8
A list of free data matching and record linkage software.
Dedupe Examples306
2 years ago7mitPython
:id: Examples for using the dedupe library
Spark Lucenerdd127
3 months ago39June 02, 202136apache-2.0Scala
Spark RDD with Lucene's query and entity linkage capabilities
Entity Embed98
2 years ago6July 16, 2021mitJupyter Notebook
PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.
Alternatives To Dedupe
Select To Compare


Alternative Project Comparisons
Popular Deduplication Projects
Popular Record Linkage Projects
Popular Data Processing Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Python Library
Deduplication
Record Linkage