soweego is a pipeline that connects Wikidata to large-scale third-party catalogs.
soweego is the only system that makes statisticians, epidemiologists, historians, and computer scientists agree. Why? Because it performs record linkage, data matching, and entity resolution at the same time. Too easy, they all seem to be synonyms!
soweego is made possible thanks to the Wikimedia Foundation:
$ git clone https://github.com/Wikidata/soweego.git $ cd soweego $ ./docker/run.sh Building soweego ... [email protected]:/app/soweego#
Now it's too late to get out!
Piece of cake:
:/app/soweego# python -m soweego run CATALOG
These steps are executed by default:
Results are in
You can launch every single soweego action with CLI commands:
:/app/soweego# python -m soweego Usage: soweego [OPTIONS] COMMAND [ARGS]... Link Wikidata to large catalogs. Options: -l, --log-level <TEXT CHOICE>... Module name followed by one of [DEBUG, INFO, WARNING, ERROR, CRITICAL]. Multiple pairs allowed. --help Show this message and exit. Commands: importer Import target catalog dumps into a SQL database. ingester Take soweego output into Wikidata items. linker Link Wikidata items to target catalog identifiers. run Launch the whole pipeline. sync Sync Wikidata to target catalogs.
Just two things to remember:
The source code is under the terms of the GNU General Public License, version 3.