This library helps with the generation of fingerprints for entity data. A fingerprint in this context is understood as a simplified entity identifier, derived from it's name or address and used for cross-referencing of entity across different datasets.
import fingerprints fp = fingerprints.generate('Mr. Sherlock Holmes') assert fp == 'holmes sherlock' fp = fingerprints.generate('Siemens Aktiengesellschaft') assert fp == 'ag siemens' fp = fingerprints.generate('New York, New York') assert fp == 'new york'
A significant part of what
fingerprints does it to recognize company legal form
names. For example,
fingerprints will be able to simplify
Общество с ограниченной ответственностью to
AG. The required database
is based on two different sources:
Wikipedia also maintains an index of types of business entity.