Awesome Open Source
Awesome Open Source

Snowball: Extracting Relations from Large Plain-Text Collections

This is my own implementation of the the Snowball system to bootstrap relationship instances. You can find more details about the original system here:

For more details about this particular implementation please refer to:

A sample file containing sentences where the named-entities are already tagged can be downloaded, which has 1 million sentences taken from the New York Times articles part of the English Gigaword Collection.

NOTE: look at the desription of BREDS to understand how to give a tagged document collection and seeds to setup the bootstrapping of relationship instances with Snowball, both systems have a similar setup.

Related Awesome Lists
Top Programming Languages

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
Python (861,969
Text (32,815
Natural Language Processing (15,515
Tf Idf (575
Information Extraction (424
Semi Supervised Learning (385
Bootstrapping (58
Relationship Extraction (10