Alternatives To Pke
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Arxivtimes3,709
a year ago2,071mit
repository to research & share the machine learning articles
Data Science3,708
2 days ago5Jupyter Notebook
Collection of useful data science topics along with articles, videos, and code
News Please1,612
65 days ago118April 04, 202220apache-2.0Python
news-please - an integrated web crawler and information extractor for news that just works
Pke1,3911a month ago1September 01, 20215gpl-3.0Python
Python Keyphrase Extraction module
Paperai915
2 months ago10March 12, 2022apache-2.0Python
📄 🤖 Semantic search and workflows for medical/scientific papers
Nlp In Practice861
2 years ago1Jupyter Notebook
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
Inltk760
a year ago24October 11, 202024mitPython
Natural Language Toolkit for Indic Languages aims to provide out of the box support for various NLP tasks that an application developer might need
Machine Learning Articles519
4 years ago
Monthly Series - Top 10 Machine Learning Articles
Headlines506
5 years ago28mitJupyter Notebook
Automatically generate headlines to short articles
Building A Simple Chatbot In Python Using Nltk482
a month ago6Jupyter Notebook
Building a Simple Chatbot from Scratch in Python (using NLTK)
Alternatives To Pke
Select To Compare


Alternative Project Comparisons
Readme

pke - python keyphrase extraction

pke is an open source python-based keyphrase extraction toolkit. It provides an end-to-end keyphrase extraction pipeline in which each component can be easily modified or extended to develop new models. pke also allows for easy benchmarking of state-of-the-art keyphrase extraction models, and ships with supervised models trained on the SemEval-2010 dataset.

python-package workflow

Table of Contents

Installation

To pip install pke from github:

pip install git+https://github.com/boudinfl/pke.git

pke relies on spacy (>= 3.2.3) for text processing and requires models to be installed:

# download the english model
python -m spacy download en_core_web_sm

Minimal example

pke provides a standardized API for extracting keyphrases from a document. Start by typing the 5 lines below. For using another model, simply replace pke.unsupervised.TopicRank with another model (list of implemented models).

import pke

# initialize keyphrase extraction model, here TopicRank
extractor = pke.unsupervised.TopicRank()

# load the content of the document, here document is expected to be a simple 
# test string and preprocessing is carried out using spacy
extractor.load_document(input='text', language='en')

# keyphrase candidate selection, in the case of TopicRank: sequences of nouns
# and adjectives (i.e. `(Noun|Adj)*`)
extractor.candidate_selection()

# candidate weighting, in the case of TopicRank: using a random walk algorithm
extractor.candidate_weighting()

# N-best selection, keyphrases contains the 10 highest scored candidates as
# (keyphrase, score) tuples
keyphrases = extractor.get_n_best(n=10)

A detailed example is provided in the examples/ directory.

Getting started

To get your hands dirty with pke, we invite you to try our tutorials out.

Name Link
Getting started with pke and keyphrase extraction Open In Colab
Model parameterization Open In Colab
Benchmarking models Open In Colab

Implemented models

pke currently implements the following keyphrase extraction models:

Model performances

For comparison purposes, overall results of implemented models on commonly-used benchmark datasets are available in results. Code for reproducing these experiments are in the benchmarking notebook (also available on Open In Colab).

Citing pke

If you use pke, please cite the following paper:

@InProceedings{boudin:2016:COLINGDEMO,
  author    = {Boudin, Florian},
  title     = {pke: an open source python-based keyphrase extraction toolkit},
  booktitle = {Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations},
  month     = {December},
  year      = {2016},
  address   = {Osaka, Japan},
  pages     = {69--73},
  url       = {http://aclweb.org/anthology/C16-2015}
}
Popular Natural Language Processing Projects
Popular Article Projects
Popular Machine Learning Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Natural Language Processing
Article
Information Retrieval