Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Sotawhat | 1,154 | 3 years ago | 14 | Python | ||||||
Returns latest research results by crawling arxiv papers and summarizing abstracts. Helps you stay afloat with so many new papers everyday. | ||||||||||
Summarization Papers | 879 | 16 hours ago | TeX | |||||||
Summarization Papers | ||||||||||
Text Summarization Papers | 245 | 3 years ago | HTML | |||||||
An Exhaustive Paper List for Text Summarization | ||||||||||
Ml4se | 206 | 23 days ago | 1 | |||||||
A curated list of papers, theses, datasets, and tools related to the application of Machine Learning for Software Engineering | ||||||||||
Scisumm Corpus | 187 | a year ago | cc-by-4.0 | |||||||
Scientific Document Summarization Corpus and Annotations from the WING NUS group. | ||||||||||
Text Summarization Repo | 184 | a year ago | bsd-3-clause | |||||||
텍스트 요약 분야의 주요 연구 주제, Must-read Papers, 이용 가능한 model 및 data 등을 추천 자료와 함께 정리한 저장소입니다. | ||||||||||
Video Summarization With Lstm | 150 | 6 years ago | 10 | other | Matlab | |||||
Implementation of our ECCV 2016 Paper (Video Summarization with Long Short-term Memory) | ||||||||||
Neusum | 118 | 4 years ago | 3 | Python | ||||||
Code for the ACL 2018 paper "Neural Document Summarization by Jointly Learning to Score and Select Sentences" | ||||||||||
Hiersumm | 116 | 4 years ago | 10 | apache-2.0 | Python | |||||
Code for paper Hierarchical Transformers for Multi-Document Summarization in ACL2019 | ||||||||||
Sa Papers | 108 | 5 years ago | ||||||||
📄 Deep Learning 中 Sentiment Analysis 論文統整與分析 😀😡☹️😭🙄🤢 |
Dataset for CIKM 2018 paper "Multi-Source Pointer Network for Product Title Summarization"
Each line in corpus.txt consists of a pair of titles (original title, short title), their brands, and commodity names. Each line is tab-delimited
(two tabs) with the following format:
<original title>\t\t<short title>\t\t<brand>\t\t<commodity name>
corpus: the dataset used in the cikm 2018 paper, the length of short title < 11.
big_corpus: much larger dataset, the length of short title < 13.
We split the file into 5 files with prefix big_corpus.tar.gz_
due to the limitation on github.com (less than 100m).
The way to reconstruct the big_corpus file:
cd big_corpus
cat big_corpus.tar.gz_* > big_corpus.tar.gz
tar zxvf big_corpus.tar.gz
Note:
brand
may contain multi-language versions(separated using “/”) for some products, e.g., Nintendo/任天堂.
@inproceedings{Sun:CIKM2018,
author = {Fei Sun and Peng Jiang and Hanxiao Sun and Changhua Pei and Wenwu Ou and Xiaobo Wang},
title = {{Multi-Source Pointer Network for Product Title Summarization}},
booktitle = {CIKM},
year = 2018
}