Cmrc2018

A Span-Extraction Dataset for Chinese Machine Reading Comprehension (CMRC 2018)
Alternatives To Cmrc2018
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Nlp_chinese_corpus7,386
4 months ago19mit
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Chinese Names Corpus3,411
4 months ago6apache-2.0
中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
Clue2,954
4 months ago71Python
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Cluedatasetsearch2,778
4 months ago6Python
搜索所有中文NLP数据集,附常用英文NLP数据集
Textrecognitiondatagenerator2,607
16 days ago12November 15, 2020114mitPython
A synthetic data generator for text recognition
Awesome_chinese_medical_nlp1,411
2 months ago
中文医学NLP公开资源整理:术语集/语料库/词向量/预训练模型/知识图谱/命名实体识别/QA/信息抽取/模型/论文/etc
Chinesenlp1,329
2 years ago3HTML
Datasets, SOTA results of every fields of Chinese NLP
Cluener20201,196
4 months ago48Python
CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition
Cdial Gpt944
10 months ago10mitPython
A Large-scale Chinese Short-Text Conversation Dataset and Chinese pre-training dialog models
Synthtext_chinese_version682
5 years ago30C++
Modify from https://github.com/ankush-me/SynthText.git to generate chinese character
Alternatives To Cmrc2018
Select To Compare


Alternative Project Comparisons
Readme

中文说明 | English



GitHub

This repository contains the data for The Second Evaluation Workshop on Chinese Machine Reading Comprehension (CMRC 2018). We will present our paper on EMNLP 2019.

Title: A Span-Extraction Dataset for Chinese Machine Reading Comprehension
Authors: Yiming Cui, Ting Liu, Wanxiang Che, Li Xiao, Zhipeng Chen, Wentao Ma, Shijin Wang, Guoping Hu
Link: https://www.aclweb.org/anthology/D19-1600/
Venue: EMNLP-IJCNLP 2019

Open Challenge Leaderboard (New!)

Keep track of the latest state-of-the-art systems on CMRC 2018 dataset.
https://ymcui.github.io/cmrc2018/

CMRC 2018 Public Datasets

Please download CMRC 2018 public datasets via the following CodaLab Worksheet.
https://worksheets.codalab.org/worksheets/0x92a80d2fab4b4f79a2b4064f7ddca9ce

Submission Guidelines

If you would like to test your model on the hidden test and challenge set, please follow the instructions on how to submit your model via CodaLab worksheet.
https://worksheets.codalab.org/worksheets/0x96f61ee5e9914aee8b54bd11e66ec647/

**Note that the test set on CLUE is NOT the complete test set. If you wish to evaluate your model OFFICIALLY on CMRC 2018, you should follow the guidelines here. **

Quick Load Through 🤗datasets

You can also access this dataset as part of the HuggingFace datasets library library as follow:

!pip install datasets
from datasets import load_dataset
dataset = load_dataset('cmrc2018')

More details on the options and usage for this library can be found on the nlp repository at huggingface/nlp

Reference

If you wish to use our data in your research, please cite:

@inproceedings{cui-emnlp2019-cmrc2018,
    title = "A Span-Extraction Dataset for {C}hinese Machine Reading Comprehension",
    author = "Cui, Yiming  and
      Liu, Ting  and
      Che, Wanxiang  and
      Xiao, Li  and
      Chen, Zhipeng  and
      Ma, Wentao  and
      Wang, Shijin  and
      Hu, Guoping",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/D19-1600",
    doi = "10.18653/v1/D19-1600",
    pages = "5886--5891",
}

International Standard Language Resource Number (ISLRN)

ISLRN: 013-662-947-043-2

http://www.islrn.org/resources/resources_info/7952/

Official HFL WeChat Account

Follow Joint Laboratory of HIT and iFLYTEK Research (HFL) on WeChat.

qrcode.png

Contact us

Please submit an issue.

Popular Chinese Projects
Popular Dataset Projects
Popular Community Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Dataset
Natural Language Processing
Chinese
Questions And Answers