Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
D2l Zh | 41,093 | 1 | 20 hours ago | 45 | March 25, 2022 | 24 | apache-2.0 | Python | ||
《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被60多个国家的400多所大学用于教学。 | ||||||||||
Chinese Bert Wwm | 8,021 | 2 days ago | 1 | apache-2.0 | Python | |||||
Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型) | ||||||||||
Text_classification | 7,411 | 6 months ago | 46 | mit | Python | |||||
all kinds of text classification models and more with deep learning | ||||||||||
Nlp_chinese_corpus | 7,386 | 4 months ago | 19 | mit | ||||||
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP | ||||||||||
Gpt2 Chinese | 6,356 | 12 days ago | 95 | mit | Python | |||||
Chinese version of GPT2 training code, using BERT tokenizer. | ||||||||||
Ansj_seg | 5,962 | 402 | 14 | 2 years ago | 10 | February 15, 2018 | 38 | apache-2.0 | Java | |
ansj分词.ict的真正java实现.分词效果速度都超过开源版的ict. 中文分词,人名识别,词性标注,用户自定义词典 | ||||||||||
Cluedatasetsearch | 2,778 | 4 months ago | 6 | Python | ||||||
搜索所有中文NLP数据集,附常用英文NLP数据集 | ||||||||||
Uer Py | 2,458 | 18 days ago | 124 | apache-2.0 | Python | |||||
Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo | ||||||||||
Gpt2 Chitchat | 2,437 | a month ago | 54 | Python | ||||||
GPT2 for Chinese chitchat/用于中文闲聊的GPT2模型(实现了DialoGPT的MMI思想) | ||||||||||
Gse | 2,151 | 14 | 14 | a month ago | 79 | May 19, 2022 | 6 | apache-2.0 | Go | |
Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others. |
This repository contains the data for The Third Evaluation Workshop on Chinese Machine Reading Comprehension (CMRC 2019). We will present our paper at COLING 2020,
Title: A Sentence Cloze Dataset for Chinese Machine Reading Comprehension
Authors: Yiming Cui, Ting Liu, Ziqing Yang, Zhipeng Chen, Wentao Ma, Wanxiang Che, Shijin Wang, Guoping Hu
Link: https://arxiv.org/abs/2004.03116
Venue: COLING 2020
Keep track of the latest state-of-the-art systems on CMRC 2019 dataset. https://ymcui.github.io/cmrc2019/
If you would like to test your model on the hidden test and challenge set, please follow the instructions on how to submit your model via CodaLab worksheet. https://worksheets.codalab.org/worksheets/0xe856b40d21de45bf898cd1d3c5135afe
baseline: a Chinese BERT-based simple baseline system
eval: contains official evaluation script
data: contains offical evaluation data
sample_submission: sample submission for codalab competition platform (trial_rand_submission.zip
is a randomly generated prediction file, trial_submission.zip
is the BERT baseline prediction file)
We provide a BERT-based baseline system for participants (check baseline directory for more info).
Results on other sets will be annouced later.
QAC: Question-Level Accuracy
PAC: Passage-Level Accuracy
Data | Passage # | Query # | QAC | PAC | Fake Candidates | Availability |
---|---|---|---|---|---|---|
Trial Data | 139 | 1,504 | 71.941% | 28.776% | No | Public |
Train Data | 9,638 | 100,009 | N/A | N/A | No | Public |
Development Data | 300 | 3,053 | 70.586% | 13.333% | Yes | Public |
Qualifying Data | 500 | 5,081 | 70.01% | 8.20% | Yes | Semi-Hidden |
Test Data | - | - | - | - | Yes | Hidden |
ISLRN: 813-010-842-493-2
http://www.islrn.org/resources/resources_info/8624/
If you wish to use our data in your research, please cite our paper:
@inproceeding={cui-etal-2020-cmrc2019,
title={A Sentence Cloze Dataset for Chinese Machine Reading Comprehension},
author={Cui, Yiming and Liu, Ting and Yang, Ziqing and Chen, Zhipeng and Ma, Wentao and Che, Wanxiang and Wang, Shijin and Hu, Guoping},
booktitle = "Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020)",
year={2020}
}
Host: Chinese Information Processing Society of China (CIPS)
Organizer: Joint Laboratory of HIT and iFLYTEK Research (HFL)
Sponsor: iFLYTEK Co., Ltd. and iFLYTEK Research (Hebei)
Ting Liu, Harbin Institute of Technology
Yiming Cui, Joint Laboratory of HIT and iFLYTEK Research
Follow Joint Laboratory of HIT and iFLYTEK Research (HFL) on WeChat.
Any problems? Feel free to concat us.
Email: cmrc2019 [aT] 126 [DoT] com
Forum: CodaLab Competition Forum
CMRC 2019 Official Website (中文):https://cmrc2019.hfl-rc.com/
CMRC 2019 Official Website (English):https://cmrc2019.hfl-rc.com/english/