Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for dataset bert
bert
x
dataset
x
58 search results found
Nlp_chinese_corpus
⭐
8,344
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Transformers Tutorials
⭐
6,731
This repository contains demos I made with the Transformers library by HuggingFace.
Awesome Pretrained Chinese Nlp Models
⭐
3,738
Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合
Clue
⭐
3,345
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Codesearchnet
⭐
2,054
Datasets, tools, and benchmarks for representation learning of code.
Chineseglue
⭐
1,765
Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard
Cluener2020
⭐
1,384
CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition
Beir
⭐
1,332
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
Bert Ner
⭐
1,000
Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).
K Bert
⭐
793
Source code of K-BERT (AAAI2020)
Cluepretrainedmodels
⭐
536
高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型
Fastbert
⭐
527
The score code of FastBERT (ACL2020)
Cluecorpus2020
⭐
517
Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
Indonlu
⭐
494
The first-ever vast natural language processing benchmark for Indonesian Language. We provide multiple downstream tasks, pre-trained IndoBERT models, and a starter code! (AACL-IJCNLP 2020)
Casrel
⭐
440
A Novel Cascade Binary Tagging Framework for Relational Triple Extraction. Accepted by ACL 2020.
Openai Clip
⭐
404
Simple implementation of OpenAI CLIP model in PyTorch.
Arabert
⭐
372
Pre-trained Transformers for the Arabic Language Understanding and Generation (Arabic BERT, Arabic GPT2, Arabic Electra)
Cmrc2018
⭐
313
A Span-Extraction Dataset for Chinese Machine Reading Comprehension (CMRC 2018)
Bert Attributeextraction
⭐
185
USING BERT FOR Attribute Extraction in KnowledgeGraph. fine-tuning and feature extraction. 使用基于bert的微调和特征提取方法来进行知识图谱百度百科人物词条属性抽取。
Awesome Llm Eval
⭐
183
Awesome-LLM-Eval: a curated list of tools, datasets/benchmark, demos, learderboard, papers, docs and models, mainly for Evaluation on LLMs.
Robbert
⭐
180
A Dutch RoBERTa-based language model
Tabformer
⭐
144
Code & Data for "Tabular Transformers for Modeling Multivariate Time Series" (ICASSP, 2021)
French Sentiment Analysis With Bert
⭐
124
How good is BERT ? Comparing BERT to other state-of-the-art approaches on a French sentiment analysis dataset
Bond
⭐
114
BOND: BERT-Assisted Open-Domain Name Entity Recognition with Distant Supervision
Bertqa Attention On Steroids
⭐
105
BertQA - Attention on Steroids
Prosody
⭐
104
Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text
Scientificsummarizationdatasets
⭐
88
Datasets I have created for scientific summarization, and a trained BertSum model
Dialogue Understanding
⭐
82
This repository contains PyTorch implementation for the baseline models from the paper Utterance-level Dialogue Understanding: An Empirical Study
Marathinlp
⭐
80
Marathi NLP - is a repository dedicated to development of tools and resources for Marathi language.
Hvpnet
⭐
66
Code for the NAACL2022 paper "Good Visual Guidance Makes A Better Extractor: Hierarchical Visual Prefix for Multimodal Entity and Relation Extraction"
Codar
⭐
56
✅ CODAR is a Framework built using PyTorch to analyze post (Text+Media) and predict Cyber Bullying and offensive content. 💬📷
Arekit
⭐
52
Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and prompting mass-media news into datasets for ML-model training
Xpersona
⭐
51
XPersona: Evaluating Multilingual Personalized Chatbot
Transformer Srl
⭐
45
Reimplementation of a BERT based model (Shi et al, 2019), currently the state-of-the-art for English SRL. This model implements also predicate disambiguation.
Colbert Using Bert Sentence Embedding For Humor Detection
⭐
44
Novel model and dataset for the task of humor detection
Finbert Qa
⭐
39
Financial Domain Question Answering with pre-trained BERT Language Model
Pn Summary
⭐
37
A well-structured summarization dataset for the Persian language!
Sentilare
⭐
34
Codes for our paper "SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge" (EMNLP 2020)
Tasknet
⭐
30
Easy multi-task learning with HuggingFace Datasets and Trainer
Sohu2019
⭐
24
2019搜狐校园算法大赛
Tradetheevent
⭐
20
Implementation of "Trade the Event: Corporate Events Detection for News-Based Event-Driven Trading." In Findings of ACL2021
Squad2.q Augmented Dataset
⭐
19
Augmented version of SQUAD 2.0 for Questions
Told Br
⭐
19
Toxic Language Detection in Social Media for Brazilian Portuguese: New Dataset and Multilingual Analysis
Quasi Attention Absa
⭐
18
The codebase for a new quasi-attention BERT model for TABSA tasks
Pragmeval
⭐
18
Discourse Based Evaluation of Language Understanding
Expbert
⭐
17
Representation Engineering with Natural Language Explanations
Covid Q
⭐
16
COVID-19 Question Dataset from the paper "What Are People Asking About COVID-19? A Question Classification Dataset"
Protonet Bert Text Classification
⭐
16
finetune bert for small dataset text classification in a few-shot learning manner using ProtoNet
Berserker
⭐
16
Berserker - BERt chineSE woRd toKenizER
Filipino Text Benchmarks
⭐
13
Open-source benchmark datasets and pretrained transformer models in the Filipino language.
Ec Darkpattern
⭐
13
[IEEE BigData 2022, 5th Workshop on Big Data for CyberSecurity (BigCyber-2022)] Dark patterns in e-commerce: a dataset and its baseline evaluations
Mcqa
⭐
13
🔮 Answering multiple choice questions with Language Models.
Jd2skills Bert Xmlc
⭐
12
Code and Dataset for the Bhola et al. (2020) Retrieving Skills from Job Descriptions: A Language Model Based Extreme Multi-label Classification Framework
Bert Disambiguation
⭐
12
Code and CoarseWSD-20 datasets for "Language Models and Word Sense Disambiguation: An Overview and Analysis"
Offenseval2020 Code
⭐
11
Malay Fake News Classification
⭐
10
Malay Fake News Classification using CNN, BiLSTM, C-LSTM, RCNN, FT-BERT and BERTCNN.
Personality_detection
⭐
9
BB-SVM model for automatic personality detection of the essays dataset (Big-Five personality labeled traits)
Meddistant19
⭐
8
MedDistant19: Towards an Accurate Benchmark for Broad-Coverage Biomedical Relation Extraction (COLING 2022)
Expanda
⭐
8
The universal integrated corpus-building environment.
Orangesum
⭐
8
The French summarization dataset introduced in "BARThez: a Skilled Pretrained French Sequence-to-Sequence Model".
Sum_liputan6
⭐
8
The first large-scale summarization corpus for the Indonesian language. AACL 2020.
Bert4nlu
⭐
7
BERT-For-NLU-Tasks
Inferdoc
⭐
6
Generate SQUAD style dataset from raw text file and train a transformer based question answering model .This repo has code from https://github.com/facebookresearch/UnsupervisedQA and https://github.com/deepset-ai/haystack
Multi Label Text Classification For Chinese
⭐
6
pytorch implementation of multi-label text classification, includes kinds of models and pretrained. Especially for Chinese preprocessing.
Kaznerd
⭐
6
An open-source Kazakh named entity recognition dataset (KazNERD), annotation guidelines, and baseline NER models.
Indexda
⭐
5
Natural Language Processing of academic papers for dataset indexing
Kr3
⭐
5
KR3: Korean Restaurant Review with Ratings / Experiments on Parameter-efficient Tuning and Task-adaptive Pre-training
Related Searches
Python Dataset (14,792)
Jupyter Notebook Dataset (6,824)
Deep Learning Dataset (2,364)
Machine Learning Dataset (2,279)
Dataset Pytorch (1,847)
Dataset Tensorflow (1,583)
Dataset Classification (1,500)
Dataset Convolutional Neural Networks (1,264)
Dataset Paper (1,252)
Javascript Dataset (1,014)
1-58 of 58 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.