Bert Ner

Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).
Alternatives To Bert Ner
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Insightface17,9621914 days ago28December 17, 2022972mitPython
State-of-the-art 2D and 3D Face Analysis Project
Mvision5,784
2 years ago14C++
机器人视觉 移动机器人 VS-SLAM ORB-SLAM2 深度学习目标检测 yolov3 行为检测 opencv PCL 机器学习 无人驾驶
Deep Text Recognition Benchmark3,412
a month ago217apache-2.0Jupyter Notebook
Text recognition (optical character recognition) with deep learning methods.
Lstm Human Activity Recognition3,074
a year ago19mitJupyter Notebook
Human Activity Recognition example using TensorFlow on smartphone sensors dataset and an LSTM RNN. Classifying the type of movement amongst six activity categories - Guillaume Chevalier
Ccpd1,965
3 months ago84mitPython
[ECCV 2018] CCPD: a diverse and well-annotated dataset for license plate detection and recognition
Celebamask Hq1,456
2 years ago49Python
A large-scale face dataset for face parsing, recognition, generation and editing.
Entity Recognition Datasets1,365
a month ago7mitPython
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
Esc 501,132
3 days ago1otherPython
ESC-50: Dataset for Environmental Sound Classification
Real World Masked Face Dataset1,100
3 years ago32Python
Real-World Masked Face Dataset,口罩人脸数据集
Bert Ner1,000
3 years ago71mitPython
Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).
Alternatives To Bert Ner
Select To Compare


Alternative Project Comparisons
Readme

For better performance, you can try NLPGNN, see NLPGNN for more details.

BERT-NER Version 2

Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).

The original version (see old_version for more detail) contains some hard codes and lacks corresponding annotations,which is inconvenient to understand. So in this updated version,there are some new ideas and tricks (On data Preprocessing and layer design) that can help you quickly implement the fine-tuning model (you just need to try to modify crf_layer or softmax_layer).

Folder Description:

BERT-NER
|____ bert                          # need git from [here](https://github.com/google-research/bert)
|____ cased_L-12_H-768_A-12	    # need download from [here](https://storage.googleapis.com/bert_models/2018_10_18/cased_L-12_H-768_A-12.zip)
|____ data		            # train data
|____ middle_data	            # middle data (label id map)
|____ output			    # output (final model, predict results)
|____ BERT_NER.py		    # mian code
|____ conlleval.pl		    # eval code
|____ run_ner.sh    		    # run model and eval result

Usage:

bash run_ner.sh

What's in run_ner.sh:

python BERT_NER.py\
    --task_name="NER"  \
    --do_lower_case=False \
    --crf=False \
    --do_train=True   \
    --do_eval=True   \
    --do_predict=True \
    --data_dir=data   \
    --vocab_file=cased_L-12_H-768_A-12/vocab.txt  \
    --bert_config_file=cased_L-12_H-768_A-12/bert_config.json \
    --init_checkpoint=cased_L-12_H-768_A-12/bert_model.ckpt   \
    --max_seq_length=128   \
    --train_batch_size=32   \
    --learning_rate=2e-5   \
    --num_train_epochs=3.0   \
    --output_dir=./output/result_dir

perl conlleval.pl -d '\t' < ./output/result_dir/label_test.txt

Notice: cased model was recommened, according to this paper. CoNLL-2003 dataset and perl Script comes from here

RESULTS:(On test set)

Parameter setting:

  • do_lower_case=False
  • num_train_epochs=4.0
  • crf=False
accuracy:  98.15%; precision:  90.61%; recall:  88.85%; FB1:  89.72
              LOC: precision:  91.93%; recall:  91.79%; FB1:  91.86  1387
             MISC: precision:  83.83%; recall:  78.43%; FB1:  81.04  668
              ORG: precision:  87.83%; recall:  85.18%; FB1:  86.48  1191
              PER: precision:  95.19%; recall:  94.83%; FB1:  95.01  1311

Result description:

Here i just use the default paramaters, but as Google's paper says a 0.2% error is reasonable(reported 92.4%). Maybe some tricks need to be added to the above model.

reference:

[1] https://arxiv.org/abs/1810.04805

[2] google-research/bert

Popular Dataset Projects
Popular Recognition Projects
Popular Data Processing Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Google
Dataset
Tensorflow
Recognition
Ner
Crf