Pretrained Models for Multilingual Federated Learning

Code and data setup for our paper: Pretrained Models for Multilingual Federated Learning by *Orion Weller, *Marc Marone, Vladimir Braverman, Dawn Lawrie, and Benjamin Van Durme. Many thanks to the great developers at the flwr team who have prepared excellent examples.

Enviroment Setup

NOTE: we used poetry following the advice of the flwr framework.

Install poetry (bash enviroment_setup/install_poetry.sh)
Activate poetry (bash enviroment_setup/activate_poetry.sh)
Install dependecies (poetry install). NOTE: this takes a few minutes.

Data Setup

After deciding which data setup you would like, look for the corresponding dataset in create_data For the sake of this readme, we will use the mtnt data.
cd into the folder (cd create_data/make_mtnt_data)
Follow the instructions in the readme located in the folder. It will typically have scripts for downloading, preprocessing, splitting, and then moving the data into the final location for the model.

Training/Evaluating Federated Learning Models

Make sure the enviroment and the data have been set up as above.
Depending on the type of model you want to train (classification, LM, or MT) see the corresponding scripts in bin/run_fl_{mt,tc,lm}.sh. Each script contains information about how to run centralized, non-IID FL, or IID FL learning, as well as random initialization and/or evaluation.
To evaluate BLEU scores, be sure to install the sacrebleu script and evaluating using the format described in bin/run_sacrebleu_eval.sh.

Citation

If you found this code or paper helpful, please consider citing:

@inproceedings{Weller2022PretrainedMF,
  title={Pretrained Models for Multilingual Federated Learning},
  author={Orion Weller and Marc Marone and Vladimir Braverman and Dawn J Lawrie and Benjamin Van Durme},
  booktitle={Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT)},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commits
bin		bin
create_data		create_data
enviroment_setup		enviroment_setup
plots_and_data_for_paper		plots_and_data_for_paper
.gitignore		.gitignore
README.md		README.md
constants.py		constants.py
create_naacl_plots.py		create_naacl_plots.py
dataset_utils.py		dataset_utils.py
main_lm.py		main_lm.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bin

bin

create_data

create_data

enviroment_setup

enviroment_setup

plots_and_data_for_paper

plots_and_data_for_paper

.gitignore

.gitignore

README.md

README.md

constants.py

constants.py

create_naacl_plots.py

create_naacl_plots.py

dataset_utils.py

dataset_utils.py

main_lm.py

main_lm.py

pyproject.toml

pyproject.toml

Repository files navigation

Pretrained Models for Multilingual Federated Learning

Enviroment Setup

Data Setup

Training/Evaluating Federated Learning Models

Citation

About

Releases

Packages

Languages

orionw/Multilingual-Federated-Learning

Folders and files

Latest commit

History

Repository files navigation

Pretrained Models for Multilingual Federated Learning

Enviroment Setup

Data Setup

Training/Evaluating Federated Learning Models

Citation

About

Resources

Stars

Watchers

Forks

Languages