Cross Language Dataset

A multilingual, multi-style and multi-granularity dataset for cross-language textual similarity detection
Alternatives To Cross Language Dataset
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
The Pile1,048
a year ago1October 17, 202019mitPython
Multilingual_text_to_speech740
6 months ago1mitPython
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
Github Typo Corpus289
4 years ago1Python
GitHub Typo Corpus: A Large-Scale Multilingual Dataset of Misspellings and Grammatical Errors
Xl Sum209
10 months agoPython
This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.
Mtdata115
10 months ago21November 25, 202222apache-2.0Python
A tool that locates, downloads, and extracts machine translation corpora
Ml Mkqa94
2 years ago1apache-2.0Python
We introduce MKQA, an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically diverse languages (260k question-answer pairs in total). The goal of this dataset is to provide a challenging benchmark for question answering quality across a wide set of languages. Please refer to our paper for details, MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering
Mmner69
2 years ago1apache-2.0Python
Massively Multilingual Transfer for NER
Anuvaad65
3 years ago7April 11, 20213gpl-3.0Python
State of the art open-source translation for Indic languages.
Glot50065
4 months agootherPython
Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages (ACL'23)
Miracl61
5 months ago1apache-2.0
A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.
Alternatives To Cross Language Dataset
Select To Compare


Alternative Project Comparisons
Popular Dataset Projects
Popular Multilingual Projects
Popular Data Processing Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Dataset
Parallel
Corpus
Multilingual
Cross Language