Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Language Style Transfer | 491 | 3 years ago | 21 | apache-2.0 | Roff | |||||
Bible Corpus | 134 | a year ago | 2 | cc0-1.0 | ||||||
A multilingual parallel corpus created from translations of the Bible. | ||||||||||
Bicleaner | 134 | 1 | 4 months ago | 35 | March 29, 2023 | gpl-3.0 | Python | |||
Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus. | ||||||||||
Korean Parallel Corpora | 129 | a year ago | 1 | |||||||
Korean Parallel Corpus | ||||||||||
Awesome Danish | 110 | a year ago | other | |||||||
A curated list of awesome resources for Danish language technology | ||||||||||
Lingtrain Aligner | 98 | 5 months ago | 53 | November 26, 2023 | 3 | gpl-3.0 | Python | |||
Lingtrain Aligner — ML powered library for the accurate texts alignment. | ||||||||||
Small_parallel_enja | 61 | 5 years ago | Roff | |||||||
50k English-Japanese Parallel Corpus for Machine Translation Benchmark. | ||||||||||
Wikipedia Parallel Titles | 53 | 9 years ago | 4 | Perl | ||||||
Tools for extracting parallel corpora from article titles across languages in Wikipedia | ||||||||||
Cross Language Dataset | 50 | 7 years ago | 1 | other | ||||||
A multilingual, multi-style and multi-granularity dataset for cross-language textual similarity detection | ||||||||||
Naki | 49 | 3 years ago | gpl-3.0 | |||||||
List of research and engineering of NLP for American Native/Indigenous Languages. |