Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Franc | 3,745 | 276 | 61 | 4 months ago | 34 | August 15, 2021 | 4 | mit | JavaScript | |
Natural language detection | ||||||||||
Spark Nlp | 3,158 | 2 | 2 | 9 hours ago | 90 | March 05, 2021 | 35 | apache-2.0 | Scala | |
State of the Art Natural Language Processing | ||||||||||
Nlp Models Tensorflow | 1,329 | 3 years ago | 3 | mit | Jupyter Notebook | |||||
Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 < Tensorflow < 2.0 | ||||||||||
Lingua Go | 862 | 2 | 11 days ago | 8 | December 28, 2021 | 5 | apache-2.0 | Go | ||
The most accurate natural language detection library for Go, suitable for long and short text alike | ||||||||||
Language Detection | 710 | 17 | 7 | a year ago | 18 | March 05, 2021 | 3 | mit | PHP | |
A language detection library for PHP. Detects the language from a given text string. | ||||||||||
Lingua Rs | 594 | 1 | 20 days ago | 3 | February 16, 2022 | 13 | apache-2.0 | Rust | ||
The most accurate natural language detection library for Rust, suitable for long and short text alike | ||||||||||
Awesome Persian Nlp Ir | 565 | 9 months ago | ||||||||
Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources | ||||||||||
Lingua | 520 | 1 | 5 months ago | 16 | June 09, 2022 | 6 | apache-2.0 | Kotlin | ||
👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike | ||||||||||
Lingua Py | 453 | 5 days ago | 6 | January 24, 2022 | 11 | apache-2.0 | Python | |||
The most accurate natural language detection library for Python, suitable for long and short text alike | ||||||||||
Malaya | 347 | 2 | 20 days ago | 149 | June 01, 2022 | 7 | mit | Jupyter Notebook | ||
Natural Language Toolkit for bahasa Malaysia, https://malaya.readthedocs.io/ |
Detect the language of text.
franc supports many languages, which means its easily confused on small samples. Make sure to pass it big documents to get reliable results.
Note: this installs the
franc
package, with support for 187 languages (languages which have 1 million or more speakers).franc-min
(82 languages, 8m or more speakers) andfranc-all
(all 414 possible languages) are also available. Finally, usefranc-cli
to install the CLI.
This package is ESM only. In Node.js (version 14.14+, 16.0+), install with npm:
npm install franc
In Deno with esm.sh
:
import {franc, francAll} from 'https://esm.sh/[email protected]'
In browsers with esm.sh
:
<script type="module">
import {franc, francAll} from 'https://esm.sh/[email protected]?bundle'
</script>
import {franc, francAll} from 'franc'
franc('Alle menslike wesens word vry') //=> 'afr'
franc(' IBM ') //=> 'ben'
franc('Alle menneske er fdde til fridom') //=> 'nno'
franc('') //=> 'und' (language code that stands for undetermined)
// You can change whats too short (default: 10):
franc('the') //=> 'und'
franc('the', {minLength: 3}) //=> 'sco'
console.log(francAll('Considerando ser essencial que os direitos humanos'))
//=> [['por', 1], ['glg', 0.771284519307895], ['spa', 0.6034146900423971], 123 more items]
console.log(francAll('Considerando ser essencial que os direitos humanos', {only: ['por', 'spa']}))
//=> [['por', 1 ], ['spa', 0.6034146900423971]]
console.log(francAll('Considerando ser essencial que os direitos humanos', {ignore: ['spa', 'glg']}))
//=> [['por', 1], ['cat', 0.5367251059928957], ['src', 0.47461899851037015], 121 more items]
This package exports the identifiers franc
, francAll
.
There is no default export.
franc(value[, options])
Get the most probable language for the given value.
value
(string
) value to testoptions
(Options
, optional) configurationThe most probable language (string
).
francAll(value[, options])
Get the most probable language for the given value.
value
(string
) value to testoptions
(Options
, optional) configurationArray containing languagedistance tuples (Array<[string, number]>
).
Options
Configuration (Object
, optional) with the following fields:
options.only
Languages to allow (Array<string>
, optional).
options.ignore
Languages to ignore (Array<string>
, optional).
options.minLength
Minimum length to accept (number
, default: 10
).
Install:
npm install franc-cli --global
Use:
CLI to detect the language of text
Usage: franc [options] <string>
Options:
-h, --help output usage information
-v, --version output version number
-m, --min-length <number> minimum length to accept
-o, --only <string> allow languages
-i, --ignore <string> disallow languages
-a, --all display all guesses
Usage:
# output language
$ franc "Alle menslike wesens word vry"
# afr
# output language from stdin (expects utf8)
$ echo " IBM " | franc
# ben
# ignore certain languages
$ franc --ignore por,glg "O Brasil caiu 26 posies"
# src
# output language from stdin with only
$ echo "Alle mennesker er fdt frie og" | franc --only nob,dan
# nob
Package | Languages | Speakers |
---|---|---|
franc-min |
82 | 8M or more |
franc |
187 | 1M or more |
franc-all |
414 | - |
Note: franc returns ISO 639-3 codes (three letter codes). Not ISO 639-1 or ISO 639-2. See also GH-10 and GH-30.
To get more info about the languages represented by ISO 639-3, use
iso-639-3
.
There is also an index available to map ISO 639-3 to ISO 639-1 codes,
iso-639-3/to-1.json
, but note that not all 639-3 codes can
be represented in 639-1.
These packages are fully typed with TypeScript.
They export the additional types TrigramTuple
and Options
.
These package are at least compatible with all maintained versions of Node.js. As of now, that is Node.js 14.14+ and 16.0+. They also works in Deno and modern browsers.
Franc has been ported to several other programming languages.
paasaa
efranc
franco
,
whatlanggo
franc
whatlang-rs
francd
pyfranc
The works franc is derived from have themselves also been ported to other languages.
Franc is a derivative work from guess-language (Python, LGPL), guesslanguage (C++, LGPL), and Language::Guess (Perl, GPL). Their creators granted me the rights to distribute franc under the MIT license: respectively, Kent S. Johnson, Jacob R. Rideout, and Maciej Ceglowski.
Yes please! See How to Contribute to Open Source.
This package is safe.