Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for corpus word segmentation
corpus
x
word-segmentation
x
14 search results found
Ekphrasis
⭐
583
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
Pycantonese
⭐
290
Cantonese Linguistics and NLP
Python Wordsegment
⭐
268
English word segmentation, written in pure-Python, and based on a trillion-word corpus.
Multi Criteria Cws
⭐
260
Simple Solution for Multi-Criteria Chinese Word Segmentation
Bi Lstm Crf
⭐
180
A PyTorch implementation of the BI-LSTM-CRF model.
Id Cnn Cws
⭐
130
Source codes and corpora of paper "Iterated Dilated Convolutions for Chinese Word Segmentation"
Toiro
⭐
110
A comparison tool of Japanese tokenizers
Rdrsegmenter
⭐
67
A Fast and Accurate Vietnamese Word Segmenter (LREC 2018)
Mypos
⭐
55
myPOS (Myanmar Part-of-Speech) Corpus for Myanmar NLP Research and Developments
Embedding Matching Word Segmenter
⭐
38
Code for the ACL-2015 paper "Accurate Linear-Time Chinese Word Segmentation via Embedding Matching"
Codeprep
⭐
24
A toolkit for pre-processing large source code corpora
Iparser
⭐
9
Yet another dependency parser, integrated with tokenizer, tagger and visualization tool.
Multi Embedding Cws
⭐
8
Multiple Character Embeddings for Chinese Word Segmentation, ACL 2019
Cdswordseg
⭐
7
Investigation into word segmentation for child- versus adult-directed speech transcriptions
Related Searches
Python Corpus (2,447)
Natural Language Processing Corpus (510)
Dataset Corpus (342)
Java Corpus (308)
Language Corpus (261)
Chinese Corpus (184)
Python Word Segmentation (128)
Tensorflow Corpus (108)
1-14 of 14 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.