Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for format corpus
corpus
x
format
x
21 search results found
Readtext
⭐
112
an R package for reading text files
Gum
⭐
76
Repository for the Georgetown University Multilayer Corpus (GUM)
Eventstoryline
⭐
70
Event StoryLine Corpus - annotated data, baselines and evaluation scripts, evaluation data.
Folia
⭐
60
FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for processing FoLiA is implemented as part of PyNLPl, this contains higher-level tools that use the library as well as the full documentation, validation schemas,
Craft
⭐
58
Deft_corpus
⭐
57
The Definition Extraction From Text corpus and relevant formatting scripts
Ronec
⭐
54
Romanian Named Entity Corpus (RONEC) version 2.0
Broad_twitter_corpus
⭐
52
The Broad Twitter Corpus, an NER dataset in English stratified for time, location, social media genre, socioeconomic factors
Morphorueval 2017
⭐
41
Discoursegraphs
⭐
34
linguistic converter / merging tool for multi-level annotated corpora. graph-based (using Python and NetworkX).
Machinelearningphishing
⭐
32
This project will determine which of the five supervised classification machine learning algorithms performs best in detecting phishy emails
Openconvert
⭐
19
Text conversion tool (from e.g. Word, HTML, txt) to corpus formats TEI or FoLiA)
Asp Source
⭐
18
Source stories from the African Storybook Project in Markdown format
Word2vec4kor
⭐
9
Annotald
⭐
6
A program for annotation in the Penn Treebank format
Scotus Speech
⭐
6
Corpus of oral arguments (recorded speech + official transcripts) of the United States Supreme Court
Bc2gm Corpus
⭐
6
Work related to the BioCreative II Gene Mention corpus
Spam Email Classifier Dataset
⭐
5
Some simple codes to format the CSDMC2010 SPAM corpus
Openpaas Sp5 Lm Preparation
⭐
5
Suc_to_iob
⭐
5
Convert the SUC 3.0 corpus from a custom format to IOB2 for use in training NER applications
Cluster Preprocessing
⭐
5
preprocessing of large corpora to induce various cluster types
Related Searches
Python Format (3,131)
Javascript Format (2,749)
Python Corpus (2,447)
Json Format (1,085)
C Plus Plus Format (1,007)
Java Format (992)
Php Format (883)
Natural Language Processing Corpus (510)
Dataset Corpus (342)
Java Corpus (308)
1-21 of 21 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.