Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for xml wikipedia
wikipedia
x
xml
x
28 search results found
Wikiteam
⭐
661
Tools for downloading and preserving wikis. We archive wikis, from Wikipedia to tiniest wikis. As of 2023, WikiTeam has preserved more than 350,000 wikis.
Wikipedia Extractor
⭐
247
This is a mirror of the script by Giuseppe Attardi, and contains history before the official repo started: https://github.com/attardi/wikiextractor --- Extracts and cleans text from Wikipedia database dump and stores output in a number of files of similar size in a given directory.
Json Wikipedia
⭐
244
Json Wikipedia, contains code to convert the Wikipedia xml dump into a json/avro dump
Dumpster Dive
⭐
214
roll a wikipedia dump into mongo
Go Xml Parse
⭐
117
Streaming XML parser example in go
Annotated Wikiextractor
⭐
88
Simple Wikipedia plain text extractor with article link annotations and Hadoop support.
Xs4s
⭐
50
XML Streaming for Scala including FS2/cats support
Wikidump
⭐
41
Tools to manipulate and extract data from wikipedia dumps
Wikihistoryflow
⭐
39
Visualise Wikipedia page edits using History Flow
Wikiforia
⭐
31
A Utility Library for Wikipedia dumps
Wikibot
⭐
26
Some MediaWiki bot examples including wikipedia, wikidata using MediaWiki module of CeJS library. 採用 CeJS MediaWiki 自動化作業用程式庫來製作 MediaWiki (維基百科/維基數據) 機器人的範例。
Solr Wikipedia
⭐
23
Wikixmlj
⭐
20
WikiXMLJ provides easy access to Wikipedia XML dumps.
Mediawiki Dump
⭐
19
Python package for working with MediaWiki XML content dumps
Wikicorpusextractor
⭐
19
Extracts text from WikiMedia XML Dump files
Epub_conversion
⭐
16
Python package for converting xml and epubs to text files
Wikiprep
⭐
16
Wikipedia preprocessor and information extractor.
Grisp
⭐
13
Knowledge Base stuff
Wikise
⭐
11
A wikipedia search engine that is completely built in Java and works on Wikipedia XML dumps
Wikiparser
⭐
10
Fast C++ based parser for English Wikipedia
Wikipedia.org Xmldump Mongodb
⭐
9
Legacy code has been replaced
Secondmarket Nyu Spring 2012
⭐
8
Offline Wiki Reader
⭐
7
📚 A shell script for searching Wikipedia index files and extracting single page content straight from the related compressed Wikipedia XML dumps.
Inxs
⭐
7
A Python framework for XML transformations without boilerplate.
Wikitools
⭐
5
Few tools for working with wikipedia XML dumps.
Wikipoff Tools
⭐
5
Python tools to build databases for wikipoff app
Wikipedia2json
⭐
5
Converts wikipedia dump from XML to JSON
Wikiprep Postprocess
⭐
5
Postprocess XML output from wikiprep (Wikipedia preprocessor) into JSON
Related Searches
Java Xml (3,600)
Python Xml (2,625)
Javascript Xml (1,987)
Php Xml (1,701)
Json Xml (1,558)
Python Wikipedia (1,264)
C Sharp Xml (1,109)
Ruby Xml (1,062)
Html Xml (990)
Javascript Wikipedia (817)
1-28 of 28 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.