Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Awesome Datahoarding | 892 | 7 months ago | 4 | |||||||
List of data-hoarding related tools | ||||||||||
Archivebot | 328 | 5 months ago | 169 | mit | Python | |||||
ArchiveBot, an IRC bot for archiving websites | ||||||||||
Bitextor | 260 | 7 months ago | 4 | gpl-3.0 | Python | |||||
Bitextor generates translation memories from multilingual websites | ||||||||||
Google Group Crawler | 213 | 2 years ago | 6 | Shell | ||||||
[Deprecated] Get (almost) original messages from google group archives. Your data is yours. | ||||||||||
Authority Data | 83 | 3 months ago | 1 | gpl-3.0 | Python | |||||
官方权威数据:统计年签,统计公报,互联网行业报告,工信部数据,ICT报告等 Official authoritative data (Chinese) | ||||||||||
Fetchurls | 79 | 2 years ago | 1 | mit | Shell | |||||
A bash script to spider a site, follow links, and fetch urls (with built-in filtering) into a generated text file. | ||||||||||
Wget Lua | 72 | 4 months ago | 10 | gpl-3.0 | C | |||||
Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication. | ||||||||||
Wmirror | 11 | 2 years ago | gpl-3.0 | Shell | ||||||
wmirror allows you to download any website from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer. | ||||||||||
Metagoofeel | 10 | a year ago | mit | Shell | ||||||
Web crawler and downloader based on GNU Wget. | ||||||||||
Pywebquery | 10 | 12 years ago | Python | |||||||
a jquery liked pythonic web crawler library ,it's based on BeautifulSoup and wget |