Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for python web archiving
python
x
web-archiving
x
19 search results found
Archivebox
⭐
19,721
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
Conifer
⭐
1,434
Collect and revisit web pages.
Pywb
⭐
1,259
Core Python Web Archiving Toolkit for replay and recording of web archives
Ipwb
⭐
577
InterPlanetary Wayback: A distributed and persistent archive replay system using IPFS
Auto Archiver
⭐
439
Automatically archive links to videos, images, and social media content from Google Sheets (and more).
Archivenow
⭐
376
A Tool To Push Web Resources Into Web Archives
Wail
⭐
330
🐋 Web Archiving Integration Layer: One-Click User Instigated Preservation
Waybackpy
⭐
235
Wayback Machine API interface & a command-line tool
Warcio
⭐
173
Streaming WARC/ARC library for fast web archive IO
Sfm Ui
⭐
148
Social Feed Manager user interface application.
Ph Submissions
⭐
133
The repository and website hosting the peer review process for new Programming Historian lessons
Cdx_toolkit
⭐
121
A toolkit for CDX indices such as Common Crawl and the Internet Archive's Wayback Machine
Fatcat
⭐
98
Perpetual Access To The Scholarly Record
Warcworker
⭐
33
A dockerized, queued high fidelity web archiver based on Squidwarc
Metawarc
⭐
21
metawarc: a command-line tool for metadata extraction from files from WARC (Web ARChive)
Sandcrawler
⭐
19
Backend, IA-specific tools for crawling and processing the scholarly web. Content ends up in https://fatcat.wiki
Bookmark Archiver
⭐
18
🗄 Save an archived copy of websites from Pocket/Pinboard/Bookmarks/RSS. Outputs HTML, PDFs, and more...
Cdxj Indexer
⭐
17
CDXJ Indexing of WARC/ARCs
Seeder
⭐
15
Seeder - Czech webarchive curating tool and public site
Debian Archivebox
⭐
13
Home of the official apt/deb package for Ubuntu/Debian-based systems.
Pip Archivebox
⭐
13
Official Python package for ArchiveBox, the self-hosted internet archiving solution.
Webarchiver
⭐
9
Decentralized web archiving
Capture Urls
⭐
5
Archive a list of URLs using the Wayback Machine
Related Searches
Python Django (26,784)
Python Machine Learning (20,195)
Python Flask (17,643)
Python Dataset (14,792)
Python Docker (14,452)
Python Tensorflow (13,736)
Python Command Line (13,139)
Python Deep Learning (13,092)
Python Jupyter Notebook (12,976)
Python Network (11,495)
1-19 of 19 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.