Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Web2db | 13 | 3 years ago | 6 | September 22, 2020 | Python | |||||
Fetch webpage full-text, persist link and full text to SQLITE3 db, resumable with tqdm progressbar. | ||||||||||
Firelinks | 10 | 11 years ago | Ruby | |||||||
Sync elinks with Firefox | ||||||||||
Web Scraping Box Office Mojo | 6 | 4 years ago | Python | |||||||
for each year scraped the table and collected all tables and made sqlite database. | ||||||||||
Reask | 4 | 5 years ago | mit | JavaScript | ||||||
Reask is a project developed with React&Flask. [ Reask 是用 React&Flask 开发的全栈项目 ] :rocket: | ||||||||||
Sqlite_bookstore | 2 | 7 years ago | Python | |||||||
An example SQLite implementation of an Online Bookstore (well... Marketplace) for my Database Systems class. |
Fetches the full text of input URLs and persists them to sqlite3 DB file.
Fetching is resumable and comes with a progressbar.
pip install web2db
import web2db
web2db.dump('data.db', urls=[
'https://www.google.com',
'https://www.yahoo.com',
'https://www.msn.com'
])
Query the DB file:
df = web2db.to_df(sqlite3_file_path)
print(df.shape)
print(df)
WebPages
url | fulltext | status_code |
---|---|---|
text | text | int |