Daily Scraper

Fetches information about every webpage 🤖
Alternatives To Daily Scraper
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Daily Scraper86
20 days ago16agpl-3.0HTML
Fetches information about every webpage 🤖
Webpage Rs2733 months ago10December 12, 20212Rust
Rust library to fetch info from a webpage
Awesome Seo Scripts22
2 years agoJavaScript
Random SEO scripts
Metatags19
5 years agoPHP
A Laravel package to fetch webpage metadata ( Open Graph | Twitter | Facebook | Article )
Webpage Scraper15
7 years ago1Python
This is a flask based application which fetches images, hyperlinks, indented source code and text after stripping the html tags from a given webpage and allows you to save them onto your system in a directory or text file with a name of your choice.
Web2db13
3 years ago6September 22, 2020Python
Fetch webpage full-text, persist link and full text to SQLITE3 db, resumable with tqdm progressbar.
Gangsta8
6 years agoapache-2.0PHP
Fetch OpenGraph data from a url and display in ExpressionEngine templates
Metadog726 years ago8November 02, 2016mitJavaScript
Sniffs out and fetches open graph and schema.org metadata from webpages.
Webpage6
2 months ago13HTML
Sources for the JMLR webpage
Xst Google Calendar Events4
2 years ago3mitJavaScript
A react Component, which fetch Calendar entries from Google Calender and output them to your Webpage. It's based on Javascript (react) and create an sortable Event-List Table. No need for PHP or Database-Connection.
Alternatives To Daily Scraper
Select To Compare


Alternative Project Comparisons
Readme

Daily Scraper

Fetches information about every webpage 🤖

Build Status License StackShare

The service uses Puppeteer, a headless Chrome, to scrape webpages. Currently it's only purpose is to provide information when a user suggests a new source. The scraper can find the icon, rss feed, name, and other relevant information for every page.

Stack

  • Node v12.19.0 (a .nvmrc is presented for nvm users).
  • NPM for managing dependencies.
  • Fastify as the web framework

Project structure

  • __tests__ - There you can find all the tests and fixtures. Tests are written using jest.
  • helm - The home of the service helm chart for easily deploying it to kubernetes.
  • src - This is obviously the place where you can find the source files.
    • scrape - Stores many utility functions to scrape information from a webpage.

Local environment

Daily Scraper requires nothing to run. It doesn't need any database or a service.

.env is used to set the required environment variables. It is loaded automatically by the project.

Finally run npm run dev to run the service and listen to port 5001.

Want to Help?

So you want to contribute to Daily Scraper and make an impact, we are glad to hear it. 😍

Before you proceed we have a few guidelines for contribution that will make everything much easier. We would appreciate if you dedicate the time and read them carefully: https://github.com/dailydotdev/.github/blob/master/CONTRIBUTING.md

Popular Webpage Projects
Popular Fetch Projects
Popular Text Processing Categories

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Html
Scraper
Fetch
Webpage