Awesome Open Source

Programming Languages

Search results for web crawler

1,484 search results found

Web_scraper ⭐ 73

A very basic web scraper implementation to scrap html elements from a web page.

Spotifyscraper ⭐ 72

Spotify Scraper to extract all the information from spotify, download mp3 with cover of the song

Davedavefind ⭐ 71

A simple search engine based on the web crawler developed in Udacity's CS101 course.

Catalyst ⭐ 71

A VS code Extension to accelerate the process of solving problems on Codeforces.

Yet Another Web Spider

Bancocentralbrasil ⭐ 71

💵 💰 🇧🇷 Informações sobre taxas oficiais diárias de Inflação, Selic, Poupança, Dólar, Dólar PTAX, Euro e Euro PTAX pelo site do Banco Central do Brasil

Brokenlinkhijacker ⭐ 71

A Fast Broken Link Hijacker Tool written in Python

Scrapy Wayback Machine ⭐ 70

A Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.

The Hawker gem is a web scraper which allows you to pull the basic information for given social media profile URL

Autoscrape Py ⭐ 70

An automated, programming-free web scraper for interactive sites

Webkitcrawler ⭐ 69

QtWebKit-based web crawler

Robotstxt ⭐ 68

robots.txt file parsing and checking for R

ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-0

Top Github Scraper ⭐ 67

Scape top GitHub repositories and users based on keywords

Web Scraping ⭐ 67

Web Scraping with Beautiful Soup and Selenium

Schweizermesser ⭐ 66

🎯Python 3 网络爬虫实战、数据分析合集 | 当当 | 网易云音乐 | unsplash | 必胜客 | 猫眼 |

Data Wrangling With Python ⭐ 66

Simplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices

Youtube Audio ⭐ 66

extract videos from youtube in audio format using webscraping techniques 🎶

Cvpr2019 ⭐ 65

Displays all the 2019 CVPR Accepted Papers in a way that they are easy to parse.

Decapitated ⭐ 64

Headless 'Chrome' Orchestration in R

I needed a serious web crawler for search engine applications. This is it.

Amazon_scraper ⭐ 64

Amazon products scraper with using of rotating proxies and headless Chrome from ScrapingAnt

Phd Seeker ⭐ 64

Finding latest fully funded PhD positions for international students through web scraping

🍎 Receive an email or Telegram message as soon as Migros Sanalmarket is available for delivery in your neighborhood.

Malaysianpaygap ⭐ 63

Scrapping malaysianpaygap & Extracting data from the Instagram posts

Newspaperjs ⭐ 63

News extraction and scraping. Article Parsing

Dotnetcrawler ⭐ 63

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-w

Faexport ⭐ 62

The API for Furaffinity you wish existed

Animewatcher ⭐ 62

The goal of this project/app is to let the user watch anime without ads. It uses Jsoup to extract data from the website and Exoplayer to show videos.

Simplestorm ⭐ 62

Simple Storm-like distributed application implementation

Distributed task redisqueue(最简单python分布式函数调度框架)

Kenpompy ⭐ 61

A simple yet comprehensive web scraper for kenpom.com.

Apple Autofill Domains ⭐ 61

Apple's allowed autofill domains

Covid_19_jhu_data_web_scrap_and_cleaning ⭐ 61

This repository contains data and code used to get and clean data from https://github.com/CSSEGISandData/COVID-19 and https://www.worldometers.info/coronavirus/

Crawlzone ⭐ 60

Crawlzone is a fast asynchronous internet crawling framework for PHP.

Gocrawler ⭐ 60

A distributed web crawler implemented using Go, Postgres, RabbitMQ and Docker

The Python Library For QtsApp which displays the option chain in near real-time. This program retrieves this data from the QtsApp site and then generates useful analysis of the Option Chain for the specified Index or Stock. It also continuously refreshes the Option Chain along with Implied Volatatlity (IV), Open Interest (OI), Delta, Theta, Vega, Gamma, Vanna, Charm, Speed, Zomma, Color, Volga, Veta at an interval of a second and visually displays the trend in various indicators useful for Techn

Webreaper ⭐ 59

Web scraper, crawler and parser in C#. Designed as simple, declarative and scalable web scraping solution.

Instagram Giveaways Winner ⭐ 59

Instagram Bot which when given a post url will spam mentions to increase the chances of winning. Win Instagram Giveaways!

Keyword_based_sina_weibo_crawler ⭐ 59

A web crawler for Sina, search and retrieve microblogs that contain certain keywords 一个简单的python爬虫实践，爬取包含关键词的新浪微博

Song Cli ⭐ 58

A command line interface for downloading Bollywood and punjabi songs

Node Js Functionalities ⭐ 58

This repository contains very useful restful API's and functionalities in node-js containing many important tutorial code for mastering node-js, all tutorials have been published on medium.com, tutorials link is given below

Scraping Tripadvisor With Python 2020 ⭐ 58

Python implementation of web scraping of TripAdvisor with Selenium in a new 2019 website

Siteshooter ⭐ 58

📷 Automate full website screenshots and PDF generation with multiple viewport support.

Feedbridge ⭐ 57

Plugin based RSS feed generator for sites that don't offer any. Serves RSS, Atom and JSON Feeds.

Linkedin Email Extractor ⭐ 57

A node web scraper to extract your linkedin connection emails

Clauneck ⭐ 57

A tool for scraping emails, social media accounts, and much more information from websites using Google Search Results.

Tripadvisor Scraper ⭐ 56

The basics of forming an input code for scraping travel industry pages with Tripadvisor Scraper API + an example of results.

Searchifyx ⭐ 56

Fast flashcard searcher study tool

Nba Search ⭐ 56

flask application designed to explore NBA data 🏀

Talospider ⭐ 55

talospider - A simple,lightweight scraping micro-framework

Selectorlib ⭐ 55

A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them

Nodejs Web Scraper ⭐ 54

Pythonscrapybasicsetup ⭐ 54

Basic setup with random user agents and IP addresses for Python Scrapy Framework.

One repo to rule them all !!?!?!! 🤓 😎

Owlcrawler ⭐ 54

Crawl the web using nats.io and Go

Lectures ⭐ 53

Lecture material for Big Data in Economics (EC 410/510)

Mobile Phone Dataset Gsmarena ⭐ 53

Python script for creating Mobile Phones Dataset on GSMArena website.

Google Search Results Nodejs ⭐ 53

SerpApi client library for Node.js. Previously: Google Search Results Node.js.

Metacritic_api ⭐ 52

PHP Metacritic API - Mirror from my GitLab

Animex V2 ⭐ 52

animeX is a CLI tool for downloading anime directly to your PC

Web_check ⭐ 52

Script for checking changes in webpages

Llm_osint ⭐ 52

LLM OSINT is a proof-of-concept method of using LLMs to gather information from the internet and then perform a task with this information.

fast cmd-line app that quickly request millions of urls and save/echo the results

Linkedin Profiles Scraping ⭐ 51

Automatically scrape the web data of people profiles on Linkedin based on a specific search query

Dashboard ⭐ 50

A tkinter GUI collating various data

Kampus Scraper ⭐ 50

Scraper & GraphQL API untuk data Perguruan Tinggi di Indonesia berdasarkan dari website Kementrian RISTEKDIKTI.

Datadoubleconfirm ⭐ 50

Simple datasets and notebooks for data visualization, statistical analysis and modelling - with write-ups here: http://projectosyo.wix.com/datadoubleconfirm.

Comp_thinking_social_science ⭐ 50

Computational Thinking for Social Scientists book project

Laravel Books Api ⭐ 49

Fully documented & tested Laravel 9 RESTful books API scraped from Gramedia.

Hk0weather ⭐ 49

Web scraper project to collect the useful Hong Kong weather data from HKO website

Ds Ml Public ⭐ 49

Python Scripts and Jupyter Notebooks

Instagram Bot ⭐ 48

🤖 Python bot to view stories, like and comment on Instagram

Project Tauro ⭐ 48

A Router WiFi key recovery/cracking tool with a twist.

Python Libzim ⭐ 48

Libzim binding for Python: read/write ZIM files in Python

Maps To Lead ⭐ 48

Esse projeto tem como objetivo obter leads em formato JSON e enviar para um webhook

Pysearch ⭐ 48

Web crawler and Search engine in Python.

Nightmare Heroku ⭐ 48

😱 a setup for nightmarejs on heroku

Python Assistant ⭐ 47

Python Assistant (PA) is a voice command based assistant service written in Python 3.9+. It can recognize human speech or voice, talk to user and execute basic commands.

Scrapy Craigslist ⭐ 47

Web Scraping Craigslist's Engineering Jobs in NY with Scrapy

Bookingscraper ⭐ 47

🌎 🏨 Scrape Booking.com 🏨 🌎

Webcrawler ⭐ 47

Just a simple web crawler which return crawled links as IObservable using reactive extension and async await.

Webcollector Python ⭐ 47

WebCollector-Python is an open source web crawler framework based on Python.It provides some simple interfaces for crawling the Web,you can setup a multi-threaded web crawler in less than 5 minutes.

Trscraper ⭐ 47

TRScraper, doğal dil işleme uygulamalarında kullanılmak amacıyla geliştirilmiş, Türkçe içerik girilen büyük platformlarda metin madenciliği yapma imkanı sunan bir uygulamadır.

Lead Generation ⭐ 46

Python script, which empowers people with no programming background to generate robust leads on a mass scale. This repo will be compiled of various versatile techniques used in lead generation.

Marketools ⭐ 46

Tools for stock market analysis.

Twitterbot ⭐ 45

🤖 CLI Twitter Bot. It's made to reach more engagement based on your interests.

Unofficial Python wrapper for NyaaPantsu API and Nyaa.si

Android Web Scraper ⭐ 45

Android Web Scraper is a simple library for android web automation. You can perform web task in background to fetch website data programmatically.

Cultured Downloader ⭐ 44

A project to automate the process of downloading images and other attachment files from Pixiv, Pixiv Fanbox, and Fantia

Uoft Scrapers ⭐ 44

Public web scraping scripts for the University of Toronto.

Keeper Core Api ⭐ 44

Nunux Keeper core API

Pulsarrpapro ⭐ 44

PulsarRPAPro is the professional edition of PulsarRPA with industrial grade scraping demos and the most advanced AI for auto extraction.

Blazingly fast web crawler for mapping and updating data

Public, free to use, repository with diggers configs for scraping / extracting data from various e-commerce websites and online stores

Letterboxd Watchlist Picker ⭐ 43

A simple website that gives you a random film off your Letterboxd watchlist (or any list).

Crawling Projects ⭐ 43

Web scraping and automation using python

Yellowpages Scraper ⭐ 43

Yellowpages.com Web Scraper written in Python and LXML to extract business details available based on a particular category and location.

Slack Gpt Bot ⭐ 43

GPT4-powered Slack bot that can scrape URL contents

Related Searches

Scraper Web Crawler (1,388)

301-400 of 1,484 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.