Awesome Open Source

Programming Languages

Search results for crawler

3,834 search results found

Filemasta ⭐ 646

A search application to explore, discover and share online files

Public Amazon Crawler ⭐ 645

Google Play Scraper ⭐ 645

Google play scraper for Python inspired by <facundoolano/google-play-scraper>

Tailwindui Crawler ⭐ 642

tailwindui-crawler downloads the component HTML files locally

Yacy_grid_crawler ⭐ 639

Crawler Microservice for the YaCy Grid

Course Crawler ⭐ 633

🎓 中国大学MOOC、学堂在线、网易云课堂、好大学在线、爱课程 MOOC 课程下载。

英语字典英语词库字典词库四级单词六级单词考研单词雅思托福 SAT GMAT TOEFL GRE

Hacker News Digest ⭐ 620

📰 Let ChatGPT Summarize Hacker News for You

Easy Scraping Tutorial ⭐ 618

Simple but useful Python web scraping tutorial code.

Voyager ⭐ 618

crawl and scrape web pages in rust

Brozzler ⭐ 613

brozzler - distributed browser-based web crawler

Fictiondown ⭐ 601

小说下载|小说爬取|起点|笔趣阁|导出Markdown|导出txt|转换epub|广告过滤|自动校对

Magnet Dht ⭐ 591

✌️ Python3 BitTorrent DHT crawler

Xehentai ⭐ 591

Doujinshi downloader 绅士漫画下载

Http Status Check ⭐ 587

CLI tool to crawl a website and check HTTP status codes

Fast high-level web crawling Ruby framework

Newcrawler ⭐ 583

Free Web Scraping Tool with Java

Adminhack ⭐ 581

today we will hack the admin panel of the site.

Chatweb ⭐ 573

ChatWeb can crawl web pages, read PDF, DOCX, TXT, and extract the main content, then answer your questions based on the content, or summarize the key points.

Scrapedin ⭐ 572

LinkedIn Scraper (currently working 2020)

Python Fxxk Spider ⭐ 571

收集各种免费的 Python 爬虫项目

OSINT Swiss Army Knife

Netdiscovery ⭐ 557

NetDiscovery 是一款基于 Vert.x、RxJava 2 等框架实现的通用爬虫框架/中间件。

Thea11ymachine ⭐ 553

The A11y Machine is an automated accessibility testing tool which crawls and tests pages of any web application to produce detailed reports.

Tumblthree ⭐ 552

A Tumblr and Twitter Blog Backup Application

API of DouYin for Humans used to Crawl Popular Videos and Musics

Jvppeteer ⭐ 549

Headless Chrome For Java （Java 爬虫）

Python Gems ⭐ 544

Beautifully constructed python scripts

Google Play Crawler ⭐ 539

Play with Google Play API :)

Go_jobs ⭐ 536

带你了解一下Golang的市场行情

基于小红书 Web 端进行的请求封装。https://reajason.github.io/xhs/

Bounty Targets ⭐ 522

This project crawls bug bounty platform scopes (like Hackerone/Bugcrowd/Intigriti/etc) hourly and dumps them into the bounty-targets-data repo

Nintendo Switch Eshop ⭐ 513

Crawler for Nintendo Switch eShop

a new crawler based on python with more function including Network fingerprint search

Domains ⭐ 508

World’s single largest Internet domains dataset

Progresskit ⭐ 498

Progress Views for Cocoa

An example of use of compute shaders and procedural instancing.

Crawljax ⭐ 493

Arrowdl ⭐ 487

ArrowDL (Arrow Downloader) is a download manager for Windows, MacOS and Linux

Spidermon ⭐ 486

Scrapy Extension for monitoring spiders execution.

Nodejs Stuff ⭐ 484

Node.js libs I want to keep in mind.

Blackfire Player is a powerful Web Crawling, Web Testing, and Web Scraper application. It provides a nice DSL to crawl HTTP services, assert responses, and extract data from HTML/XML/JSON responses.

Personrelationknowledgegraph ⭐ 480

ChinesePersonRelationGraph, person relationship extraction based on nlp methods.中文人物关系知识图谱项目,内容包括中文人物关系图谱构建,基于知识库的数据回标,基于远

swiss army knife for hackers

React Scanner ⭐ 474

Extract React components and props usage from code.

Browsertrix Crawler ⭐ 470

Run a high-fidelity browser-based crawler in a single Docker container

Commoncrawl ⭐ 466

Common Crawl support library to access 2008-2012 crawl archives (ARC files)

Webster ⭐ 465

a reliable high-level web crawling & scraping framework for Node.js.

Sentiment Analysis In Event Driven Stock Price Movement Prediction ⭐ 462

Use NLP to predict stock price movement associated with news

Convert HTML to Markdown.

Pywebcopy ⭐ 455

Locally saves webpages to your hard disk with images, css, js & links as is.

Tarantula ⭐ 453

a big hairy fuzzy spider that crawls your site, wreaking havoc

Scrapple ⭐ 452

A framework for creating semi-automatic web content extractors

Awesome Scrapy ⭐ 450

A curated list of awesome packages, articles, and other cool resources from the Scrapy community.

Spidersuite ⭐ 447

Advance web spider/crawler for cyber security professionals

Crack Js Spider ⭐ 442

JS破解逆向，破解JS反爬虫加密参数，已破解极验滑块w（2022.2.19），QQ音乐sign（20

Learnpython ⭐ 437

Python的基础练习代码与各种爬虫代码

Movie metadata scraper

《爬虫逆向进阶实战》书籍代码库

The fastest web crawler written in Rust. Maintained by @a11ywatch.

Html2article ⭐ 425

Html网页正文提取

Malspider ⭐ 425

Malspider is a web spidering framework that detects characteristics of web compromises.

Comicbook ⭐ 420

本项目不再维护，详情可加群了解 https://t.me/onecomicbook

Opensearchserver ⭐ 419

Open-source Enterprise Grade Search Engine Software

Fbcrawl ⭐ 415

A Facebook crawler

Wscan is a web security scanner that focuses on web security, dedicated to making web security accessible to everyone.

Pulsarrpa ⭐ 413

Automate webpages at scale, scrape web data completely and accurately with high performance, distributed RPA.

Simple, but smart, multi-threaded web crawler for randomly gathering huge lists of unique domain names.

Playdrone ⭐ 405

Google Play Crawler

Linkedin Profile Scraper Api ⭐ 404

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON.

Fastladder ⭐ 402

Fastladder Open Source [Forked]

Sparkler ⭐ 401

Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.

Isp Data Pollution ⭐ 400

ISP Data Pollution to Protect Private Browsing History with Obfuscation

dude uncomplicated data extraction: A simple framework for writing web scrapers using Python decorators

Music Recover ⭐ 397

🎵 缓存文件转换为 MP3 文件

Dataflowkit ⭐ 394

Extract structured data from web sites. Web sites scraping.

a smart stream-like crawler & etl python library

Sstap_ip_crawl_tool ⭐ 392

一个自动获取游戏远程ip，并自动写成SSTAP/NETCH规则文件的脚本

Signature_algorithm ⭐ 387

各种App、小程序、网站的请求签名或加密算法。现已有：自如、小红书、蛋壳公寓、luckin coffee(瑞幸咖啡)、bangkokair(曼谷航空)

Scavenger ⭐ 384

Crawler (Bot) searching for credential leaks on paste sites.

WarcDB: Web crawl data as SQLite databases.

Scrapybook ⭐ 378

Scrapy Book Code

Search Engines Scraper ⭐ 377

Search google, bing, yahoo, and other search engines with python

Instagramcrawler ⭐ 373

A non API python program to crawl public photos, posts or followers

Coronadatascraper ⭐ 372

COVID-19 Coronavirus data scraped from government and curated data sources.

Linkcheck ⭐ 371

Fast link checker

Blockchain Transactions Investigation Tool

Bitcoin Seeder ⭐ 369

Ants Go ⭐ 368

open source, distributed, restful crawler engine in golang

Iclr2019 Openreviewdata ⭐ 366

Script that crawls meta data from ICLR OpenReview webpage. Tutorials on installing and using Selenium and ChromeDriver on Ubuntu.

Iclr2021 Openreviewdata ⭐ 365

Crawl & visualize ICLR papers and reviews.

Prerender_rails ⭐ 358

Rails middleware gem for prerendering javascript-rendered pages on the fly for SEO

Keras Quora Question Pairs ⭐ 357

A Keras model that addresses the Quora Question Pairs dyadic prediction task.

Iclr2020 Openreviewdata ⭐ 355

Script that crawls meta data from ICLR OpenReview webpage. Tutorials on installing and using Selenium and ChromeDriver on Ubuntu.

Moodle Dl ⭐ 355

Moodle-DL downloads course content fast from Moodle (eg. lecture pdfs)

Lighthouse Parade ⭐ 355

A Node.js command line tool that crawls a domain and gathers lighthouse performance data for every page.

Videodl ⭐ 354

Videodl: A lightweight video downloader written by pure python.

Gospider ⭐ 354

golang实现的爬虫框架，使用者只需关心页面规则，提供web管理界面。基于colly开发。

👩 美女写真套图爬虫（一）

Related Searches

Python Crawler (4,545)

Javascript Crawler (1,142)

Crawler Scrapy (988)

Scraper Crawler (896)

Java Crawler (807)

Crawler Spider (709)

201-300 of 3,834 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.