Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for robots txt
robots-txt
x
64 search results found
Gocrawl
⭐
1,929
Polite, slim and concurrent web crawler.
Advertools
⭐
975
advertools - online marketing productivity and analysis tools
Fetchbot
⭐
758
A simple and flexible web crawler that follows the robots.txt policies and crawl delays.
Robots
⭐
364
NuxtJS module for robots.txt
Polite
⭐
310
Be nice on the web
Robotstxt
⭐
242
The robots.txt exclusion protocol implementation for Go language
Infinitycrawler
⭐
221
A simple but powerful web crawler library for .NET
Crawler Commons
⭐
217
A set of reusable Java components that implement functionality common to any web crawler
Robots Txt
⭐
201
Determine if a page may be crawled from robots.txt, robots meta tags and robot headers
Weboptout
⭐
191
Opt-Out tool to check Copyright reservations in a way that even machines can understand.
Robots Parser
⭐
137
NodeJS robots.txt parser with support for wildcard (*) matching.
Gflare Tk
⭐
110
Open-Source Python Based SEO Web Crawler
Gatsby Plugin Robots Txt
⭐
107
Gatsby plugin that automatically creates robots.txt for your site
Grobotstxt
⭐
85
grobotstxt is a native Go port of Google's robots.txt parser and matcher library.
Robots.txt Parser Class
⭐
79
Php class for robots.txt parse
Astro Lib
⭐
79
Makes it easy to add robots.txt, sitemap and web app manifest during build to your Astro app.
Ultimate Sitemap Parser
⭐
76
Ultimate Website Sitemap Parser
Robots.txt
⭐
69
Simple robots.txt template. Keep unwanted robots out (disallow). White lists (allow) legitimate user-agents. Useful for all websites.
Robotstxt
⭐
68
robots.txt file parsing and checking for R
Robots.js
⭐
63
Parser for robots.txt for node.js
Generate Robotstxt
⭐
63
Generator robots.txt for node js
Robotstxt
⭐
58
A native Rust port of Google's robots.txt parser and matcher C++ library.
Librengine
⭐
55
Privacy Web Search Engine (not meta, own crawler)
Protego
⭐
44
A pure-Python robots.txt parser with support for modern conventions.
Dark_web.py
⭐
43
Dark Web Informationgathering Footprinting Scanner and Recon Tool Release. Dark Web is an Information Gathering Tool I made in python 3. To run Dark Web, it only needs a domain or ip. Dark Web can work with any Linux distros if they support Python 3. Author: AKASHBLACKHAT(help for ethical hackers)
Useful Links
⭐
42
List of useful links, tools and resources
Robotstxt Webpack Plugin
⭐
32
A webpack plugin to generate a robots.txt file
Jsitemapgenerator
⭐
26
Java sitemap generator. This library generates a web sitemap, can ping Google, generate RSS feed, robots.txt and more with friendly, easy to use Java 8 functional style of programming
.netcorepluginmanager
⭐
26
.Net Core Plugin Manager, extend web applications using plugin technology enabling true SOLID and DRY principles when developing applications
Nuxt Humans Txt
⭐
24
🧑🏻👩🏻 "We are people, not machines" - An initiative to know the creators of a website. Contains the information about humans to the web building - A Nuxt Module to statically integrate and generate a humans.txt author file - Based on the HumansTxt Project.
Waybackrobots
⭐
24
Enumerate old versions of robots.txt paths using Wayback Machine for content discovery
Ai Training Opt Out
⭐
23
Known tags and settings suggested to opt out of having your content used for AI training.
Robotstxtparser
⭐
21
An extensible robots.txt parser and client library, with full support for every directive and specification.
Seohelper
⭐
20
This package helps you to add meta-tags, sitemap.xml and robots.txt into your project easily.
Webscraper
⭐
19
Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation to outsmart website bots and prevent blocking.
Robotsexclusiontools
⭐
18
A "robots.txt" parsing and querying library for .NET
Kirby3 Robots Txt
⭐
18
Manage the robots.txt from the Kirby config file
Astro Launchpad
⭐
17
An Astro project template for decent projects: auth, i18next, Bootstrap, sitemap, webworker, robots.txt, preact, react, endpoints, endpoint clients, OAuth, various Astro features and data loading preconfigured
Robotify Netcore
⭐
15
Provides robots.txt middleware for .NET core
Robotspy
⭐
13
Alternative robots parser module for Python
Laravel Robots
⭐
13
Laravel package to manage robots
Behat Seo Contexts
⭐
13
Behat extension for testing some On-Page SEO factors: meta title/description, canonical, hreflang, meta robots, robots.txt, redirects, sitemap validation, HTML validation, performance...
Robots.txt
⭐
13
🤖 robots.txt as a service. Crawls robots.txt files, downloads and parses them to check rules through an API
Silverstripe Robots
⭐
12
Simple robots generation module for Silverstripe (SS 4 and above)
Gollum
⭐
12
Robots.txt parser and fetcher for Elixir
Next With Sitemap
⭐
10
Higher order Next.js config to generate sitemap.xml and robots.txt
Robotstxt
⭐
10
Go robots.txt parser
Spiderbar
⭐
10
Lightweight R wrapper around rep-cpp for robot.txt (Robots Exclusion Protocol) parsing and path testing in R
Multisite Robotstxt Manager
⭐
10
A Multisite Robots.txt Manager - Quickly and easily manage all robots.txt files on a WordPress Multisite Website Network.
Robots_txt
⭐
9
Lightweight robots.txt parser and generator written in Rust.
Sitecrawler
⭐
9
TYPO3 sitemap crawler
Robots Txt Parser
⭐
9
A lightweight robots.txt parser for Node.js with support for wildcards, caching and promises.
Kirby Robots Writer
⭐
8
Robots for Kirby CMS
Robotstxt Change Monitor
⭐
8
Monitor and report changes across one or more robots.txt files.
Friendly Robots
⭐
7
A friendly tool for creating dynamic robots.txt files in Umbraco
Pico Robots
⭐
7
This is Pico's official robots plugin to add a robots.txt and sitemap.xml to your website. Pico is a stupidly simple, blazing fast, flat file CMS.
Scrawler
⭐
7
Declarative, scriptable web robot (crawler) and scrapper
Pyrobots
⭐
7
a tool that gets all paths at robots.txt and opens it in the browser.
Robotsvalidator
⭐
6
A python script to check if URLs are allowed or disallowed by a robots.txt file.
Robots Txt
⭐
6
Robots Exclusion Standard/Protocol Parser for Web Crawling/Scraping
Blue
⭐
6
🕵️♂️ɪɴғᴏʀᴍᴀᴛɪᴏɴ ɢᴀᴛʜᴇʀɪɴɢ ᴛᴏᴏʟ🕵️♂️
Chanakya
⭐
6
Scan websites for multiple things like honeypot, whois , port scan etc...
Nexttypes
⭐
5
NextTypes is a standards based information storage, processing and transmission system that integrates the characteristics of other systems such as databases, programming languages, communication protocols, file systems, document managers, operating systems, frameworks, file formats and hardware in a single tightly integrated system using a common data types system.
Robots Parse
⭐
5
A lightweight and simple robots.txt parser in node
1-64 of 64 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.