This Python script can enumerate all URLs present in robots.txt files, and test whether they can be accessed or not.

Overview

Robots.txt tester

With this script, you can enumerate all URLs present in robots.txt files, and test whether you can access them or not.

example

Setup

Clone the repository and install the dependencies :

git clone https://github.com/p0dalirius/robotstester
cd robotstester
python3 setup.py install

Usage

robotstester -u http://www.example.com/

You can find here a complete list of options :

[~] Robots.txt tester, v1.2.0

usage: robotstester.py [-h] (-u URL | -f URLSFILE) [-v] [-q] [-k] [-L] [-t THREADS] [-p] [-j JSONFILE] [-x PROXY] [-b COOKIES]

This Python script can enumerate all URLs present in robots.txt files, and test whether they can be accessed or not.

optional arguments:
  -h, --help            show this help message and exit
  -u URL, --url URL     URL to the robots.txt to test e.g. https://example.com:port/path
  -f URLSFILE, --urlsfile URLSFILE
                        List of robots.txt urls to test
  -v, --verbose         verbosity level (-v for verbose, -vv for debug)
  -q, --quiet           Show no information at all
  -k, --insecure        Allow insecure server connections when using SSL (default: False)
  -L, --location        Follow redirects (default: False)
  -t THREADS, --threads THREADS
                        Number of threads (default: 5)
  -p, --parsable        Parsable output
  -j JSONFILE, --jsonfile JSONFILE
                        Save results to specified JSON file.
  -x PROXY, --proxy PROXY
                        Specify a proxy to use for requests (e.g., http://localhost:8080)
  -b COOKIES, --cookies COOKIES
                        Specify cookies to use in requests. (e.g., --cookies "cookie1=blah;cookie2=blah")

Contributing

Pull requests are welcome. Feel free to open an issue if you want to add other features.

You might also like...
Import modules and files straight from URLs.

Import Python code from modules straight from the internet.

A python script made for personal use to monitor for sports card restocks on target.com since they are sold out often

TargetProductMonitor A python script made for personal use to monitor for sports card resocks on target.com since they are sold out often. When a rest

My sister is a GR of her class. She had to mark attendance of students from screenshots of teams meeting on an excel sheet. I resolved her problem by reading names from screenshots using PyTesseract and marking them present on the excel using Pandas in Python. It took me 1hr to write the code and it is saving half an hour everyday.
Manipulation OpenAI Gym environments to simulate robots at the STARS lab

liegroups Python implementation of SO2, SE2, SO3, and SE3 matrix Lie groups using numpy or PyTorch. [Documentation] Installation To install, cd into t

Serverless demo showing users how they can capture (and obfuscate) their Lambda payloads in Datadog APM
Serverless demo showing users how they can capture (and obfuscate) their Lambda payloads in Datadog APM

Serverless-capture-lambda-payload-demo Serverless demo showing users how they can capture (and obfuscate) their Lambda payloads in Datadog APM This wi

A one place destination to check whatever is trending on the top social and news websites at present.
A one place destination to check whatever is trending on the top social and news websites at present.

UpTrend A one place destination to check whatever is trending on the top social and news websites at present. Explore the docs » View Demo · Report Bu

 Python requirements.txt Guesser
Python requirements.txt Guesser

Python-Requirements-Guesser ⚠️ This is alpha quality software. Work in progress Attempt to guess requirements.txt modules versions based on Git histor

Birthday program - A program that lookups a birthday txt file and compares to the current date to check for birthdays
Birthday program - A program that lookups a birthday txt file and compares to the current date to check for birthdays

Birthday Program This is a program that lookups a birthday txt file and compares

Write a program that works out whether if a given year is a leap year
Write a program that works out whether if a given year is a leap year

Leap Year 💪 This is a Difficult Challenge 💪 Instructions Write a program that works out whether if a given year is a leap year. A normal year has 36

Comments
  • [Feature]  Add waybackmachine capability

    [Feature] Add waybackmachine capability

    In the past few days I've been experiencing using waybackmachine to enumerate robots.txt endpoints.

    Sometimes robots.txt gets removed and sometimes the removed content can be juicy. Thus the ideia of searching every WBM to look for old robots entries.

    I've implemented a quick and basic script to do a PoC, but I feel like this repo has the power to bring it to the next level since a lot of good features are already done.

    https://gist.github.com/felipecaon/035ad1718c3cae681d2afb03c699795f

    The gist works by getting all the robots.txt entries from WBM, parsing and sending to stdout. The script does not remove dps, just do a basic word removal.

    If I have the time I may be able to open a PR. But if someone wants to takes it further, I would love to see that. The core waybackmachine endpoints to be used are on my gist file.

    opened by felipecaon 0
Releases(1.2)
  • 1.2(Jul 7, 2021)

    Added --parsable option :cat2:

    usage: robotstester.py [-h] (-u URL | -f URLSFILE) [-v] [-q] [-k] [-L] [-t THREADS] [-p] [-j JSONFILE] [-x PROXY] [-b COOKIES]
    
    This Python script can enumerate all URLs present in robots.txt files, and test whether they can be accessed or not.
    
    optional arguments:
      -h, --help            show this help message and exit
      -u URL, --url URL     URL to the robots.txt to test e.g. https://example.com:port/path
      -f URLSFILE, --urlsfile URLSFILE
                            List of robots.txt urls to test
      -v, --verbose         verbosity level (-v for verbose, -vv for debug)
      -q, --quiet           Show no information at all
      -k, --insecure        Allow insecure server connections when using SSL (default: False)
      -L, --location        Follow redirects (default: False)
      -t THREADS, --threads THREADS
                            Number of threads (default: 5)
      -p, --parsable        Parsable output
      -j JSONFILE, --jsonfile JSONFILE
                            Save results to specified JSON file.
      -x PROXY, --proxy PROXY
                            Specify a proxy to use for requests (e.g., http://localhost:8080)
      -b COOKIES, --cookies COOKIES
                            Specify cookies to use in requests. (e.g., --cookies "cookie1=blah;cookie2=blah")
    
    Source code(tar.gz)
    Source code(zip)
  • 1.0(Jul 5, 2021)

    [~] Robots.txt tester, v1.0
    
    usage: robotstester.py [-h] [-u URL | -f URLSFILE] [-v] [-q] [-k] [-L] [-t THREADS] [-j JSONFILE] [-x PROXY] [-b COOKIES]
    
    This Python script can enumerate all URLs present in robots.txt files, and test whether they can be accessed or not.
    
    optional arguments:
      -h, --help            show this help message and exit
      -u URL, --url URL     URL to the robots.txt to test e.g. https://example.com:port/path
      -f URLSFILE, --urlsfile URLSFILE
                            List of robots.txt urls to test
      -v, --verbose         verbosity level (-v for verbose, -vv for debug)
      -q, --quiet           Show no information at all
      -k, --insecure        Allow insecure server connections when using SSL (default: False)
      -L, --location        Follow redirects (default: False)
      -t THREADS, --threads THREADS
                            Number of threads (default: 5)
      -j JSONFILE, --jsonfile JSONFILE
                            Save results to specified JSON file.
      -x PROXY, --proxy PROXY
                            Specify a proxy to use for requests (e.g., http://localhost:8080)
      -b COOKIES, --cookies COOKIES
                            Specify cookies to use in requests. (e.g., --cookies "cookie1=blah;cookie2=blah")
    
    Source code(tar.gz)
    Source code(zip)
Owner
Podalirius
Hacker of everything
Podalirius
To attract customers, the hotel chain has added to its website the ability to book a room without prepayment

To attract customers, the hotel chain has added to its website the ability to book a room without prepayment. We need to predict whether the customer is going to reject the booking or not. Since in c

Taychinov Evgeniy 0 Aug 04, 2022
Simple tooling for marking deprecated functions or classes and re-routing to the new successors' instance.

pyDeprecate Simple tooling for marking deprecated functions or classes and re-routing to the new successors' instance

Jirka Borovec 45 Nov 24, 2022
Programmatic interface to Synapse services for Python

A Python client for Sage Bionetworks' Synapse, a collaborative, open-source research platform that allows teams to share data, track analyses, and collaborate

Sage Bionetworks 54 Dec 23, 2022
Gobigger Explore For Python

Gobigger-Explore 🔮 GoBigger Challenge 2021 Baseline en/中文 🤖 Introduction This is the baseline of GoBigger Multi-Agent Decision Intelligence Challeng

OpenDILab 145 Dec 22, 2022
Collection of script & resources for Foundry's Nuke software.

Author: Liam Collod. Collections of scripting stuff I wrote for Foundry's Nuke software. Utilisation You can have a look at the README.md file in each

Liam Collod 1 May 14, 2022
A Python library to simulate a Zoom H6 recorder remote control

H6 A Python library to emulate a Zoom H6 recorder remote control Introduction This library allows you to control your Zoom H6 recorder from your compu

Matias Godoy 68 Nov 02, 2022
A reproduction repo for a Scheduling bug in AirFlow 2.2.3

A reproduction repo for a Scheduling bug in AirFlow 2.2.3

Ilya Strelnikov 1 Feb 09, 2022
A small Python library which gives you the IEEE-754 representation of a floating point number.

ieee754 ieee754 is small Python library which gives you the IEEE-754 representation of a floating point number. You can specify a precision given in t

Bora Canbula 5 Dec 20, 2022
Job Guy Backend

جاب‌گای چیست؟ اونجا وضعیت چطوریه؟ یه سوال به همین کلیت و ابهام معمولا وقتی برای یه شرکت رزومه می‌فرستیم این سوال کلی و بزرگ برای همه پیش میاد.اونجا وض

Jobguy.work 217 Dec 25, 2022
A novel dual model approach for categorization of unbalanced skin lesion image classes (Presented technical paper 📃)

A novel dual model approach for categorization of unbalanced skin lesion image classes (Presented technical paper 📃)

1 Jan 19, 2022
HairCLIP: Design Your Hair by Text and Reference Image

Overview This repository hosts the official PyTorch implementation of the paper: "HairCLIP: Design Your Hair by Text and Reference Image". Our single

322 Dec 30, 2022
The goal of this program was to find the most common color in my living room.

The goal of this program was to find the most common color in my living room. I found a dataset online with colors names and their corr

1 Nov 09, 2021
DG - A(n) (unusual) programming language

DG - A(n) (unusual) programming language General structure There are no infix-operators (i.e. 1 + 1) Each operator takes 2 parameters When there are m

1 Mar 05, 2022
IDA Pro plugin that shows the comments in a database

ShowComments A Simple IDA Pro plugin that shows the comments in a database Installation Copy the file showcomments.py to the plugins folder under IDA

Fernando Mercês 32 Dec 10, 2022
Chat meetup

FLiP-Meetup-Chat Chat meetup create function bin/pulsar-admin functions create --auto-ack true --jar pulsardjlexample-1.0.jar --classname "dev.pulsarf

Timothy Spann 1 Dec 09, 2021
Binjago - Set of tools aiding in analysis of stripped Golang binaries with Binary Ninja

Binjago 🥷 Set of tools aiding in analysis of stripped Golang binaries with Bina

W3ndige 2 Jul 23, 2022
Installer, package manager, build wrapper and version manager for Piccolo

Piccl Installer, package manager, build wrapper and version manager for Piccolo

1 Dec 19, 2021
Python programming language Test

Exercise You are tasked with creating a data-processing app that pre-processes and enriches the data coming from crawlers, with the following requirem

Monirul Islam Khan 1 Dec 13, 2021
Programa que organiza pastas automaticamente

📂 Folder Organizer 📂 Programa que organiza pastas automaticamente Requisitos • Como usar • Melhorias futuras • Capturas de Tela Requisitos Antes de

João Victor Vilela dos Santos 1 Nov 02, 2021
A webapp that timestamps key moments in a football clip

A look into what we're building Demo.mp4 Prerequisites Python 3 Node v16+ Steps to run Create a virtual environment. Activate the virtual environment.

Pranav 1 Dec 10, 2021